View Full Version : showing updated pages to users who don't actively clear their cache ?

04-03-2007, 02:01 PM

I am maintaining a website for my mother where she is displaying her paintings, www.taimiborg.se . It's a very simple and small 'old-school' site that does not use cookies or any other things (at least not that I consciously implemented).

Yesterday I made an update to a page, Utställnignar/Upcoming exhibitions. I could see the new page immediately but some 12 hours later my mother called me and said she constantly got the old page.

It seems to me that it was a cache issue because when I told her to refresh the page the new one appeared.

I usually don't have problems like this since I have set the cache in all my browsers to small sizes and frequent clearing. However, there must be many users like my mother who don't really know what a cache is and why one can refresh a page. So now I am faced with the risk that they too have old pages in their cache and will not see the new updated pages when they revisit the site.

Actually I thought that this was somewhat of a non-problem and that even if IE/Firefox/Opera has allowed cache sizes of 200mb (and thus will keep files for a long times before it automatically starts purging to make room for newer files), that once I close my browser and then restart it, when loading a page my browser will compare the cached page to the requested page. I thought this was normal cache behaviour of browsers, but the experience of my mother suggests otherwise.

So I wonder: when developing websites, what is the standard method for trying to ensure that a visitor is viewing the most up-to-date pages rather than some old cached ones on their harddrives? Should cookies be used for this? Or some meta-tag?

Please bear in mind that I am not a professional HTML programmer (obviously) so I appreciate pedagogical answers.

Thank you!

04-03-2007, 05:41 PM
not sure if you have any serverside languages available but a simple trick is to append a random (or logical) string to the query string, this makes the browser think it has a different page to display...

e.g. in PHP
a href="blah.com/index.html?r=<?=rand(1,100);?>"
or ?r=<?=date('dmy');?> etc

I assume you can do the same with javascript for internal links

04-03-2007, 05:54 PM
I would not recommend Firepages method of appending random strings to a URL. Doing so violates URL usability guidelines (http://www.useit.com/alertbox/990321.html) and will cause problems for search engines, negatively impacting any search engine rankings a site has now or could potentially have in the future.

Like you I've never actually thought about this being a problem before. I don't have any solid suggestions either... personally I'd ignore it because I've no experience of this causing problems for me or for any users of my websites. Not a particularly helpful suggestion I know... if you really want to do it, my hunch is there should be a JavaScript out there that will do this (although I've not looked).

04-03-2007, 06:28 PM
Thanks for that warning Pennimus.

I certainly don't want to mess up the google ranking where her site is currently at top position. Furthermore, since this is not a huge site with any strange needs, I want to follow web standards as closely as possible so as to maximise usability. I kind of thought that webdevelopers long ago had solved this problem (if indeed it is a problem?) of providing the most up-to-date content to the user - instead of them taking old pages from their cache - and that it had been incorporated into HTML.

(Also I was not quite sure what to do with that PHP string above since I have no clue about how to use PHP. I am pretty sure the web-hoster supports it (one.com) but I don't know where I would start if implementing that function in PHP or javascript. (I build my sites inside GoLive with I think purely HTML and CSS building blocks and have no scripting or programming knowledge.))

I ran a few extra tests and played around with some dummy pages on her site, and on my computer (WinXP Pro SP2) updated pages would show immediately in IE, Opera and Firefox if I clicked the link for a page and did not use back button or addressbar autocomplete, and most importatnly always show the most up-to-date page if the browser was closed down and then re-opened.

So perhaps I need to go to her computer and start looking there in order to understand why her Firefox did not show the new page even after the whole computer had been shut down for the night and rebooted in the morning.

I would be helpful if other people could also weigh in on this precise point:

1) If I visit index.html and click a link to page X and then close down the browser without clearing the cache, and then
2) update page X on the server, and then
3) restart the same browser and from index.html click the link leading to page X,

am I correct in expecting that the updated page should load automatically rather than the old page being fetched from the browser cache (given default browser settings)?

04-04-2007, 02:34 AM
don't know about url usability guidelines... absolutely sure however that it has no negative affect whatsoever on search engine rankings (at least the important ones)

of course at the end of the day it is a hack...but it works

04-04-2007, 04:47 PM
Why so sure? I'm telling you it does. Consider this simplistic scenario...

1. Blogger comes to your website. Finds www.site.com/content.html?random=1
2. Blogger likes content. Blogs about it and linking to it in the process.
3. Google finds blog, follows link and indexes www.site.com/content.html?random=1
4. Google thinks "nice content... lets wait to see if anyone else links to it before I consider ranking it"
5. Blogger 2 comes to your site. Finds www.site.com/content.html?random=2
6. Blogger likes content. Blogs about it, linking to it in the process.
7. Google finds blog, follows link and indexes www.site.com/content.html?random=2
8. Google thinks, "hmmm, haven't I see this somewhere before?"
9. Google follows a link back to your homepage, then sees a link to www.site.com/content.html?random=5432673, indexes it. Etc etc...

I could go on and on but I think you see my point.

04-14-2007, 03:06 PM
I found this tag on another site:


1) Do I understand correctly that this expires the document after 7 days and thus forces a re-check with the server even if there is a page in the local browser cache?

2) I presume the googlebot in that case is also scheduling a revisit after 7 days. In case there is no update of the page for a long time and the googlebot comes back week after week (perhaps months) to always find the same page as before, can this affect the site-ranking or indexing negatively?

I.e., is there a downside to specifying expiry dates for documents that could remain valid much longer?

Thanks a lot!