Helping people with computers... one answer at a time.

It's surprisingly difficult to tell with certainty when a web page was written. There are some clues we can gather that might help - a little.

How do I find out what date a website or any thing on Google is written. Many times I look at Google to find websites but can never find out when a particular website is written.

In an absolute sense: you don't. Surprising as it might seem, that kind of information actually doesn't exist. There's no place, no standard, no way to absolutely, positively say that this web page was written on this or that date.

However.

While not absolute, and not 100% reliable, there are clues.

Let's look at what some of those are.

Definitions

First, I do need to be clear that Google has nothing, really, to do with this. It's just a way for you to find websites and web pages, and doesn't really factor into the "when was it written" question. So I'm explicitly not going to be talking about Google - or any search engine - at all.

"There's no definitive way to determine when a web page - or its content - was created."

Second, what we really care about here are not web sites, but web pages. A site may itself have a creation date, but in reality what we typically care about is the recency of a particular web page we're looking at.

The Best Source: The Page Itself

As silly as it sounds, the most authoritative source for the date of a web page is the web page itself. By this I mean that many pages include an "updated" or "posted" date somewhere on the page. Even here on Ask Leo! you'll see dates on all the articles:

Dates on Pages

In this example you'll see two: I place the most current "posted" date near the bottom of the article, below the related links.

The second is a reflection of the fact that the article I chose as my example had a major rewrite. So I choose to place the date of the rewrite as the posted date (highlighted here in the lower, red oval), and then include an explanatory statement about the rewrite, including its original date (in the upper, green oval).

There are, unfortunately, a bunch of problems with this:

  • I could lie. There's nothing that forces these dates to be accurate.

  • There's no standard location. Look above or below the main content of web pages, or perhaps in the web page footer, for the most common locations.

  • They may not be there at all.

But in general, if the site takes the time to post a date with their content, that's where I'd head.

HTTP Headers

What most people want is some kind of magic date information that's somehow within the web page information - it must be there, we just can't see it.

Well, there is always a date returned, but it's not the date we want. When an HTTP request returns a web page the date that the response was generated is included. This is not the date of the page, it's (roughly) the date you requested the page.

Not helpful.

There is sometimes an additional date returned called "Last Modified", which is intended to reflect the date that the page being requested was last altered.

Once again, there are several problems with this approach:

  • It's not required. In fact, in researching this issue, I note that my server does not send Last Modified information when you fetch a page.

  • There's no standard as to exactly what it means.

  • Typically it means the last date (and time) that the file you're accessing was altered - but that can often have no relationship to when the content it contains was written. For example, on my site a page is "altered" every time someone leaves a comment as the page is updated to contain the new comment. That is completely unrelated to when the article itself was written.

  • It could lie.

So the closest technological resources we have is woefully inadequate.

(If you want to see the headers that are returned with web page requests, Firefox has several addons that will do so. My favorite is Firebug - actually a fairly extensive web debugging addon. Another might be "Live Headers", an addon that simply displays headers. I'll warn you that you'll be heading into some fairly geeky territory, though. Smile)

History

OK, I lied: I'll mention Google one more time.

Google, and all other search engines like it, could track historical changes to pages as they periodically spider the internet. If one week a page appears, and the next week it changes, it seems like search engine spiders could track this activity.

To the best of my knowledge they do not. Or if they do, they don't typically make the information available.

The Internet Archive, on the other hand does exactly that.

Using the Internet Archive's "Wayback Machine", I can actually view web pages "as they were" at some point in the past. As long as the Internet Archive had spidered and captured that web page on that date in the past, that is.

Sadly, the Internet Archive also has some serious limitations:

  • It's spotty - even within a site not all pages on that site may be included.

  • It's spotty - not all sites are included. In fact, webmasters can actually request that they not be included.

  • It's spotty - not all dates are included. The Internet Archive's spider checks "periodically" at what appears to be a rate of every few weeks. Changes occurring faster than that are not captured.

  • It may not be current. They state that it may take up to 6 months for pages to appear. The Ask Leo! home page, for example, is current on Internet Archive only through June of 2008.

But even with all those limitations, it can be a useful piece of data for researching when, approximately, a web page changed.

If it's in the archive, of course.

The reason archiving of this sort is so challenging is simply the sheer quantity of data involved. An ideal archive would keep an entire copy of the entire world wide web every so often. That's more data than can be reasonably managed.

Apparently It's Timeless

Combining those approaches can often get you interesting information, but as we've seen: each approach has some serious limitations.

In the end the answer remains no. There's no definitive way to determine when a web page - or its content - was created.

Article C4362 - July 8, 2010 « »

Share this article with your friends:

Share this article on Facebook Tweet this article Email a link to this article
Leo Leo A. Notenboom has been playing with computers since he was required to take a programming class in 1976. An 18 year career as a programmer at Microsoft soon followed. After "retiring" in 2001, Leo started Ask Leo! in 2003 as a place for answers to common computer and technical questions. More about Leo.

Not what you needed?

9 Comments
Irving Stein
July 13, 2010 9:40 AM

this may work:


javasc#ipt:alert(document.lastModified)

Just copy and paste the line above in your address bar and
hit your ENTER key - and you'll know the date and time the page
you're viewing was last updated!
Please comment on this
Irv

Which, as stated in the article, may well have nothing to do with what was written on the page. Pages get "updated" for random things and reasons. Example: some sites update the copyright notice every January 1 - so then the last modified changes even though the content did not. That's just an example; there are many reasons that pages could be "modified" without content changing.
Leo
14-Jul-2010

Francisco Torres
July 13, 2010 10:05 AM

It is possible to detect page age with a simple google hack. View: http://www.labnol.org/internet/search/find-publishing-date-of-web-pages/8410/
also there is an installable tool that does it
http://www.linkdiagnosis.com/

That technique is frequently inaccurate. It's off by quite a bit on some of my pages. Like everything else I covered in the article it's data, just don't take it as absolute truth.
Leo
17-Jul-2010

James
July 13, 2010 12:50 PM

Right, Irving, that was the first thing that popped into my head. But, what you have to remember is if that webpage is generated differently each time, it will give you a time that doesn't seem right. For example, try that line on www.google.com . It will probably return a time a few seconds before you checked.

Mike
July 13, 2010 3:00 PM

While knowing when a page was written may be important, sometimes the date you read it is just as critical. Specifically, citations for papers and articles often call for an article's retrieval or access date more often than the publication date. But sometimes both, if they're available.

Gabe
July 14, 2010 6:05 AM

Thanks for dating your articles, Leo. I consider it an integral part of a professional article. It's always very frustrating when you think you're reading something very current until it references a "current event" that happened many years ago. I've even seen this on some news sites.

TuneUp
July 16, 2010 10:26 AM

Leo thanks for posting this question—I am always looking for recent articles to refer to in my blog and finding the date can be frustrating. I agree that the page is the best source; the date is often hidden in the footer. Irving, I tried your tip but it didn’t work for me, do you put it after the site’s web address?

DuLe
August 8, 2010 4:35 PM

Wow, this is a question similar to one I asked Leo many moons ago. Here’s why I made my original inquiry....

I use Google a lot. Seems to me that as the Internet expands exponentially so do the old/useless search results -- such as dead websites, expired coupons, bad links, and just really old information in general.

Here’s just one lame example of a thousand I could site: I want to read a review of a concert held last night. I Google Artist A performing at Venue B. In the search results I get a review of every concert Artist A has held at Venue B. I don’t want review of a concert ten years ago; I want last night’s! Further, putting a date into the search criteria rarely helps because dates are not in/on web pages.

Usually, whether searching for concert reviews, medical advice, attempting to buy electronics online, or whatever else one searches for, finding the most recent information is nearly impossible. And, as I mentioned, the bigger the Internet grows the larger I see this as a real issue. (If not a “real” issue at least it’s like searching for a needle in an ever-growing haystack.)

I know nothing of web design but, seems to me, if there was a universal requirement that every web page had its creation date -- either embedded or somehow expressed -- a search engine could read that date information and could then show the most recent results first.

Am I, technically, way off base -- or just a dreamer?

I'll choose dreamer. Smile

I agree it would be nice, but as you can see from the article there are many interpretations of which date would apply, and even then various sites would "game the system" and provide incorrect dates so as to get otherwise undeserved attention.

Anyway, regardless of what it could be, it is what it is.
Leo
09-Aug-2010

Coly Moore
July 18, 2012 6:43 PM

I completely missed the date at the end of each of your articles. Well done! I do wish there were more like you (but of course Ask Leo is inimitable!).

span
April 16, 2013 4:09 PM

This will work in most cases - On the page in question, type this in the address bar..

Javascript:document.lastModified

Hit RETURN and look in upper left of screen

To get back to the page, click Refresh or Reload, whichever per your browser. BACK doesn't do it.

If the date is current, the time-stamp may be off due to time zone's origin.

Comments on this entry are closed.

If you have a question, start by using the search box up at the top of the page - there's a very good chance that your question has already been answered on Ask Leo!.

If you don't find your answer, head out to http://askleo.com/ask to ask your question.