Helping people with computers... one answer at a time.

Websites can do a MUCH better job of helping visitors find what they want; the 404 'not found' capabilities of most web servers is one powerful way.

404

Not Found

The requested URL was not found on this server.

Those are words that no internet user ever wants to see.

So why are websites so quick to show that message for even the smallest error?

Yes, the requested page wasn't found. Fine. Stuff happens.

But did your site even try to see if it could do something better than give up? Did it even try to see if the user's intent could be inferred from what was given?

Did the site even try to give the user what they wanted anyway?

404: The page of last resort

Having cute or funny "404 - page not found" pages has become one popular way to try to make light of something that is nothing more than an annoyance to your visitor. They wanted something, they clicked a link to your site, they expected something, and your site didn't give it to them.

Fail.

Throwing up a "Sorry, I can't find that page" message should be your site's absolute last resort.

"So, if the incoming URL isn't correct," I hear you asking, "how is my site supposed to figure out what the visitor intended?"

These are computers, for heaven's sake. Make them do a little work before they give up.

Start with Webmaster Tools

169 references to pages that don't exist

Google's Webmaster Tools for your site is your starting point. That's where you'll find a lengthy list of URLs to other pages out on the internet. On those pages are links back to your site - links that don't work.

Make those URLs work.

Find out what people are linking to and then, even if they're doing it incorrectly, make those links work.

Yes, it's probably their fault for using an incorrect URL. And yes, in an ideal world, you'd push back and have them fix all of those links to your site.

Ain't gonna happen. Out of your control. Blame them if you like, but that doesn't help the people who are clicking on those links to get to your site.

However, what you do have control over is your own site; it has the ability to serve your visitors better by tolerating as much broken cr*p as it can and getting more visitors to where they intended, even if the link that they're using is wrong.

Use your web server's '404 handler'

404 handlers are one of the internet's greatest untapped resources to improve user experience.

Most web servers have the ability to supply your own page for you. Often referred to as the "404 handler", this is displayed when a page isn't found. This site, Ask Leo!, is running on an Apache web server and, in its configuration, it has the following line:

ErrorDocument 404 /404resolver.php

If all that your 404 page does is display an error, then it's not "handling" squat.

It can do more; much more.

  • It can fix common mistakes or patterns in incoming broken links:

    Example: For some reason, I find a number of incoming links to Ask Leo! with a space immediately prior to the ".html". I strip the space and send the visitor to the intended page.

    Ask Leo! currently has over a dozen identified patterns that it checks to try and send visitors to the pages that they intended via a 301 redirect.

  • It can handle one-off errors:

    Example: Someone out there has a link to "http://ask-leo.com/w...or_backups.html" - with the actual ellipses in the link - which is not a page on my site. There's only one page that starts with "w" and ends with "or_backups.html", so visitors are redirected there transparently.

    Ask Leo! has over 100 of these special "one-off" redirections for broken incoming links.

  • It can even handle URLs broken by email programs that wrap lengthy URLs and break them:

    Examples: Visit http://ask-leo.com/who_is - you'll be taken to the only page on the site that begins with "who_is". Even ambiguity can be handled - visit http://ask-leo.com/who for a list of possible pages to choose from.

    Isn't that better than "Sorry, we can't find it"?

Put just a little effort into the software that you put in place to deal with "not found" errors and your site will have a true 404 handler that does whatever it can to actually get your visitor to the page that they intended.

It's work. It's programming.

Of course it is. It requires someone with a little bit of web server knowledge and some basic web programming skills to put a better infrastructure for handling your 404 errors into place.

After that, it gets pretty routine. Check the webmaster tools every so often for newly created incoming-yet-broken links out on the internet - and add the information to handle 'em. Keep an eye out for patterns and push those back to the programmer as needed.

It is programming.

But it's not rocket science.

Is it worth it?

Obviously, I believe that it's definitely worth the effort.

I've put a lot of effort into this site's 404 handler with the express intent of getting people to the answers that they're looking for, even if their email program broke a link, someone posted a mangled link in a newsgroup, or who-knows-what happened to make that link less than perfect.

Why?

In a word: reputation.

User reputation and search engine reputation.

Let's face it. You will get broken, incoming links and those links will be out of your control. Users will enter a typo or otherwise screw things up, sites on which links are posted will screw things up, or mail programs will screw things up. Being tolerant of those kinds of screw ups makes your site a more robust, valuable, and accessible resource.

One broken link in a forum post, for example, either tells post readers that your site is broken when you return a "Not Found" ... or it gets you traffic. Make no mistake - visitors will blame your site even if it's the incoming link that's broken. They don't know or care about broken links and they shouldn't have to.

I also have to believe that Google and other search engines notice sites that proactively make that Not Found list on Webmaster Tools shrink rather than continually grow.

It's an indicator that the site owner is working on their site quality.

It's also an indicator that the site owner is trying to help site visitors get to the content that they're actually looking for.

That's a reputation worth aspiring to.

It's one measure of quality that you actually have in your control. Why not use it?

PS: I'm not in a position to share actual code for all of the above. A few years ago, however, I did write up the basics along with some PHP snippets in an article Tolerate Broken URLs on my old MT Tips site.

Article C4815 - May 9, 2011 « »

Share this article with your friends:

Share this article on Facebook Tweet this article Email a link to this article
Leo Leo A. Notenboom has been playing with computers since he was required to take a programming class in 1976. An 18 year career as a programmer at Microsoft soon followed. After "retiring" in 2001, Leo started Ask Leo! in 2003 as a place for answers to common computer and technical questions. More about Leo.

Not what you needed?

10 Comments
Michael Horowitz
May 16, 2011 4:27 PM

Listing all the pages that start with "who" is very impressive. Care to share, conceptually, how you do this? I'm guessing you ask PHP to list all the files in a folder, put the results into an array and parse through the array.

I suspect you also adjust for case errors (assuming your site is running on Linux/Unix).

I noticed you adjusted a .h suffix to the correct .html. Excellent :-)

Actually you just listed the conceptual algorithm - for sites that use static HTML, just look for files who's names begin with what was given. If there's one, 301 redirect to it, if there's more than one present the list.
Leo
17-May-2011

Snert
May 17, 2011 8:46 AM

Highly infomative. And I thank you yet again for simple, concise information.
I don't have a web site but now I understand what happens when I encounter a 404.

Benmara
May 17, 2011 9:29 AM

Loved this article
I wish I knew how to "PHP", lol, but I can barely .html!
So I did the next best thing....I copied the error pages (404, 405, etc) to my pc and revised them to look just like my site pages and added my menu.
I still get mistaken links to these pages but now people can "menu" to what they actually want. Not as good as your way, but what is a 50 yo PC idiot to do? LOL

Benmara

Actually that, alone, is a HUGE step in the right direction. Putting up a page that maintains your site's style, and gives the user options to navigate or search your site is so much better than "sorry, can't find it".
Leo
17-May-2011

Tom R.
May 17, 2011 9:47 AM

The worst situation is when I click on an internal link in a website and get a 404. If you can't get your own links to work, you lose. I'm outta there and I ain't coming back!

I agree, to a point. After maintaining this site now for nearly 8 years I can see how easy it is for links - even internal ones - to go stale and break as things change on the site.
Leo
17-May-2011

Ron
May 17, 2011 9:57 AM

Thanks for the great service, and advice.

Sure it requires a little more programming effort, but that is what we are paid the big bucks to do. We do the programming once, and our users benefit "forever". Good trade. This idea is a prime example of using computers to do the repetitive, "mechanical", things they do best, rather than asking people to do them.

PS: I like the link to the "Tolerate Broken URLs" page, but ... near the middle of the page it says there is a link to download full code, I don't see it. Please either add the link or remove the "teaser" text.

Suggestion: I like that this page includes message about Javascript. However, please consider moving that message up to the "Post a Comment ..." line. That way I can enable Javascript BEFORE typing all of my comment and potentially losing it (as happens all to often on many sites) when I do eventually notice the message and enable Java.

Suggestion 2: please tell us specifically which site(s) to enable for comments. I've got FireFox Noscript, so I enable selectively rather than all sites. I don't want to enable all the ad sites too

Great suggestions all around, thank you. You should see the Javascript message above the comment block as well, now, and referencing the domain you need to enable (ask-leo.com). Of course since you already posted a comment, you figured that out Smile. Thanks.
Leo
17-May-2011

Glenn P.
May 17, 2011 12:36 PM

Hey, Leo -- dude! -- You actually don't mind  that Benmara has just ripped off your HTM/PHP code and violated your copyright...?!? (Wow! I'm way impressed.)

In that case, how about putting up your code in a more public & more user-friendly location -- and/or releasing it under the GPL?

He certainly didn't steal any PHP - that shouldn't be accessible to anyone but me. As for releasing it - no. The problem is that the techniques are simpole in concept, but the details vary dramatically based on how the website is implemented. I've released some code snippets on the mttips site mentioned, but that's about all that makes sense: example snippets.
Leo
17-May-2011

YK
May 17, 2011 1:31 PM

If you are using Apache (without major rewrite rules), have a look at mod_speling.
Otherwise, I sometimes use regular expressions over a list of file-names (generated by directory scanning) and return the matches. (You might want to consider a caching extension like APC to avoid thrashing your hard disks.)

Richard Deem
May 17, 2011 2:16 PM

Google has some JS code that attempts to find the correct page by guessing the URL. It does pretty well if the spelling is similar. No programming required! I tried to put in the code, but your comment system ate it. Here is the URL:

Creating useful 404 pages (Google)

Benmara
May 18, 2011 11:46 AM

@Posted by: Ron at May 17, 2011 9:57 AM

Hey Ron DUDE "I copied the
> error pages (404, 405, etc) to my pc and revised them to look
> just like my site pages and added my menu"

from MY web site provided by MY hosting service (which I pay for)...reading comprehension is obviously your strong suit...I TOO am impressed!

Benmara


JustInspired
May 18, 2011 2:49 PM

I just can't believe some of the websites for 'big' companies that I've come across don't even have a 404 page! You just get the default browser 404 page. At the very least, give people a custom 404 with a menu to help them find what they want...

Comments on this entry are closed.

If you have a question, start by using the search box up at the top of the page - there's a very good chance that your question has already been answered on Ask Leo!.

If you don't find your answer, head out to http://askleo.com/ask to ask your question.