Helping people with computers... one answer at a time.
Websites can do a MUCH better job of helping visitors find what they want; the 404 'not found' capabilities of most web servers is one powerful way.
404
Not Found
The requested URL was not found on this server.
Those are words that no internet user ever wants to see.
So why are websites so quick to show that message for even the smallest error?
Yes, the requested page wasn't found. Fine. Stuff happens.
But did your site even try to see if it could do something better than give up? Did it even try to see if the user's intent could be inferred from what was given?
Did the site even try to give the user what they wanted anyway?
•
Having cute or funny "404 - page not found" pages has become one popular way to try to make light of something that is nothing more than an annoyance to your visitor. They wanted something, they clicked a link to your site, they expected something, and your site didn't give it to them.
Fail.
Throwing up a "Sorry, I can't find that page" message should be your site's absolute last resort.
"So, if the incoming URL isn't correct," I hear you asking, "how is my site supposed to figure out what the visitor intended?"
These are computers, for heaven's sake. Make them do a little work before they give up.

Google's Webmaster Tools for your site is your starting point. That's where you'll find a lengthy list of URLs to other pages out on the internet. On those pages are links back to your site - links that don't work.
Make those URLs work.
Find out what people are linking to and then, even if they're doing it incorrectly, make those links work.
Yes, it's probably their fault for using an incorrect URL. And yes, in an ideal world, you'd push back and have them fix all of those links to your site.
Ain't gonna happen. Out of your control. Blame them if you like, but that doesn't help the people who are clicking on those links to get to your site.
However, what you do have control over is your own site; it has the ability to serve your visitors better by tolerating as much broken cr*p as it can and getting more visitors to where they intended, even if the link that they're using is wrong.
404 handlers are one of the internet's greatest untapped resources to improve user experience.
Most web servers have the ability to supply your own page for you. Often referred to as the "404 handler", this is displayed when a page isn't found. This site, Ask Leo!, is running on an Apache web server and, in its configuration, it has the following line:
If all that your 404 page does is display an error, then it's not "handling" squat.
It can do more; much more.
It can fix common mistakes or patterns in incoming broken links:
Example: For some reason, I find a number of incoming links to Ask Leo! with a space immediately prior to the ".html". I strip the space and send the visitor to the intended page.
Ask Leo! currently has over a dozen identified patterns that it checks to try and send visitors to the pages that they intended via a 301 redirect.
It can handle one-off errors:
Example: Someone out there has a link to "http://ask-leo.com/w...or_backups.html" - with the actual ellipses in the link - which is not a page on my site. There's only one page that starts with "w" and ends with "or_backups.html", so visitors are redirected there transparently.
Ask Leo! has over 100 of these special "one-off" redirections for broken incoming links.
It can even handle URLs broken by email programs that wrap lengthy URLs and break them:
Examples: Visit http://ask-leo.com/who_is - you'll be taken to the only page on the site that begins with "who_is". Even ambiguity can be handled - visit http://ask-leo.com/who for a list of possible pages to choose from.
Isn't that better than "Sorry, we can't find it"?
Put just a little effort into the software that you put in place to deal with "not found" errors and your site will have a true 404 handler that does whatever it can to actually get your visitor to the page that they intended.
Of course it is. It requires someone with a little bit of web server knowledge and some basic web programming skills to put a better infrastructure for handling your 404 errors into place.
After that, it gets pretty routine. Check the webmaster tools every so often for newly created incoming-yet-broken links out on the internet - and add the information to handle 'em. Keep an eye out for patterns and push those back to the programmer as needed.
It is programming.
But it's not rocket science.
Obviously, I believe that it's definitely worth the effort.
I've put a lot of effort into this site's 404 handler with the express intent of getting people to the answers that they're looking for, even if their email program broke a link, someone posted a mangled link in a newsgroup, or who-knows-what happened to make that link less than perfect.
Why?
In a word: reputation.
User reputation and search engine reputation.
Let's face it. You will get broken, incoming links and those links will be out of your control. Users will enter a typo or otherwise screw things up, sites on which links are posted will screw things up, or mail programs will screw things up. Being tolerant of those kinds of screw ups makes your site a more robust, valuable, and accessible resource.
One broken link in a forum post, for example, either tells post readers that your site is broken when you return a "Not Found" ... or it gets you traffic. Make no mistake - visitors will blame your site even if it's the incoming link that's broken. They don't know or care about broken links and they shouldn't have to.
I also have to believe that Google and other search engines notice sites that proactively make that Not Found list on Webmaster Tools shrink rather than continually grow.
It's an indicator that the site owner is working on their site quality.
It's also an indicator that the site owner is trying to help site visitors get to the content that they're actually looking for.
That's a reputation worth aspiring to.
It's one measure of quality that you actually have in your control. Why not use it?
•
PS: I'm not in a position to share actual code for all of the above. A few years ago, however, I did write up the basics along with some PHP snippets in an article Tolerate Broken URLs on my old MT Tips site.
Article C4815 - May 9, 2011 « »
May 17, 2011 12:36 PM
Hey, Leo -- dude! -- You actually don't mind that Benmara has just ripped off your HTM/PHP code and violated your copyright...?!? (Wow! I'm way impressed.)
In that case, how about putting up your code in a more public & more user-friendly location -- and/or releasing it under the GPL?
17-May-2011
May 17, 2011 1:31 PM
If you are using Apache (without major rewrite rules), have a look at mod_speling.
Otherwise, I sometimes use regular expressions over a list of file-names (generated by directory scanning) and return the matches. (You might want to consider a caching extension like APC to avoid thrashing your hard disks.)
May 17, 2011 2:16 PM
Google has some JS code that attempts to find the correct page by guessing the URL. It does pretty well if the spelling is similar. No programming required! I tried to put in the code, but your comment system ate it. Here is the URL:
Creating useful 404 pages (Google)
May 18, 2011 11:46 AM
@Posted by: Ron at May 17, 2011 9:57 AM
Hey Ron DUDE "I copied the
> error pages (404, 405, etc) to my pc and revised them to look
> just like my site pages and added my menu"
from MY web site provided by MY hosting service (which I pay for)...reading comprehension is obviously your strong suit...I TOO am impressed!
Benmara
May 18, 2011 2:49 PM
I just can't believe some of the websites for 'big' companies that I've come across don't even have a 404 page! You just get the default browser 404 page. At the very least, give people a custom 404 with a menu to help them find what they want...