Helping people with computers... one answer at a time.
Websites can do a MUCH better job of helping visitors find what they want; the 404 'not found' capabilities of most web servers is one powerful way.
The requested URL was not found on this server.
Those are words that no internet user ever wants to see.
So why are websites so quick to show that message for even the smallest error?
Yes, the requested page wasn't found. Fine. Stuff happens.
But did your site even try to see if it could do something better than give up? Did it even try to see if the user's intent could be inferred from what was given?
Did the site even try to give the user what they wanted anyway?
Having cute or funny "404 - page not found" pages has become one popular way to try to make light of something that is nothing more than an annoyance to your visitor. They wanted something, they clicked a link to your site, they expected something, and your site didn't give it to them.
Throwing up a "Sorry, I can't find that page" message should be your site's absolute last resort.
"So, if the incoming URL isn't correct," I hear you asking, "how is my site supposed to figure out what the visitor intended?"
These are computers, for heaven's sake. Make them do a little work before they give up.
Google's Webmaster Tools for your site is your starting point. That's where you'll find a lengthy list of URLs to other pages out on the internet. On those pages are links back to your site - links that don't work.
Make those URLs work.
Find out what people are linking to and then, even if they're doing it incorrectly, make those links work.
Yes, it's probably their fault for using an incorrect URL. And yes, in an ideal world, you'd push back and have them fix all of those links to your site.
Ain't gonna happen. Out of your control. Blame them if you like, but that doesn't help the people who are clicking on those links to get to your site.
However, what you do have control over is your own site; it has the ability to serve your visitors better by tolerating as much broken cr*p as it can and getting more visitors to where they intended, even if the link that they're using is wrong.
404 handlers are one of the internet's greatest untapped resources to improve user experience.
Most web servers have the ability to supply your own page for you. Often referred to as the "404 handler", this is displayed when a page isn't found. This site, Ask Leo!, is running on an Apache web server and, in its configuration, it has the following line:
If all that your 404 page does is display an error, then it's not "handling" squat.
It can do more; much more.
It can fix common mistakes or patterns in incoming broken links:
Example: For some reason, I find a number of incoming links to Ask Leo! with a space immediately prior to the ".html". I strip the space and send the visitor to the intended page.
Ask Leo! currently has over a dozen identified patterns that it checks to try and send visitors to the pages that they intended via a 301 redirect.
It can handle one-off errors:
Example: Someone out there has a link to "http://ask-leo.com/w...or_backups.html" - with the actual ellipses in the link - which is not a page on my site. There's only one page that starts with "w" and ends with "or_backups.html", so visitors are redirected there transparently.
Ask Leo! has over 100 of these special "one-off" redirections for broken incoming links.
It can even handle URLs broken by email programs that wrap lengthy URLs and break them:
Examples: Visit http://ask-leo.com/who_is - you'll be taken to the only page on the site that begins with "who_is". Even ambiguity can be handled - visit http://ask-leo.com/who for a list of possible pages to choose from.
Isn't that better than "Sorry, we can't find it"?
Put just a little effort into the software that you put in place to deal with "not found" errors and your site will have a true 404 handler that does whatever it can to actually get your visitor to the page that they intended.
Of course it is. It requires someone with a little bit of web server knowledge and some basic web programming skills to put a better infrastructure for handling your 404 errors into place.
After that, it gets pretty routine. Check the webmaster tools every so often for newly created incoming-yet-broken links out on the internet - and add the information to handle 'em. Keep an eye out for patterns and push those back to the programmer as needed.
It is programming.
But it's not rocket science.
Obviously, I believe that it's definitely worth the effort.
I've put a lot of effort into this site's 404 handler with the express intent of getting people to the answers that they're looking for, even if their email program broke a link, someone posted a mangled link in a newsgroup, or who-knows-what happened to make that link less than perfect.
In a word: reputation.
User reputation and search engine reputation.
Let's face it. You will get broken, incoming links and those links will be out of your control. Users will enter a typo or otherwise screw things up, sites on which links are posted will screw things up, or mail programs will screw things up. Being tolerant of those kinds of screw ups makes your site a more robust, valuable, and accessible resource.
One broken link in a forum post, for example, either tells post readers that your site is broken when you return a "Not Found" ... or it gets you traffic. Make no mistake - visitors will blame your site even if it's the incoming link that's broken. They don't know or care about broken links and they shouldn't have to.
I also have to believe that Google and other search engines notice sites that proactively make that Not Found list on Webmaster Tools shrink rather than continually grow.
It's an indicator that the site owner is working on their site quality.
It's also an indicator that the site owner is trying to help site visitors get to the content that they're actually looking for.
That's a reputation worth aspiring to.
It's one measure of quality that you actually have in your control. Why not use it?
PS: I'm not in a position to share actual code for all of the above. A few years ago, however, I did write up the basics along with some PHP snippets in an article Tolerate Broken URLs on my old MT Tips site.