Helping people with computers... one answer at a time.
Spam is no longer just something in email. Web site spam, or "comment spam" is on the rise. If you've seen website spam on your site, we'll look at some of the things you can do.
I have a personal web-site to help computer users that's been running for about 6 years. I have a guest book and people have been signing it for years. Within the past year, though, I've been swamped with spammers signing my book. I get about 6 to 10 spams each day. Each morning I delete them, but it is getting worse by day.
I had tried to "hide" my guest book from the public and sacrifice the ability to have people sign and my enjoyment in reading these. But even this "hidden" page keeps getting spam.
How can I prevent spammers from signing my guest book? I'd appreciate your comments and hopefully a solution to this annoying problem.
Oh, I have plenty of comments and opinions on this topic - it's a problem I face right here with Ask Leo!
But unfortunately, like spam in general, there's no single answer - no magic bullet.
Depending on your server and other specifics, there are several approaches you can take.
Web spam, also known as "blog spam" or "comment spam", is definitely on the rise. Spurred by the popularity of Weblogs or blogs which allow people to post comments, spammers are using these forms to post links back to their own sites. The links aren't really intended to server as advertising, per se, but rather, to trick the search engines into thinking that the target site is more important than it is, because of all the incoming links.
Regardless of why, it's a mess.
There are two types of comment spam generation techniques: manual and automated. Automated tools will scour the web looking for things that look like comment or guest book forms, and automatically post their bogus content to these forms. Manual tools involving hiring cheap labor overseas to do exactly the same thing by hand.
While it started as comment spam on blogs, it's most definitely no longer limited to that. Almost any form that accepts input on the web is getting hit.
As I said, there are various tools and techniques to combat comment or web spam. Which technique might help you depends on how your form is set up, and on what type of server, or publishing platform you might be running.
A very common technique is to use what's called a "CAPTCHA" ("Completely Automated Public Turing test to tell Computers and Humans Apart"). You've probably seen them - they're the often distorted characters that you're asked to re-type into the form before it will be accepted. As the name implies, it's a way to prevent automated tools from posting to your form. Unfortunately it does nothing to stop actual humans.
If you're running on a content management system like MovableType, WordPress or others, then CAPTCHA may already be an option - either as a built-in feature, or as a plugin for your platform. Unfortunately creating and using a CAPTCHA test in the general sense is not all that trivial.
If you're running an Apache-based web server and you have access to its configuration, the mod_security module might be an option. This module can be configured to monitor for terms and take action when those terms are posted to your form. It's something else I run on Ask Leo!'s server, and as a result attempts to post a comment with certain four-letter-words or certain spam-related phrases will simply be rejected.
Another technique I find myself using is for forms where I control the script that processes the form input. Most notably, my ask a question page has been getting hammered of late with various attempts at web spam. What I've done is simply make note of common strings (typically the websites that are being linked to) and updated the code to disallow posts containing those strings. (Apparently, being PHP based, it bypasses mod_security.)
Both techniques that scan for strings require a certain amount of maintenance. As spammers arrive attempting to promote new things, those things need to get added to the disallowed list. However, if you're willing to completely disallow links in the content posted from valid users, then disallowing the string "http:" would stop 99% of this type of spam. Unfortunately that's not something I can do, as many of the questions I get do need to refer to specific web pages.
If you don't have access to the levels of scripting or server configuration that I've described here, then your next best bet is to investigate the specific publishing platform you're using. The spam problem is wide-spread, and many of the popular platforms are implementing solutions of various types.