Helping people with computers... one answer at a time.

Spam is no longer just something in email. Web site spam, or "comment spam" is on the rise. If you've seen website spam on your site, we'll look at some of the things you can do.

I have a personal web-site to help computer users that's been running for about 6 years. I have a guest book and people have been signing it for years. Within the past year, though, I've been swamped with spammers signing my book. I get about 6 to 10 spams each day. Each morning I delete them, but it is getting worse by day.

I had tried to "hide" my guest book from the public and sacrifice the ability to have people sign and my enjoyment in reading these. But even this "hidden" page keeps getting spam.

How can I prevent spammers from signing my guest book? I'd appreciate your comments and hopefully a solution to this annoying problem.

Oh, I have plenty of comments and opinions on this topic - it's a problem I face right here with Ask Leo!

But unfortunately, like spam in general, there's no single answer - no magic bullet.

Depending on your server and other specifics, there are several approaches you can take.

Web spam, also known as "blog spam" or "comment spam", is definitely on the rise. Spurred by the popularity of Weblogs or blogs which allow people to post comments, spammers are using these forms to post links back to their own sites. The links aren't really intended to server as advertising, per se, but rather, to trick the search engines into thinking that the target site is more important than it is, because of all the incoming links.

Regardless of why, it's a mess.

There are two types of comment spam generation techniques: manual and automated. Automated tools will scour the web looking for things that look like comment or guest book forms, and automatically post their bogus content to these forms. Manual tools involving hiring cheap labor overseas to do exactly the same thing by hand.

While it started as comment spam on blogs, it's most definitely no longer limited to that. Almost any form that accepts input on the web is getting hit.

As I said, there are various tools and techniques to combat comment or web spam. Which technique might help you depends on how your form is set up, and on what type of server, or publishing platform you might be running.

A very common technique is to use what's called a "CAPTCHA" ("Completely Automated Public Turing test to tell Computers and Humans Apart"). You've probably seen them - they're the often distorted characters that you're asked to re-type into the form before it will be accepted. As the name implies, it's a way to prevent automated tools from posting to your form. Unfortunately it does nothing to stop actual humans.

If you're running on a content management system like MovableType, WordPress or others, then CAPTCHA may already be an option - either as a built-in feature, or as a plugin for your platform. Unfortunately creating and using a CAPTCHA test in the general sense is not all that trivial.

'Which technique might help you depends on how your form is set up, and on what type of server, or publishing platform you might be running."

However, if you're using a standard HTML <form> to get your input, I developed a technique that relies on JavaScript to throttle spam. In fact, it's a technique I use here on Ask Leo! with great success. It's developed and described for the MovableType publishing platform, but the technique is in fact valid for any <form> based input. You can read more about it on my MovableType Tips site: Dealing with Comment Spam.

The drawback of this technique is that it requires that JavaScript be enabled in order for people to post to your form. While most people do have it turned on, there's a percentage that do not, and you'll have to decide if that is important enough to you.

If you're running an Apache-based web server and you have access to its configuration, the mod_security module might be an option. This module can be configured to monitor for terms and take action when those terms are posted to your form. It's something else I run on Ask Leo!'s server, and as a result attempts to post a comment with certain four-letter-words or certain spam-related phrases will simply be rejected.

Another technique I find myself using is for forms where I control the script that processes the form input. Most notably, my ask a question page has been getting hammered of late with various attempts at web spam. What I've done is simply make note of common strings (typically the websites that are being linked to) and updated the code to disallow posts containing those strings. (Apparently, being PHP based, it bypasses mod_security.)

Both techniques that scan for strings require a certain amount of maintenance. As spammers arrive attempting to promote new things, those things need to get added to the disallowed list. However, if you're willing to completely disallow links in the content posted from valid users, then disallowing the string "http:" would stop 99% of this type of spam. Unfortunately that's not something I can do, as many of the questions I get do need to refer to specific web pages.

If you don't have access to the levels of scripting or server configuration that I've described here, then your next best bet is to investigate the specific publishing platform you're using. The spam problem is wide-spread, and many of the popular platforms are implementing solutions of various types.

Article C2777 - September 3, 2006 « »

Share this article with your friends:

Share this article on Facebook Tweet this article Email a link to this article
Leo Leo A. Notenboom has been playing with computers since he was required to take a programming class in 1976. An 18 year career as a programmer at Microsoft soon followed. After "retiring" in 2001, Leo started Ask Leo! in 2003 as a place for answers to common computer and technical questions. More about Leo.

Not what you needed?

9 Comments
Jon
September 4, 2006 7:49 PM

We had a similar problem with the Guest Book page my wife has on her web site - spammers got in with advertising material. We solved this problem (so far) by siging up for an e-Guestbook (Google them) account. Annual subscription is very low. Posters have to type in a "magic" number which beats automated posters. You are advised when a new post has been sent, and can vet it before allowing the post. As I said, it works so far.

We had a similar situation with the site's discussion forum. We fixed this briefly be setting up a forum with phpBB, but recently, we started getting dozens of signups by "members" who are obviously pushing spam. More maintenance work to be done...

Mary
September 5, 2006 1:38 PM

Leo -
A few months ago I joined an online computer help forum sponsored by a major computer manufacturer. Within just a few days I began receiving (on average) 10 spam emails a day. Now it's up to 20 a day. I've got my firewall, antivirus, and antispyware programs current and running. The spam has been directed to my bulk folder so apparently the filters are working and not sending the spam to my inbox.

But how did the spammers get *MY* email address in the first place? In order to access the forum I have to first go to www.computer company.com and then sign in from there. If members want to communicate privately, they can send messages via a separate link provided on the forum site. We never see each others actual email address. (Similar to how eBay allows people to communicate.) It's not like my address is being posted by the computer manufacturer or the discussion forum... or is it?
Mary

Greg Bulmash
September 6, 2006 11:07 AM

phpBB's built-in CAPTCHA has either been cracked by spammers or the human-created phpBB spam is on the rise.

The problem with phpBB is that even if you require answering an e-mail to activate the account, or even if you go so far as to require manual administrator approval to activate an account, the moment someone signs up (BEFORE they're activated), they end up in the member directory.

And if they've specified a homepage link in the form when they signed up, it's linked from two places in the member directory. So they can give a fake e-mail address and never activate their account, but get linky goodness from just barraging your phpBB board with fake accounts.

This is why I have removed all my phpBB installations.

Mike
September 25, 2006 10:59 AM

Maybe we're going about this wrong? How does the spammers' automated form search spider determine a page is a form they want to spam? Maybe there is a way to make the form page NOT look like a form they want to submit to. Are they looking for one which posts to a .cgi? In that case why not make the cgi extension .xyz and change your server .htaccess to execute .xyz like a .cgi?

Holly Wild
February 12, 2007 8:01 AM

I get close to 100 spam e-mails from our comment forms on our site. How can I stop this ...we use frontpage. The site is Http://www.sjlounyinjurylaw.com its making us crazy!

Holly Wild
February 12, 2007 8:02 AM

sorry the site is Http://www.ajlounyinjurylaw.com...help please.

Leo Notenboom
February 12, 2007 9:49 AM

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

The article you commented on has my suggestions.

Good luck!

Leo
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (MingW32)

iD8DBQFF0KigCMEe9B/8oqERAuspAJ46l2DDKmqNMgJbc7ek/AvFhdzobgCfY2Er
kVSL6946NACOPC9+yXoZu1A=
=CYJd
-----END PGP SIGNATURE-----

David A
October 17, 2008 11:28 AM

my very popular website (over 1 million viewers) is now getting sick sex postings (my site is a family site) how can I prevent it, my site is www.pennypincher.ca

The article you just commented on has my basic suggestions - they all involve modifying the website or website software to put up barriers to this type of thing. Unfortanately there's no simply answer that just works - it depends heavily on the type of software that's running the site.
- Leo
26-May-2008

audy
March 17, 2009 10:06 AM

try spameat.com
it's got a good concept.
everyone help each other to filter the web spam

Comments on this entry are closed.

If you have a question, start by using the search box up at the top of the page - there's a very good chance that your question has already been answered on Ask Leo!.

If you don't find your answer, head out to http://askleo.com/ask to ask your question.