I have been struggling with comment spam, but I think I’ve vanquished it with this latest plugin. It makes the user compute a md5 hash in javascript before submitting a page. Basically, a md5 hash is a unique fingerprint of some binary sequence, that can only be computed by running the md5 algorithm. What this script does is take the IP address of the user agent hitting your post page, a site-specific string, the user agent string, and the time down to the hour. Then it md5 encodes that string, which means analysis of the string itself can’t reveal how it is generated.

It inserts into each page a randomly-named javascript function that, upon submission, computes the md5 for the md5-encoded string and makes that the name of a hidden form variable, which in turn has a unique value.

What this means is this:

  1. A spammer must visit every comment page in order to get the information needed to comment on each page.
  2. A spammer must implement enough javascript & DOM to handle the function.
  3. A spammer must compute a md5 sum.

All of these things are expensive, computationally speaking. Calculating the md5 sum itself is not so bad, rendering a DOM tree & loading a javascript engine is fairly expensive, if you’re trying to spam millions of pages.

This may not be the ultimate fix, but I suspect it may put the “cost” of generating spam too high for most spammers. If you have a WordPress blog, I can’t recommend the WP-Hashcash plugin enough.