Tag Archives: SPAM

Gibberish Last Names Clogging up Subscriptions

I had been annoyed recently by an increasing number of SPAM subscriptions on my Thaidye.com web site. The script behind the form for these subscriptions and requests for rebate immediate send out a confirmation email to the address entered (as well as to the admin of the site) and add the visitor to a database.

Initially there were a few of those spam entries and I could easily go into the database and manually remove them. But they became more and more numerous, so the first step was to add a link to the email the admin received which allowed for an easy removal of that spam entry.

But finally it got so annoying and time-consuming that I tried to think of a more automated way to handle the spam entries. What they all had in common was

  • a good first name
  • a gibberish last name like NLMAkPJpIVyqCkCeuEh, YijhgzswktJTVWqXhmA or MPSVPkfXInMzFYhEOpp
  • and a good email

The email was probably a good one because there were hardly any bounces for the automatic confirmation emails. That actually bothered me also because these poor recipients got some SPAM apparently coming from me.

Now the quest for me was to find something that all these spam entries had in common so that I could filter them somehow. Unfortunately php does not have a ‘gibberish’ function, so I had to come up with one of my own. Meditating over these entries I finally saw that these spam names often have longer sequences of consonants than would occur in valid names.

With a little bit of help from my friends at Google I came up with the following. With the hope that it might help somebody bothered by the same spammers, here the code snippet to filter those entries:

$first = $_REQUEST[first];
$last = $_REQUEST[last];
$gibberish = preg_match('/[bcdfghjklmnpqrstvwxz]{4,}/i', $first)
          || preg_match('/[bcdfghjklmnpqrstvwxz]{4,}/i', $last);
if (! $gibberish) {
    //do the regular processing
    }
else {
    // pretend every thing went find for the spammer
    // but don't really do anything
    }

Will see how many will slip through – gibberish with less than 4 consonants in a row.

What I am still curious about is ‘WHY’? What the spammer intends with these spam entries. I don’t see any way that could be beneficial to him/her. Discrediting me because the site sents out spam? But why then use gibberish in the last name? I am really curious.

This Blog does not use rel=”nofollow”

I just installed a plugin to this blog to remove the default behavior to add the nofollow tag to all URLs that a commenter writes.

The nofollow tag was intended to reduce SPAM comments on blogs because it removes the incentive to post these spam comments. Google rates a page largely by the number of links from other web pages to it. The nofollow tag, that is added to a link as rel=”nofollow,” indicated to Google, not to count this link. After this, why would a blackhat SEO add a spam comment to a blog if that link would not help him to reach his objective to rank higher?

In theory that worked, but it had a side effect. Real commenters, who would have added value to a blog by commenting good comments, also stayed away because they also had lost the benefit to get link popularity. They went away to other methods of getting in-bound links. Or they went to blogs that did not have that nofollow tag set on their commenter’s contributions.

I use the plugin Nofollow Free to remove nofollow tags from comment links. It is configurable in that you can choose to remove each and every nofollow tag, or only those of registered users, and you can create a blacklist of words that would trigger to add the nofollow tag again – the male enhancer pill with a V is probably a good member for that field.