Five years of WordPress!

      6 Comments on Five years of WordPress!

At April 22 2009 I started this WP blog. Before I used to blog at web-log and in the mean time I also wrote down my recipes at blogspot. The recipe blog still exist but I don’t have the password anymore… And of course there was also my “a bug a day” picture blog for my 365days project in 2012 (which “failed” after 184 days and still has to be merged with blog.spiderwebz).

After starting my blog in English, I switched back to Dutch by 2011 and then back to English again after buying my first typewriter in November 2012. I have written 220 blog posts which received 442 comments (thank you typosphere!). This blog only had two different themes, but they both had the same color scheme. Guess I like green. ;-)

– – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – –

Now after watching a few stats on other blogs recently, I started to wonder. How many of these visits are indeed persons instead of spammers and crawlers? Since I cannot use their blogs and stats, I used my own.

The first thing I needed for my little research, was to KEEP ALL THE SPAM COMMENTS!
On Friday the 18th, I caught 239 of them. According to StatPress, I had 21 visitors that day, with 35 pageviews. Onestat kinda agrees, with 17 unique visitors. Saturday the 18th was very similar; 247 spam comments, 20 visitors according to StatPress, 11 unique ones according to Onestat. Sunday the 19th was the same thing all over again; 271 spam comments, 21 visitors according to StatPress, 10 unique ones according to Onestat.

When I look at the visitors list in Onestat, I mostly see people entering my blog through the typosphere blog roll, other typosphere blogs and search engines. Plausible! But, when I look at StatPress and want to see the same visitors list, I see this:

Click to enlarge

Click to enlarge

Not once, but from numerous IP addresses and in different styles and shapes. That’s not a very human thing to do…. When I follow a couple of these form algorithm’s I enter my blog on the many pages that receive a lot of spam comments. Coincidence? I don’t think so! Looking at another graphic about unique visitors en pageviews, StatPress shows me the following numbers:

Notice the high visits on the 21th? That's 2061 visits from only one IP address!

Notice the high visits on the 21th? That’s 2061 visits from only one IP address (that tried to login)

That’s a whole different number, about the same days! So where do they come from? When I look at my “Top IP – Page views” stats, four immediately jump out. After requesting the report on the first IP, it turns out this specific address is indeed a web crawler and identifies itself as jobdigger spider. It enters my blog searching for robots.txt and crawls all my Dutch posts. A very nice thing to do of this company, is explain on their own website what they do, why they are crawling and how to get rid of it. I’m a little impressed! But the next one doesn’t of course. And neither do all the others.

What’s so difficult about getting rid of the spambots is that they use several different IP’s every day. So blocking the IP isn’t sufficient. Some of you are using Captcha, but I rather delete every spam comment myself than to scare off a visitor. I like to keep things open and easy for visitors that aren’t spambots and crawlers. I do run Akismet. It catches the spam comments and puts them in my spam folder. It’s hardly ever wrong!

akismet1

So what about that robots.txt file the jobdigging website was talking about? As a website owner, you can use the /robots.txt file to give instructions about your website to web robots; it’s called The Robots Exclusion Protocol. When a robot wants to visit a website URL it seeks for URL/robots.txt.

Sounds lovely right? But, there are two important considerations when using /robots.txt. First it’s a publicly available file which anyone can read. And more importantly in this case, robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention. So this either won’t help to fight spambots. Guess I’m going back to ban the recurring IP addresses!

The point I was trying to make writing this blog post? Well, yes, we get a lot of traffic. But they are not all human. The majority exists out of spammers and crawlers. And you cannot get rid of them…

6 thoughts on “Five years of WordPress!

  1. Bill M

    I trace the spammers and robots at times on my blogs and sites. I’ve found most come from Russia and China.

    Reply
  2. Scott K

    On the statistics I have on my page, the biggest bot comes in often at about 5th place with referrals. I’d put it in at about 20%, but getting that information out of blogger is difficult. It is very limited compared to WordPress. Blogger however does seem to have a fair bit more success with keeping spammers out. But it still isn’t perfect.

    Great to hear you’ve gotten to 5 years! Seasoned pro here.

    Reply
    1. spider Post author

      That’s probably because Blogger has it’s own platform on which it can blacklist spammers. I don’t, it’s just me and my WP.

      Reply
  3. T. Munk

    Get ye to the plugins admin and set up an Akismet account post-haste! For the past 3 years I’ve had Akismet running, it’s intercepted 146,145 spam comments with very very few misses (until last week, it would miss maybe 1 or 2 spams a month, which I just had to “spam” when I got the notification email. I never get notification emails for the ones it intercepts, so it quietly handles spam very effectively). Akismet is a distributed anti-spammer database which keeps a list of known spammer IP’s and behaviours and filters based on that distributed list.

    Last week, I had a very dedicated spammer attempt to hand-spam my site, and managed to get a dozen spam posts past Akismet’s filter. That led me to beef it up with a plugin called “WP-Spamshield”, which adds some javascript and cookies trickery to the comment forms that defeats most automated submissions.

    Basically, I use every trick in the book other than Captchas (which I refuse to use). The tools available to WordPress for this purpose are very good. (:

    Reply
    1. spider Post author

      I already have Akismet Ted. Very neat tool! Didn’t try WP-Spamshield yet, but my r3boot is writing new blacklisting software for his network and I can add annoying spammers to it. So maybe I don’t need it anymore soon. (always getting my hopes up!)

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.