Working with Spam Analysis

I have a couple of ideas node.js would be great for. One of them is spam analysis. So I started importing around 14.000 spam comments from one of my blogs locally and am now setting up a little system that analysis things like which words occur the most in the comment field.


Obviously spam bots tried to link to questionable sites for various medication, but there are still many words that are very usual in almost every common sentence. This project is still in a very, very early stage. There is no API, only me trying to understand node.js, the express framework and how to tie all the ends together.

The picture above is a little tag cloud of the 50 most common terms in my spam comments so far.

Leave a comment

Your email address will not be published. Required fields are marked *