Working with Spam Analysis

I have a couple of ideas node.js would be great for. One of them is spam analysis. So I started importing around 14.000 spam comments from one of my blogs locally and am now setting up a little system that analysis things like which words occur the most in the comment field.

comment_spam_analysis

Obviously spam bots tried to link to questionable sites for various medication, but there are still many words that are very usual in almost every common sentence. This project is still in a very, very early stage. There is no API, only me trying to understand node.js, the express framework and how to tie all the ends together.

The picture above is a little tag cloud of the 50 most common terms in my spam comments so far.

Thank you for reading! If you have any comments, additions or questions, please leave them in the form below! You can also tweet them at me

If you want to read more like this, follow me on feedly or other rss readers

Leave a Reply

Your email address will not be published. Required fields are marked *