Crawling an entire Domain / Website

This post is going to be about crawling an entire domain in Node.js. You can find the first posts of the series here: Web Scraping / Web Crawling Pages with Node.js. For testing purposes I have created a simple set of HTML pages, that should resemble a generic website. It has some page and we […]

Web Crawling with Node.js #2: Building the Page Object

Welcome to part 2 of the series crawling the web with Node.js. In this article we’re going to have a look at what valuable content we can grab from a page. Important parts when writing a crawler are obviously links, because our crawler wouldn’t know where to go next without them. The data I’m going […]