Crawling an entire Domain / Website

This post is going to be about crawling an entire domain in Node.js. You can find the first posts of the series here: Web Scraping / Web Crawling Pages with Node.js.

For testing purposes I have created a simple set of HTML pages, that should resemble a generic website. It has some page and we want our crawler to go through them and make sure it finds all of them, where they’re linked. That means when our crawler hits a page, it should keep track of the links it finds and then only proceed to pages it has not crawled yet.

Continue reading Crawling an entire Domain / Website

Web Crawling with Node.js #2: Building the Page Object

Welcome to part 2 of the series crawling the web with Node.js. In this article we’re going to have a look at what valuable content we can grab from a page. Important parts when writing a crawler are obviously links, because our crawler wouldn’t know where to go next without them.

The data I’m going to extract from a page are not necessarily the ones you’ll want and it really all depends what you want with the project. Maybe you only want the content of specific tags or status codes. I’ll just put up some examples and you can see from there what’s possible and see what would make sense for your purpose.

Continue reading Web Crawling with Node.js #2: Building the Page Object

Show the diff(erence) between two files [free GUI client]

When working with code, especially with front end code, you might want to see a diff of two files. Maybe you have a build tool that’s doing something with it or just two different versions. The point is: You want to know exactly if two files are the same or just have all the differences listed. I’ll just share some of my favourite tools for that.

Continue reading Show the diff(erence) between two files [free GUI client]

How to know that you’ve made it

A phrase I’ve come to think about recently is making it as in Wow, she/he really made it. For me, the problem in that is not that people are saying it wrong, but thinking it wrong. For me there is no such thing anymore or at least there will never be enough. It’s like that insatiable greedy devil in our heads that drives us further.

The drive is important, it makes us chase a job, clients, pick up the phone late at night or learn an extra skill, read an extra article or an extra book on top of what we’re expected to do.

The sad part is: You’ll never make it.
Continue reading How to know that you’ve made it

Bootstrap 4 Grid only and SASS with Gulp

Bootstrap is a great CSS framework, but what if we only want to use the grid and not all the other features? You can do this if you either use the SASS or LESS version of the bootstrap framework. I’ll quickly demonstrate how you only take the necessary parts. I dug into this, because I was creating a landing page only featuring parts of the bootstrap framework to increase the page speed.
Continue reading Bootstrap 4 Grid only and SASS with Gulp

Web Scraping / Web Crawling Pages with Node.js

This post series is going to discuss and illustrate how to write a web crawler in node.js. I’m going to write some posts on a topic that are database agnostic and the database part split up into the respective different databases you could imagine using.

Continue reading Web Scraping / Web Crawling Pages with Node.js