In this course, you will learn the most important tools of Web scraping in Python, and when to use each one. If you ever thought about scraping a website, but gotten confused due to all the options, or didn’t even know where to start, then this course is for you.
We will cover the following topics:
1. Requests and BeautifulSoup
For most simple scraping, requests is good enough. We will start at the page for my book, grab all the text, search for the price of the course and finally, download all the images.
2. Web Api: GitHub Jobs
Scraping has a few legal risks, so you should always use an API, if provided. I’ve already covered the Reddit API, so here we will look at the Github API, also covering how to read Json files. We will go over how you can search for particular jobs on the Github jobs website using Python scripts.
Scrapy allows you to build a spider. A spider is a program that can crawl multiple pages. Point it to one, it will follow all the links on the page, follow the links of the next page and so on, till the whole website has been indexed.
We will use Scrappy to create a spider that crawl my website, and make a list of all the pages it finds and store it in a file.
Next, we will make our crawler more intelligent, and only crawl certain pages. It will start on my blog for building a Reddit Bot, and only crawl the pages in that series. It will then download the any source code it finds (Python, Bash etc).
We will write code that will search for a particular term using my site’s search button, follow the links, click on them and download data.
A brief note on the legal aspects of scraping, especially if you want to build a business with it.
How will the course be delivered?
You will get 3 hours of HD quality video, plus the source code. The videos are online, though you can download a copy for personal use.
You will also get all the source code, so you don’t have to type anything.
What if I hate the course?
If you are not happy for any reason, you can contact me within 30 days to get a 100% refund. If you are not satisfied, I don’t want your money.