(View Complete Item Description)
Now that you have learned how Wget can be used to mirror or download specific files from websites like ActiveHistory.ca via the command line, it’s time to expand your web-scraping skills through a few more lessons that focus on other uses for Wget’s recursive retrieval function. The following tutorial provides three examples of how Wget can be used to download large collections of documents from archival websites with assistance from the Python programing language. It will teach you how to parse and generate a list of URLs using a simple Python script, and will also introduce you to a few of Wget’s other useful features. Similar functions to the ones demonstrated in this lesson can be achieved using curl, an open-source software capable of performing automated downloads from the command line. For this lesson, however, we will focus on Wget and building your Python skills.
Material Type:
Diagram/Illustration
Author:
Kellen Kurschinski