LINUXMAKER, OpenSource, Tutorials

Website downloads using wget

With wget you can create a complete, static clone of a website. In this way it is possible, for example, to provide a website for offline use.

The prerequisite is the use of the command line tool "wget", which is part of every Linux distribution as well as MacOS. A complete website can be downloaded with the following input:

wget --recursive --no-clobber --page-requisites --html-extension --convert-links --domains


Downloads pages recursively, following all links.

If the download is interrupted, pages that have already been downloaded will not be downloaded again.

Also downloads the content (images, scripts) required to display the page.

Saves all pages as HTML files

--convert-linksConverts the links so that the downloaded files link to each other (instead of the original source on the Internet).

Only downloads pages from the domains specified here.

Procedure for extensive websites

In the case of particularly large websites, downloading all pages can take a long time and, above all, load the web server or ensure that the crawling computer is blacklisted. To avoid this, the following two options can be used:

Waits 20 seconds between page views (can of course also be set lower).

Limits the download speed to 20K (which would be very defensive).