wget on Windows

Allen CheeJune 10th, 2019Last Updated: June 10th, 2019

1 207 1 minute read

Overview

This is to document my steps to download all image (JPG) files along with PDF and regular HTML files instead of using the web browser, using only 1 command (wget).

Installation

Use Choco (https://chocolatey.org/). Follow installation instructions @ https://chocolatey.org/install

Then open a command prompt with administrative rights to install wget:

choco install wget

Usage

My target website (say abc.com) is protected by BASIC authentication. I am only interested in downloading files with extensions *.jpg, *.pdf & *.html. So I will create a directory to have the files placed i.e. c:\abc. Then, just run the commands below:

cd c:\abc
wget –user-agent=”Googlebot/2.1 (+https://www.googlebot.com/bot.html)” –http-user=user123 –http-password=coder4life -A “*.jpg,*.html,*.pdf” -r https://www.abc.com/folder123/ -l=0

where

–user-agent = User agent string to let the web server of target website to know about the kind of client/browser that is connecting. If not specified the value is “wget” which some web servers may block access

–http-user = BASIC username

–http-password = BASIC password (plain text)

-A = Inclusion list to download

-r = Tells wget to recursively get files (search the website for all possible paths/files)

-l = How “deep” should wget go. Default is 5, meaning from the URL
https://www.abc.com/folder123/, wget can go until /folder123/1/2/3/4/5
and stop looking. The command above has value 0, which means “infinite” (until all possible paths are traversed)

Published on System Code Geeks with permission by Allen Chee, partner at our SCG program. See the original article here: wget on Windows

Opinions expressed by System Code Geeks contributors are their own.

Allen CheeJune 10th, 2019Last Updated: June 10th, 2019

1 207 1 minute read

This site uses Akismet to reduce spam. Learn how your comment data is processed.

1 Comment

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Rod

6 years ago

Thanks for the article. As a suggested correction, where it reads –user-agent, it should be -–user-agent. The change needs is also required for -–http-user and –http-password.

wget on Windows

Overview

Installation

Usage

Thank you!

Allen Chee

Thank you!

Overview

Installation

Usage

Thank you!

Related Articles

Thank you!