Home » Windows » wget on Windows

About Allen Chee

Allen Chee
Allen is a software developer working in the banking domain. Apart from hacking code and tinkering with technology, he reads a lot about history, so that mistakes of the past need not be repeated if they are remembered.

wget on Windows

Overview

This is to document my steps to download all image (JPG) files along with PDF and regular HTML files instead of using the web browser, using only 1 command (wget).

Installation

Use Choco (https://chocolatey.org/). Follow installation instructions @ https://chocolatey.org/install

Then open a command prompt with administrative rights to install wget:

choco install wget

Usage

My target website (say abc.com) is protected by BASIC authentication. I am only interested in downloading files with extensions *.jpg, *.pdf & *.html. So I will create a directory to have the files placed i.e. c:\abc. Then, just run the commands below:

cd c:\abc 
wget –user-agent=”Googlebot/2.1 (+https://www.googlebot.com/bot.html)” –http-user=user123 –http-password=coder4life -A “*.jpg,*.html,*.pdf” -r https://www.abc.com/folder123/ -l=0

where

–user-agent = User agent string to let the web server of target website to know about the kind of client/browser that is connecting. If not specified the value is “wget” which some web servers may block access

–http-user = BASIC username

–http-password = BASIC password (plain text)

-A = Inclusion list to download

-r = Tells wget to recursively get files (search the website for all possible paths/files)

-l = How “deep” should wget go. Default is 5, meaning from the URL 
https://www.abc.com/folder123/, wget can go until  /folder123/1/2/3/4/5
and stop looking. The command above has value 0, which means “infinite” (until all possible paths are traversed)

Published on System Code Geeks with permission by Allen Chee, partner at our SCG program. See the original article here: wget on Windows

Opinions expressed by System Code Geeks contributors are their own.

(0 rating, 0 votes)
You need to be a registered member to rate this.
1 Comment Views Tweet it!
Do you want to know how to develop your skillset to become a sysadmin Rockstar?
Subscribe to our newsletter to start Rocking right now!
To get you started we give you our best selling eBooks for FREE!
1. Introduction to NGINX
2. Apache HTTP Server Cookbook
3. VirtualBox Essentials
4. Nagios Monitoring Cookbook
5. Linux BASH Programming Cookbook
6. Postgresql Database Tutorial
and many more ....
I agree to the Terms and Privacy Policy

1
Leave a Reply

avatar
1 Comment threads
0 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
1 Comment authors
Rod Recent comment authors

This site uses Akismet to reduce spam. Learn how your comment data is processed.

  Subscribe  
newest oldest most voted
Notify of
Rod
Guest
Rod

Thanks for the article. As a suggested correction, where it reads –user-agent, it should be -–user-agent. The change needs is also required for -–http-user and –http-password.