This is to document my steps to download all image (JPG) files along with PDF and regular HTML files instead of using the web browser, using only 1 command (wget).
Use Choco (https://chocolatey.org/). Follow installation instructions @ https://chocolatey.org/install
Then open a command prompt with administrative rights to install wget:
choco install wget
My target website (say abc.com) is protected by BASIC authentication. I am only interested in downloading files with extensions *.jpg, *.pdf & *.html. So I will create a directory to have the files placed i.e. c:\abc. Then, just run the commands below:
wget –user-agent=”Googlebot/2.1 (+https://www.googlebot.com/bot.html)” –http-user=user123 –http-password=coder4life -A “*.jpg,*.html,*.pdf” -r https://www.abc.com/folder123/ -l=0
–user-agent = User agent string to let the web server of target website to know about the kind of client/browser that is connecting. If not specified the value is “wget” which some web servers may block access
–http-user = BASIC username
–http-password = BASIC password (plain text)
-A = Inclusion list to download
-r = Tells wget to recursively get files (search the website for all possible paths/files)
-l = How “deep” should wget go. Default is 5, meaning from the URL
https://www.abc.com/folder123/, wget can go until /folder123/1/2/3/4/5
and stop looking. The command above has value 0, which means “infinite” (until all possible paths are traversed)
Opinions expressed by System Code Geeks contributors are their own.