[ubuntu] How to download all files (but not HTML) from a website using wget?

How to use wget and get all the files from website?

I need all files except the webpage files like HTML, PHP, ASP etc.

This question is related to ubuntu download wget

The answer is


I was trying to download zip files linked from Omeka's themes page - pretty similar task. This worked for me:

wget -A zip -r -l 1 -nd http://omeka.org/add-ons/themes/
  • -A: only accept zip files
  • -r: recurse
  • -l 1: one level deep (ie, only files directly linked from this page)
  • -nd: don't create a directory structure, just download all the files into this directory.

All the answers with -k, -K, -E etc options probably haven't really understood the question, as those as for rewriting HTML pages to make a local structure, renaming .php files and so on. Not relevant.

To literally get all files except .html etc:

wget -R html,htm,php,asp,jsp,js,py,css -r -l 1 -nd http://yoursite.com

On Windows systems in order to get wget you may

  1. download Cygwin
  2. download GnuWin32

You may try:

wget --user-agent=Mozilla --content-disposition --mirror --convert-links -E -K -p http://example.com/

Also you can add:

-A pdf,ps,djvu,tex,doc,docx,xls,xlsx,gz,ppt,mp4,avi,zip,rar

to accept the specific extensions, or to reject only specific extensions:

-R html,htm,asp,php

or to exclude the specific areas:

-X "search*,forum*"

If the files are ignored for robots (e.g. search engines), you've to add also: -e robots=off


This downloaded the entire website for me:

wget --no-clobber --convert-links --random-wait -r -p -E -e robots=off -U mozilla http://site/path/

wget -m -p -E -k -K -np http://site/path/

man page will tell you what those options do.

wget will only follow links, if there is no link to a file from the index page, then wget will not know about its existence, and hence not download it. ie. it helps if all files are linked to in web pages or in directory indexes.


wget -m -A * -pk -e robots=off www.mysite.com/

this will download all type of files locally and point to them from the html file and it will ignore robots file


Try this. It always works for me

wget --mirror -p --convert-links -P ./LOCAL-DIR WEBSITE-URL

Examples related to ubuntu

grep's at sign caught as whitespace "E: Unable to locate package python-pip" on Ubuntu 18.04 How to Install pip for python 3.7 on Ubuntu 18? "Repository does not have a release file" error ping: google.com: Temporary failure in name resolution How to install JDK 11 under Ubuntu? How to upgrade Python version to 3.7? Issue in installing php7.2-mcrypt Install Qt on Ubuntu Failed to start mongod.service: Unit mongod.service not found

Examples related to download

how to download file in react js How do I download a file with Angular2 or greater Unknown URL content://downloads/my_downloads python save image from url How to download a file using a Java REST service and a data stream How to download file in swift? Where can I download Eclipse Android bundle? How to download image from url android download pdf from url then open it with a pdf reader Flask Download a File

Examples related to wget

How to `wget` a list of URLs in a text file? How to install wget in macOS? wget ssl alert handshake failure How to run wget inside Ubuntu Docker image? Unable to establish SSL connection upon wget on Ubuntu 14.04 LTS How to use Python requests to fake a browser visit a.k.a and generate User Agent? wget/curl large file from google drive wget: unable to resolve host address `http' Python equivalent of a given wget command How to download HTTP directory with all files and sub-directories as they appear on the online files/folders list?