How to download HTTP directory with all files and sub-directories as they appear on the online files folders list

Question

There is an online HTTP directory that I have access to  I have tried to download all sub-directories and files via wget  But  the problem is that when wget downloads sub-directories it downloads the index html file which contains the list of files in that directory without downloading the files themselves   Is there a way to download the sub-directories and files without depth limit  as if the directory I want to download is just a folder which I want to copy to my computer

User · Answer

wget generally works in this way  but some sites may have problems and it may create too many unnecessary html files  In order to make this work easier and to prevent unnecessary file creation  I am sharing my getwebfolder script  which is the first linux script I wrote for myself  This script downloads all content of a web folder entered as parameter   When you try to download an open web folder by wget which contains more then one file  wget downloads a file named index html  This file contains a file list of the web folder  My script converts file names written in index html file to web addresses and downloads them clearly with wget   Tested at Ubuntu 18 04 and Kali Linux  It may work at other distros as well   Usage      extract getwebfolder file from zip file provided below chmod  x getwebfolder  only for first time    getwebfolder webfolder URL   such as   getwebfolder http   example com example folder   Download Link  Details on blog

User · Answer

you can use lftp  the swish army knife of downloading if you have bigger files you can add --use-pget-n 10 to command  lftp -c  mirror --parallel 100 https   example com files   exit

User · Answer

wget is an invaluable resource and something I use myself   However sometimes there are characters in the address that wget identifies as syntax errors   I m sure there is a fix for that  but as this question did not ask specifically about wget I thought I would offer an alternative for those people who will undoubtedly stumble upon this page looking for a quick fix with no learning curve required   There are a few browser extensions that can do this  but most require installing download managers  which aren t always free  tend to be an eyesore  and use a lot of resources   Heres one that has none of these drawbacks     Download Master  is an extension for Google Chrome that works great for downloading from directories   You can choose to filter which file-types to download  or download the entire directory   https   chrome google com webstore detail download-master dljdacfojgikogldjffnkdcielnklkce  For an up-to-date feature list and other information  visit the project page on the developer s blog   http   monadownloadmaster blogspot com

User · Answer

No Software or Plugin required    only usable if you don t need recursive deptch   Use bookmarklet  Drag this link in bookmarks  then edit and paste this code    function    var arr     l document links  var ext prompt  select extension for download  all links containing that  will be downloaded      mp3    for var i 0  i lt l length  i      if l i  href indexOf ext      false   l i  setAttribute  download  l i  text   l i  click                and go on page  from where you want to download files   and click that bookmarklet

User · Answer

Solution   wget -r -np -nH --cut-dirs 3 -R index html http   hostname aaa bbb ccc ddd    Explanation    It will download all files and subfolders in ddd directory -r   recursively  -np   not going to upper directories  like ccc     -nH   not saving files to hostname folder  --cut-dirs 3   but saving it to ddd by omitting first 3 folders aaa  bbb  ccc -R index html   excluding index html files    Reference  http   bmwieczorek wordpress com 2008 10 01 wget-recursively-download-all-files-from-certain-directory-listed-by-apache

User · Answer

You can use this Firefox addon to download all files in HTTP Directory   https   addons mozilla org en-US firefox addon http-directory-downloader

User · Answer

I was able to get this to work thanks to this post utilizing VisualWGet  It worked great for me  The important part seems to be to check the -recursive flag  see image     Also found that the -no-parent flag is important  othewise it will try to download everything

User · Answer

wget -r -np -nH --cut-dirs 3 -R index html http   hostname aaa bbb ccc ddd    From man wget     -r       --recursive    Turn on recursive retrieving  See Recursive Download  for more details  The default maximum depth is 5      -np       --no-parent    Do not ever ascend to the parent directory when retrieving recursively  This is a useful option  since it guarantees that only the files below a certain hierarchy will be downloaded  See Directory-Based Limits  for more details      -nH       --no-host-directories    Disable generation of host-prefixed directories  By default  invoking Wget with    -r http   fly srk fer hr     will create a structure of directories beginning with fly srk fer hr   This option disables such behavior      --cut-dirs number    Ignore number directory components  This is useful for getting a fine-grained control over the directory where recursive retrieval will be saved   Take  for example  the directory at    ftp   ftp xemacs org pub xemacs      If you retrieve it with    -r     it will be saved locally under ftp xemacs org pub xemacs   While the    -nH    option can remove the ftp xemacs org  part  you are still stuck with pub xemacs  This is where    --cut-dirs    comes in handy  it makes Wget not    see    number remote directory components  Here are several examples of how    --cut-dirs    option works   No options        -  ftp xemacs org pub xemacs  -nH               -  pub xemacs  -nH --cut-dirs 1  -  xemacs  -nH --cut-dirs 2  -     --cut-dirs 1      -  ftp xemacs org xemacs      If you just want to get rid of the directory structure  this option is similar to a combination of    -nd    and    -P     However  unlike    -nd        --cut-dirs    does not lose with subdirectories   for instance  with    -nH --cut-dirs 1     a beta  subdirectory will be placed to xemacs beta  as one would expect

[html] How to download HTTP directory with all files and sub-directories as they appear on the online files/folders list?

Examples related to html

Examples related to http

Examples related to get

Examples related to download

Examples related to wget