Multiple simultaneous downloads using Wget

Question

I m using wget to download website content  but wget downloads the files one by one   How can I make wget download using 4 simultaneous connections

User · Answer

Another program that can do this is axel   axel -n  lt NUMBER OF CONNECTIONS gt  URL   For baisic HTTP Auth   axel -n  lt NUMBER OF CONNECTIONS gt   user password https   domain tld path file ext    Ubuntu man page

User · Answer

use the aria2    aria2c -x 16  url                                                    ---- gt  the number of connections    http   aria2 sourceforge net  I love it

User · Answer

Consider using Regular Expressions or FTP Globbing  By that you could start wget multiple times with different groups of filename starting characters depending on their frequency of occurrence   This is for example how I sync a folder between two NAS   wget --recursive --level 0 --no-host-directories --cut-dirs 2 --no-verbose --timestamping --backups 0 --bind-address 10 0 0 10 --user  lt ftp user gt  --password  lt ftp password gt   ftp   10 0 0 100 foo bar  0-9a-hA-H    --directory-prefix  volume1 foo  amp  wget --recursive --level 0 --no-host-directories --cut-dirs 2 --no-verbose --timestamping --backups 0 --bind-address 10 0 0 11 --user  lt ftp user gt  --password  lt ftp password gt   ftp   10 0 0 100 foo bar   0-9a-hA-H    --directory-prefix  volume1 foo  amp    The first wget syncs all files folders starting with 0  1  2    F  G  H and the second thread syncs everything else   This was the easiest way to sync between a NAS with one 10G ethernet port  10 0 0 100  and a NAS with two 1G ethernet ports  10 0 0 10 and 10 0 0 11   I bound the two wget threads through --bind-address to the different ethernet ports and called them parallel by putting  amp  at the end of each line  By that I was able to copy huge files with 2x 100 MB s   200 MB s in total

User · Answer

try pcurl  http   sourceforge net projects pcurl   uses curl instead of wget  downloads in 10 segments in parallel

User · Answer

make can be parallelised easily  e g   make -j 4    For example  here s a simple Makefile I m using to download files in parallel using wget   BASE http   www somewhere com path to FILES   shell awk   printf   s ext n     1   filelist txt  LOG download log  all    FILES      echo   FILES     ext      wget -N -a   LOG    BASE       PHONY  all default  all

User · Answer

A new  but yet not released  tool is Mget  It has already many options known from Wget and comes with a library that allows you to easily embed  recursive  downloading into your own application   To answer your question   mget --num-threads 4  url   UPDATE  Mget is now developed as Wget2 with many bugs fixed and more features  e g  HTTP 2 support    --num-threads is now --max-threads

User · Answer

use  aria2c -x 10 -i websites txt  gt  dev null 2 gt  dev null  amp    in websites txt put 1 url per line  example   https   www example com 1 mp4 https   www example com 2 mp4 https   www example com 3 mp4 https   www example com 4 mp4 https   www example com 5 mp4

User · Answer

Call Wget for each link and set it to run in background  I tried this Python code with open  links txt    r  as f1         Opens links txt file with read mode   list 1   f1 read   splitlines          Get every line in links txt for i in list 1                          Iteration over each link    wget  quot  i quot  -bq                         Call wget with background mode  Parameters         b - Run in Background       q - Quiet mode  No Output

User · Answer

I found  probably  a solution     In the process of downloading a few thousand log files from one server   to the next I suddenly had the need to do some serious multithreaded   downloading in BSD  preferably with Wget as that was the simplest way   I could think of handling this   A little looking around led me to   this little nugget   wget -r -np -N  url   amp  wget -r -np -N  url   amp  wget -r -np -N  url   amp  wget -r -np -N  url        Just repeat the wget -r -np -N  url  for as many threads as you need      Now given this isn   t pretty and there are surely better ways to do   this but if you want something quick and dirty it should do the trick      Note  the option -N makes wget download only  newer  files  which means it won t overwrite or re-download files unless their timestamp changes on the server

User · Answer

Since GNU parallel was not mentioned yet  let me give another way   cat url list   parallel -j 8 wget -O     html

User · Answer

They always say it depends but when it comes to mirroring a website The best exists httrack  It is super fast and easy to work  The only downside is it s so called support forum but you can find your way using official documentation  It has both GUI and CLI interface and it Supports cookies just read the docs This is the best  Be cureful with this tool you can download the whole web on your harddrive   httrack -c8  url    By default maximum number of simultaneous connections limited to 8 to avoid server overload

User · Answer

wget cant download in multiple connections  instead you can try to user other program like aria2

User · Answer

use xargs to make wget working in multiple file in parallel     bin bash  mywget         wget   1     export -f mywget    run wget in parallel using 8 thread connection xargs -P 8 -n 1 -I    bash -c  mywget        lt  list urls txt   Aria2 options  The right way working with file smaller than 20mb  aria2c -k 2M -x 10 -s 10  url    -k 2M split file into  2mb chunk  -k or --min-split-size has default value of 20mb  if you not set this option and file under 20mb it  will only run in single connection no matter what value of -x or -s

User · Answer

I strongly suggest to use httrack   ex  httrack -v -w http   example com   It will do a mirror with 8 simultaneous connections as default  Httrack has a tons of options where to play  Have a look

User · Answer

Wget does not support multiple socket connections in order to speed up download of files   I think we can do a bit better than gmarian answer   The correct way is to use aria2   aria2c -x 16 -s 16  url                                                                   --------- gt  the number of connections here

User · Answer

As other posters have mentioned  I d suggest you have a look at aria2  From the Ubuntu man page for version 1 16 1      aria2 is a utility for downloading files  The supported protocols are HTTP S   FTP  BitTorrent  and Metalink  aria2 can download a file from multiple sources protocols and tries to utilize your maximum download bandwidth  It supports downloading a file from HTTP S  FTP and BitTorrent at the same time  while the data downloaded from HTTP S  FTP is uploaded to the BitTorrent swarm  Using Metalink s chunk checksums  aria2 automatically validates chunks of data while downloading a file like BitTorrent    You can use the -x flag to specify the maximum number of connections per server  default  1    aria2c -x 16  url     If the same file is available from multiple locations  you can choose to download from all of them  Use the -j flag to specify the maximum number of parallel downloads for every static URI  default  5    aria2c -j 5  url   url2    Have a look at http   aria2 sourceforge net  for more information  For usage information  the man page is really descriptive and has a section on the bottom with usage examples  An online version can be found at http   aria2 sourceforge net manual en html README html

[download] Multiple simultaneous downloads using Wget?

Examples related to download

Examples related to wget