Python equivalent of a given wget command

Question

I m trying to create a Python function that does the same thing as this wget command   wget -c --read-timeout 5 --tries 0   URL    -c - Continue from where you left off if the download is interrupted   --read-timeout 5 - If there is no new data coming in for over 5 seconds  give up and try again   Given -c this mean it will try again from where it left off   --tries 0 - Retry forever   Those three arguments used in tandem results in a download that cannot fail   I want to duplicate those features in my Python script  but I don t know where to begin

User · Answer

TensorFlow makes life easier  file path gives us the location of downloaded file   import tensorflow as tf tf keras utils get file origin  https   storage googleapis com tf-datasets titanic train csv                                       fname  train csv                                       untar False  extract False

User · Answer

For Windows and Python 3 x  my two cents contribution about renaming the file on download     Install wget module   pip install wget Use wget     import wget wget download  Url    C   PathToMyDownloadFolder  NewFileName extension     Truely working command line example     python -c  import wget  wget download   https   cdn kernel org pub linux kernel v4 x linux-4 17 2 tar xz      C   Users  TestName TestExtension       Note    C   PathToMyDownloadFolder  NewFileName extension  is not mandatory  By default  the file is not renamed  and the download folder is your local path

User · Answer

I had to do something like this on a version of linux that didn t have the right options compiled into wget  This example is for downloading the memory analysis tool  guppy   I m not sure if it s important or not  but I kept the target file s name the same as the url target name     Here s what I came up with   python -c  import requests  r   requests get  https   pypi python org packages source g guppy guppy-0 1 10 tar gz     open  guppy-0 1 10 tar gz     wb   write r content     That s the one-liner  here s it a little more readable   import requests fname    guppy-0 1 10 tar gz  url    https   pypi python org packages source g guppy     fname r   requests get url  open fname    wb   write r content    This worked for downloading a tarball  I was able to extract the package and download it after downloading   EDIT   To address a question  here is an implementation with a progress bar printed to STDOUT  There is probably a more portable way to do this without the clint package  but this was tested on my machine and works fine      usr bin env python  from clint textui import progress import requests  fname    guppy-0 1 10 tar gz  url    https   pypi python org packages source g guppy     fname  r   requests get url  stream True  with open fname   wb   as f      total length   int r headers get  content-length        for chunk in progress bar r iter content chunk size 1024   expected size  total length 1024    1            if chunk              f write chunk              f flush

User · Answer

There is also a nice Python module named wget that is pretty easy to use  Found here   This demonstrates the simplicity of the design    gt  gt  gt  import wget  gt  gt  gt  url    http   www futurecrew com skaven song files mp3 razorback mp3   gt  gt  gt  filename   wget download url  100                                                     3841532   3841532 gt   gt  gt  filename  razorback mp3    Enjoy   However  if wget doesn t work  I ve had trouble with certain PDF files   try this solution   Edit  You can also use the out parameter to use a custom output directory instead of current working directory    gt  gt  gt  output directory    lt directory name gt   gt  gt  gt  filename   wget download url  out output directory   gt  gt  gt  filename  razorback mp3

User · Answer

urllib request should work   Just set it up in a while not done  loop  check if a localfile already exists  if it does send a GET with a RANGE header  specifying how far you got in downloading the localfile   Be sure to use read   to append to the localfile until an error occurs   This is also potentially a duplicate of Python urllib2 resume download doesn  39 t work when network reconnects

User · Answer

Let me Improve a example with threads in case you want download many files   import math import random import threading  import requests from clint textui import progress    You must define a proxy list   I suggests https   free-proxy-list net  proxies         0    http    http   34 208 47 183 80        1    http    http   40 69 191 149 3128        2    http    http   104 154 205 214 1080        3    http    http   52 11 190 64 3128         you must define the list for files do you want download videos          https   i stack imgur com g2BHi jpg        https   i stack imgur com NURaP jpg     downloaderses   list     def downloaders video  selected proxy       print  Downloading file named    by proxy        format video  selected proxy       r   requests get video  stream True  proxies selected proxy      nombre video   video split      3      with open nombre video   wb   as f          total length   int r headers get  content-length            for chunk in progress bar r iter content chunk size 1024   expected size  total length   1024    1               if chunk                  f write chunk                  f flush     for video in videos      selected proxy   proxies math floor random random     len proxies        t   threading Thread target downloaders  args  video  selected proxy       downloaderses append t   for  downloaders in downloaderses       downloaders start

User · Answer

import urllib2 import time  max attempts   80 attempts   0 sleeptime   10  in seconds  no reason to continuously try if network is down   while true   Possibly Dangerous while attempts  lt  max attempts      time sleep sleeptime      try          response   urllib2 urlopen  http   example com   timeout   5          content   response read           f   open   local index html    w            f write  content           f close           break     except urllib2 URLError as e          attempts    1         print type e

User · Answer

Here s the code adopted from the torchvision library   import urllib  def download url url  root  filename None          Download a file from a url and place it in root      Args          url  str   URL to download file from         root  str   Directory to place downloaded file in         filename  str  optional   Name to save the file under  If None  use the basename of the URL              root   os path expanduser root      if not filename          filename   os path basename url      fpath   os path join root  filename       os makedirs root  exist ok True       try          print  Downloading     url     to     fpath          urllib request urlretrieve url  fpath      except  urllib error URLError  IOError  as e          if url  5      https               url   url replace  https     http                print  Failed download  Trying https - gt  http instead                         Downloading     url     to     fpath              urllib request urlretrieve url  fpath    If you are ok to take dependency on torchvision library then you also also simply do   from torchvision datasets utils import download url download url  http   something com file zip      my folder

User · Answer

easy as py   class Downloder        def download manager self  url  destination  Files DownloderApp    try number  10   time out  60             threading Thread target self  wget dl  args  url  destination  try number  time out  log file   start           if self  wget dl url  destination  try number  time out  log file     0              return True         else              return False       def  wget dl self url  destination  try number  time out           import subprocess         command   wget    -c    -P   destination   -t   try number   -T   time out   url          try              download state subprocess call command          except Exception as e              print e           if download state  0   gt  successfull download         return download state

User · Answer

A solution that I often find simpler and more robust is to simply execute a terminal command within python  In your case   import os url    https   www someurl com  os system f   wget -c --read-timeout 5 --tries 0   url

[python] Python equivalent of a given wget command

Examples related to python

Examples related to wget