Downloading a picture via urllib and python

Question

So I m trying to make a Python script that downloads webcomics and puts them in a folder on my desktop   I ve found a few similar programs on here that do something similar  but nothing quite like what I need   The one that I found most similar is right here  http   bytes com topic python answers 850927-problem-using-urllib-download-images    I tried using this code    gt  gt  gt  import urllib  gt  gt  gt  image   urllib URLopener    gt  gt  gt  image retrieve  http   www gunnerkrigg com  comics 00000001 jpg   00000001 jpg     00000001 jpg    lt httplib HTTPMessage instance at 0x1457a80 gt     I then searched my computer for a file  00000001 jpg   but all I found was the cached picture of it   I m not even sure it saved the file to my computer   Once I understand how to get the file downloaded  I think I know how to handle the rest   Essentially just use a for loop and split the string at the  00000000   jpg  and increment the  00000000  up to the largest number  which I would have to somehow determine   Any reccomendations on the best way to do this or how to download the file correctly   Thanks   EDIT 6 15 10  Here is the completed script  it saves the files to any directory you choose   For some odd reason  the files weren t downloading and they just did   Any suggestions on how to clean it up would be much appreciated   I m currently working out how to find out many comics exist on the site so I can get just the latest one  rather than having the program quit after a certain number of exceptions are raised   import urllib import os  comicCounter len os listdir   file    1    reads the number of files in the folder to start downloading at the next comic errorCount 0  def download comic url comicName               download a comic in the form of      url   http   www example com     comicName    00000000 jpg              image urllib URLopener       image retrieve url comicName     download comicName at URL  while comicCounter  lt   1000     not the most elegant solution     os chdir   file      set where files download to         try          if comicCounter  lt  10     needed to break into 10 n segments because comic names are a set of zeros followed by a number             comicNumber str  0000000  str comicCounter      string containing the eight digit comic number             comicName str comicNumber   jpg      string containing the file name             url str  http   www gunnerkrigg com  comics   comicName     creates the URL for the comic             comicCounter  1    increments the comic counter to go to the next comic  must be before the download in case the download raises an exception             download comic url comicName     uses the function defined above to download the comic             print url         if 10  lt   comicCounter  lt  100              comicNumber str  000000  str comicCounter               comicName str comicNumber   jpg               url str  http   www gunnerkrigg com  comics   comicName              comicCounter  1             download comic url comicName              print url         if 100  lt   comicCounter  lt  1000              comicNumber str  00000  str comicCounter               comicName str comicNumber   jpg               url str  http   www gunnerkrigg com  comics   comicName              comicCounter  1             download comic url comicName              print url         else     quit the program if any number outside this range shows up             quit     except IOError     urllib raises an IOError for a 404 error  when the comic doesn t exist         errorCount  1    add one to the error count         if errorCount gt 3     if more than three errors occur during downloading  quit the program             break         else              print str  comic         str comicCounter           does not exist      otherwise say that the certain comic number doesn t exist print  all comics are up to date     prints if all comics are downloaded

User · Answer

Maybe you need  User-Agent    import urllib2 opener   urllib2 build opener   opener addheaders      User-Agent    Mozilla 5 0  Windows NT 6 1  WOW64  AppleWebKit 537 36  KHTML  like Gecko  Chrome 34 0 1847 137 Safari 537 36    response   opener open  http   google com   htmlData   response read   f   open  file txt   w   f write htmlData   f close

User · Answer

According to urllib request urlretrieve     Python 3 9 2 documentation  The function is ported from the Python 2 module  urllib   as opposed to  urllib2   It might become deprecated at some point in the future  Because of this  it might be better to use requests get url  params None    kwargs   Here is a MWE  import requests   url    http   example com example jpg   response   requests get url   with open filename   quot wb quot   as f      f write response content   Refer to Downlolad Google   s WebP Images via Take Screenshots with Selenium WebDriver

User · Answer

Just for the record  using requests library   import requests f   open  00000001 jpg   wb   f write requests get  http   www gunnerkrigg com  comics 00000001 jpg   content  f close     Though it should check for requests get   error

User · Answer

Using requests  import requests import shutil os  headers          user-agent    Mozilla 5 0  Windows NT 10 0  Win64  x64  AppleWebKit 537 36  KHTML  like Gecko  Chrome 78 0 3904 108 Safari 537 36    currentDir   os getcwd   path   os path join currentDir  Images   saving images to Images folder  def ImageDl url       attempts   0     while attempts  lt  5  retry 5 times         try              filename   url split      -1              r   requests get url headers headers stream True timeout 5              if r status code    200                  with open os path join path filename   wb   as f                      r raw decode content   True                     shutil copyfileobj r raw f              print filename              break         except Exception as e              attempts  1             print e   if   name         main         ImageDl url

User · Answer

What about this   import urllib  os  def from url  url  filename   None           Store the url content to filename        if not filename          filename   os path basename  os path realpath url         req   urllib request Request  url       try          response   urllib request urlopen  req       except urllib error URLError as e          if hasattr  e   reason                 print   Fail in reaching the server - gt     e reason               return False         elif hasattr  e   code                 print   The server couldn  t fulfill the request - gt     e code               return False     else          with open  filename   wb    as fo              fo write  response read                 print   Url saved as  s    filename           return True      def main        test url    http   cdn sstatic net stackoverflow img favicon ico       from url  test url    if   name         main         main

User · Answer

I have found this answer and I edit that in more reliable way  def download photo self  img url  filename       try          image on web   urllib urlopen img url          if image on web headers maintype     image               buf   image on web read               path   os getcwd     DOWNLOADED IMAGE PATH             file path     s s     path  filename              downloaded image   file file path   wb               downloaded image write buf              downloaded image close               image on web close           else              return False         except          return False     return True   From this you never get any other resources or exceptions while downloading

User · Answer

import urllib f   open  00000001 jpg   wb   f write urllib urlopen  http   www gunnerkrigg com  comics 00000001 jpg   read    f close

User · Answer

All the above codes  do not allow to preserve the original image name  which sometimes is required   This will help in saving the images to your local drive  preserving the original image name      IMAGE   URL rsplit     1  1      urllib urlretrieve URL  IMAGE    Try this for more details

User · Answer

For Python 3 you will need to import import urllib request   import urllib request   urllib request urlretrieve url  filename    for more info check out the link

User · Answer

Using urllib  you can get this done instantly   import urllib request  opener urllib request build opener   opener addheaders    User-Agent   Mozilla 5 0  Windows NT 6 1  WOW64  AppleWebKit 537 36  KHTML  like Gecko  Chrome 36 0 1941 0 Safari 537 36    urllib request install opener opener   urllib request urlretrieve URL   images 0 jpg

User · Answer

If you need proxy support you can do this     if needProxy    False      returnCode  urlReturnResponse   urllib urlretrieve  myUrl  fullJpegPathAndName     else      proxy support   urllib2 ProxyHandler   https  myHttpProxyAddress       opener   urllib2 build opener proxy support      urllib2 install opener opener      urlReader   urllib2 urlopen  myUrl   read        with open  fullJpegPathAndName   w    as f        f write  urlReader

User · Answer

This worked for me using python 3   It gets a list of URLs from the csv file and starts downloading them into a folder  In case the content or image does not exist it takes that exception and continues making its magic   import urllib request import csv import os  errorCount 0  file list     Users  USER Desktop YOUR-FILE-TO-DOWNLOAD-IMAGES image  0  jpg     CSV file must separate by commas   urls csv is set to your current working directory make sure your cd into or add the corresponding path with open   urls csv   as images      images   csv reader images      img count   1     print  Please Wait   it will take some time       for image in images          try              urllib request urlretrieve image 0               file list format img count               img count    1         except IOError              errorCount  1               Stop in case you reach 100 errors downloading images             if errorCount gt 100                  break             else                  print   File does not exist    print   Done

User · Answer

Another way to do this is via the fastai library  This worked like a charm for me  I was facing a SSL  CERTIFICATE VERIFY FAILED Error using urlretrieve so I tried that    url    https   www linkdoesntexist com lennon jpg  fastai core download url url  image1 jpg   show progress False

User · Answer

It s easiest to just use  read   to read the partial or entire response  then write it into a file you ve opened in a known good location

User · Answer

Python 2  Using urllib urlretrieve  import urllib urllib urlretrieve  http   www gunnerkrigg com  comics 00000001 jpg    00000001 jpg     Python 3  Using urllib request urlretrieve  part of Python 3 s legacy interface  works exactly the same   import urllib request urllib request urlretrieve  http   www gunnerkrigg com  comics 00000001 jpg    00000001 jpg

User · Answer

Python 3 version of  DiGMi s answer   from urllib import request f   open  00000001 jpg    wb   f write request urlopen  http   www gunnerkrigg com comics 00000001 jpg   read    f close

User · Answer

Aside from suggesting you read the docs for retrieve   carefully  http   docs python org library urllib html urllib URLopener retrieve   I would suggest actually calling read   on the content of the response  and then saving it into a file of your choosing rather than leaving it in the temporary file that retrieve creates

User · Answer

A simpler solution may be python 3    import urllib request import os os chdir  D   comic    your path i 1  s  00000000  while i lt 1000      try          urllib request urlretrieve  http   www gunnerkrigg com  comics    s  8-len str i     str i    jpg  str i    jpg       except          print  not possible    str i       i  1

User · Answer

If you know that the files are located in the same directory dir of the website site and have the following format  filename 01 jpg       filename 10 jpg then download all of them   import requests  for x in range 1  10       str1    filename  2 2d jpg     x      str2    http   site dir filename  2 2d jpg     x       f   open str1   wb       f write requests get str2  content      f close

[python] Downloading a picture via urllib and python

Examples related to python

Examples related to urllib2

Examples related to urllib