Download file from web in Python 3

Question

I am creating a program that will download a  jar  java  file from a web server  by reading the URL that is specified in the  jad file of the same game application  I m using Python 3 2 1  I ve managed to extract the URL of the JAR file from the JAD file  every JAD file contains the URL to the JAR file   but as you may imagine  the extracted value is type   string    Here s the relevant function   def downloadFile URL None       import httplib2     h   httplib2 Http   cache       resp  content   h request URL   GET       return content  downloadFile URL from file    However I always get an error saying that the type in the function above has to be bytes  and not string  I ve tried using the URL encode  utf-8    and also bytes URL encoding  utf-8    but I d always get the same or similar error   So basically my question is how to download a file from a server when the URL is stored in a string type

User · Answer

I hope I understood the question right  which is   how to download a file from a server when the URL is stored in a string type   I download files and save it locally using the below code   import requests  url    https   www python org static img python-logo png  fileName    D  Python dwnldPythonLogo png  req   requests get url  file   open fileName   wb   for chunk in req iter content 100000       file write chunk  file close

User · Answer

Motivation  Sometimes  we are want to get the picture but not need to download it to real files   i e   download the data and keep it on memory   For example  If I use the machine learning method  train a model that can recognize an image with the number  bar code    When I spider some websites and that have those images so I can use the model to recognize it   and I don t want to save those pictures on my disk drive   then you can try the below method to help you keep download data on memory   Points  import requests from io import BytesIO response   requests get url  with BytesIO as io obj      for chunk in response iter content chunk size 4096           io obj write chunk    basically  is like to  Ranvijay Kumar  An Example  import requests from typing import NewType  TypeVar from io import StringIO  BytesIO import matplotlib pyplot as plt import imageio  URL   NewType  URL   str  T IO   TypeVar  T IO   StringIO  BytesIO    def download and keep on memory url  URL  headers None  timeout None    option  - gt  T IO      chunk size   option get  chunk size   4096     default 4KB     max size   1024    2   option get  max size   -1     MB  default will ignore      response   requests get url  headers headers  timeout timeout      if response status code    200          raise requests ConnectionError f  response status code         instance io   StringIO if isinstance next response iter content chunk size 1    str  else BytesIO     io obj   instance io       cur size   0     for chunk in response iter content chunk size chunk size           cur size    chunk size         if 0  lt  max size  lt  cur size              break         io obj write chunk      io obj seek 0          save it to real file      with open  temp png   mode  wb   as out f          out f write io obj read                return io obj   def main        headers              Accept    text html application xhtml xml application xml q 0 9 image webp image apng     q 0 8 application signed-exchange v b3            Accept-Encoding    gzip  deflate            Accept-Language    zh-TW zh q 0 9 en-US q 0 8 en q 0 7            Cache-Control    max-age 0            Connection    keep-alive            Host    statics 591 com tw            Upgrade-Insecure-Requests    1            User-Agent    Mozilla 5 0  Windows NT 10 0  Win64  x64  AppleWebKit 537 36  KHTML  like Gecko  Chrome 78 0 3904 87 Safari 537 36            io img   download and keep on memory URL  http   statics 591 com tw tools showPhone php info data rLsGZe4U 2FbphHOimi2PT 2FhxTPqI amp type rLEFMu4XrrpgEw                                             headers     You may need this  Otherwise  some websites will send the 404 error to you                                           max size 4     max loading  lt  4MB     with io img          plt rc  axes spines   top False  bottom False  left False  right False          plt rc   xtick    ytick    color  1  1  1  0      same of plt axis  off           plt imshow imageio imread io img  as gray False  pilmode  RGB            plt show     if   name         main         main

User · Answer

I use requests package whenever I want something related to HTTP requests because its API is very easy to start with   first  install requests    pip install requests   then the code   from requests import get    to make GET request   def download url  file name         open in binary mode     with open file name   wb   as file            get request         response   get url            write to file         file write response content

User · Answer

from urllib import request  def get url       with request urlopen url  as r          return r read     def download url  file None       if not file          file   url split      -1      with open file   wb   as f          f write get url

User · Answer

Here we can use urllib s Legacy interface in Python3   The following functions and classes are ported from the Python 2 module urllib  as opposed to urllib2   They might become deprecated at some point in the future   Example  2 lines code    import urllib request  url    https   www python org static img python-logo png  urllib request urlretrieve url   logo png

User · Answer

If you are using Linux you can use the wget module of Linux through the python shell  Here is a sample code snippet import os url    http   www example com foo zip  os system  wget  s  url

User · Answer

If you want to obtain the contents of a web page into a variable  just read the response of urllib request urlopen  import urllib request     url    http   example com   response   urllib request urlopen url  data   response read          a  bytes  object text   data decode  utf-8     a  str   this step can t be used if data is binary   The easiest way to download and save a file is to use the urllib request urlretrieve function  import urllib request       Download the file from  url  and save it locally under  file name   urllib request urlretrieve url  file name    import urllib request       Download the file from  url   save it in a temporary directory and get the   path to it  e g    tmp tmpb48zma txt   in the  file name  variable  file name  headers   urllib request urlretrieve url   But keep in mind that urlretrieve is considered legacy and might become deprecated  not sure why  though   So the most correct way to do this would be to use the urllib request urlopen function to return a file-like object that represents an HTTP response and copy it to a real file using shutil copyfileobj  import urllib request import shutil       Download the file from  url  and save it locally under  file name   with urllib request urlopen url  as response  open file name   wb   as out file      shutil copyfileobj response  out file   If this seems too complicated  you may want to go simpler and store the whole download in a bytes object and then write it to a file  But this works well only for small files  import urllib request       Download the file from  url  and save it locally under  file name   with urllib request urlopen url  as response  open file name   wb   as out file      data   response read     a  bytes  object     out file write data    It is possible to extract  gz  and maybe other formats  compressed data on the fly  but such an operation probably requires the HTTP server to support random access to the file  import urllib request import gzip       Read the first 64 bytes of the file inside the  gz archive located at  url  url    http   example com something gz  with urllib request urlopen url  as response      with gzip GzipFile fileobj response  as uncompressed          file header   uncompressed read 64    a  bytes  object           Or do anything shown above using  uncompressed  instead of  response

User · Answer

You can use wget which is popular downloading shell tool for that  https   pypi python org pypi wget This will be the simplest method since it does not need to open up the destination file  Here is an example    import wget url    https   i1 wp com python3 codes wp-content uploads 2015 06 Python3-powered png fit 650 2C350    wget download url    Users scott Downloads cat4 jpg

User · Answer

Yes  definietly requests is  great package to use in something related to HTTP requests  but we need to be careful with the encoding type of the incoming data as well below is an example which explains the difference   from requests import get    case when the response is byte array url    some image url   response   get url  with open  output    wb   as file      file write response content      case when the response is text   Here unlikely if the reponse content is of type   iso-8859-1   we will have to override the response encoding url    some page url   response   get url    override encoding by real educated guess as provided by chardet r encoding   r apparent encoding  with open  output    w   encoding  utf-8   as file      file write response content

[python] Download file from web in Python 3

Examples related to python

Examples related to python-3.x