[python] Basic http file downloading and saving to disk in python?

I'm new to Python and I've been going through the Q&A on this site, for an answer to my question. However, I'm a beginner and I find it difficult to understand some of the solutions. I need a very basic solution.

Could someone please explain a simple solution to 'Downloading a file through http' and 'Saving it to disk, in Windows', to me?

I'm not sure how to use shutil and os modules, either.

The file I want to download is under 500 MB and is an .gz archive file.If someone can explain how to extract the archive and utilise the files in it also, that would be great!

Here's a partial solution, that I wrote from various answers combined:

import requests
import os
import shutil

global dump

def download_file():
    global dump
    url = "http://randomsite.com/file.gz"
    file = requests.get(url, stream=True)
    dump = file.raw

def save_file():
    global dump
    location = os.path.abspath("D:\folder\file.gz")
    with open("file.gz", 'wb') as location:
        shutil.copyfileobj(dump, location)
    del dump

Could someone point out errors (beginner level) and explain any easier methods to do this?

Thanks!

This question is related to python file download save

The answer is


Exotic Windows Solution

import subprocess

subprocess.run("powershell Invoke-WebRequest {} -OutFile {}".format(your_url, filename), shell=True)

I started down this path because ESXi's wget is not compiled with SSL and I wanted to download an OVA from a vendor's website directly onto the ESXi host which is on the other side of the world.

I had to disable the firewall(lazy)/enable https out by editing the rules(proper)

created the python script:

import ssl
import shutil
import tempfile
import urllib.request
context = ssl._create_unverified_context()

dlurl='https://somesite/path/whatever'
with urllib.request.urlopen(durl, context=context) as response:
    with open("file.ova", 'wb') as tmp_file:
        shutil.copyfileobj(response, tmp_file)

ESXi libraries are kind of paired down but the open source weasel installer seemed to use urllib for https... so it inspired me to go down this path


Four methods using wget, urllib and request.

#!/usr/bin/python
import requests
from StringIO import StringIO
from PIL import Image
import profile as profile
import urllib
import wget


url = 'https://tinypng.com/images/social/website.jpg'

def testRequest():
    image_name = 'test1.jpg'
    r = requests.get(url, stream=True)
    with open(image_name, 'wb') as f:
        for chunk in r.iter_content():
            f.write(chunk)

def testRequest2():
    image_name = 'test2.jpg'
    r = requests.get(url)
    i = Image.open(StringIO(r.content))
    i.save(image_name)

def testUrllib():
    image_name = 'test3.jpg'
    testfile = urllib.URLopener()
    testfile.retrieve(url, image_name)

def testwget():
    image_name = 'test4.jpg'
    wget.download(url, image_name)

if __name__ == '__main__':
    profile.run('testRequest()')
    profile.run('testRequest2()')
    profile.run('testUrllib()')
    profile.run('testwget()')

testRequest - 4469882 function calls (4469842 primitive calls) in 20.236 seconds

testRequest2 - 8580 function calls (8574 primitive calls) in 0.072 seconds

testUrllib - 3810 function calls (3775 primitive calls) in 0.036 seconds

testwget - 3489 function calls in 0.020 seconds


Another clean way to save the file is this:

import csv
import urllib

urllib.retrieve("your url goes here" , "output.csv")

For Python3+ URLopener is deprecated. And when used you will get error as below:

url_opener = urllib.URLopener() AttributeError: module 'urllib' has no attribute 'URLopener'

So, try:

import urllib.request 
urllib.request.urlretrieve(url, filename)

As mentioned here:

import urllib
urllib.urlretrieve ("http://randomsite.com/file.gz", "file.gz")

EDIT: If you still want to use requests, take a look at this question or this one.


I use wget.

Simple and good library if you want to example?

import wget

file_url = 'http://johndoe.com/download.zip'

file_name = wget.download(file_url)

wget module support python 2 and python 3 versions


import urllib
urllib.request.urlretrieve("https://raw.githubusercontent.com/dnishimoto/python-deep-learning/master/list%20iterators%20and%20generators.ipynb", "test.ipynb")

downloads a single raw juypter notebook to file.


For text files, you can use:

import requests

url = 'https://WEBSITE.com'
req = requests.get(url)
path = "C:\\YOUR\\FILE.html"

with open(path, 'wb') as f:
    f.write(req.content)

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to file

Gradle - Move a folder from ABC to XYZ Difference between opening a file in binary vs text Angular: How to download a file from HttpClient? Python error message io.UnsupportedOperation: not readable java.io.FileNotFoundException: class path resource cannot be opened because it does not exist Writing JSON object to a JSON file with fs.writeFileSync How to read/write files in .Net Core? How to write to a CSV line by line? Writing a dictionary to a text file? What are the pros and cons of parquet format compared to other formats?

Examples related to download

how to download file in react js How do I download a file with Angular2 or greater Unknown URL content://downloads/my_downloads python save image from url How to download a file using a Java REST service and a data stream How to download file in swift? Where can I download Eclipse Android bundle? How to download image from url android download pdf from url then open it with a pdf reader Flask Download a File

Examples related to save

What is the difference between --save and --save-dev? Basic http file downloading and saving to disk in python? Save byte array to file Saving Excel workbook to constant path with filename from two fields PHP - get base64 img string decode and save as jpg (resulting empty image ) How to edit/save a file through Ubuntu Terminal Write and read a list from file How to do a "Save As" in vba code, saving my current Excel workbook with datestamp? How to dump raw RTSP stream to file? Saving images in Python at a very high quality