Use the zipfile
module. To extract a file from a URL, you'll need to wrap the result of a urlopen
call in a BytesIO
object. This is because the result of a web request returned by urlopen
doesn't support seeking:
from urllib.request import urlopen
from io import BytesIO
from zipfile import ZipFile
zip_url = 'http://example.com/my_file.zip'
with urlopen(zip_url) as f:
with BytesIO(f.read()) as b, ZipFile(b) as myzipfile:
foofile = myzipfile.open('foo.txt')
print(foofile.read())
If you already have the file downloaded locally, you don't need BytesIO
, just open it in binary mode and pass to ZipFile
directly:
from zipfile import ZipFile
zip_filename = 'my_file.zip'
with open(zip_filename, 'rb') as f:
with ZipFile(f) as myzipfile:
foofile = myzipfile.open('foo.txt')
print(foofile.read().decode('utf-8'))
Again, note that you have to open
the file in binary ('rb'
) mode, not as text or you'll get a zipfile.BadZipFile: File is not a zip file
error.
It's good practice to use all these things as context managers with the with
statement, so that they'll be closed properly.