[python] Given a URL to a text file, what is the simplest way to read the contents of the text file?

In Python, when given the URL for a text file, what is the simplest way to access the contents off the text file and print the contents of the file out locally line-by-line without saving a local copy of the text file?

TargetURL=http://www.myhost.com/SomeFile.txt
#read the file
#print first line
#print second line
#etc

This question is related to python

The answer is


There's really no need to read line-by-line. You can get the whole thing like this:

import urllib
txt = urllib.urlopen(target_url).read()

For me, none of the above responses worked straight ahead. Instead, I had to do the following (Python 3):

from urllib.request import urlopen

data = urlopen("[your url goes here]").read().decode('utf-8')

# Do what you need to do with the data.

I'm a newbie to Python and the offhand comment about Python 3 in the accepted solution was confusing. For posterity, the code to do this in Python 3 is

import urllib.request
data = urllib.request.urlopen(target_url)

for line in data:
    ...

or alternatively

from urllib.request import urlopen
data = urlopen(target_url)

Note that just import urllib does not work.


requests package works really well for simple ui as @Andrew Mao suggested

import requests
response = requests.get('http://lib.stat.cmu.edu/datasets/boston')
data = response.text
for i, line in enumerate(data.split('\n')):
    print(f'{i}   {line}')

o/p:

0    The Boston house-price data of Harrison, D. and Rubinfeld, D.L. 'Hedonic
1    prices and the demand for clean air', J. Environ. Economics & Management,
2    vol.5, 81-102, 1978.   Used in Belsley, Kuh & Welsch, 'Regression diagnostics
3    ...', Wiley, 1980.   N.B. Various transformations are used in the table on
4    pages 244-261 of the latter.
5   
6    Variables in order:

Checkout kaggle notebook on how to extract dataset/dataframe from URL


I do think requests is the best option. Also note the possibility of setting encoding manually.

import requests
response = requests.get("http://www.gutenberg.org/files/10/10-0.txt")
# response.encoding = "utf-8"
hehe = response.text

Another way in Python 3 is to use the urllib3 package.

import urllib3

http = urllib3.PoolManager()
response = http.request('GET', target_url)
data = response.data.decode('utf-8')

This can be a better option than urllib since urllib3 boasts having

  • Thread safety.
  • Connection pooling.
  • Client-side SSL/TLS verification.
  • File uploads with multipart encoding.
  • Helpers for retrying requests and dealing with HTTP redirects.
  • Support for gzip and deflate encoding.
  • Proxy support for HTTP and SOCKS.
  • 100% test coverage.

import urllib2
for line in urllib2.urlopen("http://www.myhost.com/SomeFile.txt"):
    print line

The requests library has a simpler interface and works with both Python 2 and 3.

import requests

response = requests.get(target_url)
data = response.text

import urllib2

f = urllib2.urlopen(target_url)
for l in f.readlines():
    print l

Just updating here solution suggested by @ken-kinder for Python 2 to work for Python 3:

import urllib
urllib.request.urlopen(target_url).read()