[python] Get webpage contents with Python?

I'm using Python 3.1, if that helps.

Anyways, I'm trying to get the contents of this webpage. I Googled for a little bit and tried different things, but they didn't work. I'm guessing that this should be an easy task, but...I can't get it. :/.

Results of urllib, urllib2:

>>> import urllib2
Traceback (most recent call last):
  File "<pyshell#0>", line 1, in <module>
    import urllib2
ImportError: No module named urllib2
>>> import urllib
>>> urllib.urlopen("http://www.python.org")
Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    urllib.urlopen("http://www.python.org")
AttributeError: 'module' object has no attribute 'urlopen'
>>> 

Python 3 solution

Thank you, Jason. :D.

import urllib.request
page = urllib.request.urlopen('http://services.runescape.com/m=hiscore/ranking?table=0&category_type=0&time_filter=0&date=1519066080774&user=zezima')
print(page.read())

This question is related to python python-3.x

The answer is


A solution with works with Python 2.X and Python 3.X:

try:
    # For Python 3.0 and later
    from urllib.request import urlopen
except ImportError:
    # Fall back to Python 2's urllib2
    from urllib2 import urlopen

url = 'http://hiscore.runescape.com/index_lite.ws?player=zezima'
response = urlopen(url)
data = str(response.read())

Suppose you want to GET a webpage's content. The following code does it:

# -*- coding: utf-8 -*-
# python

# example of getting a web page

from urllib import urlopen
print urlopen("http://xahlee.info/python/python_index.html").read()

If you ask me. try this one

import urllib2
resp = urllib2.urlopen('http://hiscore.runescape.com/index_lite.ws?player=zezima')

and read the normal way ie

page = resp.read()

Good luck though


You can use urlib2 and parse the HTML yourself.

Or try Beautiful Soup to do some of the parsing for you.


The best way to do this these day is to use the 'requests' library:

import requests
response = requests.get('http://hiscore.runescape.com/index_lite.ws?player=zezima')
print (response.status_code)
print (response.content)

Mechanize is a great package for "acting like a browser", if you want to handle cookie state, etc.

http://wwwsearch.sourceforge.net/mechanize/


Also you can use faster_than_requests package. That's very fast and simple:

import faster_than_requests as r
content = r.get2str("http://test.com/")

Look at this comparison:

enter image description here