[python] Python, HTTPS GET with basic authentication

Im trying to do a HTTPS GET with basic authentication using python. Im very new to python and the guides seem to use diffrent librarys to do things. (http.client, httplib and urllib). Can anyone show me how its done? How can you tell the standard library to use?

This question is related to python python-3.x

The answer is


using only standard modules and no manual header encoding

...which seems to be the intended and most portable way

the concept of python urllib is to group the numerous attributes of the request into various managers/directors/contexts... which then process their parts:

import urllib.request, ssl

# to avoid verifying ssl certificates
httpsHa = urllib.request.HTTPSHandler(context= ssl._create_unverified_context())

# setting up realm+urls+user-password auth
# (top_level_url may be sequence, also the complete url, realm None is default)
top_level_url = 'https://ip:port_or_domain'
# of the std managers, this can send user+passwd in one go,
# not after HTTP req->401 sequence
password_mgr = urllib.request.HTTPPasswordMgrWithPriorAuth()
password_mgr.add_password(None, top_level_url, "user", "password", is_authenticated=True)

handler = urllib.request.HTTPBasicAuthHandler(password_mgr)
# create OpenerDirector
opener = urllib.request.build_opener(handler, httpsHa)

url = top_level_url + '/some_url?some_query...'
response = opener.open(url)

print(response.read())

A correct way to do basic auth in Python3 urllib.request with certificate validation follows.

Note that certifi is not mandatory. You can use your OS bundle (likely *nix only) or distribute Mozilla's CA Bundle yourself. Or if the hosts you communicate with are just a few, concatenate CA file yourself from the hosts' CAs, which can reduce the risk of MitM attack caused by another corrupt CA.

#!/usr/bin/env python3


import urllib.request
import ssl

import certifi


context = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
context.verify_mode = ssl.CERT_REQUIRED
context.load_verify_locations(certifi.where())
httpsHandler = urllib.request.HTTPSHandler(context = context)

manager = urllib.request.HTTPPasswordMgrWithDefaultRealm()
manager.add_password(None, 'https://domain.com/', 'username', 'password')
authHandler = urllib.request.HTTPBasicAuthHandler(manager)

opener = urllib.request.build_opener(httpsHandler, authHandler)

# Used globally for all urllib.request requests.
# If it doesn't fit your design, use opener directly.
urllib.request.install_opener(opener)

response = urllib.request.urlopen('https://domain.com/some/path')
print(response.read())

Based on the @AndrewCox 's answer with some minor improvements:

from http.client import HTTPSConnection
from base64 import b64encode


client = HTTPSConnection("www.google.com")
user = "user_name"
password = "password"
headers = {
    "Authorization": "Basic {}".format(
        b64encode(bytes(f"{user}:{password}", "utf-8")).decode("ascii")
    )
}
client.request('GET', '/', headers=headers)
res = client.getresponse()
data = res.read()

Note, you should set encoding if you use bytes function instead of b"".


requests.get(url, auth=requests.auth.HTTPBasicAuth(username=token, password=''))

If with token, password should be ''.

It works for me.


Use the power of Python and lean on one of the best libraries around: requests

import requests

r = requests.get('https://my.website.com/rest/path', auth=('myusername', 'mybasicpass'))
print(r.text)

Variable r (requests response) has a lot more parameters that you can use. Best thing is to pop into the interactive interpreter and play around with it, and/or read requests docs.

ubuntu@hostname:/home/ubuntu$ python3
Python 3.4.3 (default, Oct 14 2015, 20:28:29)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> r = requests.get('https://my.website.com/rest/path', auth=('myusername', 'mybasicpass'))
>>> dir(r)
['__attrs__', '__bool__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__nonzero__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_content', '_content_consumed', 'apparent_encoding', 'close', 'connection', 'content', 'cookies', 'elapsed', 'encoding', 'headers', 'history', 'iter_content', 'iter_lines', 'json', 'links', 'ok', 'raise_for_status', 'raw', 'reason', 'request', 'status_code', 'text', 'url']
>>> r.content
b'{"battery_status":0,"margin_status":0,"timestamp_status":null,"req_status":0}'
>>> r.text
'{"battery_status":0,"margin_status":0,"timestamp_status":null,"req_status":0}'
>>> r.status_code
200
>>> r.headers
CaseInsensitiveDict({'x-powered-by': 'Express', 'content-length': '77', 'date': 'Fri, 20 May 2016 02:06:18 GMT', 'server': 'nginx/1.6.3', 'connection': 'keep-alive', 'content-type': 'application/json; charset=utf-8'})

Update: OP uses Python 3. So adding an example using httplib2

import httplib2

h = httplib2.Http(".cache")

h.add_credentials('name', 'password') # Basic authentication

resp, content = h.request("https://host/path/to/resource", "POST", body="foobar")

The below works for python 2.6:

I use pycurl a lot in production for a process which does upwards of 10 million requests per day.

You'll need to import the following first.

import pycurl
import cStringIO
import base64

Part of the basic authentication header consists of the username and password encoded as Base64.

headers = { 'Authorization' : 'Basic %s' % base64.b64encode("username:password") }

In the HTTP header you will see this line Authorization: Basic dXNlcm5hbWU6cGFzc3dvcmQ=. The encoded string changes depending on your username and password.

We now need a place to write our HTTP response to and a curl connection handle.

response = cStringIO.StringIO()
conn = pycurl.Curl()

We can set various curl options. For a complete list of options, see this. The linked documentation is for the libcurl API, but the options does not change for other language bindings.

conn.setopt(pycurl.VERBOSE, 1)
conn.setopt(pycurlHTTPHEADER, ["%s: %s" % t for t in headers.items()])

conn.setopt(pycurl.URL, "https://host/path/to/resource")
conn.setopt(pycurl.POST, 1)

If you do not need to verify certificate. Warning: This is insecure. Similar to running curl -k or curl --insecure.

conn.setopt(pycurl.SSL_VERIFYPEER, False)
conn.setopt(pycurl.SSL_VERIFYHOST, False)

Call cStringIO.write for storing the HTTP response.

conn.setopt(pycurl.WRITEFUNCTION, response.write)

When you're making a POST request.

post_body = "foobar"
conn.setopt(pycurl.POSTFIELDS, post_body)

Make the actual request now.

conn.perform()

Do something based on the HTTP response code.

http_code = conn.getinfo(pycurl.HTTP_CODE)
if http_code is 200:
   print response.getvalue()