[python] How can I login to a website with Python?

How can I do it? I was trying to enter some specified link (with urllib), but to do it, I need to log in.

I have this source from the site:

<form id="login-form" action="auth/login" method="post">
    <div>
    <!--label for="rememberme">Remember me</label><input type="checkbox" class="remember" checked="checked" name="remember me" /-->
    <label for="email" id="email-label" class="no-js">Email</label>
    <input id="email-email" type="text" name="handle" value="" autocomplete="off" />
    <label for="combination" id="combo-label" class="no-js">Combination</label>
    <input id="password-clear" type="text" value="Combination" autocomplete="off" />
    <input id="password-password" type="password" name="password" value="" autocomplete="off" />
    <input id="sumbitLogin" class="signin" type="submit" value="Sign In" />

Is this possible?

This question is related to python automation httpclient webautomation

The answer is


Typically you'll need cookies to log into a site, which means cookielib, urllib and urllib2. Here's a class which I wrote back when I was playing Facebook web games:

import cookielib
import urllib
import urllib2

# set these to whatever your fb account is
fb_username = "[email protected]"
fb_password = "secretpassword"

class WebGamePlayer(object):

    def __init__(self, login, password):
        """ Start up... """
        self.login = login
        self.password = password

        self.cj = cookielib.CookieJar()
        self.opener = urllib2.build_opener(
            urllib2.HTTPRedirectHandler(),
            urllib2.HTTPHandler(debuglevel=0),
            urllib2.HTTPSHandler(debuglevel=0),
            urllib2.HTTPCookieProcessor(self.cj)
        )
        self.opener.addheaders = [
            ('User-agent', ('Mozilla/4.0 (compatible; MSIE 6.0; '
                           'Windows NT 5.2; .NET CLR 1.1.4322)'))
        ]

        # need this twice - once to set cookies, once to log in...
        self.loginToFacebook()
        self.loginToFacebook()

    def loginToFacebook(self):
        """
        Handle login. This should populate our cookie jar.
        """
        login_data = urllib.urlencode({
            'email' : self.login,
            'pass' : self.password,
        })
        response = self.opener.open("https://login.facebook.com/login.php", login_data)
        return ''.join(response.readlines())

You won't necessarily need the HTTPS or Redirect handlers, but they don't hurt, and it makes the opener much more robust. You also might not need cookies, but it's hard to tell just from the form that you've posted. I suspect that you might, purely from the 'Remember me' input that's been commented out.


import cookielib
import urllib
import urllib2

url = 'http://www.someserver.com/auth/login'
values = {'email-email' : '[email protected]',
          'password-clear' : 'Combination',
          'password-password' : 'mypassword' }

data = urllib.urlencode(values)
cookies = cookielib.CookieJar()

opener = urllib2.build_opener(
    urllib2.HTTPRedirectHandler(),
    urllib2.HTTPHandler(debuglevel=0),
    urllib2.HTTPSHandler(debuglevel=0),
    urllib2.HTTPCookieProcessor(cookies))

response = opener.open(url, data)
the_page = response.read()
http_headers = response.info()
# The login cookies should be contained in the cookies variable

For more information visit: https://docs.python.org/2/library/urllib2.html


Websites in general can check authorization in many different ways, but the one you're targeting seems to make it reasonably easy for you.

All you need is to POST to the auth/login URL a form-encoded blob with the various fields you see there (forget the labels for, they're decoration for human visitors). handle=whatever&password-clear=pwd and so on, as long as you know the values for the handle (AKA email) and password you should be fine.

Presumably that POST will redirect you to some "you've successfully logged in" page with a Set-Cookie header validating your session (be sure to save that cookie and send it back on further interaction along the session!).


For HTTP things, the current choice should be: Requests- HTTP for Humans


Web page automation ? Definitely "webbot"

webbot even works web pages which have dynamically changing id and classnames and has more methods and features than selenium or mechanize.

Here's a snippet :)

from webbot import Browser 
web = Browser()
web.go_to('google.com') 
web.click('Sign in')
web.type('[email protected]' , into='Email')
web.click('NEXT' , tag='span')
web.type('mypassword' , into='Password' , id='passwordFieldId') # specific selection
web.click('NEXT' , tag='span') # you are logged in ^_^

The docs are also pretty straight forward and simple to use : https://webbot.readthedocs.io


Let me try to make it simple, suppose URL of the site is www.example.com and you need to sign up by filling username and password, so we go to the login page say http://www.example.com/login.php now and view it's source code and search for the action URL it will be in form tag something like

 <form name="loginform" method="post" action="userinfo.php">

now take userinfo.php to make absolute URL which will be 'http://example.com/userinfo.php', now run a simple python script

import requests
url = 'http://example.com/userinfo.php'
values = {'username': 'user',
          'password': 'pass'}

r = requests.post(url, data=values)
print r.content

I Hope that this helps someone somewhere someday.


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to automation

element not interactable exception in selenium web automation Upload file to SFTP using PowerShell Check if element is clickable in Selenium Java Schedule automatic daily upload with FileZilla How can I start InternetExplorerDriver using Selenium WebDriver How to use Selenium with Python? Excel VBA Automation Error: The object invoked has disconnected from its clients How to type in textbox using Selenium WebDriver (Selenium 2) with Java? Sending email from Command-line via outlook without having to click send R command for setting working directory to source file location in Rstudio

Examples related to httpclient

How to add Apache HTTP API (legacy) as compile-time dependency to build.grade for Android M? PHP GuzzleHttp. How to make a post request with params? C#: HttpClient with POST parameters How to send a Post body in the HttpClient request in Windows Phone 8? The type List is not generic; it cannot be parameterized with arguments [HTTPClient] How to add,set and get Header in request of HttpClient? No MediaTypeFormatter is available to read an object of type 'String' from content with media type 'text/plain' SSL "Peer Not Authenticated" error with HttpClient 4.1 Apache HttpClient Interim Error: NoHttpResponseException Common HTTPclient and proxy

Examples related to webautomation

element not interactable exception in selenium web automation Automatic login script for a website on windows machine? How can I login to a website with Python?