How to automate browsing using python

Question

suppose  I need to perform a set of procedure on a particular website say  fill some forms  click submit button  send the data back to server  receive the response  again do something based on the response and send the data back to the server of the website  I know there is a webbrowser module in python  but I want to do this without invoking any web browser  It hast to be a pure script   Is there a module available in python  which can help me do that  thanks

User · Answer

There are plenty of built in python modules that whould help with this. For example urllib and htmllib.

The problem will be simpler if you change the way you're approaching it. You say you want to "fill some forms, click submit button, send the data back to server, recieve the response", which sounds like a four stage process.

In fact, what you need to do is post some data to a webserver and get a response.

This is as simple as:

>>> import urllib
>>> params = urllib.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
>>> f = urllib.urlopen("http://www.musi-cal.com/cgi-bin/query", params)
>>> print f.read()

(example taken from the urllib docs).

What you do with the response depends on how complex the HTML is and what you want to do with it. You might get away with parsing it using a regular expression or two, or you can use the htmllib.HTMLParser class, or maybe a higher level more flexible parser like Beautiful Soup.

User · Answer

I think the best solutions is the mix of requests and BeautifulSoup  I just wanted to update the question so it can be kept updated

User · Answer

For automation you definitely might wanna check out

webbot

Its is based on selenium and offers lot more features with very little code like automatically finding elements to perform actions like click , type based on the your parameters.

Its even works for sites with dynamically changing class names and ids .

Here is doc : https://webbot.readthedocs.io/

User · Answer

You may have a look at these slides from the last italian pycon (pdf): The author listed most of the library for doing scraping and autoted browsing in python. so you may have a look at it.

I like very much twill (which has already been suggested), which has been developed by one of the authors of nose and it is specifically aimed at testing web sites.

User · Answer

You can also take a look at mechanize  Its meant to handle  stateful programmatic  web browsing   as per their site

User · Answer

Selenium http   www seleniumhq org  is the best solution for me  you can code it with python  java  or anything programming language you like with ease  and easy simulation that convert into program

User · Answer

Internet Explorer specific  but rather good   http   pamie sourceforge net   The advantage compared to urllib BeautifulSoup is that it executes Javascript as well since it uses IE

User · Answer

I have found the iMacros Firefox plugin (which is free) to work very well.

It can be automated with Python using Windows COM object interfaces. Here's some example code from http://wiki.imacros.net/Python. It requires Python Windows Extensions:

import win32com.client
def Hello():
    w=win32com.client.Dispatch("imacros")
    w.iimInit("", 1)
    w.iimPlay("Demo\\FillForm")
if __name__=='__main__':
    Hello()

User · Answer

Do not forget zope testbrowser which is wrapper around mechanize       zope testbrowser provides an easy-to-use programmable web browser with special focus on testing

User · Answer

HTMLUNIT is the package if you re a java developer  http   htmlunit sourceforge net apidocs index html

User · Answer

You likely want urllib2  It can handle things like HTTPS  cookies  and authentication  You will probably also want BeautifulSoup to help parse the HTML pages

User · Answer

Selenium2 includes webdriver  which has python bindings and allows one to use the headless htmlUnit driver  or switch to firefox or chrome for graphical debugging

User · Answer

The best solution that i have found  and currently implementing  is   - scripts in python using selenium webdriver - PhantomJS headless browser  if firefox is used you will have a GUI and will be slower

User · Answer

All answers are old  I recommend and I am a big fan of requests  From homepage      Python   s standard urllib2 module provides most of the HTTP   capabilities you need  but the API is thoroughly broken  It was built   for a different time     and a different web  It requires an enormous   amount of work  even method overrides  to perform the simplest of   tasks       Things shouldn t be this way  Not in Python

User · Answer

selenium will do exactly what you want and it handles javascript

User · Answer

httplib2   beautifulsoup  Use firefox   firebug   httpreplay to see what the javascript passes to and from the browser from the website  Using httplib2 you can essentially do the same via post and get

[python] How to automate browsing using python?

The answer is

Examples related to python

Examples related to browser-automation

Tags