Python Selenium accessing HTML source

Question

How can I get the HTML source in a variable using the Selenium module with Python   I wanted to do something like this   from selenium import webdriver  browser   webdriver Firefox   browser get  http   example com   if  whatever  in html source        Do something else        Do something else   How can I do this  I don t know how to access the HTML source

User · Answer

By using the page source you will get the whole HTML code.
So first decide the block of code or tag in which you require to retrieve the data or to click the element..

options = driver.find_elements_by_name_("XXX")
for option in options:
    if option.text == "XXXXXX":
        print(option.text)
        option.click()

You can find the elements by name, XPath, id, link and CSS path.

User · Answer

You need to access the page source property   from selenium import webdriver  browser   webdriver Firefox   browser get  http   example com    html source   browser page source if  whatever  in html source        do something else        do something else

User · Answer

I d recommend getting the source with urllib and  if you re going to parse  use something like Beautiful Soup   import urllib  url   urllib urlopen  http   example com     Open the URL  content   url readlines     Read the source and save it to a variable

User · Answer

With Selenium2Library you can use get source    import Selenium2Library s   Selenium2Library Selenium2Library   s open browser  localhost 7080    firefox   source   s get source

User · Answer

driver page source will help you get the page source code  You can check if the text is present in the page source or not   from selenium import webdriver driver   webdriver Firefox   driver get  some url   if  your text here  in driver page source      print  Found it    else      print  Did not find it      If you want to store the page source in a variable  add below line after driver get   var pgsource driver page source   and change the if condition to   if  your text here  in var pgsource

User · Answer

To answer your question about getting the URL to use for urllib  just execute this JavaScript code   url   browser execute script  return window location

User · Answer

from bs4 import BeautifulSoup from selenium import webdriver  driver   webdriver Chrome   html source code   driver execute script  return document body innerHTML    html soup  BeautifulSoup   BeautifulSoup html source code   html parser     Now you can apply BeautifulSoup function to extract data

User · Answer

You can simply use the WebDriver object  and access to the page source code via its  property field page source     Try this code snippet  -   from selenium import webdriver driver   webdriver Firefox  path to executable   driver get  https   some-domain com   source   driver page source if  stuff  in source      print  found      else      print  not in source

[python] Python Selenium accessing HTML source

Examples related to python

Examples related to selenium

Examples related to selenium-webdriver