[python] Get HTML source of WebElement in Selenium WebDriver using Python

I'm using the Python bindings to run Selenium WebDriver:

from selenium import webdriver
wd = webdriver.Firefox()

I know I can grab a webelement like so:

elem = wd.find_element_by_css_selector('#my-id')

And I know I can get the full page source with...

wd.page_source

But is there a way to get the "element source"?

elem.source   # <-- returns the HTML as a string

The Selenium WebDriver documentation for Python are basically non-existent and I don't see anything in the code that seems to enable that functionality.

What is the best way to access the HTML of an element (and its children)?

The answer is


Java with Selenium 2.53.0

driver.getPageSource();

Using the attribute method is, in fact, easier and more straightforward.

Using Ruby with the Selenium and PageObject gems, to get the class associated with a certain element, the line would be element.attribute(Class).

The same concept applies if you wanted to get other attributes tied to the element. For example, if I wanted the string of an element, element.attribute(String).


This works seamlessly for me.

element.get_attribute('innerHTML')

WebElement element = driver.findElement(By.id("foo"));
String contents = (String)((JavascriptExecutor)driver).executeScript("return arguments[0].innerHTML;", element); 

This code really works to get JavaScript from source as well!


Sure we can get all HTML source code with this script below in Selenium Python:

elem = driver.find_element_by_xpath("//*")
source_code = elem.get_attribute("outerHTML")

If you you want to save it to file:

with open('c:/html_source_code.html', 'w') as f:
    f.write(source_code.encode('utf-8'))

I suggest saving to a file because source code is very very long.


In Ruby, using selenium-webdriver (2.32.1), there is a page_source method that contains the entire page source.


It looks outdated, but let it be here anyway. The correct way to do it in your case:

elem = wd.find_element_by_css_selector('#my-id')
html = wd.execute_script("return arguments[0].innerHTML;", elem)

or

html = elem.get_attribute('innerHTML')

Both are working for me (selenium-server-standalone-2.35.0).


InnerHTML will return the element inside the selected element and outerHTML will return the inside HTML along with the element you have selected

Example:

Now suppose your Element is as below

<tr id="myRow"><td>A</td><td>B</td></tr>

innerHTML element output

<td>A</td><td>B</td>

outerHTML element output

<tr id="myRow"><td>A</td><td>B</td></tr>

Live Example:

http://www.java2s.com/Tutorials/JavascriptDemo/f/find_out_the_difference_between_innerhtml_and_outerhtml_in_javascript_example.htm

Below you will find the syntax which require as per different binding. Change the innerHTML to outerHTML as per required.

Python:

element.get_attribute('innerHTML')

Java:

elem.getAttribute("innerHTML");

If you want whole page HTML, use the below code:

driver.getPageSource();

If you are interested in a solution for Selenium Remote Control in Python, here is how to get innerHTML:

innerHTML = sel.get_eval("window.document.getElementById('prodid').innerHTML")

I hope this could help: http://selenium.googlecode.com/svn/trunk/docs/api/java/org/openqa/selenium/WebElement.html

Here is described Java method:

java.lang.String    getText() 

But unfortunately it's not available in Python. So you can translate the method names to Python from Java and try another logic using present methods without getting the whole page source...

E.g.

 my_id = elem[0].get_attribute('my-id')

The method to get the rendered HTML I prefer is the following:

driver.get("http://www.google.com")
body_html = driver.find_element_by_xpath("/html/body")
print body_html.text

However, the above method removes all the tags (yes, the nested tags as well) and returns only text content. If you interested in getting the HTML markup as well, then use the method below.

print body_html.getAttribute("innerHTML")

There is not really a straightforward way of getting the HTML source code of a webelement. You will have to use JavaScript. I am not too sure about python bindings, but you can easily do like this in Java. I am sure there must be something similar to JavascriptExecutor class in Python.

 WebElement element = driver.findElement(By.id("foo"));
 String contents = (String)((JavascriptExecutor)driver).executeScript("return arguments[0].innerHTML;", element);

And in PHPUnit Selenium test it's like this:

$text = $this->byCssSelector('.some-class-nmae')->attribute('innerHTML');

The other answers provide a lot of details about retrieving the markup of a WebElement. However, an important aspect is, modern websites are increasingly implementing JavaScript, ReactJS, jQuery, Ajax, Vue.js, Ember.js, GWT, etc. to render the dynamic elements within the DOM tree. Hence there is a necessity to wait for the element and its children to completely render before retrieving the markup.


Python

Hence, ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

  • Using get_attribute("outerHTML"):

    element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#my-id")))
    print(element.get_attribute("outerHTML"))
    
  • Using execute_script():

    element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#my-id")))
    print(driver.execute_script("return arguments[0].outerHTML;", element))
    
  • Note: You have to add the following imports:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to selenium

SessionNotCreatedException: Message: session not created: This version of ChromeDriver only supports Chrome version 81 session not created: This version of ChromeDriver only supports Chrome version 74 error with ChromeDriver Chrome using Selenium Selenium: WebDriverException:Chrome failed to start: crashed as google-chrome is no longer running so ChromeDriver is assuming that Chrome has crashed WebDriverException: unknown error: DevToolsActivePort file doesn't exist while trying to initiate Chrome Browser Class has been compiled by a more recent version of the Java Environment How to configure ChromeDriver to initiate Chrome browser in Headless mode through Selenium? How to make Firefox headless programmatically in Selenium with Python? element not interactable exception in selenium web automation Selenium Web Driver & Java. Element is not clickable at point (x, y). Other element would receive the click How do you fix the "element not interactable" exception?

Examples related to selenium-webdriver

SessionNotCreatedException: Message: session not created: This version of ChromeDriver only supports Chrome version 81 Selenium: WebDriverException:Chrome failed to start: crashed as google-chrome is no longer running so ChromeDriver is assuming that Chrome has crashed WebDriverException: unknown error: DevToolsActivePort file doesn't exist while trying to initiate Chrome Browser How to configure ChromeDriver to initiate Chrome browser in Headless mode through Selenium? How to make Firefox headless programmatically in Selenium with Python? Selenium Web Driver & Java. Element is not clickable at point (x, y). Other element would receive the click How do you fix the "element not interactable" exception? Scrolling to element using webdriver? Only local connections are allowed Chrome and Selenium webdriver Check if element is clickable in Selenium Java

Examples related to webdriver

WebDriverException: unknown error: DevToolsActivePort file doesn't exist while trying to initiate Chrome Browser Selenium Web Driver & Java. Element is not clickable at point (x, y). Other element would receive the click How do you fix the "element not interactable" exception? ImportError: No module named 'selenium' how to use List<WebElement> webdriver How to get attribute of element from Selenium? Selenium Finding elements by class name in python Open web in new tab Selenium + Python When running WebDriver with Chrome browser, getting message, "Only local connections are allowed" even though browser launches properly How can I start InternetExplorerDriver using Selenium WebDriver

Examples related to automated-tests

Message "Async callback was not invoked within the 5000 ms timeout specified by jest.setTimeout" Select a date from date picker using Selenium webdriver How can I scroll a web page using selenium webdriver in python? How to find specific lines in a table using Selenium? Debugging "Element is not clickable at point" error Running Selenium WebDriver python bindings in chrome Get HTML source of WebElement in Selenium WebDriver using Python Selenium C# WebDriver: Wait until element is present Random "Element is no longer attached to the DOM" StaleElementReferenceException Scroll Element into View with Selenium