In the HTML which you have shared:
<div id="a">This is some
<div id="b">text</div>
</div>
The text This is some
is within a text node. To depict the text node in a structured way:
<div id="a">
This is some
<div id="b">text</div>
</div>
To extract and print the text This is some
from the text node using Selenium's python client you have 2 ways as follows:
Using splitlines()
: You can identify the parent element i.e. <div id="a">
, extract the innerHTML
and then use splitlines()
as follows:
using xpath:
print(driver.find_element_by_xpath("//div[@id='a']").get_attribute("innerHTML").splitlines()[0])
using xpath:
print(driver.find_element_by_css_selector("div#a").get_attribute("innerHTML").splitlines()[0])
Using execute_script()
: You can also use the execute_script()
method which can synchronously execute JavaScript in the current window/frame as follows:
using xpath and firstChild:
parent_element = driver.find_element_by_xpath("//div[@id='a']")
print(driver.execute_script('return arguments[0].firstChild.textContent;', parent_element).strip())
using xpath and childNodes[n]:
parent_element = driver.find_element_by_xpath("//div[@id='a']")
print(driver.execute_script('return arguments[0].childNodes[1].textContent;', parent_element).strip())