Selenium Python how to get text(html source) from <div> - python

I'm trying to get text $27.5 inside tag <div>, I located the element by id and the element is called "price".
The snippet of html is as follows:
<div id="PPP,BOSSST,NYCPAS,2015-04-26T01:00:00-04:00,2015-04-26T05:20:00-04:00,_price" class="price inlineBlock strong mediumText">$27.50</div>
Here is what I've tried
price.text
price.get_attribute('value')
Both of the above doesn't work.
Update:
Thanks for everyone that tries to help.
I combined your answers together and got the solution:)
price = driver.find_element_by_xpath("//div[#class='price inlineBlock strong mediumText']")
price_content = price.get_attribute('innerHTML')
print price_content.strip()

Can't you use a regular expression or Beautiful Soup to find the contents of the element in HTML:
re.search(r'<div.*?>(*.?)</div>', price.get_attribute('innerHTML')).group(1)

You element is hidden, last I worked with Selenium you were not able to get text of hidden elements. That said, you can always execute javascript, I dont usually write in python, but it should be something like:
def val = driver.execute_script("return document.getElementById('locator').innerHTML")

Change the css selector to
div[id$='_price']
Complete code
price = fltright.find_element(By.CSS_SELECTOR, "div[id$='_price']")
price.text

I tried your edited solution, but they only get 1 div having class. So, I tried these below to print a List of div having the same class.
Changing element to elements will output a List:
price = driver.find_elements_by_xpath('//div[#class = "price inlineBlock strong mediumText"]')
Use for ... in range () to print a List:
num = len (price)
for i in range (num):
print (price[i].text)

browser.find_element_by_xpath("//form[#id='workQueueTaskListForm']/div[1]/p").text

Related

Selenium Python - how to get deeply nested element

I am exploring Selenium Python and trying to grab a name property from Linkedin page in order to get its index later.
This is the HTML:
Here is how I try to do it:
all_span = driver.find_elements(By.TAG_NAME, "span")
all_span = [s for s in all_span if s.get_attribute("aria-hidden") == "true"]
counter = 1
for i in all_span:
print(counter)
print(i.text)
counter += 1
The problem is there are other spans on the same page that also have aria-hidden=true attribute, but not relevant and that messes up the index.
So I need to reach that span that contains name from one of its its parent divs but I don't know how.
Looking at documentation here: https://selenium-python.readthedocs.io/locating-elements.html# I cant seem to find how to target deeply neseted elements.
I need to get the name that is in span element.
The link
The best way would be to use xpath. https://selenium-python.readthedocs.io/locating-elements.html#locating-by-xpath
Let's say you have this:
<div id="this-div-contains-the-span-i-want">
<span aria-hidden="true">
<!--
...
//-->
</span>
</div>
Then, using xpath:
xpath = "//div[#id='this-div-contains-the-span-i-want']/span[#aria-hidden='true']"
span_i_want = driver.find_element(By.XPATH, xpath)
So, in your example, you could use:
xpath = "//a[#class='app-aware-link']/span[#dir='ltr']/span[#aria-hidden='true']"
span_i_want = driver.find_element(By.XPATH, xpath)
print(span_i_want.text)
No typos but
print(span_i_want) - returns [] empty array

Python & Selenium: what's the best way to hierarchically select data from html elements?

As an exercise for learning Python and Selenium, I'm trying to write a script that checks a web page with all kinds of commercial deals, find all the specific food deals (class name 'tag-food'), put them in a list (elem), then check which ones contain the text 'sushi', and for those elements extract the html element which contains price. And print the results.
I have:
elem = driver.find_elements_by_class_name('tag-food')
i = 0
while i < len(elem):
source_code = elem[i].get_attribute("innerHTML")
# ?? how to check if source_code contains 'sushi'?
# ?? if true how to extract price data?
i = i + 1
driver.quit()
What's the best and most direct way to do these checks? Thanks! 🙏
I don't think you need a while loop for this. Also, you would be looking for a text value, not innerHTML
You can make it more simple like this:
for row in driver.find_elements_by_class_name('tag-food'):
if "sushi" in row.get_attribute("innerText"):
print("Yes this item has sushi")
# find element to grab price, store in variable to do something else with
else:
print("No sushi in this item")
Or even just this, depending on how the text in the HTML is structured:
for row in driver.find_elements_by_class_name('tag-food'):
if "sushi" in row.text:
print("Yes this item has sushi")
# find element to grab price, store in variable to do something else with
else:
print("No sushi in this item")

Having trouble getting some text. Python. Selenium

Trying to get the Finance data from this div. There is no unique identifier for this div. So, I collect all 3-4 divs check if the word FINANSE appears in the text, if it does, then get the inner div text. However, it doesn't seem to work. Any other approach or what am I missing here? Thanks in advance.
link = https://rejestr.io/krs/882875/fortuna-cargo
fin_divs = driver.find_elements_by_css_selector('div.card.mb-4')
for div in fin_divs:
if 'FINANSE' in div.text:
finances = div.find_element_by_css_selector('div.card-body').text
else:
finances = "Finance Data Not Available"
You can simplify your code to select exact element instead of looping through list of elements:
finances = driver.find_element_by_xpath('//div[div="Finanse"]/div[#class="card-body"]').text
print(finances)
>>>Kapitał zakładowy
>>>5 tys. zł
You are doing everything correct, just add break into the if statement to not overwrite finances to "Finance Data Not Available" after finding correct one:
fin_divs = driver.find_elements_by_css_selector('div.card.mb-4')
for div in fin_divs:
if 'FINANSE' in div.text:
finances = div.find_element_by_css_selector('div.card-body').text
break
else:
finances = "Finance Data Not Available"

Finding all child elements in an html page using python selenium webdriver

I want to extract all h2 elements of the div element. The code that I've used is this:
browser = webdriver.Chrome()
browser.get("https://www.mmorpg.com/play-now")
time.sleep(2)
item_list_new=[]
link = browser.find_element_by_xpath("//div[#class='freegamelist']")
names = link.find_element_by_tag_name('h2')
x = names.text
item_list_new.append(x)
print(item_list_new)
But when I run this, I only get the first 'h2' element of the div element.
Can somebody tell me what am I doing wrong and also please guide me with the correct way of doing it?
Thanks in advance.
you need to write names = link.find_elements_by_tag_name('h2')
Your code should be
browser = webdriver.Chrome()
browser.get("https://www.mmorpg.com/play-now")
time.sleep(2)
item_list_new=[]
link = browser.find_element_by_xpath("//div[#class='freegamelist']")
names = link.find_elements_by_tag_name('h2')
x = names.text
item_list_new.append(x)
print(item_list_new)
find_element_by_tag_name gives the first element and find_elements_by_tag_name gives all the matching elements
Try to get all header values as below:
link = browser.find_element_by_xpath("//div[#class='freegamelist']")
names = link.find_elements_by_tag_name('h2')
item_list_new = [x.text for x in names]
print(item_list_new)
or you can simplify
names = browser.find_elements_by_xpath("//div[#class='freegamelist']//h2")
item_list_new = [x.text for x in names]
print(item_list_new)
You actually want to use the function find_elements_by_tag_name that sounds almost similar, as pointed out here.

xpath preceding-sibling solution

Now I need to get content of class odd or just text from <td> 161.5 </td>
so i wrote:
element = driver.find_elements_by_xpath('//td[span[#class=" odds-wrap " and #eu="1.90"]]/preceding-sibling::td')
and and it works.
My question is: Is it possible to get the same content using one more condition, for example title="bet365".. So, I want to get the same result, but using one more condition from another sibling element..
edit
element = driver.find_elements_by_xpath('//tr[preceding-sibling::td/span[#class=" odds-wrap " and #eu="1.90"] and following-sibling::td/div/a[#title="bet365"]]')
for ele in element:
print(ele.text)
not find and print anything, I do not know why
You can combine preceding-sibling and following-sibling:
//td[following-sibling::td/span[#class=" odds-wrap " and #eu="1.90"] and preceding-sibling::td/a[#title="bet365"]]
Use the following xpath
//span[#class=" odds-wrap " and #eu="1.90"]/preceding::a[#title='bet365']/following::td[1]

Categories