Parsing nested elements using selenium not working - python

Parsing nested elements using selenium not working - python - python

The picture attached show the structure of the HTML page that I am trying to scrape:
First I retrieve the element league-item and then I am looking for the i item with class name : 'ds-icon-material league-toggle-icon'
Selenium is telling me that it cannot find any item with such name.
Here is my code:
path = r"""chromedriver.exe"""
driver = webdriver.Chrome(executable_path=path)
driver.get(_1bet)
time.sleep(5)
#a = driver.find_element_by_class_name('box-content.box-bordered.box-stick.box-bordered-last')
league1 = driver.find_elements_by_class_name('league-list')[0]
league1.find_element_by_class_name("ds-icon-material league-toggle-icon")
Can you please help me? I dont understand why it isn't working.
Thanks
NB: The website I'm scraping is: https://1bet.com/ca/sports/tennis?time_range=all

I can't access that web page so I can only guess what is going there.
I can figure 2 problems here:
To select element inside element it's better to use XPath starting with a dot .
The element you trying to access having 2 class names. You should use css selector or XPath to locate element according to multiple class names.
So I suggest you trying this:
league1 = driver.find_elements_by_class_name('league-list')[0]
league1.find_element_by_xpath(".//i[#class='ds-icon-material league-toggle-icon']")

Selenium expects single class name - and it adds dot at the beginning to create CSS selector.
But "ds-icon-material league-toggle-icon" is two classes and it will add dot befor first class but not before second class and this makes proble.
You may use directly css selector with all dots
.find_element_by_css_selctor(".ds-icon-material.league-toggle-icon")
or you have to trick Selenium and add missing dots between classes
.find_element_by_class_name("ds-icon-material.league-toggle-icon")
I can't connect with this page to confirm that this is all.

Related

How to find an Element by index in selenium Webdriver for python

This is HTML code of that page
From there I want to access the 2nd element by using class name "maxbutton-1" as it has 3 same buttons and I can't use xpath or any constant selector so want to use the indexing with class and can't find anything to do that in python particular.
Also tried the method used in java to do same thing but it didn't worked.
Link of that same page
just trying to automate the movie downloading process for any movie.
Thank you.

To click on first, second or third button, try to change number of element:
el1 = driver.find_element_by_xpath("(//a[#class='maxbutton-1 maxbutton maxbutton-download-links'])[1]")
el2 = driver.find_element_by_xpath("(//a[#class='maxbutton-1 maxbutton maxbutton-download-links'])[2]")
el3 = driver.find_element_by_xpath("(//a[#class='maxbutton-1 maxbutton maxbutton-download-links'])[3]")
then you can extract element href/link attribute like that:
link = el.get_attribute('href')
or click it like that:
el.click()

Selenium Python - Store XPath in var and extract depther hirachy XPath from var

I sadly couldn't find any resources online for my problem. I'm trying to store elements found by XPath in a list and then loop over the XPath elements in a list to search in that object. But instead of searching in that given object, it seems that selenium is always again looking in the whole site.
Anyone with good knowledge about this? I've seen that:
// Selects nodes in the document from the current node that matches the selection no matter where they are
But I've also tried "/" and it didn't work either.
Instead of giving me the text for each div, it gives me the text from all divs.
My Code:
from selenium import webdriver
driver = webdriver.Chrome()
result_text = []
# I'm looking for all divs with a specific class and store them in a list
divs_found = driver.find_elements_by_xpath("//div[#class='a-fixed-right-grid-col a-col-left']")
# Here seems to be the problem as it seems like instead of "divs_found[1]" it behaves like "driver" an looking on the whole site
hrefs_matching_in_div = divs_found[1].find_elements_by_xpath("//a[contains(#href, '/gp/product/')]")
# Now I'm looking in the found href matches to store the text from it
for href in hrefs_matching_in_div:
result_text.append(href.text)
print(result_text)

You need to add . for immediate child.Try now.
hrefs_matching_in_div = divs_found[1].find_elements_by_xpath(".//a[contains(#href, '/gp/product/')]")

Selenium Python - Finding Elements by Class Name With Spaces

I'm writing a Python program that uses Selenium to navigate to and enter information into search boxes on an advanced search page. This website uses Javascript, and the IDs and Names for each search box change slightly each time the website is loaded, but the Class Names remain consistent. Class names are frequently reused though, so my goal is to use find_elements_by_class_name(classname) and then index through that list.
One box, for example, has the class name x-form-text x-form-field x-form-num-field x-form-empty-field, but I can't use this because selenium considers it a compound class name and throws an error. If I use just a portion of it, such as x-form-text, it can't find the element. My hope is to either find a way to allow the spaces or, if that can't be done, find a way to search for all elements whose class name contains a section of text without spaces, such as x-form-text.
Any help or thoughts would be greatly appreciated!
Edit:
I tried this code:
quantminclass = 'x-form-text.x-form-field.x-form-num-field.x-form-empty-field'
quantmin = '25'
browser.find_elements_by_css_selector(quantminclass)[0].send_keys(quantmin)
But got an error that the list index was out of range, implying that it can't find anything. I inspected the element and that is definitely its class name, so I'm not sure how to proceed.

Those are multiple classes, not a single class with spaces, just use all the classes together.
driver.find_element_by_css_selector('.class1.class2.class3')
In CSS selector a dot . is a class, you can concatenate any number class names

Try converting class name to a CSS selector.
With a CSS selector, a class named x-form-text x-form-field x-form-num-field
turns into .x-form-text.x-form-field.x-form-num-field
So basically just replace spaces with dots and you're good to go.

Since Selenium 4 find_element_by_* is depricated, so you need to use
find_element() [Selenium-doc]
from selenium.webdriver.common.by import By
# By CLASS_NAME
driver.find_element(By.CLASS_NAME, "x-form-text.x-form-field.x-form-num-field.x-form-empty-field")
# By CSS_SELECTOR
driver.find_element(By.CSS_SELECTOR, ".x-form-text.x-form-field.x-form-num-field.x-form-empty-field")
# By XPATH
driver.find_element(By.XPATH, "//*[#class='x-form-text x-form-field x-form-num-field x-form-empty-field']")

If you have class name (or another attrs) with spaces, for example:
<div class="target with space or maybe another-long-text">Test 123</div>
This will work:
driver.find_element_by_xpath("//div[#class='target with space or maybe another-long-text']")

Using selenium to get access class info on website

I am using the following code using Python 3.6 and selenium:
element = driver.find_element_by_class_name("first_result_price")
print(element)
on the website it is like this
`website: span class="first_result_price">712
however if I print element I get a completely different number?
Any suggestions?
many thanks!!

"element" is a type of object called WebElement that Selenium adds. If you want to find the text inside that element, you have to say
element.text
Which should return what you're looking for, '712', albeit in string form.

In Selenium, how do I include a specific node [1] using find_elements_by_css_selector()

In the case that I want the first use of class so I don't have to guess the find_elements_by_xpath(), what are my options for this? The goal is to write less code, assuring any changes to the source I am scraping can be fixed easily. Is it possible to essentially
find_elements_by_css_selector('source[1]')
This code does not work as is though.
I am using selenium with Python and will likely be using phantomJS as the webdriver (Firefox for testing).

In CSS Selectors, square brackets select attributes, so your sample code is trying to select the 'source' type element with an attribute named 1, eg
<source 1="your_element" />
Whereas I gather you're trying to find the first in a list that looks like this:
<source>Blah</source>
<source>Rah</source>
If you just want the first matching element, you can use the singular form:
element = find_element_by_css_selector("source")
The form you were using returns a list, so you're also able to get the n-1th element to find the nth instance on the page (Lists index from 0):
element = find_elements_by_css_selector("source")[0]
Finally, if you want your CSS selectors to be completely explicit in which element they're finding, you can use the nth-of-type selector:
element = find_element_by_css_selector("source:nth-of-type(1)")
You might find some other helpful information at this blog post from Sauce Labs to help you write flexible selectors to replace your XPath.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Parsing nested elements using selenium not working - python - python

Related

How to find an Element by index in selenium Webdriver for python

Selenium Python - Store XPath in var and extract depther hirachy XPath from var

Selenium Python - Finding Elements by Class Name With Spaces

Using selenium to get access class info on website

In Selenium, how do I include a specific node [1] using find_elements_by_css_selector()

Categories

Resources