I am exploring Selenium Python and trying to grab a name property from Linkedin page in order to get its index later.
This is the HTML:
Here is how I try to do it:
all_span = driver.find_elements(By.TAG_NAME, "span")
all_span = [s for s in all_span if s.get_attribute("aria-hidden") == "true"]
counter = 1
for i in all_span:
print(counter)
print(i.text)
counter += 1
The problem is there are other spans on the same page that also have aria-hidden=true attribute, but not relevant and that messes up the index.
So I need to reach that span that contains name from one of its its parent divs but I don't know how.
Looking at documentation here: https://selenium-python.readthedocs.io/locating-elements.html# I cant seem to find how to target deeply neseted elements.
I need to get the name that is in span element.
The link
The best way would be to use xpath. https://selenium-python.readthedocs.io/locating-elements.html#locating-by-xpath
Let's say you have this:
<div id="this-div-contains-the-span-i-want">
<span aria-hidden="true">
<!--
...
//-->
</span>
</div>
Then, using xpath:
xpath = "//div[#id='this-div-contains-the-span-i-want']/span[#aria-hidden='true']"
span_i_want = driver.find_element(By.XPATH, xpath)
So, in your example, you could use:
xpath = "//a[#class='app-aware-link']/span[#dir='ltr']/span[#aria-hidden='true']"
span_i_want = driver.find_element(By.XPATH, xpath)
print(span_i_want.text)
No typos but
print(span_i_want) - returns [] empty array
Related
I have a group of elements on the page that looks like:
<div class="course-lesson__course-wrapper" data-qa-level="z1">
<div class="course-lesson__course-title">
<section class="course-lesson__wrap" data-qa-lesson="trial">
<section class="course-lesson__wrap" data-qa-lesson="trial">
There are several pages with this layout. I want to get a list of all the elements in z1 and then click on them if it is a data-qa-lesson="trial"
I have this code
#finds all the elements for z1 - ...etc
listofA1 = driver.find_element(By.CSS_SELECTOR, "div.course-lesson__course-wrapper:nth-child(1)")
for elemen in listofA1:
#checks for the attribute i need to see if it's clickable
elementcheck = elemen.getAttribute("data-qa-lesson")
if elementcheck == "objective":
elemen.click()
#do some stuff then go back to main and begin again on the next element
driver.get(home_link)
But it does not seem to work
To avoid StaleElementException you can try this approach:
count = len(driver.find_elements(By.XPATH, '//div[#data-qa-level="z1"]//div[#data-qa-level="trial"]')) # Get count of elements
for i in range(count):
driver.find_elements(By.XPATH, '//div[#data-qa-level="z1"]//div[#data-qa-level="trial"]')[i].click() # click current element
# Do what you need
driver.get(home_link)
i want to store elements in a list and click on each of them one by one. But I have to reach the < u > - Element, because .click() only works, when I click on that < u > - xpath and not on the class-Element.
<a href="javascript:doQuery(...);" class="report">
<u>Test1</u>
<a href="javascript:doQuery(...);" class="report">
<u>Test2</u>
<a href="javascript:doQuery(...);" class="report">
<u>Test3</u>
Any tips? Thanks.
You can use a CSS selector for this.
selector = '.report > u'
elements = driver.find_elements_by_css_selector(selector)
for elem in elements:
elem.click()
This selector .report > u will select all <u> elements whose parent is an element with the report class.
reference
how would I get the text "Premier League (ENG 1)" extracted from this HTML tree? (marked part)
I treid ti get the text with xpath, css selector, class... but I seem to cant get this text extracted.
Basically I want to create a list and go over all "class=with icon" elements that include a text (League) and append the text to that list.
This was my last attempt:
def scrape_test():
alleligen = []
#click the dropdown menue to open the folder with all the leagues
league_dropdown_menue = driver.find_element_by_xpath('/html/body/main/section/section/div[2]/div/div[2]/div/div[1]/div[1]/div[7]/div')
liga_dropdown_menue.click()
time.sleep(1)
#get text form all elements that conain a league as text
leagues = driver.find_elements_by_css_selector('body > main > section > section > div.ut-navigation-container-view--content > div > div.ut-pinned-list-container.ut-content-container > div > div.ut-pinned-list > div.ut-item-search-view > div.inline-list-select.ut-search-filter-control.has-default.has-image.is-open.active > div > ul > li:nth-child(3)')
#append to list
alleligen.append(leagues)
print(alleligen)
But I dont get any output.
What am I missing here?
(I am new to coding)
try this
path = "//ul[#class='inline-list']//li[first()+1]"
element = WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.XPATH, path))).text
print(element)
the path specifies the element you want to target. the first // in the path means that the element you want to find is not the first element in the page and exists somewhere in the page. li[first()+1] states that you are interested in the li tag after the first li.
The WebDriverWait waits for the webpage to load completely for a specified number of seconds (in this case, 5). You might want to put the WebdriverWait inside a try block.
The .text in the end parses the text from the tag. In this case it is the text you want Premier League (ENG 1)
Can you try :
leagues = driver.find_elements_by_xpath(“//li[#class=‘with-icon’ and contains(text(), ‘League’)]”)
For league in leagues:
alleligen.append(league.text)
print(alleligen)
If you know that your locator will remain on the same position in that list tree, you can use the following where the li element is taken based on its index:
locator= "//ul[#class='inline-list']//li[2]"
element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, locator))).text
The html snippet is like this:
<div class="busi-attr">
<p><span class="attrName">Min. Order: </span>1 Piece</p>
<p><span class="attrName">Supply Ability: </span>10,000 Piece/Pieces per Month</p>
</div>
I only want the second <p> element that is amount 10,000 and nothing else, how do i do that? thanks
Try xpath with a list comprehension that checks the text value of an element:
span=[x for x in yourwebelement.xpath('//span[#class="attrName"]') if 'supply ability' in x.text.lower()][0]
This is only the span of course, but all you need now is the parent
p=span.xpath('.//parent::p')[0]
I think the span block stops you from getting the text value of p, so let's get all the text minus whatever is in the span.
text=[x for x in p.itertext() if x != span.text]
I'm trying to get text $27.5 inside tag <div>, I located the element by id and the element is called "price".
The snippet of html is as follows:
<div id="PPP,BOSSST,NYCPAS,2015-04-26T01:00:00-04:00,2015-04-26T05:20:00-04:00,_price" class="price inlineBlock strong mediumText">$27.50</div>
Here is what I've tried
price.text
price.get_attribute('value')
Both of the above doesn't work.
Update:
Thanks for everyone that tries to help.
I combined your answers together and got the solution:)
price = driver.find_element_by_xpath("//div[#class='price inlineBlock strong mediumText']")
price_content = price.get_attribute('innerHTML')
print price_content.strip()
Can't you use a regular expression or Beautiful Soup to find the contents of the element in HTML:
re.search(r'<div.*?>(*.?)</div>', price.get_attribute('innerHTML')).group(1)
You element is hidden, last I worked with Selenium you were not able to get text of hidden elements. That said, you can always execute javascript, I dont usually write in python, but it should be something like:
def val = driver.execute_script("return document.getElementById('locator').innerHTML")
Change the css selector to
div[id$='_price']
Complete code
price = fltright.find_element(By.CSS_SELECTOR, "div[id$='_price']")
price.text
I tried your edited solution, but they only get 1 div having class. So, I tried these below to print a List of div having the same class.
Changing element to elements will output a List:
price = driver.find_elements_by_xpath('//div[#class = "price inlineBlock strong mediumText"]')
Use for ... in range () to print a List:
num = len (price)
for i in range (num):
print (price[i].text)
browser.find_element_by_xpath("//form[#id='workQueueTaskListForm']/div[1]/p").text