Hey all trust that you're well, I'm trying to find elements by class_name and loop through them, however, they all have the same class_name.
I've discovered that they contain different index numbers and I'm trying to utilise that to loop through them
Example of the element and the index:
<div class="member-2gU6Ar container-1oeRFJ clickable-28SzVr" aria-controls="popout_4188" aria-expanded="false" tabindex="-1" colorroleid="987314373729067059" index="0" role="listitem" data-list-item-id="members-987320208253394947___0">
<div class="member-2gU6Ar container-1oeRFJ clickable-28SzVr" aria-controls="popout_4184" aria-expanded="false" tabindex="-1" colorroleid="987324577870929940" index="1" role="listitem" data-list-item-id="members-987320208253394947___1">
My code:
users = bot.find_elements(By.CLASS_NAME, 'member-2gU6Ar')
time.sleep(5)
try:
for user in users:
user.click()
message = bot.find_element(By.XPATH, '//body[1]/div[1]/div[2]/div[1]/div[3]/div[1]/div[1]/div[1]/div[5]/div[1]/input[1]')
time.sleep(5)
message.send_keys('Automated' + Keys.ENTER)
except NoSuchElementException:
skip
The class that you see over here member-2gU6Ar container-1oeRFJ clickable-28SzVr is not a single class, it is a combination of multiple classes separated with space.
So using member-2gU6Ar would not work as expected.
You can remove the spaces and put a . to make a CSS selector though.
div.member-2gU6Ar.container-1oeRFJ.clickable-28SzVr
I would not really suggest that since I see it contains alpha numeric string, that may get change with the time.
Here I have written an xpath:
//div[starts-with(#class,'member') and contains(#class, 'container') and #index]
this should match all the divs with the specified attribute.
You can use it probably like this:
users = bot.find_elements(By.XPATH, "//div[starts-with(#class,'member') and contains(#class, 'container') and #index]")
i = 1
time.sleep(5)
try:
for user in users:
ele = bot.find_element(By.XPATH, f"//div[starts-with(#class,'member') and contains(#class, 'container') and #index= '{i}']")
ele.click()
message = bot.find_element(By.XPATH, '//body[1]/div[1]/div[2]/div[1]/div[3]/div[1]/div[1]/div[1]/div[5]/div[1]/input[1]')
time.sleep(5)
message.send_keys('Automated' + Keys.ENTER)
i = i + 1
except NoSuchElementException:
skip
However I would recommend you to use a relative xpath and not absolute xpath //body[1]/div[1]/div[2]/div[1]/div[3]/div[1]/div[1]/div[1]/div[5]/div[1]/input[1].
Hope this helps.
Related
I am trying to automate adding items to cart in online shop, however, I got stuck on a loop that should differentiate whether item is available or not.
Here's the loop:
while True:
#if ???
WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.XPATH, "//*[text()='" + size.get() + "']"))).click()
sleep(1)
WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.XPATH, "//*[text()='Add to cart']"))).click()
sleep(1)
print("Success!")
break
else:
driver.refresh()
sleep(3)
If the size is available, button is active:
<div class="styles__ArticleSizeItemWrapper-sc-dt4c4z-4 eQqdpu">
<button aria-checked="false" role="radio" class="styles__ArticleSizeButton-sc-1n1fwgw-0 jIVZOs">
<span class="styles__StyledText-sc-cia9rt-0 styles__StyledText-sc-1n1fwgw-2 styles__ArticleSizeItemTitle-sc-1n1fwgw-3 gnSCRf cLhSqA bipwfD">XL</span>
<span class="styles__StyledText-sc-cia9rt-0 ffGzxX">
</span>
</button>
</div>
If not, button is inactive:
<div class="styles__ArticleSizeItemWrapper-sc-dt4c4z-4 eQqdpu">
<button disabled="" aria-checked="false" role="radio" class="styles__ArticleSizeButton-sc-1n1fwgw-0 fBeTLI">
<span class="styles__StyledText-sc-cia9rt-0 styles__StyledText-sc-1n1fwgw-2 styles__ArticleSizeItemTitle-sc-1n1fwgw-3 gnSCRf cLhSqA bipwfD">XXL</span>
<span class="styles__StyledText-sc-cia9rt-0 styles__StyledText-sc-1n1fwgw-2 kQJTJc cLhSqA">
</span>
</button>
</div>
The question is: what should be the condition for this loop?
I have tried something like this:
if (driver.find_elements(By.XPATH, "//*[contains(#class='styles__ArticleSizeButton-sc-1n1fwgw-0 jIVZOs') and text()='" + e2.get() + "']")):
EDIT: Replaced "=" with "," in the above code as follows:
if (driver.find_elements(By.XPATH, "//*[contains(#class='styles__ArticleSizeButton-sc-1n1fwgw-0 jIVZOs') and text()='" + e2.get() + "']")):
but I keep getting invalid xpath expression error.
EDIT: The error is gone, but the browser keeps refreshing with the else statement (element not found).
I believe your error is in the use of the contains function, which expects two parameters: a string and a substring, although you're passing it a boolean expression (#class='styles__ArticleSizeButton-sc-1n1fwgw-0 jIVZOs').
I expect this is just a typo and you actually meant to type contains(#class, 'styles__ArticleSizeButton-sc-1n1fwgw-0 jIVZOs') (NB comma instead of an equals sign after #class).
Also, you are looking for a button element which has a child text node (text() refers to a text node) which is equal to the size you're looking for, but that text node is actually a child of a span which is a child of the button. You can compare your size to the text value of that span.
Try something like this:
"//*[contains(#class='styles__ArticleSizeButton-sc-1n1fwgw-0 jIVZOs') and span='"
+ e2.get()
+ "']"
e3="Some value"
x=f"//button[contains(#class,'styles__ArticleSizeButton-sc-1n1fwgw-0 jIVZOs') and not(contains(#disabled='')) and ./span[contains(text(),'{e3}')]])]"
print(x)
Try looking for the button which contains that class and with that span and maybe check if button disabled?
I managed to get it working using this condition:
if (driver.find_elements(By.XPATH,
"//*[contains(#class, 'styles__ArticleSizeButton-sc-1n1fwgw-0 jIVZOs')
and .//*[text()='" + e2.get() + "']]")):
It is quite similar to the original approach, however, adding .//* before text() did the trick.
Without .//* find_elements was looking in the same node which resulted in not finding the element. .//* instructs find_elements to look in the child node where element exists.
Important: text condition was wrapped in additional [] brackets.
I am trying to search into a table for a specific value (Document ID) and then press a button that is next to that column (Retire). I should add here that the 'Retire' button is only visible once the mouse is hovered over, but I have built that into my code which I'll share further down.
So for example:
My Document ID would be 0900766b8001b6a3, and I would want to click the button called 'Retire'. The issue I'm having is pulling the XPaths for the retire buttons, this needs to be dynamic. I got it working for some Document IDs that had a common link, for example:
A700000007201082 Xpath = //[#id="retire-7201082"] (you can see the commonality here with the Document ID ending in the same as the Xpath number 7201082. Whereas in the first example, the xpath for '0900766b8001b6a3' = //[#id="retire-251642"], you can see the retire number here is completely random to the Document ID and therefore hard to manually build the Xpath.
Here is my code:
before_XPath = "//*[#class='wp-list-table widefat fixed striped table-view-list pages']/tbody/tr["
aftertd_XPath_1 = "]/td[1]"
aftertd_XPath_2 = "]/td[2]"
aftertd_XPath_3 = "]/td[3]"
before_XPath_1 = "//*[#class='wp-list-table widefat fixed striped table-view-list pages']/tbody/tr[1]/th["
before_XPath_2 = "//*[#class='wp-list-table widefat fixed striped table-view-list pages']/tbody/tr[2]/td["
aftertd_XPath = "]/td["
after_XPath = "]"
aftertr_XPath = "]"
search_text = "0900766b8001af05"
time.sleep(10)
num_rows = len(driver.find_elements_by_xpath("//*[#class='wp-list-table widefat fixed striped table-view-list pages']/tbody/tr"))
num_columns = len (driver.find_elements_by_xpath("//*[#class='wp-list-table widefat fixed striped table-view-list pages']/tbody/tr[2]/td"))
elem_found = False
for t_row in range(2, (num_rows + 1)):
for t_column in range(1, (num_columns + 1)):
FinalXPath = before_XPath + str(t_row) + aftertd_XPath + str(t_column) + aftertr_XPath
cell_text = driver.find_element_by_xpath(FinalXPath).text
if ((cell_text.casefold()) == (search_text.casefold())):
print("Search Text "+ search_text +" is present at row " + str(t_row) + " and column " + str(t_column))
elem_found = True
achains = ActionChains(driver)
move_to = driver.find_element_by_xpath("/html/body/div[1]/div[2]/div[2]/div[1]/div[3]/form[1]/table/tbody/tr[" + str(t_row) + "]/td[2]")
achains.move_to_element(move_to).perform()
retire_xpath = driver.find_element_by_xpath("//*[#id='retire-"+ str(search_text[-7:])+"']")
time.sleep(6)
driver.execute_script("arguments[0].click();", move_to)
time.sleep(6)
driver.switch_to.alert.accept()
break
if (elem_found == False):
print("Search Text "+ search_text +" not found")
This particular bit of code lets me handle any Document IDs such as 'A700000007201082' as I can just cut off the part I need and build it into an XPath:
retire_xpath = driver.find_element_by_xpath("//*[#id='retire-"+ str(search_text[-7:])+"']")
I've tried to replicate the above for the Doc IDs starting with 09007, but I can't find how to pull that unique number as it isn't anywhere accessible in the table.
I am wondering if there's something I can do to build it the same way I have above or perhaps focus on the index? Any advice is much appreciated, thanks.
EDIT:
This is the HTML code for the RETIRE button for Document ID: 0900766b8001b6a3
<span class="retire"><button id="retire-251642" data-document-id="251642" rel="bookmark" aria-label="Retire this document" class="rs-retire-link">Retire</button></span>
You can see the retire button id is completely different to the Document ID. Here is some HTML code just above it which I think could be useful:
<div class="hidden" id="inline_251642">
<div class="post_title">General Purpose Test Kit Lead</div><div class="post_name">0900766b8001b6a3</div>
<div class="post_author">4</div>
<div class="comment_status">closed</div>
<div class="ping_status">closed</div>
<div class="_status">publish</div>
<div class="jj">30</div>
<div class="mm">03</div>
<div class="aa">2001</div>
<div class="hh">15</div>
<div class="mn">43</div>
<div class="ss">03</div>
<div class="post_password"></div><div class="post_parent">0</div><div class="page_template">default</div><div class="tags_input" id="rs-language-code_251642">de, en, fr, it</div><div class="tags_input" id="rs-current-state_251642">public</div><div class="tags_input" id="rs-doc-class-code_251642">rs_instruction_sheet</div><div class="tags_input" id="rs-restricted-countries_251642"></div></div>
Would it be possible to call the div class "post_name" as this has the correct doc ID, and the press the RETIRE button for that specific Doc ID?
Thank you.
I am trying to scrape data from this page: https://www.oddsportal.com/soccer/chile/primera-division/curico-unido-o-higgins-CtsLggl6/#over-under;2
Here I am trying to expand all of the "compare odds" fields, which are contained in this HTML:
<div class="table-container">
<div class="table-header-light even"><strong>Over/Under +1 </strong><span class="avg chunk-odd-payout">93.4%</span><span class="avg chunk-odd nowrp">5.63</span>
<span
class="avg chunk-odd nowrp">1.12</span><span class="odds-cnt">(3)</span><span class="odds-co"><a class="more" href="" onclick="page.togleTableContent('P-1.00-0-0',this);return false;">Compare odds</a></span></div>
</div>
<div class="table-container" style="display: none;">
<div class="table-header-light"><strong>Over/Under +1.25 </strong><span class="avg chunk-odd-payout"></span><span class="avg chunk-odd nowrp"></span><span class="avg chunk-odd nowrp"></span>
<span
class="odds-cnt">(0)</span><span class="odds-co"><a class="more" href="" onclick="page.togleTableContent('P-1.25-0-0',this);return false;">Compare odds</a></span></div>
</div>
The part I am trying to access is the following:
span class="odds-co">Compare odds
I have tried all of the following:
#odds_rows = browser.find_elements_by_class_name('more')
# odds_rows=browser.find_elements_by_css_selector(".more")
# odds_rows=WebDriverWait(browser, 10).until(EC.visibility_of_all_elements_located((By.XPATH, "//*[#class='more']")))
odds_rows= WebDriverWait(browser, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, ".more")))
#odds_rows=WebDriverWait(browser, 10).until(EC.visibility_of_all_elements_located((By.CLASS_NAME, "more")))
In order to subsequently loop click through the identified fields:
for i in odds_rows:
#browser.execute_script("arguments[0].click();", i)
i.click()
However already in the step of identifying the fields I am getting a timeout error on all WebDriverWait attempts except for
odds_rows=WebDriverWait(browser, 10).until(EC.visibility_of_all_elements_located((By.CLASS_NAME, "more")))
This option yields only one result:
[<selenium.webdriver.remote.webelement.WebElement (session="7cbc57173a57aadbc115264dff8ca620", element="3654f928-bca4-4033-9566-da9e6aa6294b")>]
However this result is not clicked subsequently.
What am I doing wrong?
driver = webdriver.Chrome()
driver.implicitly_wait(10)
driver.get("https://www.oddsportal.com/soccer/chile/primera-division/curico-unido-o-higgins-CtsLggl6/#over-under;2;0.50;0")
driver.maximize_window()
odds_rows = WebDriverWait(driver, 10).until(
EC.presence_of_all_elements_located((By.CSS_SELECTOR, '.table-header-light')))
for i in odds_rows:
count = i.find_element_by_xpath('./span[#class="odds-cnt"]')
elem = i.find_elements_by_xpath('.//*[contains(text(),"Compare")]')
txt = count.text
if txt != '' and len(elem):
elem = elem[0]
driver.execute_script("arguments[0].scrollIntoView();", elem)
elem.click()
The issue is the row with count '' is not visible and you cannot click it .
if you click on 'Compare odds' you can see the URL that changes from
https://www.oddsportal.com/soccer/chile/primera-division/curico-unido-o-higgins-CtsLggl6/#over-under;2
https://www.oddsportal.com/soccer/chile/primera-division/curico-unido-o-higgins-CtsLggl6/#over-under;2;0.50;0
if you follow clicking you will se that the last part:
2;0.50;0 will increase by 0.5
next is
https://www.oddsportal.com/soccer/chile/primera-division/curico-unido-o-higgins-CtsLggl6/#over-under;2;1.00;0
https://www.oddsportal.com/soccer/chile/primera-division/curico-unido-o-higgins-CtsLggl6/#over-under;2;1.50;0
and continue...
In another way you have this class: "table-main detail-odds sortable" by default is hidden, because there is the data, you DON'T need to click. you can scrape that class
i hope be helpful for you.
So I try to improve me code and XPATH and I try to handle field like this:
<div class="h3">Login</div> <div class="Is3WC-spot Is3WC-spotInfo icon xdm icon_information_16 dialogInfoSpot" style="display: none;" aria-hidden="true"><div class="Is3WC-spotRelativeDivPos"> <div class="Is3WC-spotTextMargins">User Name (login) must:<br>- contain at least 3 characters<br>- be shorter than 60 characters</div> <div class="Is3WC-spotClose icon hover xdm icon_delete_minor_white"></div> </div></div> <div class="floatLeft"><div><input type="text" class="dialogTextBox error"></div> <div class="Is3WC-errorLabel">Login is required.</div></div> <div class="dialogInformationIcon iconInformation clickable"></div>
How to create that xpath better?
Login cell
It works now as : "/html/body/div[8]/div/table/tbody/tr[2]/td[2]/div/div/div/div[2]/div[3]/div[1]/input" and I'm not proud of it at all...
And a question about WebDriverWait. In topic How to click on GWT enabled elements using Selenium and Python one person recomend me to use WebDriverWait but I did own method:
#staticmethod
def clickPerform(driver, xpath):
counter = 0;
while True:
try:
driver.find_element_by_xpath(xpath).click()
break
except ElementClickInterceptedException:
time.sleep(1)
if counter == 10:
Verify.verifyAssert(True, False,"Not able to click ElementClickInterceptedException: " + xpath)
break
counter = counter + 1
except ElementNotInteractableException:
time.sleep(1)
if counter == 10:
Verify.verifyAssert(True, False,"Not able to click ElementNotInteractableException: " + xpath)
break
counter = counter + 1
It is better to use the mentioned WebDriverWait instead of my method? (I think it is :)) but I'm looking for opinions )
There can be multiple ways to write xpath relative to text here. You can try
//div[text()='Login is required.']/preceding-sibling::div/input
OR
//div[text()='Login is required.']/parent::div//input[contains(#class,'TextBox')]
I am trying to get names and affiliations of authors from a series of articles from this page (you'll need to have access to Proquest to visualise it). What I want to do is to open all the tooltips present at the top of the page, and extract some HTML text from them. This is my code:
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
browser = webdriver.Firefox()
url = 'http://search.proquest.com/econlit/docview/56607849/citation/2876523144F544E0PQ/3?accountid=13042'
browser.get(url)
#insert your username and password here
n_authors = browser.find_elements_by_class_name('zoom') #zoom is the class name of the three tooltips that I want to open in my loop
author = []
institution = []
for a in n_authors:
print(a)
ActionChains(browser).move_to_element(a).click().perform()
html_author = browser.find_element_by_xpath('//*[#id="authorResolveLinks"]/li/div/a').get_attribute('innerHTML')
html_institution = browser.find_element_by_xpath('//*[#id="authorResolveLinks"]/li/div/p').get_attribute('innerHTML')
author.append(html_author)
institution.append(html_institution)
Although n_authors has three entries that are apparently different from one another, selenium fails to get the info from all tooltips, instead returning this:
author
#['Nuttall, William J.',
#'Nuttall, William J.',
#'Nuttall, William J.']
And the same happens for the institution. What am I getting wrong? Thanks a lot
EDIT:
The array containing the xpaths of the tooltips:
n_authors
#[<selenium.webdriver.remote.webelement.WebElement (session="277c8abc-3883-
#43a8-9e93-235a8ded80ff", element="{008a2ade-fc82-4114-b1bf-cc014d41c40f}")>,
#<selenium.webdriver.remote.webelement.WebElement (session="277c8abc-3883-
#43a8-9e93-235a8ded80ff", element="{c4c2d89f-3b8a-42cc-8570-735a4bd56c07}")>,
#<selenium.webdriver.remote.webelement.WebElement (session="277c8abc-3883-
#43a8-9e93-235a8ded80ff", element="{9d06cb60-df58-4f90-ad6a-43afeed49a87}")>]
Which has length 3, and the three elements are different, which is why I don't understand why selenium won't distinguish them.
EDIT 2:
Here is the relevant HTML
<span class="titleAuthorETC small">
<span style="display:none" class="title">false</span>
Jamasb, Tooraj
<a class="zoom" onclick="return false;" href="#">
<img style="margin-left:4px; border:none" alt="Visualizza profilo" id="resolverCitation_previewTrigger_0" title="Visualizza profilo" src="/assets/r20161.1.0-4/ctx/images/scholarUniverse/ar_button.gif">
</a><script type="text/javascript">Tips.images = '/assets/r20161.1.0-4/pqc/javascript/prototip/images/prototip/';</script>; Nuttall, William J
<a class="zoom" onclick="return false;" href="#">
<img style="margin-left:4px; border:none" alt="Visualizza profilo" id="resolverCitation_previewTrigger_1" title="Visualizza profilo" src="/assets/r20161.1.0-4/ctx/images/scholarUniverse/ar_button.gif">
</a>; Pollitt, Michael G
<a class="zoom" onclick="return false;" href="#">
<img style="margin-left:4px; border:none" alt="Visualizza profilo" id="resolverCitation_previewTrigger_2" title="Visualizza profilo" src="/assets/r20161.1.0-4/ctx/images/scholarUniverse/ar_button.gif">
</a>.
UPDATE:
#parishodak's answer, for some reason does not work using Firefox, unless I manually hover over the tooltips first. It works with chromedriver, but only if I first hover over the tooltips, and only if I allow time.sleep(), as in
for i in itertools.count():
try:
tooltip = browser.find_element_by_xpath('//*[#id="resolverCitation_previewTrigger_' + str(i) + '"]')
print(tooltip)
ActionChains(browser).move_to_element(tooltip).perform() #
except NoSuchElementException:
break
time.sleep(2)
elements = browser.find_elements_by_xpath('//*[#id="authorResolveLinks"]/li/div/a')
author = []
for e in elements:
print(e)
attribute = e.get_attribute('innerHTML')
author.append(attribute)`
The reason it is returning the same element, because xpath is not changing for all the loop iterations.
Two ways to deal:
Use array notation for xpath as described below:
browser.find_elements_by_xpath('//*[#id="authorResolveLinks"]/li/div/a[1]').get_attribute('innerHTML')
browser.find_elements_by_xpath('//*[#id="authorResolveLinks"]/li/div/a[2]').get_attribute('innerHTML')
browser.find_elements_by_xpath('//*[#id="authorResolveLinks"]/li/div/a[3]').get_attribute('innerHTML')
Or
Instead of find_element_by_xpath use find_elements_by_xpath
elements = browser.find_elements_by_xpath('//*[#id="authorResolveLinks"]/li/div/a')
loop over elements and use get_attribute('innerHTML') on each element in loop iteration.