thanks in advance. Noobie to Python here and I am trying to automate value entries into a website via Selenium and the respective XPath values.
How it is supposed to function is that I send keys of the dynamic 'ID' into the input box and the ID will pop up and I select the ID. This works. Right now I am running into the issue where the ID does not exist and the tool ends up stalling out. I know I need an If function, then execute an elif if it does not exist but I am lost when it comes to these kind of statements with XPaths and need a little bit of guidance.
I have the class XPath of the pop up value stating the ID does not exist:
<li class="vv_list_no_items vv_item_indent listNoItems">No results match "1234567890123456"</li>
The confusing part is also having dynamic IDs where "1234567890123456" can be any ID.
Current code is below, sorry for the indenting as this was grabbed out of a larger set of scripts.
try:
wait = WebDriverWait(browser, 10)
# Inputs Legal Entity
elem = wait.until(EC.element_to_be_clickable((By.XPATH,
"//*[#id='di3Form']/div[2]/div[2]/div/div[1]/div[3]/div/div[2]/div/div[1]/input"))).send_keys(
LE)
elem = wait.until(
EC.element_to_be_clickable((By.XPATH, "//*[#id='veevaBasePage']/ul[3]/li/a"))).click()
LE = None
# Inputs WWID
elem = wait.until(EC.element_to_be_clickable((By.XPATH,
"//*[#id='di3Form']/div[2]/div[2]/div/div[1]/div[4]/div/div[2]/div/div[1]/input"))).send_keys(ID)
elem = wait.until(
EC.element_to_be_clickable((By.XPATH, "//*[#id='veevaBasePage']/ul[4]/li[2]/a/em"))).click()
# Inputs Country
elem = wait.until(EC.element_to_be_clickable((By.XPATH,
"//*[#id='di3Form']/div[2]/div[2]/div/div[1]/div[5]/div/div[2]/div/div[1]/input"))).send_keys(
Country)
elem = wait.until(
EC.element_to_be_clickable((By.XPATH, "//*[#id='veevaBasePage']/ul[5]/li/a"))).click()
# Save
elem = wait.until(EC.element_to_be_clickable((By.XPATH,
"//a[#class='docInfoSaveButton save vv_button vv_primary']/span[#class='vv_button_text vv_ellipsis' and text()='Save']")))
browser.execute_script("arguments[0].click();", elem)
wait = WebDriverWait(browser, 15)
# Click dropdown menu arrow
elem = wait.until(EC.element_to_be_clickable(
(By.XPATH, "//*[#id='di3Header']/div[3]/div/div[2]/div[1]/div/div[2]/div/div/button")))
browser.execute_script("arguments[0].click();", elem)
wait = WebDriverWait(browser, 100)
# Click "Publish"
elem = wait.until(EC.element_to_be_clickable((By.XPATH,
"/html/body/div[6]/div/ul/li")))
browser.execute_script("arguments[0].click();", elem)
#Confirm Publish
elem = wait.until(EC.element_to_be_clickable((By.XPATH,
"//a[#class='save vv_button vv_primary']/span[#class='vv_button_text' and text()='Yes']")))
browser.execute_script("arguments[0].click();", elem)
You can use xpath contains, with find_elements that returns a list and have a if condition that if it is >0, then No match found string would be present in UI.
try :
no_match = "No results match" + " " + '"' + WWID + '"'
if (len(driver.find_elements(By.XPATH, "//li[contains(text(),'{}')]".format(no_match)))) > 0:
print("Code has found No results match, String ")
# do what ever you wanna do here.
else:
print("There is not locator that contains, No results match")
except:
print("Something went wrong")
pass
Related
I am trying to scrape information from an automobile blog but i can't loop through the div tag containing the paragraph tags that contain the info.
driver.get("https://www.autocar.co.uk/car-news")
driver.maximize_window()
for i in range(3):
i+=1
info = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, f'//*[#id="page"]/div[2]/div[1]/div[1]/div[2]/div/div[1]/div/div[1]/div[1]/div[{i}]/div')))
heading = info.find_element_by_tag_name('h2')
clickable = heading.find_element_by_tag_name('a')
driver.execute_script("arguments[0].click();", clickable)
# the code starts to fail around here
try:
body_info = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, 'field-item even')))
main_text = []
for j in range(3):
j+=1
text = body_info.find_element_by_tag_name('p')
main_text.append(text)
for t in main_text:
t_info = t.text
print(f'{heading.text}\n{t_info}')
except:
print("couldn't find tag")
driver.back()
There's an issue with your code, (By.CLASS_NAME, 'field-item even').
Selenium does not have support for multiple classes or classes with space.
Simply replace space with . and that would be the CSS_SELECTOR
Try something like this:
try:
body_info = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CSS_SELECTOR, '.field-item.even')))
It must be '.field-item even' and not 'field-item even' if you are using By.CSS_SELECTOR for presence_of_element_located().
So Replace,
body_info = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, 'field-item even')))
with,
body_info = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, 'field-item even')))
Official docs. https://selenium-python.readthedocs.io/api.html#locate-elements-by
To select multiple classes, you must use the class selector. You can't select multiple classes through a space, like in CSS. You need to select multiple classes using the class selector. So you must put a dot before all of the classes and not give any space between them
I've looked all through Stackoverflow to try and find the answer to this but couldn't. What's wrong with my code is that it clicks the first element and then gets the 'href' I want but stops right after that, and throws errors like
box[x].click()
&
selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page document
Here's the code
box = driver.find_elements_by_class_name("info-section.info-primary")
x = 0
#for x in range(0, len(box)):
while True:
while x <= len(box):
#if box[x].is_displayed():
driver.implicitly_wait(2)
# error is happening here
box[x].click()
x += 1
try:
website = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, "primary-btn.website-link"))
)
print(website.get_attribute('href'))
driver.back()
except:
driver.back()
if not driver.find_element_by_class_name('ajax-page'):
break
else:
driver.find_element_by_class_name('ajax-page').click()
You are getting the StaleElementReference error because you define box, navigate to another page, then try to use the box variable again. The quickest way to resolve this would be to locate the element without the variable each loop:
box = driver.find_elements_by_class_name("info-section.info-primary")
x = 0
#for x in range(0, len(box)):
while True:
while x <= len(box):
#if box[x].is_displayed():
driver.implicitly_wait(2)
# error is happening here
driver.find_elements_by_class_name("info-section.info-primary")[x].click()
x += 1
try:
website = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, "primary-btn.website-link"))
)
print(website.get_attribute('href'))
driver.back()
except:
driver.back()
if not driver.find_element_by_class_name('ajax-page'):
break
else:
driver.find_element_by_class_name('ajax-page').click()
This is the link
https://www.unibet.eu/betting/sports/filter/football/matches
Using selenium driver, I access this link. This is what we have on the page
The actual task for me is to click on each of the match link. I found all those matches by
elems = driver.find_elements_by_class_name('eb700')
When i did this
for elem in elems:
elements
elem.click()
time.sleep(2)
driver.execute_script("window.history.go(-1)")
time.sleep(2)
The first time it clicked, loaded new page, went to previous page and then gave the following error
StaleElementReferenceException: Message: stale element reference: element is not attached to the page document
I also tried getting HREF attribute from the elem, but it gave None, Is it possible to open the page in a new tab instead of clicking the elem?
You can retry to click on element once again since it is no longer present in DOM.
Code :
driver = webdriver.Chrome("C:\\Users\\**\\Inc\\Desktop\\Selenium+Python\\chromedriver.exe")
driver.maximize_window()
wait = WebDriverWait(driver, 30)
driver.get("https://www.unibet.eu/betting/sports/filter/football/matches")
wait.until(EC.element_to_be_clickable((By.PARTIAL_LINK_TEXT, "OK"))).click()
sleep(2)
elements = driver.find_elements(By.XPATH, "//div[contains(#class,'_')]/div[#data-test-name='accordionLevel1']")
element_len = len(elements)
print(element_len)
counter = 0
while counter < element_len:
attempts = 0
while attempts < 2:
try:
ActionChains(driver).move_to_element(elements[counter]).click().perform()
except:
pass
attempts = attempts + 1
sleep(2)
# driver.execute_script("window.history.go(-1)") #may be get team name
#using //div[#data-test-name='teamName'] xpath
sleep(2)
# driver.refresh()
sleep(2)
counter = counter + 1
Since you move to next page, the elements no longer exists in DOM. So, you will get Stale Element exception.
What you can do is when comming back to same page, get all the links again (elems) and use while loop instead of for loop.
elems = driver.find_elements_by_class_name('eb700')
i=0
while i<len(elems):
elems[i].click()
time.sleep(2)
driver.execute_script("window.history.go(-1)")
time.sleep(2)
elems = driver.find_elements_by_class_name('eb700')
i++
Other solution is to remain on same page and save all href attributes in a list and then use driver.navigate to open each match link.
matchLinks=[]
elems = driver.find_elements_by_class_name('eb700')
for elem in elems:
matchLinks.append(elem.get_attribute('href')
for match in matchLinks:
driver.get(match)
#do whatever you want to do on match page.
I want to click the next page until no more page, but it does not click.
returns the error:raise exception_class(message, screen, stacktrace)
StaleElementReferenceException: stale element reference: element is not attached to the page document
my codes:
Thanks in advance!
driver.get('http://www.chinamoney.com.cn/chinese/zjfxzx/?tbnm=%E6%9C%80%E6%96%B0&tc=null&isNewTab=1')
driver.implicitly_wait(10)
driver.refresh()
driver.implicitly_wait(10)
wait = WebDriverWait(driver, 5)
datefield_st = wait.until(EC.element_to_be_clickable((By.ID, "pdbp-date-1")))
datefield_st.click()
select_st = Select(wait.until(EC.element_to_be_clickable((By.CLASS_NAME, "ui-datepicker-year"))))
select_st.select_by_visible_text("2021")
select2 = Select(wait.until(EC.element_to_be_clickable((By.CLASS_NAME, "ui-datepicker-month"))))
select2.select_by_value("1")
day=1
wait.until(EC.element_to_be_clickable((By.XPATH, "//td[#data-handler='selectDay']/a[text()='{}']".format(str(day))))).click()
datefield_ed = wait.until(EC.element_to_be_clickable((By.ID, "pdbp-date-2")))
datefield_ed.click()
select_ed = Select(wait.until(EC.element_to_be_clickable((By.CLASS_NAME, "ui-datepicker-year"))))
select_ed.select_by_visible_text("2021")
select2 = Select(wait.until(EC.element_to_be_clickable((By.CLASS_NAME, "ui-datepicker-month"))))
select2.select_by_value("1")
day=1
wait.until(EC.element_to_be_clickable((By.XPATH, "//td[#data-handler='selectDay']/a[text()='{}']".format(str(day))))).click()
driver.find_element_by_link_text("查询").click()
while True:
driver.implicitly_wait(10)
links=[link.get_attribute('href') for link in driver.find_elements_by_xpath("//a[contains(#title,'同业存单') and not(contains(#title,'申购说明')) and not(contains(#title,'公告'))]")]
titles = [title.text for title in driver.find_elements_by_xpath("//a[contains(#title,'中期票据') and not(contains(#title,'申购说明')) and not(contains(#title,'公告'))]")]
dates = [date.text for date in driver.find_elements_by_xpath('//*[#class="san-grid-r text-date"]')]
driver.implicitly_wait(10)
for link, title,date in zip(links, titles,dates):
dataframe = pd.DataFrame({'col1':date,'col2':title,'col3':link},index=[0])
dataframe.to_csv('Chinamoney.csv',mode='a+',header=False,index=False,encoding='utf-8-sig')
print(link,title,date)
try:
driver.find_element_by_xpath('//*[contains(#class, "page-next")]').click()
except:
print('No more pages')
You passed two class names into selector while it's not allowed for search by class name. Either try
(By.CLASS_NAME, 'page-next')
or
(By.CSS_SELECTOR, '.page-btn.page-next')
Also your element and icon select the same element. So you don't need to define icon. Simply use element.click()
You are using:
driver.find_element_by_xpath('//*[contains(#class, "page-next")]').click()
Try:
element = driver.find_element_by_xpath('//*[contains(#class, "page-next")]')
driver.execute_script("arguments[0].click();", element)
If this doesnt work, you can try to obtain the url/link value and store it, and later you can go to the url or do what you want without click in it.
I am trying to scrape all job postings for the last 24 hours from Glassdoor and save them to a dictionary.
binary = FirefoxBinary('path_to_firebox_binary.exe')
cap = DesiredCapabilities().FIREFOX
cap["marionette"] = True
driver = webdriver.Firefox(firefox_binary=binary, capabilities=cap, executable_path=GeckoDriverManager().install())
base_url = 'https://www.glassdoor.com/Job/jobs.htm?suggestCount=0&suggestChosen=false&clickSource=searchBtn' \
'&typedKeyword=data+sc&sc.keyword=data+scientist&locT=C&locId=1154532&jobType= '
driver.get(url=base_url)
driver.implicitly_wait(20)
driver.maximize_window()
WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.CSS_SELECTOR, "div#filter_fromAge>span"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((
By.XPATH, "//div[#id='PrimaryDropdown']/ul//li//span[#class='label' and contains(., 'Last Day')]"))).click()
# find job listing elements on web page
listings = driver.find_elements_by_class_name("jl")
n_listings = len(listings)
results = {}
for index in range(n_listings):
driver.find_elements_by_class_name("jl")[index].click() # runs into error
print("clicked listing {}".format(index + 1))
info = driver.find_element_by_class_name("empInfo.newDetails")
emp = info.find_element_by_class_name("employerName")
results[index] = {'title': title, 'company': emp_name, 'description': description}
I keep running into the error message
selenium.common.exceptions.StaleElementReferenceException: Message:
The element reference of is stale; either the element is no longer attached to the
DOM, it is not in the current frame context, or the document has been
refreshed
for the first line inside my for loop. Even if the for loop runs for some number of times, it eventually leads to the exception showing up. I am new to selenium and web scraping, will appreciate any help.
Every time a new post is selected the clicked element is being modified, and therefor the DOM is being refreshed. The change is slow, certainly in comparison to the actions in the loop, so what you want to do is to slow it a little bit. Instead of using fixed sleep you can wait for the changes to occur
Every time you select a posting a new class selected is being added and the style attribute lose it's content. You should wait for this to happen, get the information, and click the next post
wait = WebDriverWait(driver, 20)
for index in range(n_listings - 1):
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.selected:not([style="border-bottom:0"])')))
print("clicked listing {}".format(index + 1))
info = driver.find_element_by_class_name('empInfo.newDetails')
emp = info.find_element_by_class_name('employerName')
if index < n_listings - 1:
driver.find_element_by_css_selector('.selected + .jl').click()
This error means the element you are trying to click on was not found, you have to first make sure the target element exists and then call click() or wrap it in a try/except block.
# ...
results = {}
for index in range(n_listings):
try:
driver.find_elements_by_class_name("jl")[index].click() # runs into error
except:
print('Listing not found, retrying in 1 seconds ...')
time.sleep(1)
continue
print("clicked listing {}".format(index + 1))
info = driver.find_element_by_class_name("empInfo.newDetails")
emp = info.find_element_by_class_name("employerName")
# ...