time.sleep gives desired scrape but wait until does not

time.sleep gives desired scrape but wait until does not - python

Why is it when I add time.sleep(2), I get my desired output but if I add wait until specific xpath it gives less results?
Output with time.sleep(2) (also desired):
Adelaide Utd
Tottenham
Dundee Fc
...
Count: 145 names
Remove time.sleep
Adelaide Utd
Tottenham
Dundee Fc
...
Count: 119 names
I have added:
clickMe = wait(driver, 13).until(EC.element_to_be_clickable((By.CSS_SELECTOR, ("#page-container > div:nth-child(4) > div > div.ubet-sports-section-page > div > div:nth-child(2) > div > div > div:nth-child(1) > div > div > div.page-title-new > h1"))))
As this element is present on all pages.
Seems to be significantly less. How can I get around this issue?
Script:
import csv
import os
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.ui import WebDriverWait as wait
driver = webdriver.Chrome()
driver.set_window_size(1024, 600)
driver.maximize_window()
driver.get('https://ubet.com/sports/soccer')
clickMe = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, ('//select[./option="Soccer"]/option'))))
options = driver.find_elements_by_xpath('//select[./option="Soccer"]/option')
indexes = [index for index in range(len(options))]
for index in indexes:
try:
try:
zz = wait(driver, 10).until(
EC.element_to_be_clickable((By.XPATH, '(//select/optgroup/option)[%s]' % str(index + 1))))
zz.click()
except StaleElementReferenceException:
pass
from selenium.webdriver.support.ui import WebDriverWait
def find(driver):
pass
from selenium.common.exceptions import StaleElementReferenceException, NoSuchElementException
import time
clickMe = wait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, ("#page-container > div:nth-child(4) > div > div.ubet-sports-section-page > div > div:nth-child(2) > div > div > div:nth-child(1) > div > div > div.page-title-new > h1"))))
langs0 = driver.find_elements_by_css_selector(
"div > div > div > div > div > div > div > div > div.row.collapse > div > div > div:nth-child(2) > div > div > div > div > div > div.row.small-collapse.medium-collapse > div:nth-child(1) > div > div > div > div.lbl-offer > span")
langs0_text = []
for lang in langs0:
try:
langs0_text.append(lang.text)
except StaleElementReferenceException:
pass
directory = 'C:\\A.csv' #####################################
with open(directory, 'a', newline='', encoding="utf-8") as outfile:
writer = csv.writer(outfile)
for row in zip(langs0_text):
writer.writerow(row)
except StaleElementReferenceException:
pass
If you cannot access page, you need vpn.
Updating...
Perhaps that element loads before others. So if we changed it to datascraped (not all pages have data to be scraped).
Add:
try:
clickMe = wait(driver, 13).until(EC.element_to_be_clickable((By.CSS_SELECTOR, ("div > div > div > div > div > div > div > div > div.row.collapse > div > div > div:nth-child(2) > div > div > div > div > div > div.row.small-collapse.medium-collapse > div:nth-child(3) > div > div > div > div.lbl-offer > span"))))
except TimeoutException as ex:
pass
Same issue still present
Manual steps:
#Load driver.get('https://ubet.com/sports/soccer')
#Click drop down (//select/optgroup/option
#Wait for page elements so can scrape
Scrape:
div > div > div > div > div > div > div > div > div.row.collapse > div > div > div:nth-child(2) > div > div > div > div > div > div.row.small-collapse.medium-collapse > div:nth-child(1) > div > div > div > div.lbl-offer > span
Loop repeat.

The website is built on angularjs, so your best bet would be to wait until angular has finished processing of all AJAX requests (I won't go into the underlying mechanics, but there are plenty of materials on that topic throughout the web). For this, I usually define a custom expected condition to check while waiting:
class NgReady:
js = ('return (window.angular !== undefined) && '
'(angular.element(document).injector() !== undefined) && '
'(angular.element(document).injector().get("$http").pendingRequests.length === 0)')
def __call__(self, driver):
return driver.execute_script(self.js)
# NgReady does not have any internal state, so one instance
# can be reused for waiting multiple times
ng_ready = NgReady()
Now use it to wait after zz.click():
zz.click()
wait(driver, 10).until(ng_ready)
Tests
Your original code, unmodified (without sleeping or waiting with ng_ready):
$ python so-47954604.py && wc -l out.csv && rm out.csv
86 out.csv
Using time.sleep(10) after zz.click():
$ python so-47954604.py && wc -l out.csv && rm out.csv
101 out.csv
Same result when using wait(driver, 10).until(ng_ready) after zz.click():
$ python so-47954604.py && wc -l out.csv && rm out.csv
101 out.csv
Credits
NgReady is not my invention, I just ported it to python from the expected condition implemented in Java I found here, so all credits go to the author of the answer.

#hoefling idea is absolutely the correct one, but here is an addition to the "wait for Angular" part.
The logic used inside the NgReady only checks for angular to be defined and no pending requests left to be processed. Even though it works for this website, it's not a definite answer to the question of Angular being ready to work with.
If we look at what Protractor - the Angular end-to-end testing framework - does to "sync" with Angular, it is using this "Testability" API built into Angular.
There is also this pytractor package which extends selenium webdriver instances with a WebDriverMixin which would keep the sync between the driver and angular automatically on every interaction.
You can either start using pytractor directly (it is though abandonded as a package). Or, we can try and apply the ideas implemented there in order to always keep our webdriver synced with Angular. For that, let's create this waitForAngular.js script (we'll use only Angular 1 and 2 support logic only - we can always extend it by using the relevant Protractor's client side script):
try { return (function (rootSelector, callback) {
var el = document.querySelector(rootSelector);
try {
if (!window.angular) {
throw new Error('angular could not be found on the window');
}
if (angular.getTestability) {
angular.getTestability(el).whenStable(callback);
} else {
if (!angular.element(el).injector()) {
throw new Error('root element (' + rootSelector + ') has no injector.' +
' this may mean it is not inside ng-app.');
}
angular.element(el).injector().get('$browser').
notifyWhenNoOutstandingRequests(callback);
}
} catch (err) {
callback(err.message);
}
}).apply(this, arguments); }
catch(e) { throw (e instanceof Error) ? e : new Error(e); }
Then, let's inherit from webdriver.Chrome and patch the execute() method - so that every time there is an interaction, we additionally check if Angular is ready before the interaction:
import csv
from selenium import webdriver
from selenium.webdriver.remote.command import Command
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.ui import WebDriverWait as wait
from selenium.webdriver.common.by import By
from selenium.common.exceptions import StaleElementReferenceException
from selenium.webdriver.support import expected_conditions as EC
COMMANDS_NEEDING_WAIT = [
Command.CLICK_ELEMENT,
Command.SEND_KEYS_TO_ELEMENT,
Command.GET_ELEMENT_TAG_NAME,
Command.GET_ELEMENT_VALUE_OF_CSS_PROPERTY,
Command.GET_ELEMENT_ATTRIBUTE,
Command.GET_ELEMENT_TEXT,
Command.GET_ELEMENT_SIZE,
Command.GET_ELEMENT_LOCATION,
Command.IS_ELEMENT_ENABLED,
Command.IS_ELEMENT_SELECTED,
Command.IS_ELEMENT_DISPLAYED,
Command.SUBMIT_ELEMENT,
Command.CLEAR_ELEMENT
]
class ChromeWithAngular(webdriver.Chrome):
def __init__(self, root_element, *args, **kwargs):
self.root_element = root_element
with open("waitForAngular.js") as f:
self.script = f.read()
super(ChromeWithAngular, self).__init__(*args, **kwargs)
def wait_for_angular(self):
self.execute_async_script(self.script, self.root_element)
def execute(self, driver_command, params=None):
if driver_command in COMMANDS_NEEDING_WAIT:
self.wait_for_angular()
return super(ChromeWithAngular, self).execute(driver_command, params=params)
driver = ChromeWithAngular(root_element='body')
# the rest of the code as is with what you had
Again, this is heavily insipred by the pytractor and protractor projects.

Related

What is the correct element to use from this section of html?

I'm using selenium to automatically go to various webpages and download an XML file.
This has been working fine but it has suddenly stopped working, and it's due to selenium not being able to find the element.
I've tried using selector gadget with CSS & Xpath. I've tried directly copying from the inspect panel, I've tried using the wait function on selenium until the element fully shows and I'm getting no luck.
This is the html of where the download button is
<div class="video-playlist-xml" data-reactid=".2.0"><i class="icon-download-xml-green" data-toggle="tooltip" data-placement="top" title="" data-original-title="Download xml file of the match" data-reactid=".2.0.0.0"></i></div>
The download button is just above the video, above the scoreline.
for x in range(number_of_clicks):
driver = webdriver.Chrome(executable_path=r'C:\Users\James\OneDrive\Desktop\webdriver\chromedriver.exe',options = options)
driver.get('https://football.instatscout.com/teams/978/matches')
time.sleep(10)
print("Page Title is : %s" %driver.title)
WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"#team-table1 > div.table-scroll-inner > div.team-stats-wrapper.team-stats-wrapper_no-vertical-scroll > table > tbody > tr:nth-child(" +str(x+1) + ") > td:nth-child(1) > div > div.styled__MatchPlay-sc-10ytjn2-1.hkIvhi > i"))).click()
WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"#root > div > article > section.player-details > div > div.OutsideClickWrapper-sc-ktqo9u.cTxKts > div > a:nth-child(1) > span > span"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.video-playlist-xml > a[href] > i.icon-download-xml-green[data-original-title='Download xml file of the match']"))).click()
chks = driver.find_elements_by_css_selector("#players > div.control-block > div.control-block__container.control-block__container--large > button")
for chk in chks:
chk.click()
time.sleep(15)
driver.quit()
error Traceback
---------------------------------------------------------------------------
TimeoutException Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_26436/2420202332.py in <module>
6 WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"#team-table1 > div.table-scroll-inner > div.team-stats-wrapper.team-stats-wrapper_no-vertical-scroll > table > tbody > tr:nth-child(" +str(x+1) + ") > td:nth-child(1) > div > div.styled__MatchPlay-sc-10ytjn2-1.hkIvhi > i"))).click()
7 WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"#root > div > article > section.player-details > div > div.OutsideClickWrapper-sc-ktqo9u.cTxKts > div > a:nth-child(1) > span > span"))).click()
----> 8 WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.video-playlist-xml > a[href] > i.icon-download-xml-green[data-original-title='Download xml file of the match']"))).click()
9 #WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.XPATH,'//*[contains(concat( " ", #class, " " ), concat( " ", "video-playlist-xml", " " ))]//a | //*[contains(concat( " ", #class, " " ), concat( " ", "icon-download-xml-green", " " ))]'))).click()
10 chks = driver.find_elements_by_css_selector("#players > div.control-block > div.control-block__container.control-block__container--large > button")
~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium\webdriver\support\wait.py in until(self, method, message)
78 if time.time() > end_time:
79 break
---> 80 raise TimeoutException(message, screen, stacktrace)
81
82 def until_not(self, method, message=''):
TimeoutException: Message:

To locate a clickable element instead of presence_of_element_located() you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.video-playlist-xml > a[href] > i.icon-download-xml-green[data-original-title='Download xml file of the match']"))).click()
Using XPATH:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[#class='video-playlist-xml']/a[#href]/i[#class='icon-download-xml-green' and #data-original-title='Download xml file of the match']"))).click()
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Stupidly, I've just realised that clicking one of the elements was opening a new window. Issues solved. Thanks guys.

Selenium - switch to div class that is a window

I need help selecting an element on a webpage with Selenium. I have been using Selenium on this website for about 3 weeks and so far, I can usually find an element by css selector or XPath. However, this specific section of the website is giving me a very hard time.
After I click on “reset office 365 password” a window comes up and I want to programmably put in the new password but it can’t find anything in the popup window.
Here is what the page looks like:
(I am too low of score to post pictures here) https://cdn.discordapp.com/attachments/768594779344470022/845811910577881098/unknown.png
Here is the whole element’s information:
<input type="password" tabindex="1" name="password" class="m-third pass ng-pristine ng-empty ng-invalid ng-invalid-required ng-touched" ng-model="password.value" ng-blur="password.check = false" ng-focus="password.check = true" required="" autofocus="" ng-disabled="!active">
Here is what I tried: (I tried a lot of things)
Tried clicking on the password box by using css selector – failed: Invalid selector
im_blacklistaddbutton = browser_options.browser.find_element_by_css_selector('#ng-app > div.page-container > div > div > div.vertical-tabs.j-vertical-tabs.ng-scope > div.vertical-tabs-panes.p0 > div > div > div.page-content.ng-scope > div > div > form > div > div > div.ng-isolate-scope > div.modal > div.modal-body.ng-transclude > div > reset:password > ng-form > div:nth-child(1) > div > div.validation-input > input')
im_blacklistaddbutton.send_keys(email_pd.pd)
selenium.common.exceptions.InvalidSelectorException: Message: invalid selector: An invalid or illegal selector was specified
Tried clicking on the password box by using xpath selector – failed: Namespace Error
im_blacklistaddbutton = browser_options.browser.find_element_by_xpath('//*[#id="ng-app"]/div[2]/div/div/div[3]/div[2]/div/div/div[2]/div/div/form/div/div/div[3]/div[1]/div[2]/div/reset:password/ng-form/div[1]/div/div[1]/input')
im_blacklistaddbutton.send_keys(email_pd.pd)
NamespaceError: Failed to execute 'evaluate' on 'Document': The string '//*[#id="ng-app"]/div[2]/div/div/div[3]/div[2]/div/div/div[2]/div/div/form/div/div/div[3]/div[1]/div[2]/div/reset:password/ng-form/div[1]/div/div[1]/input' contains unresolvable namespaces.
Tried waiting for the element by partial link text: It timed out
wait.until(EC.visibility_of_element_located((By.PARTIAL_LINK_TEXT, 'Generate password')))
selenium.common.exceptions.TimeoutException: Message:
Tried waiting for the element by ID name text value: It timed out
wait.until(EC.text_to_be_present_in_element((By.CLASS_NAME, 'btn m-link'), "Generate Password"))
selenium.common.exceptions.TimeoutException: Message:
Tried to switch to a window or iframe but it said that the div class of "model" is not a window or an iframe.
From here I am completely lost as to why this stupid window is not accessible. Text window - why are you so mean to me?
Here is my specific function in total:
def reset_im_oa_password():
browser_options.browser.get('https://cpx.intermedia.net/ControlPanel/Menu/AccountMenu/?frameUrl=https://cpx.intermedia.net/aspx/Office365/Home/licenses#/installed/users')
wait = WebDriverWait(browser_options.browser, 10)
try:
wait.until(EC.element_to_be_clickable((By.XPATH, 'player')))
except exceptions.TimeoutException as e:
pass
browser_options.browser.switch_to_frame('mainFrame')
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#ng-app > div.page-container > div > div > div.vertical-tabs.j-vertical-tabs.ng-scope > div.vertical-tabs-panes.p0 > div > div > div.page-content.ng-scope > div > div > form > div > div > div:nth-child(2) > div.table-wrap.table-fixed.j-table-wrap.s-wide.ng-isolate-scope > div.table-filter > div.table-filter-search.searchbox.ng-isolate-scope > div > span:nth-child(3) > input')))
im_blacklistaddbutton = browser_options.browser.find_element_by_css_selector('#ng-app > div.page-container > div > div > div.vertical-tabs.j-vertical-tabs.ng-scope > div.vertical-tabs-panes.p0 > div > div > div.page-content.ng-scope > div > div > form > div > div > div:nth-child(2) > div.table-wrap.table-fixed.j-table-wrap.s-wide.ng-isolate-scope > div.table-filter > div.table-filter-search.searchbox.ng-isolate-scope > div > span:nth-child(3) > input')
im_blacklistaddbutton.send_keys(email_or_user_selection.email_select)
im_blacklistaddbutton = browser_options.browser.find_element_by_css_selector('#ng-app > div.page-container > div > div > div.vertical-tabs.j-vertical-tabs.ng-scope > div.vertical-tabs-panes.p0 > div > div > div.page-content.ng-scope > div > div > form > div > div > div:nth-child(2) > div.table-wrap.table-fixed.j-table-wrap.s-wide.ng-isolate-scope > div.table-filter > div.table-filter-search.searchbox.ng-isolate-scope > div > span:nth-child(3) > button')
im_blacklistaddbutton.send_keys(Keys.ENTER)
wait.until(EC.element_to_be_clickable((By.XPATH, ("//*[starts-with(#id, 'btnResetPassword')]"))))
im_blacklistaddbutton = browser_options.browser.find_element_by_xpath(("//*[starts-with(#id, 'btnResetPassword')]"))
im_blacklistaddbutton.send_keys(Keys.ENTER)
try:
wait.until(EC.visibility_of_element_located((By.PARTIAL_LINK_TEXT, 'Generate password')))
except exceptions.TimeoutException as e:
pass
browser_options.browser.switch_to_window('model') # anything past this section will fail
wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'model')))
im_blacklistaddbutton = browser_options.browser.find_element_by_xpath('//*[#id="ng-app"]/div[2]/div/div/div[3]/div[2]/div/div/div[2]/div/div/form/div/div/div[3]/div[1]/div[2]/div/reset:password/ng-form/div[1]/div/div[1]/input')
im_blacklistaddbutton.send_keys(email_pd.pd)
return
if anyone needs the full code from the webpage let me know. Thanks

If this element is not really inside an iframe as you write, then, wait for it to become clickable, like this:
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[type='password']")))
im_blacklistaddbutton = browser.find_element_by_css_selector("input[type='password']")
im_blacklistaddbutton.send_keys("new_password")
But make sure that css selector input[type='password'] is unique.
If not, try this one: .validation-input>input[type='password']
(Check validation-input class name si correct as it is cut on your screenshot)
If the input frame is inside iframe nothing will work until you switch to this iframe.

Because of no webpage code, right now I can't say why the element is not detectable by Selenium but you can try one thing. Right click on the element(input tag in dom shown in picture) and go to "Copy to" option and select "Copy JS Path". Then go to console tab in dev tools and paste it. Then try to set it's value to some dummy text and see if it sets the password.
jsPath.value="some password" //this should set the password
If this works, then you can set the value by using JavaScript executor of Selenium in the same way.

I want to make a python script that scrapes (copies) all of the usernames from a person's following list

I tried to follow along with some youtube tutorials in order to make my code do what I want it to do, but I still haven't found any answer on the entire internet...
Here I tried to make the script using BeautifulSoup:
import bs4
import requests
resoult = requests.get("https://www.instagram.com/kyliejenner/following/")
src = resoult.content
Soup = bs4.BeautifulSoup(src, "lxml")
links = Soup.find_all("a")
print(links)
print("/n")
for link in links:
if "FPmhX notranslate _0imsa " in link.text:
print(link)
And here I tried to do the same thing with Selenium, but the problem is that I don't know the next steps in order to make my code copy the usernames a user is following
import selenium
from selenium import webdriver
import time
PATH = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://www.instagram.com/")
time.sleep(2)
username = driver.find_element_by_css_selector ("#loginForm > div > div:nth-child(1) > div > label >
input")
username.send_keys ("my_username")
password = driver.find_element_by_css_selector ("#loginForm > div > div:nth-child(2) > div > label >
input")
password.send_keys("password")
loginButton = driver.find_element_by_css_selector ("#loginForm > div > div:nth-child(3)")
loginButton.click()
time.sleep(3)
saveinfoButton = driver.find_element_by_css_selector ("#react-root > section > main > div > div > div
>
section > div > button")
saveinfoButton.click()
time.sleep(3)
notnowButton = driver.find_element_by_css_selector("body > div.RnEpo.Yx5HN > div > div > div >
div.mt3GC
> button.aOOlW.HoLwm")
notnowButton.click()
I would really appreciate it if someone could solve this problem. Again, all that I want my script to do is to copy the usernames from the "following" section of someones profile.

How do I select an item in a dropdown list where I have to scroll?

On the following website: https://www1.hkexnews.hk/search/titlesearch.xhtml?lang=en
I am trying to select the following dropdown list option via selenium:
Under Headline category and Document Type, on the first dropdown list I select Headline Category, then on the second list I want to select Announcements and Notices -> New Listings (Listed Issuers/New Applicants -> Allotment Results.
I have realised you have to use driver.find_element_by_css_selector() as none of the items on the list have unique ID's.
I have also realized you have to scroll the page when the option is not in view so that CSS selector can pick it up.
What I have SHOULD work but it doesn't? Can someone help me resolve this, please?
```python
# Select dropdown list
driver.find_element_by_css_selector('#rbAfter2006 > div > div > div').click()
# Select Announcements and Notices
driver.find_element_by_css_selector('#rbAfter2006 > div ~ div > div > div > div > ul > li ~ li ').click()
# Scroll down so that New Listings (Listed Issuers/New Applicants) is in view
element = driver.find_element_by_css_selector('#rbAfter2006 > div ~ div > div > div > div > ul > li ~ li > a '
'~ div > div > ul > li ~ li ~ li ~ li ~ li ~ li ~ li ~ li ~ li')
actions = ActionChains(driver)
actions.move_to_element(element).perform()
# Click New Listings (Listed Issuers/New Applicants)
driver.find_element_by_css_selector('#rbAfter2006 > div ~ div > div > div > div > ul > li ~ li > a '
'~ div > div > ul > li ~ li ~ li ~ li ~ li ~ li ~ li ').click()
# THIS IS WHAT FAILS, Can't find element? I am currently printing the box so I know what is selected
print(driver.find_element_by_css_selector('#rbAfter2006 > div ~ div > div > div > div > ul > li ~ li > a '
'~ div > div > ul > li ~ li ~ li ~ li ~ li ~ li ~ li > a ~ div > ul > li ~ li').text)
```
I currently get an element not found an error

Can you try getting a list out of dropdown values probably using css selector like
ul li a
then loop through to find a matching value and click it

Ideally we should not select an element with such complex css selector, try to shorten the css selector.
Then,
Try to mouse hover to the first element in the drop down as,
driver.action.move_to(first_element_in_dropdown).perform
Then,
Try to scroll to the element which you need using,
driver.execute_script("arguments[0].scrollIntoView({behavior: 'smooth', block: 'center', inline: 'nearest'});",element_to_be_selected)
Note: The code is in Ruby, try translating to python and use. This may help

To click item from drop-down list induce WebDriverWait and presence_of_element_located() then use location_once_scrolled_into_view and click on the element using following xpath.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
driver.get('https://www1.hkexnews.hk/search/titlesearch.xhtml?lang=en')
driver.maximize_window()
wait = WebDriverWait(driver,40)
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,'div.combobox-input-wrap a[data-value="rbAll"]'))).click()
wait.until(EC.element_to_be_clickable((By.XPATH,'//div[#class="droplist-item"]/a[contains(.,"Headline Category")]'))).click()
wait.until(EC.element_to_be_clickable((By.XPATH,'//div[#id="rbAfter2006"]//div[#class="combobox-input-wrap"]/a[contains(.,"ALL")]'))).click()
wait.until(EC.element_to_be_clickable((By.XPATH,'//div[#class="droplist-group"]//ul[#class="droplist-items"]//li/a[contains(.,"Announcements and Notices")]'))).click()
ele=wait.until(EC.presence_of_element_located((By.XPATH,'//div[#class="droplist-group droplist-submenu level2"]//ul//li/a[contains(.,"New Listings (Listed Issuers/New Applicants)")]')))
ele.location_once_scrolled_into_view
ele.click()
ele2=wait.until(EC.presence_of_element_located((By.XPATH,'//div[#class="droplist-group droplist-submenu level3"]//ul//li/a[contains(.,"Allotment Results")]')))
ele2.location_once_scrolled_into_view
ele2.click()
Browser Snapshot

Get href using xpath + id

I have a list of search results 9 search results from this site and I'd like to get the href link for each of the items in the search results.
Here is the xpath and selectors of the 1st, 2nd, and 3rd items' links:
'//*[#id="search-results"]/div[4]/div/ctl:cache/div[3]/div[1]/div/div[2]/div[2]/div[2]/p[4]/a'
#search-results > div.c_408104 > div > ctl:cache > div.product-list.grid > div:nth-child(8) > div > div.thumbnail > div.caption.link-behavior > div.caption > p.description > a
'//*[#id="search-results"]/div[4]/div/ctl:cache/div[3]/div[2]/div/div[2]/div[2]/div[2]/p[4]/a'
#search-results > div.c_408104 > div > ctl:cache > div.product-list.grid > div:nth-child(13) > div > div.thumbnail > div.caption.link-behavior > div.caption > p.description > a
'//*[#id="search-results"]/div[4]/div/ctl:cache/div[3]/div[4]/div/div[2]/div[2]/div[2]/p[2]/a'
#search-results > div.c_408104 > div > ctl:cache > div.product-list.grid > div:nth-child(14) > div > div.thumbnail > div.caption.link-behavior > div.caption > p.description > a
I've tried:
browser.find_elements_by_xpath("//a[#href]")
but this returns all links on the page, not just the search results. I've also tried using the id, but not sure what is the proper syntax.
browser.find_elements_by_xpath('//*[#id="search-results"]//a')

What you want is the attribute="href" of all the results...
So I'll show you an example:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
url = 'https://www.costco.com/sofas-sectionals.html'
chrome_options = Options()
chrome_options.add_argument("--start-maximized")
browser = webdriver.Chrome("C:\workspace\TalSolutionQA\general_func_class\chromedriver.exe",
chrome_options=chrome_options)
browser.get(url)
result_xpath = '//*[#class="caption"]//a'
all_results = browser.find_elements_by_xpath(result_xpath)
for i in all_results:
print(i.get_attribute('href'))
So what I'm doing here is just getting all the elements that I know to have the links and saving them to all_results, now in selenium we have a method get_attribute to extract the required attribute.
Hope you find this helpful!

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

time.sleep gives desired scrape but wait until does not - python

Related

What is the correct element to use from this section of html?

Selenium - switch to div class that is a window

I want to make a python script that scrapes (copies) all of the usernames from a person's following list

How do I select an item in a dropdown list where I have to scroll?

Get href using xpath + id

Categories

Resources