I am scraping an angular.js site. My initial link has a search button. I find by xpath and click with no issues. After I click search, I want to be able to click each of the athletes in the table to go to their info pages, but I am not having success with the click method. The links are attached to their names.
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
TIMEOUT = 5
driver = webdriver.Firefox()
driver.set_page_load_timeout(TIMEOUT)
url = 'https://n.rivals.com/search#?formValues=%7B%22sport%22:%22Football%22,%22recruit_year%22:2021,%22offer_and_visit_type%22:%5B%22Offer%22%5D,%22prospect_profiles.prospect_colleges.offer%22:true,%22page_number%22:1,%22page_size%22:50%7D'
try:
driver.get(url)
except TimeoutException:
pass
search_button = driver.find_element_by_xpath('//*[#id="articles"]/div/div[2]/div/div/div[1]/form/div[2]/div[5]/button')
search_button.click();
#below is where I tried, but could not get to click
first_athlete = driver.find_element_by_xpath('//*[#id="content_"]/td[1]/div[2]/a')
first_athlete.click();
Works if you remove the last /a in the xpath:
first_athlete = driver.find_element_by_xpath('//*[#id="content_"]/td[1]/div[2]')
first_athlete.click()
If you want to search for all athletes and you have the name of athletes with you, you can use CSS selector as well.
athelete = driver.find_elements_by_css_selector(`#content_ > td > div > a[href *="donovan-jackson"]);
athelete.click();
This code will give you a unique web element for each player.
Thanks
Related
I am trying to get search results from yahoo search using python - selenium and bs4. I have been able to get the links successfuly but I am not able to click the button at the bottom to go to the next page. I tried one way, but it could't identify after the second page.
Here is the link:
https://in.search.yahoo.com/search;_ylt=AwrwSY6ratRgKEcA0Bm6HAx.;_ylc=X1MDMjExNDcyMzAwMgRfcgMyBGZyAwRmcjIDc2ItdG9wLXNlYXJjaARncHJpZANidkhMeWFsMlJuLnZFX1ZVRk15LlBBBG5fcnNsdAMwBG5fc3VnZwMxMARvcmlnaW4DaW4uc2VhcmNoLnlhaG9vLmNvbQRwb3MDMARwcXN0cgMEcHFzdHJsAzAEcXN0cmwDMTQEcXVlcnkDc3RhY2slMjBvdmVyZmxvdwR0X3N0bXADMTYyNDUzMzY3OA--?p=stack+overflow&fr=sfp&iscqry=&fr2=sb-top-search
This is what im doing to get data from page but need to put in a loop which changes pages:
page = BeautifulSoup(driver.page_source, 'lxml')
lnks = page.find('div', {'id': 'web'}).find_all('a', href = True)
for i in lnks:
print(i['href'])
You don't need to scroll down to the bottom. The next button is accessible without scrolling. Suppose you want to navigate 10 pages. The python script can be like this:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
driver=webdriver.Chrome()
driver.get('Yahoo Search URL')
# Let's create a loop containing the XPath for next button
# As well as waiting for the next button to be clickable.
for i in range(10):
WebDriverWait(driver, 5).until(EC.element_to_be_clickable(By.XPATH, '//a[#class="next"]'))
navigate = driver.find_element_by_xpath('//a[#class="next"]').click()
The next page button is on the bottom of the page so you first need to scroll to that element and then click it. Like this:
from selenium.webdriver.common.action_chains import ActionChains
actions = ActionChains(driver)
next_page_btn = driver.find_element_by_css_selector("a.next")
actions.move_to_element(next_page_btn).build().perform()
time.sleep(0.5)
next_page_btn.click()
I am trying to scrape information on a website where the information is not immediately present. When you click a certain button, the page begins to load new content on the bottom of the page, and after it's done loading, red text shows up as "Assists (At Least)". I am able to find the first button "Go to Prop builder", which doesn't immediately show up on the page, but after the script clicks the button, it times out when trying to find the "Assists (At Least)" text, in spite of the script sleeping and being present on the screen.
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
import time
from bs4 import BeautifulSoup
driver = webdriver.Chrome()
driver.get('https://www.bovada.lv/sports/basketball/nba')
# this part succeeds
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located(
(By.XPATH, "//span[text()='Go to Prop builder']")
)
)
element.click()
time.sleep(5)
# this part fails
element2 = WebDriverWait(driver, 6).until(
EC.visibility_of_element_located(
(By.XPATH, "//*[text()='Assists (At Least)']")
)
)
time.sleep(2)
innerHTML = driver.execute_script('return document.body.innerHTML')
driver.quit()
soup = BeautifulSoup(innerHTML, 'html.parser')
The problem is the Assist element is under a frame. You need to switch to the frame like this:
frame = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME,"player-props-frame")))
driver.switch_to.frame(frame)
Increase the timeout to confirm the timeout provided is correct, You can also confirm using debug mode. If still issue persist, please check "Assists (At Least)" element do not fall under any frame.
You can also share the DOM and proper error message if issue not resolved.
I have a couple of suggestions you could try,
Make sure that the content loaded at the bottom of the is not in a frame. If it is, you need to switch to the particular frame
Check the XPath is correct, try the XPath is matching from the Developer Console
Inspect the element from the browser, once the Developer console is open, press CTRL +F and then try your XPath. if it's not highlighting check frames
Check if there is are any iframes in the page, search for iframe in the view page source, and if you find any for that field which you are looking for, then switch to that frame first.
driver.switch_to.frame("name of the iframe")
Try adding a re-try logic with timeout, and a refresh button if any on the page
st = time.time()
while st+180>time.time():
try:
element2 = WebDriverWait(driver, 6).until(
EC.visibility_of_element_located(
(By.XPATH, "//*[text()='Assists (At Least)']")
)
)
except:
pass
The content you want is in an iFrame. You can access it by switching to it first, like this:
iframe=driver.find_element_by_css_selector('iframe[class="player-props-frame"]')
driver.switch_to.frame(iframe)
Round brackets are the issue here (at least in some cases...). If possible, use .contains selector:
//*[contains(text(),'Assists ') and contains(text(),'At Least')]
I am using selenium to go to a url and click search. This part works fine. After searching, I want to scrape all the href URL's associated with the athletes on the page and click on the next page. I have tried several class and xpath locations without success...
Goal:
1) Go to the URL listed
2) Click search buttton
3) Scrape all the urls that go to each athletes profile page
4) Click the next page button at the bottom
5) Repeat this process through all the pages
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
TIMEOUT = 5
driver = webdriver.Firefox()
driver.set_page_load_timeout(TIMEOUT)
url = 'https://n.rivals.com/search#?formValues=%7B%22sport%22:%22Football%22,%22recruit_year%22:2021,%22offer_and_visit_type%22:%5B%22Offer%22%5D,%22prospect_profiles.prospect_colleges.offer%22:true,%22page_number%22:1,%22page_size%22:50%7D'
try:
driver.get(url)
except TimeoutException:
pass
#this click method works
search_button = driver.find_element_by_xpath('//*[#id="articles"]/div/div[2]/div/div/div[1]/form/div[2]/div[5]/button')
search_button.click();
#I cannot find/get the href links below to print:
profile_page = driver.find_elements_by_xpath('//*[#id="content_"]/td[1]/div[2]/div/a')
profile_page = [home.get_attribute("href") for home in profile_page]
print(profile_page)
#I cannot get it to click the next button to do the same thing on the next page:
next_button = driver.find_element_by_xpath('//*[#id="content_"]/td[1]/div[2]/div/a')
next_button.click();
I am trying to run a script in selenium webdriver python. Where I am trying to click on search field, but its always showing exception of "An element could not be located on the page using the given search parameters."
Here is script:
from selenium import webdriver
from selenium.webdriver.common.by import By
class Exercise:
def safari(self):
class Exercise:
def safari(self):
driver = webdriver.Safari()
driver.maximize_window()
url= "https://www.airbnb.com"
driver.implicitly_wait(15)
Title = driver.title
driver.get(url)
CurrentURL = driver.current_url
print("Current URL is "+CurrentURL)
SearchButton =driver.find_element(By.XPATH, "//*[#id='GeocompleteController-via-SearchBarV2-SearchBarV2']")
SearchButton.click()
note= Exercise()
note.safari()
Please Tell me, where I am wrong?
There appears to be two matching cases:
The one that matches the search bar is actually the second one. So you'd edit your XPath as follows:
SearchButton = driver.find_element(By.XPATH, "(//*[#id='GeocompleteController-via-SearchBarV2-SearchBarV2'])[2]")
Or simply:
SearchButton = driver.find_element_by_xpath("(//*[#id='GeocompleteController-via-SearchBarV2-SearchBarV2'])[2]")
You can paste your XPath in Chrome's Inspector tool (as seen above) by loading the same website in Google Chrome and hitting F12 (or just right click anywhere and click "Inspect"). This gives you the matching elements. If you scroll to 2 of 2 it highlights the search bar. Therefore, we want the second result. XPath indices start at 1 unlike most languages (which usually have indices start at 0), so to get the second index, encapsulate the entire original XPath in parentheses and then add [2] next to it.
Inspect
Im trying to click on this button to move to the login page.
my code is :
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('http://moodle.tau.ac.il/')
thats work fine but i can only find the form by using
loginform = driver.find_element_by_xpath("//form[#id='login']/")
I don't know how to get to the button, it's very basic stuff but I didn't find any good example.
This will click on the login button on moodle.tau.ac.il page.
The line driver.find_element_by_xpath(".//*[#id='login']/div/input").click() finds the login button on the page and clicks it. Xpath is just a selector type that you can use with selenium to find web elements on a page. You can also use ID, classname, and CSSselectors.
from selenium import webdriver
driver = new webdriver.Chrome()
driver.get('moodle.tau.ac.il')
# This will take you to the login page.
driver.find_element_by_xpath(".//*[#id='login']/div/input").click()
# Fills out the login page
elem = driver.find_element_by_xpath("html/body/form/table/tbody/tr[2]/td/table/tbody/tr[1]/td[2]/input")
elem.send_keys('Your Username')
elem = driver.find_element_by_xpath("html/body/form/table/tbody/tr[2]/td/table/tbody/tr[3]/td[2]/input")
elem.send_keys('Your ID Number')
elem = driver.find_element_by_xpath("html/body/form/table/tbody/tr[2]/td/table/tbody/tr[1]/td[2]/input")
elem.send_keys('Your Password')
driver.find_element_by_xpath("html/body/form/table/tbody/tr[2]/td/table/tbody/tr[7]/td[2]/input").click()
The page has two identical login forms and your XPath returns the hidden one.
So with the visible one:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get(r"http://moodle.tau.ac.il/")
driver.find_element_by_css_selector("#page-content #login input[type=submit]").click()
Or with an XPath:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get(r"http://moodle.tau.ac.il/")
driver.find_element_by_xpath("id('page-content')//form[#id='login']//input[#type='submit']").click()
You could find it using XPath as mentioned by #ChrisP
You could find it by CSS selector: "#login input[type='text']"
Or you could also just submit the form... loginForm.submit()
Ideally, you'd have a unique id for that button which would make it very easy to find.