How to scrape in Python Selenium? - python

If you visit this site, https://www.premierleague.com/match/66686 and then press stats tab, you will see several information about the match. How am I supposed to scrape the Possession for both teams?
This did not work.
stats = driver.find_element(By.XPATH, '//*[#id="mainContent"]/div/section[2]/div[2]/div[2]/div[1]/div/div/ul/li[3]')
stats.click()
posHome = driver.find_element(By.XPATH,'//body[1]/main[1]/div[1]/section[2]/div[2]/div[2]/div[2]/section[3]/div[2]/div[2]/table[1]/tbody[1]/tr[1]/td[1]')
print(posHome.text)
posAway = driver.find_element(By.XPATH,'//*[#id="mainContent"]/div/section[2]/div[2]/div[2]/div[2]/section[3]/div[2]/div[2]/table/tbody/tr[1]/td[3]')
print(posAway.text)
Please let me know how to solve this issue and thanks!

To print the Possession for both teams you need to induce WebDriverWait for the visibility_of_element_located() and you can use the following locator strategies:
Code Block:
driver.get("https://www.premierleague.com/match/66686")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[text()='Accept All Cookies']"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//li[text()='Stats']"))).click()
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "tbody.matchCentreStatsContainer>tr>td>p"))).text)
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "tbody.matchCentreStatsContainer>tr>td:nth-child(3)>p"))).text)
driver.quit()
Console Output:
33.9
66.1
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python

Related

How to click on button with text Show more results using Selenium and Python?

For this button:
I used these pieces of code, but it did not work.
WebDriverWait(wd, 1).until(EC.element_to_be_clickable((By.XPATH, "//input[contains(., 'Show more results')]"))).click()
and
button = wd.find_elements_by_xpath("//*[contains(text(), 'My Button')]")
button.click()
You were almost there. Show more results isn't the innerText but the value of the value attribute.
Solution
To click on the element with text as Show more results you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following locator strategies:
Using CSS_SELECTOR:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[value='Show more results']"))).click()
Using XPATH:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[#value='Show more results']"))).click()
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

How can I adjust this line of selenium code to get the status info of this item?

I have this line of code which I am trying to use to obtain the status of an item. Here is the line of code:
item_status = driver.findElement(By.className("status-info")).getText();
I'm not sure how I can adjust this to retrieve the text seen here:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options=Options()
driver=webdriver.Chrome(options=options)
#Directing to site
driver.get("https://www.amazon.co.uk/Nintendo-Switch-OLED-Model-Neon/dp/B098TNW7NM/ref=sr_1_3?keywords=Nintendo+Switch&qid=1651147043&sr=8-3");
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "/html/body/div[2]/span/form/div[3]/span[1]/span/input"))).click()
When you are doing this
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "/html/body/div[1]/header/div/div[1]/div[2]/div/form/div[3]/div/span/input"))).click()
it will click on search icon, now on the result page, this xpath //span[#class='a-size-base a-color-success a-text-bold'] is not present hence nothing is getting printed on the console you are likely to face TimedoutException.
However looking at the screenshot that you've shared, I would say to use this xpath
//div[#id='availability']//span[contains(text(),'In stock.')]
If you want to print the text and tag
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[#id='availability']//span[contains(text(),'In stock.')]"))).get_attribute("innerHTML"))
If only text you want:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[#id='availability']//span[contains(text(),'In stock.')]"))).get_attribute("innerText"))
driver.findElement(By.className("status-info")) is the Java syntax and getText() is a Java method. Possibly you need Python syntax and method.
Solution
To print the text In stock. you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:
Using CSS_SELECTOR:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "span.a-size-base.a-color-success.a-text-bold"))).text)
Using XPATH:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[#class='a-size-base a-color-success a-text-bold']"))).get_attribute("innerHTML"))
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Selenium not able to click on Get Data button on using Python

I am scraping data from this website . The element is below and geckodriver
<img class="getdata-button" style="float:right;" src="/common/images/btn-get-data.gif" id="get" onclick="document.getElementById('submitMe').click()">
but can't get selenium to click it tried even xpath, id but not luck
is there any fix or work around to get it done?
To click on the element Get Data you can use either of the following Locator Strategies:
Using css_selector:
driver.find_element_by_css_selector("img.getdata-button#get").click()
Using xpath:
driver.find_element_by_xpath("//img[#class='getdata-button' and #id='get']").click()
Ideally, to click on the element you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "img.getdata-button#get"))).click()
Using XPATH:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//img[#class='getdata-button' and #id='get']"))).click()
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You should probably try by id
driver.find_element(By.ID, 'get').click()

How to click on the webelement with in the highlighted script using Selenium and Python

I tried:
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH,"//*[#value='Sign Out']")))
but no luck.. please see image for the html script
The desired element is a JavaScript enabled element so to click on the element you have to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR and click():
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "form[action*='Logoff']>li>input[value='Sign Out']"))).click()
Using XPATH and submit():
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//form[contains(#action, 'Logoff')]/li/input[#value='Sign Out']"))).submit()
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Retrieving dynamic value with selenium webdriver, python

I am aware that there already exists similar threads about this. However, when trying previously suggested methods to retrieve my specific dynamic table value, all I am getting is either a nbsp value or something cryptic like "1a207feb-8080-4ff0-..."
What I am trying to do:
Get the current table value for euro/oz value for gold from here. I "inspected" the page and got the xpath (//*[#id="bullionPriceTable"]/div/table/tbody/tr[3]/td[3]/span)
My code:
driver = webdriver.Chrome("path/to/chromedriver")
driver.get("https://www.bullionvault.com/gold-price-chart.do")
xpath = '//*[#id="bullionPriceTable"]/div/table/tbody/tr[3]/td[3]/span'
select=driver.find_element_by_xpath(xpath)
print(select)
This prints:
<selenium.webdriver.remote.webelement.WebElement (session="3ade114e9f0907e4eb13deac6a264fc8", element="3a670af5-8594-4504-908a-a9bfcbac7342")>
which obviously is not the number I was looking for.
I've also experimented with using get_attribute('innerHtml') and .text on the webElement, but to no avail. What am I missing here? Am I just not encoding this value correctly, or am I extracting from the wrong source?
To extract the table value for euro/oz value for gold i.e. the text €1,452.47 you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using XPATH and get_attribute():
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
driver.get('https://www.bullionvault.com/gold-price-chart.do#')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[#class='cookies-warning-buttons']//a[text()='Accept']"))).click()
driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//strong[text()='Live Gold Price']"))))
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//th[text()='Gold Price per Ounce']//following-sibling::td[3]/span[#data-currency='EUR']"))).get_attribute("innerHTML"))
Console Output:
€1,456.30
Using XPATH and text attribute:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
driver.get('https://www.bullionvault.com/gold-price-chart.do#')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[#class='cookies-warning-buttons']//a[text()='Accept']"))).click()
driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//strong[text()='Live Gold Price']"))))
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//th[text()='Gold Price per Ounce']//following-sibling::td[3]/span[#data-currency='EUR']"))).text)
Console Output:
€1,456.30
Wait for the page to load then try to get the innerHTML like the following example
import time
from selenium import webdriver
chrome_browser = webdriver.Chrome(
executable_path=r"chromedriver.exe")
chrome_browser.get("https://www.bullionvault.com/gold-price-chart.do")
time.sleep(2)
select = chrome_browser.find_element_by_xpath(
"//*[#id='bullionPriceTable']/div/table/tbody/tr[3]/td[3]/span"
).get_attribute("innerHTML")
print(select)
€1,450.98

Categories