I am trying to scrape this website
https://script.google.com/a/macros/cprindia.org/s/AKfycbzgfCVNciFRpcpo8P7joP1wTeymj9haAQnNEkNJJ2fQ4FBXEco/exec
I am using selenium and python.I am not able to view entire page source,Basically i have to scrape the table inside it and click on next button,but the code of next and table not visible on page source.Here is my code
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from bs4 import BeautifulSoup
from selenium import webdriver
browser = webdriver.PhantomJS()
browser.get(link)
pass1 = browser.find_element_by_xpath("/html/body/div[2]/table[2]/tbody/tr[1]/td/div/div/div[2]/div[2]")
pass1.click()
time.sleep(30)
I am getting this error,NoSuchElementException.
There are two iframes present on the page, so you need to first switch on those iframe and then you need to click on the element.
And you can apply explicit wait on the element so that the script waits until the element is visible on the page.
You can do it like:
browser = webdriver.PhantomJS()
browser.get(link)
browser.switch_to.frame(driver.find_element_by_id('sandboxFrame'))
browser.switch_to.frame(driver.find_element_by_id('userHtmlFrame'))
WebDriverWait(browser, 20).until(EC.presence_of_element_located((By.XPATH, "//div[contains(#class,'charts-custom-button-collapse-left')]//div"))).click()
Note: You have to add the following imports:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Related
Here is the code I'm trying to use to scrape data from FRED website to download the time series data in CSV format but it is redirecting me top another page
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver import ActionChains
url='https://fred.stlouisfed.org/series/TERMCBAUTO48NS'
driver=webdriver.Chrome(executable_path=r'D:\\Workspace\\Python\\automation\\chromedriver.exe')
driver.get(url)
element=driver.find_element_by_id('download-button')
element.click()
wait1=WebDriverWait(driver,20)
result1=wait1.until(
EC.element_to_be_clickable((By.ID,'fg-download-menu')))
print('Result 1: ',result1)
menu=driver.find_element_by_id('fg-download-menu')
wait1=WebDriverWait(driver,20)
result2=wait1.until(
EC.element_to_be_clickable((By.ID,'download-data-csv')))
print('Result 2: ',result2)
hidden_submenu=driver.find_element_by_id('download-data-csv')
actions=ActionChains(driver)
actions.move_to_element(menu)
actions.click(hidden_submenu)
actions.perform()
driver.quit()
The locators you are using are not unique. Like there are several tags with the same id download-button.
Its important to find unique locators for Elements. You can refer This link
Try like below and confirm:
driver.get("https://fred.stlouisfed.org/series/TERMCBAUTO48NS")
wait = WebDriverWait(driver,30)
# Click on Download button
wait.until(EC.element_to_be_clickable((By.XPATH,"//div[#id='page-title']//button[#id='download-button']"))).click()
# Click on CSV data
wait.until(EC.element_to_be_clickable((By.XPATH,"//ul[#id='fg-download-menu']//a[#id='download-data-csv']"))).click()
This should work:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome("D:/chromedriver/94/chromedriver.exe")
driver.get("https://fred.stlouisfed.org/series/TERMCBAUTO48NS")
# wait 60 seconds
wait = WebDriverWait(driver,60)
wait.until(EC.element_to_be_clickable((By.ID, "download-button"))).click()
wait.until(EC.element_to_be_clickable((By.ID, "download-data-csv"))).click()
I've been trying to scrape some information of this E-commerce website with selenium. However when I access the website I need to accept cookies to continue. This only happens when the bot accesses the website, not when I do it manually. When I try to find the corresponding element either by xpath, as I find it when I inspect the page manually I always get this error message:
selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page document
My code is mentined below.
import time
import pandas
pip install selenium
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
from bs4 import BeautifulSoup #pip install beautifulsoup4
PATH = "/Users/Ziye/Desktop/Python/chromedriver"
delay = 15
driver = webdriver.Chrome(PATH)
driver.implicitly_wait(10)
driver.get("https://en.zalando.de/women/?q=michael+michael+kors+taschen")
driver.find_element_by_xpath('//*[#id="uc-btn-accept-banner"]').click()
This is the HTML corresponding to the button "That's ok". The XPATH is as above.
<button aria-label="" id="uc-btn-accept-banner" class="uc-btn uc-btn-primary">
That’s OK <span id="uc-optin-timer-display"></span>
</button>
Does anyone know where my mistake lies?
You should add explicit wait for this button:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
driver = webdriver.Chrome(executable_path='/snap/bin/chromium.chromedriver')
driver.implicitly_wait(10)
driver.get("https://en.zalando.de/women/?q=michael+michael+kors+taschen")
wait = WebDriverWait(driver, 15)
wait.until(EC.element_to_be_clickable((By.XPATH, '//*[#id="uc-btn-accept-banner"]')))
driver.find_element_by_xpath('//*[#id="uc-btn-accept-banner"]').click()
Your locator is correct.
As css selector, you can use .uc-btn-footer-container .uc-btn.uc-btn-primary
Here is my code so far, note that the web page loads up with a captcha. I countered this with adding a time.sleep so I can run tests. When I try to submit the form by submitting "Create new account", I get an error saying the element has no attribute for 'submit'. I tried finding the element using xpath, css_selectos, tags, class names, etc.
from selenium import webdriver
from selenium.webdriver.support.ui import Select
import time
browser = webdriver.Chrome()
browser.get('https://www.bstn.com/en/register/address')
time.sleep(35)
elam = browser.find_element_by_css_selector("[value='Create new account']")
elam.Submit()
If you are trying to click on Create new account button after filling information then please find below xpath to click on it
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
WebDriverWait(driver, 10).until(
EC.element_to_be_clickable((By.XPATH, "//input[#class='button radius charcheck-submit']"))).click()
Another solution with action class
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver import ActionChains
actionChains = ActionChains(driver)
submit = WebDriverWait(driver, 10).until(
EC.visibility_of_element_located((By.XPATH, "//input[#class='button radius charcheck-submit']")))
actionChains.move_to_element(submit).click().perform()
Working code
from selenium import webdriver
from selenium.webdriver.support.ui import Select
import time
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
browser = webdriver.Chrome(executable_path=r"C:\New folder\chromedriver.exe")
# browser = webdriver.Chrome()
browser.get('https://www.bstn.com/en/register/address')
time.sleep(35)
WebDriverWait(browser, 10).until(
EC.element_to_be_clickable((By.XPATH, "//input[#class='button radius charcheck-submit']"))).click()
On the website the selenium script cannot find the login and password fields. I tried to search by xpath, css selector, name and class name. But nothing worked.
from selenium import webdriver
from time import sleep
driver = webdriver.Firefox()
driver.get("https://login.aliexpress.com/")
driver.find_element_by_id("fm-login-id").send_keys("test_id")
driver.find_element_by_id("fm-login-password").clear()
driver.find_element_by_id("fm-login-password").send_keys("test_pass")
driver.find_element_by_id("fm-login-submit").click()`
I tried to do this with the help of Selenium IDE, and everything worked in the GUI. But after I exported the code to python and ran it, the program gave an error that it could not find the element.
The login form is inside of a frame, you need to switch to it first.
from selenium import webdriver
from time import sleep
driver = webdriver.Firefox()
driver.get("https://login.aliexpress.com/")
frame = driver.find_element_by_id("alibaba-login-box")
driver.switch_to.frame(frame)
driver.find_element_by_id("fm-login-id").send_keys("test_id")
driver.find_element_by_id("fm-login-password").clear()
driver.find_element_by_id("fm-login-password").send_keys("test_pass")
driver.find_element_by_id("fm-login-submit").click()
However as the the desired elements are within an <iframe> so you have to:
Induce WebDriverWait for the desired frame to be available and switch to it.
Induce WebDriverWait for the desired elements to be clickable.
You can use the following solution:
Using CSS_SELECTOR:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox(executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe')
driver.get("https://login.aliexpress.com/")
WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe#alibaba-login-box[src^='https://passport.aliexpress.com/mini_login.htm?']")))
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input.fm-text#fm-login-id"))).send_keys("test_id")
driver.find_element_by_css_selector("input.fm-text#fm-login-password").send_keys("test_pass")
driver.find_element_by_css_selector("input.fm-button#fm-login-submit").click()
Interim Broswer Snapshot:
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Reference
You can find a relevant discussion in
Ways to deal with #document under iframe
I have a page which loads dynamic content with ajax and then redirects after a certain amount of time (not fixed). How can I force Selenium Webdriver to wait for the page to redirect then go to a different link immediately after?
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
driver = webdriver.Chrome();
driver.get("http://www.website.com/wait.php")
You can create a custom Expected Condition to wait for the URL to change:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
wait = WebDriverWait(driver, 10)
wait.until(lambda driver: driver.current_url != "http://www.website.com/wait.php")
The Expected Condition is basically a callable - you can wrap it into a class overwriting the __call__() magic method as the built-in conditions are implemented.
There are several ExpectedConditions that can be used along with ExplicitWait to handle page redirection:
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
driver = webdriver.Chrome()
wait = WebDriverWait(driver, 10)
Wait for new page title with title_is(title_name):
wait.until(EC.title_is('New page Title'))
Wait for current URL to be changed to any other URL with url_changes(exact_url):
wait.until(EC.url_changes('https://current_page.com'))
Wait for navigating to exact URL with url_to_be(exact_url):
wait.until(EC.url_to_be('https://new_page.com'))
Also
url_contains(partial_url) could be used to wait for URL that contains specified substring or url_matches(pattern) - to wait for URL matched by passed regex pattern or title_contains(substring) - to wait for new title that contains specified substring
It's a good practice to "wait" for unique (for new page) element to appear.
You may use module expected_conditions.
For example, you could wait for user logo on signing in:
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver = webdriver.Firefox()
driver.get('http://yourapp.url')
timeout = 5
try:
logo_present = EC.presence_of_element_located((By.ID, 'logo_id'))
WebDriverWait(driver, timeout).until(logo_present)
except TimeoutException:
print "Timed out waiting for page to load"