Yes i know that this type of question has been answered many times before, but none of them helped me. Actually i didn't know much about it so need your help!
My problem:
I am scraping through a website and it needs a CAPTCHA to search for every input. So i use FireFox as my browser as it asks for the captcha one time and doesn't change it. My code asks the user for CAPTCHA one time and then click on search button and tries to scrape the data, but when it clicks on the search button again (as it is in a loop) then it raises this error:
selenium.common.exceptions.StaleElementReferenceException:
Message: The element reference of <input id="txt_ALPHA_NUMERIC" class="ui-inputfield ui-inputtext ui-widget ui-state-default ui-corner-all" name="txt_ALPHA_NUMERIC" type="text"> is stale;
either the element is no longer attached to the DOM, it is not in the current frame context, or the document has been refreshed
My old code:
from selenium import webdriver # Import module
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys # For keyboard keys
import time
import pandas as pd
URL = 'https://vahan.nic.in/nrservices/faces/user/searchstatus.xhtml' # Define URL
browser = webdriver.Firefox(executable_path=r'C:\Users\intel\Downloads\Setups\geckodriver.exe')
browser.get(URL)
vehicle_no = browser.find_element_by_xpath("""//*[#id="regn_no1_exact"]""")
vehicle_no.send_keys('RJ14CX3238')
captcha_input = input("enter your captcha ")
captcha = browser.find_element_by_xpath("""//*[#id="txt_ALPHA_NUMERIC"]""")
captcha.send_keys(captcha_input)
button_click = browser.find_element_by_xpath("/html/body/form/div[1]/div[3]/div/div[2]/div/div/div[2]/div[5]/div/button/span").click()
i = 111
attempt = 1
max_attempts = 2
while True:
i = i + 1
time.sleep(4)
reg_no = browser.find_element_by_xpath("/html/body/form/div[1]/div[3]/div/div[2]/div/div/div[2]/div[6]/div/div/div/table/tbody/tr[2]/td[2]/span").text
date = browser.find_element_by_xpath("/html/body/form/div[1]/div[3]/div/div[2]/div/div/div[2]/div[6]/div/div/div/table/tbody/tr[2]/td[4]").text
vehicle_no = browser.find_element_by_xpath("""//*[#id="regn_no1_exact"]""")
vehicle_no.send_keys('RJ14CX3' + str(i))
captcha.send_keys(captcha_input)
button_click = browser.find_element_by_xpath("/html/body/form/div[1]/div[3]/div/div[2]/div/div/div[2]/div[5]/div/button/span").click()
browser.execute_script("return arguments[0].scrollIntoView(true);", button_click)
Updated new code now:
from selenium import webdriver # Import module
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys # For keyboard keys
import time
import pandas as pd
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
URL = 'https://vahan.nic.in/nrservices/faces/user/searchstatus.xhtml' # Define URL
browser = webdriver.Firefox(executable_path=r'C:\Users\intel\Downloads\Setups\geckodriver.exe')
browser.get(URL)
vehicle_no = browser.find_element_by_xpath("""//*[#id="regn_no1_exact"]""")
vehicle_no.send_keys('RJ14CX3238')
captcha_input = input("enter your captcha ")
captcha = browser.find_element_by_xpath("""//*[#id="txt_ALPHA_NUMERIC"]""")
captcha.send_keys(captcha_input)
button_click = browser.find_element_by_xpath("/html/body/form/div[1]/div[3]/div/div[2]/div/div/div[2]/div[5]/div/button/span").click()
i = 111
while True:
button_click = browser.find_element_by_xpath("/html/body/form/div[1]/div[3]/div/div[2]/div/div/div[2]/div[5]/div/button/span")
WebDriverWait(browser, 10).until_not(EC.visibility_of_element_located((By.ID, "overley")))
browser.execute_script("return arguments[0].scrollIntoView(true);", button_click)
i = i + 1
#reg_no = browser.find_element_by_xpath("/html/body/form/div[1]/div[3]/div/div[2]/div/div/div[2]/div[6]/div/div/div/table/tbody/tr[2]/td[2]/span").text
#date = browser.find_element_by_xpath("/html/body/form/div[1]/div[3]/div/div[2]/div/div/div[2]/div[6]/div/div/div/table/tbody/tr[2]/td[4]").text
time.sleep(5)
vehicle_no.send_keys('RJ14CX3' + str(i))
WebDriverWait(browser, 10).until_not(EC.visibility_of_element_located((By.ID, "overley")))
captcha.send_keys(captcha_input)
Also fix any other problems if is in my code. Any help would be appreciated!!
Thanks in advance.
Simply re-find the button element in the loop, each time, rather than before the loop starts. Any time the DOM mutates, previous references are marked as stale, and will require a new instance. Interacting with Captcha's mutate the DOM, and mark the page as dirty (having changed/modified), which Selenium uses to flag "staleness".
Related
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
import time
from twilio.rest import Client
from datetime import datetime
import datefinder
import os
chrome_options = Options()
chrome_options.add_experimental_option("detach", True)
url = 'https://www.recreation.gov/permits/233262/registration/detailed-availability?type=overnight-permit'
driver = webdriver.Chrome()
driver.get(url)
title = ""
text = ""
campsites = ""
waiter = webdriver.support.wait.WebDriverWait(driver, 30)
waiter.until(EC.visibility_of_element_located((By.XPATH, '//*[#id="per-availability-main"]/div[1]/div[1]/div/div/div/div/div[3]/div/div[2]/button[2]')))
element = driver.find_element(By.XPATH, '//*[#id="per-availability-main"]/div[1]/div[1]/div/div/div/div/div[3]/div/div[2]/button[2]')
element.click()
time.sleep(0.3)
element.click()
time.sleep(0.4)
waiter.until(EC.visibility_of_element_located((By.XPATH, '//*[#id="per-availability-main"]/div[1]/div[3]/div/fieldset/div/div[2]/label/span')))
element = driver.find_element(By.XPATH, '//*[#id="per-availability-main"]/div[1]/div[3]/div/fieldset/div/div[2]/label/span')
element.click()
time.sleep(0.5)
waiter.until(EC.visibility_of_element_located((By.CLASS_NAME, 'rec-grid-grid-cell available')))
elements = driver.find_elements(By.CLASS_NAME, 'rec-grid-grid-cell available')
time.sleep(4)
So this code is to eventually compile a list of available permits for a given date for me to quickly find out which I want to do. It clicks 2 users and selects "no" for the guided trip. This reveals a grid, which shows the available sites. The first 2 steps work completely fine. It stops working when it tries to work with the grid.
I'm trying to locate available sites with the class name "rec-grid-grid-cell available"
I have also tried locating anything on that grid by XPATH and it can't seem to find anything. Is there a special way to deal with grids that appear after a few clicks?
If you need more information, please ask.
Unfortunately you cannot pass multiple css class names to By.CLASS_NAME.
So you can do either:
available_cells_css = ".rec-grid-grid-cell.available"
available_cells = waiter.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, available_cells_css)))
or
available_cells_xpath = "//div[#class='rec-grid-grid-cell available']"
available_cells = waiter.until(EC.visibility_of_all_elements_located((By.XPATH, available_cells_xpath)))
I am trying to use this python code from here. I am using firefox geckodriver instead. I get an index error from line 43 which is log_in[0].click(). Here is the code for convenience:
# importing necessary classes
# from different modules
from lib2to3.pgen2 import driver
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.keys import Keys
import time
browser = webdriver.Firefox()
prefs = {"profile.default_content_setting_values.notifications": 2}
# open facebook.com using get() method
browser.get('https://www.facebook.com/')
# user_name or e-mail id
username = "argleblargle#gmail.com"
# getting password from text file
with open('test.txt', 'r') as myfile:
password = myfile.read().replace('\n', '')
print("Let's Begin")
element = browser.find_elements_by_xpath('//*[#id ="email"]')
element[0].send_keys(username)
print("Username Entered")
element = browser.find_element_by_xpath('//*[#id ="pass"]')
element.send_keys(password)
print("Password Entered")
# logging in
log_in = browser.find_elements_by_id('loginbutton')
log_in[0].click()
print("Login Successful")
browser.get('https://www.facebook.com/events/birthdays/')
feed = 'Hap Borth! Hope you have an amazing day!'
element = browser.find_elements_by_xpath("//*[#class ='enter_submit\
uiTextareaNoResize uiTextareaAutogrow uiStreamInlineTextarea\
inlineReplyTextArea mentionsTextarea textInput']")
cnt = 0
for el in element:
cnt += 1
element_id = str(el.get_attribute('id'))
XPATH = '//*[#id ="' + element_id + '"]'
post_field = browser.find_element_by_xpath(XPATH)
post_field.send_keys(feed)
post_field.send_keys(Keys.RETURN)
print("Birthday Wish posted for friend" + str(cnt))
# Close the browser
browser.close()
As you can see from the code, it prints out when a step is completed. It passed username entered, passed password entered, but did not pass login successful. I get an IndexError: line 43, in <module> log_in[0].click()
Is that because the login button is somewhere different from when the code was first written? Is it 2FA shenanigans? I am doing this for fun, thanks for reading.
EDIT: the original error was because of the s in ind_elements_by_id. There is one element. Oops.
The error is now selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: [id="loginbutton"]
log_in variable is empty, because many websites including facebook first load the website as such, and only than load the layout and all the elements with javascript.
Your code tries to interact with facebook before it is fully loaded and therefor cannot find login button.
You can do something like this to resolve your problem or just write a while loop that checks if the button is found.
I am trying to click on the first result on this page, but all the options I tried didn't work.
Firstly I just login into the website with email: kocianlukyluk#gmail.com and password: Redfinpython06. Here is the code for it:
driver = webdriver.Chrome("C:\\Users\\kocia\\OneDrive\\Plocha\\Python\\nastaveni\\chromedriver.exe")
driver.get('https://www.redfin.com/myredfin/favorites')
email = 'kocianlukyluk#gmail.com'
password = 'Redfinpython06'
time.sleep(3)
driver.find_element_by_xpath(
'//*[#id="content"]/div[6]/div/div[2]/div/div/form/span[1]/span/div/input').send_keys(email)
time.sleep(3)
driver.find_element_by_xpath(
'//*[#id="content"]/div[6]/div/div[2]/div/div/form/span[2]/span/div/input').send_keys(password)
time.sleep(3)
sing_up = driver.find_element_by_css_selector('button[type=submit]')
sing_up.click()
But the problem is after login i can't click on the first result on the page.
Here is what i tried:
result = driver.find_elements_by_xpath("//*[#id="content"]/div[10]/div/div[5]/div/div[2]/div/div")[0]
result.find_element_by_xpath("//*[#id="content"]/div[10]/div/div[5]/div/div[2]/div/div/div[1]").click()
or
result = driver.find_elements_by_xpath("//*[#id="content"]/div[10]/div/div[5]/div/div[2]/div/div")[0]
result.click()
or
result = driver.find_element_by_xpath("//*[#id="content"]/div[10]/div/div[5]/div/div[2]/div/div/div[1]")
result.click()
Thank you so much for help.
I hope that is a dummy email and password that you are just using for testing purposes :)
Below clicks on the first house picture in the list. I also cleaned up your email and password xpath designations. You can see how much easier it is to grab them by name
Also, you may want to put proper wait methods around these find elements. Using sleep generally is not recommended
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from time import sleep
driver = webdriver.Chrome()
driver.get('https://www.redfin.com/myredfin/favorites')
email = 'kocianlukyluk#gmail.com'
password = 'Redfinpython06'
sleep(3)
driver.find_element_by_name(
'emailInput').send_keys(email)
sleep(3)
driver.find_element_by_name(
'passwordInput').send_keys(password)
sleep(3)
sing_up = driver.find_element_by_css_selector('button[type=submit]')
sing_up.click()
sleep(3)
first_house = driver.find_element_by_xpath("//div[#class='FavoritesHome'][1]//img")
first_house.click()
I was trying to get the email address and click on the refresh button from the following screenshot. But I am getting errors.
My code for this is like the following:
from selenium import webdriver
url = 'http://od.obagg.com/ '
driver = webdriver.Chrome(executable_path='chromedriver')
driver.get(url)
s = driver.find_element_by_id('//*[#id="shortid"]').get_attribute('placeholder')
print(s)
Based on the inspect, i was trying to do and tried many ways to get that email field value and click on refresh button. But still no luck.
Do anybody know any tricks to share?
It may be due to the fact the element is disabled, also, find_element_by_id('//*[#id="shortid"]') is incorrect. It can be either:
find_element_by_xpath('//*[#id="shortid"]')
find_element_by_id("shortid") ?
The following works for me:
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
driver = webdriver.Chrome(executable_path='chromedriver')
driver.maximize_window()
driver.get('http://od.obagg.com/')
wait = WebDriverWait(driver, 10)
el = wait.until(ec.visibility_of_element_located((By.ID, "shortid")))
placeholder = el.get_attribute("placeholder")
email = el.get_attribute('value')
print(placeholder, email)
# 请等待分配临时邮箱 -_ylp06tc#xxx.xxx
If you need 10 different emails, you can use:
from time import sleep
for x in range(10):
driver.find_element_by_id("refreshShortid").click()
sleep(0.15) # you may have to increase this value to give enough time to generate the new email
new_email = driver.find_element_by_id("shortid").get_attribute('value')
print(new_email)
I am trying to scroll down comments on a twitter status,trying to extract the page with all the comments(or at least first 5 pages). Using selenium driver for it , but not successful with the scrolling part, so i have to do manually and extract. I am using python 3.6.5 Pls help...
for eg for this tweet - https://twitter.com/TeamYouTube/status/1012415985184206848
Can anyone help me with code..
My code:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.keys import Keys
import time
driver = webdriver.Chrome(executable_path="...../chromedriver")
driver.get('https://twitter.com/TeamYouTube/status/1012415985184206848')
for i in range(1,10):
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(3)
ip = input("Enter y to proceed: ")
if(ip == 'y'):
page = driver.page_source
filename = input('Enter file name : ')
path = 'D:/page_'+filename+'.html'
f = open(path,'w',encoding='utf-8')
f.write(page)
f.close()
driver.close()
Try this:
driver.execute_script("arguments[0].scrollTo(0, document.body.scrollHeight);", driver.findElement(By.id("#permalink-overlay-dialog")));
Explanation: you have to scroll a particular div. To be able to do it, you have to find this element on the page and the scroll to the end of the page only this element.
Second suggestion is to use:
from selenium.webdriver.common.keys import Keys
# locate element and simulate 'END' button press
driver.find_element_by_id("permalink-overlay-dialog").send_keys(Keys.END)
if ot won't work try also to extend with ActionChains:
from selenium.webdriver.common.action_chains import ActionChains
element = driver.find_element_by_id("permalink-overlay-dialog")
action = ActionChains(driver)
action.move_to_element(element).perform()
element.send_keys(Keys.END)