How can I force python Selenium wait for a second until AngularJS completes page parsing and loads some stuff that it needs.
Or how can I force Selenium to wait for 1 second after button click, which causes ajax request to the server, handled by AngularJS. I need server side actions to take place before navigating to other page.
If some asynchronous behavior in your application is important enough that a real user should wait for it, then you should tell them. Similarly, your script can wait for that same indication before proceeding.
For example, if a user clicks a button that triggers an API call to create a record, and the user needs to wait for that record to be created, you should show them a message indicating when it completes successfully, e.g., "Record created successfully." Your script can then wait for that same text to appear, just as a user would.
Importantly, it shouldn't matter how your application is implemented. What matters is that your users can use your application—not that it calls certain AngularJS APIs or React APIs, etc.
1. Using Selenium
Selenium includes WebDriverWait and the expected_conditions module to help you wait for particular conditions to be met:
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
TIMEOUT = 5
# ...
WebDriverWait(driver, TIMEOUT).until(
EC.text_to_be_present_in_element(
[By.CLASS_NAME, "alert"],
"Record created successfully"))
2. Using Capybara (which uses Selenium)
As you can see above, bare Selenium is complicated and finicky. capybara-py abstracts most of it away:
from capybara.dsl import page
# ...
page.assert_text("Record created successfully")
Access AngularJS scope through Selenium - most likely that state is already held in Scope/IsolatedScope .
I built a few extensions to help with this that can be translated to python.
webDriver.NgWaitFor(productDiv, "scope.Data.Id != 0");
webDriver.NgWaitFor(partialElement, "scope.IsBusyLoadingTemplate == false");
https://github.com/leblancmeneses/RobustHaven.IntegrationTests/blob/master/NgExtensions/NgWebDriverExtensions.cs
to deal with ajax request when your working with both angularjs $http and jquery I use:
webDriver.WaitFor("window.isBrowserBusy() == false");
requires you to setup intercepts in both angularjs and jquery to manage the count of the xhr requests.
Here is the framework we are using in our project: (you might want to extract more pieces from it)
https://github.com/leblancmeneses/RobustHaven.IntegrationTests
I had the same problem this is the way I solved it
from datetime import datetime, timedelta
from time import sleep
from selenium import webdriver
from selenium.common.exceptions import WebDriverException
class MyDriver(webdriver.Chrome):
def __init__(self, executable_path="chromedriver", port=0,
chrome_options=None, service_args=None,
desired_capabilities=None, service_log_path=None):
super().__init__(executable_path, port, chrome_options, service_args,
desired_capabilities, service_log_path)
def wait_until_angular(self, seconds: int = 10) -> None:
java_script_to_load_angular = "var injector = window.angular.element('body').injector(); " \
"var $http = injector.get('$http');" \
"return ($http.pendingRequests.length === 0);"
end_time = datetime.utcnow() + timedelta(seconds=seconds)
print("wait for Angular Elements....")
while datetime.utcnow() < end_time:
try:
if self.execute_script(java_script_to_load_angular):
return
except WebDriverException:
continue
sleep(1)
raise TimeoutError("waiting for angular elements for too long")
It worked for me
Hope this helps you!!!
Related
I have a website I want to crawl. To access the search results, you must first solve a Recaptcha V2 with a callback function (see screenshot below)
Recaptcha V2 with a callback function
I am using a dedicated captcha solver called 2captcha. The service provides me with a token, which I then plug into the callback function to bypass the captcha. I found the callback function using the code in this GitHub Gist and I am able to invoke the function successfully in the Console of Chrome Dev Tools
The function can be invoked by typing any of these two commands
window[___grecaptcha_cfg.clients[0].o.o.callback]('captcha_token')
or
verifyAkReCaptcha('captcha_token')
However, when I invoke these functions using the driver.execute_script() method in Python Selenium, I get an error. I also tried executing **other standard Javascript functions **with this method (e.g., scrolling down a page), and I keep getting errors. It's likely because the domain I am trying to crawl prevents me from executing any Javascript with automation tools.
So, my question is, how can I invoke the callback function after I obtain the token from the 2captcha service? Would appreciate all the help I could get. Thank you in advance to hero(in) who will know his/her way around this tough captcha. Cheers!!
Some extra info to help with my question:
Automation framework used --> Python Selenium or scrapy. Both are fine by me
Error messages -->
Error message 1 and Error message 2
Code
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from twocaptcha import TwoCaptcha
from dotenv import load_dotenv
import os
# Load environment variables
load_dotenv()
# Instantiate a solver object
solver = TwoCaptcha(os.getenv("CAPTCHA_API_KEY"))
sitekey = "6Lfwdy4UAAAAAGDE3YfNHIT98j8R1BW1yIn7j8Ka"
url = "https://suchen.mobile.de/fahrzeuge/search.html?dam=0&isSearchRequest=true&ms=8600%3B51%3B%3B&ref=quickSearch&sb=rel&vc=Car"
# Set chrome options
chrome_options = Options()
chrome_options.add_argument('start-maximized') # Required for a maximized Viewport
chrome_options.add_experimental_option('excludeSwitches', ['enable-logging', 'enable-automation'])
chrome_options.add_experimental_option("detach", True)
chrome_options.add_experimental_option('prefs', {'intl.accept_languages': 'en,en_US'})
# Instantiate a browser object and navigate to the URL
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get(url)
driver.maximize_window()
def solve(sitekey, url):
try:
result = solver.recaptcha(sitekey=sitekey, url=url)
except Exception as e:
exit(e)
return result.get('code')
captcha_key = solve(sitekey=sitekey, url=url)
print(captcha_key)
# driver.execute_script(f"window[___grecaptcha_cfg.clients[0].o.o.callback]('{captcha_key}')") # This step fails in Python but runs successfully in the console
# driver.execute_script(f"verifyAkReCaptcha('{captcha_key}')") # This step fails in Python but runs successfully in the console
To solve the captcha we can use pyautogui. To install the package run pip install pyautogui. Using it we can interact with what appears on the screen. This means that the browser window must be visible during the execution of the python script. This is a big drawback with respect to other methods, but on the other side it is very reliable.
In our case we need to click on this box to solve the captcha, so we will tell pyautogui to locate this box on the screen and then click on it.
So save the image on your computer and call it box.png. Then run this code (replace ... with your missing code).
import pyautogui
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
...
driver.get(url)
driver.maximize_window()
# html of the captcha is inside an iframe, selenium cannot see it if we first don't switch to the iframe
WebDriverWait(driver, 9).until(EC.frame_to_be_available_and_switch_to_it((By.ID, "sec-cpt-if")))
# wait until the captcha is visible on the screen
WebDriverWait(driver, 9).until(EC.visibility_of_element_located((By.CSS_SELECTOR, '#g-recaptcha')))
# find captcha on page
checkbox = pyautogui.locateOnScreen('box.png')
if checkbox:
# compute the coordinates (x,y) of the center
center_coords = pyautogui.center(checkbox)
pyautogui.click(center_coords)
else:
print('Captcha not found on screen')
Based on #sound wave's answer, I was able to invoke the callback function and bypass the captcha without pyautogui. The key was to switch to the captcha's frame using the frame_to_be_available_and_switch_to_it method. Thanks a mil to #sound wave for the amazing hint.
Here's the full code for anyone who's interested. Keep in mind that you will need a 2captcha API key for it to work.
The thing that I am still trying to figure out is how to operate this script in headless mode because the WebDriverWait object needs Selenium to be in non-headless mode to switch to the captcha frame. If anyone knows how to switch to the captcha frame while working with Selenium in headless mode, please share your knowledge :)
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from twocaptcha import TwoCaptcha
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from dotenv import load_dotenv
import os
import time
# Load environment variables
load_dotenv()
# Instantiate a solver object
solver = TwoCaptcha(os.getenv("CAPTCHA_API_KEY"))
sitekey = "6Lfwdy4UAAAAAGDE3YfNHIT98j8R1BW1yIn7j8Ka"
url = "https://suchen.mobile.de/fahrzeuge/search.html?dam=0&isSearchRequest=true&ms=8600%3B51%3B%3B&ref=quickSearch&sb=rel&vc=Car"
# Set chrome options
chrome_options = Options()
chrome_options.add_argument('start-maximized') # Required for a maximized Viewport
chrome_options.add_experimental_option('excludeSwitches', ['enable-logging', 'enable-automation'])
chrome_options.add_experimental_option("detach", True)
chrome_options.add_experimental_option('prefs', {'intl.accept_languages': 'en,en_US'})
# Instantiate a browser object and navigate to the URL
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get(url)
driver.maximize_window()
# Solve the captcha using the 2captcha service
def solve(sitekey, url):
try:
result = solver.recaptcha(sitekey=sitekey, url=url)
except Exception as e:
exit(e)
return result.get('code')
captcha_key = solve(sitekey=sitekey, url=url)
print(captcha_key)
# html of the captcha is inside an iframe, selenium cannot see it if we first don't switch to the iframe
WebDriverWait(driver, 9).until(EC.frame_to_be_available_and_switch_to_it((By.ID, "sec-cpt-if")))
# Inject the token into the inner HTML of g-recaptcha-response and invoke the callback function
driver.execute_script(f'document.getElementById("g-recaptcha-response").innerHTML="{captcha_key}"')
driver.execute_script(f"verifyAkReCaptcha('{captcha_key}')") # This step fails in Python but runs successfully in the console
# Wait for 3 seconds until the "Accept Cookies" window appears. Can also do that with WebDriverWait.until(EC)
time.sleep(3)
# Click on "Einverstanden"
driver.find_element(by=By.XPATH, value="//button[#class='sc-bczRLJ iBneUr mde-consent-accept-btn']").click()
# Wait for 0.5 seconds until the page is loaded
time.sleep(0.5)
# Print the top title of the page
print(driver.find_element(by=By.XPATH, value="//h1[#data-testid='result-list-headline']").text)
I have created a browser scraping script which sends a message on WhatsApp web using selenium in python but yesterday noticed a that its sending half message or not sending messages. Debugged it and found that the browser window must be active to send messages my send message code as below.
def send_message(msg):
whatsapp_msg = driver.find_element_by_class_name(send_messageClass)
for part in msg.split('\n'):
whatsapp_msg.send_keys(part)
ActionChains(driver).key_down(Keys.SHIFT).key_down(Keys.ENTER).key_up(Keys.SHIFT).key_up(Keys.ENTER).perform()
time.sleep(1)
ActionChains(driver).send_keys(Keys.RETURN).perform()
time.sleep(1)
find_element_by_class_name simply retrieves the element from the DOM. It does not ensure, if it is visible.
For this use an explicit wait in conjunction with visibility of the element as expected condition:
selenium.webdriver.support.expected_conditions.visibility_of(element)
This will wait for the element to be visible until the specified timeout is reached. Here is an example with timeout of 60 seconds:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EXP_CON
...
wait = WebDriverWait(driver, 60)
whatsapp_msg = driver.find_element_by_class_name(send_messageClass)
visible_whatsapp_msg = wait.until(EXP_CON.visibility_of(whatsapp_msg))
I have a problem with the automation of PayPal sandbox by Selenium Python.
Generally, I wrote explicit waits for each action method like send_keys(), or click() into the button, but they just don't work. I tried almost all explicit waits that are available.
I tried to adapt method which will be waiting until Angular script will be fully loaded, but it totally doesn't work because of this app based on Angular v.1., by executing javascript.
For example:
while self.context.browser.execute_script(
"return angular.element(document).injector().get('$http').pendingRequests.length === 0"):
sleep(0.5)
The only method which works are static python sleep, which is totally inappropriate! But when I add 2 seconds of sleep between every first action on the page, the test passing without any problems, while I trying to replace sleep by for example WebDriverWait(self.context.browser, timeout=15).until(EC.visibility_of_all_elements_located) , the test stop when all elements are visible on the page.
Can someone handle this?
My code witch sleeps between each page objects:
context.pages.base_page.asert_if_url_contain_text("sandbox.paypal.com")
context.pages.paypal_login_page.login_to_pp_as(**testPP)
sleep(2)
context.pages.choose_payment_page.pp_payment_method("paypal")
sleep(2)
context.pages.pay_now_page.click_pay_now()
sleep(2)
context.pages.finish_payment_page.click_return_to_seller()
sleep(5)
context.pages.base_page.open()
Example method witch explicit wait:
def click_pay_now(self):
WebDriverWait(self.context.browser, timeout=15).until(EC.visibility_of_all_elements_located)
self.pay_now_button.is_element_visible()
self.pay_now_button.click()
visibility_of_all_elements_located() will return a list, instead you need to use visibility_of_element_located() which will return a WebElement.
Ideally, if your usecase is to invoke click() then you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR:
WebDriverWait(self.context.browser, timeout=15).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "element_css"))).click()
Using XPATH:
WebDriverWait(self.context.browser, timeout=15).until(EC.element_to_be_clickable((By.XPATH, "element_xpath"))).click()
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Reference
You can find a relevant detailed discussion in:
Do we have any generic function to check if page has completely loaded in Selenium
Selenium wait sometime not works well with AngularJs or reactJs based app that's why protractor is best tools for AngularJs or reactJs based app. Although I hope If you can try below solution it can work as it based on Javascript.
A function will check page is fully loaded or not.
def page_has_loaded():
page_state = browser.execute_script(
'return document.readyState;'
)
return page_state == 'complete'
Then use wait with combination of very small sleeping time that can be less as soon as page will be loaded.
def wait_for(condition_function):
start_time = time.time()
while time.time() < start_time + 2:
if condition_function():
return True
else:
time.sleep(0.1)
raise Exception(
'Timeout waiting for {}'.format(condition_function.**name**)
)
And you can call it as mentioned below:
wait_for(page_has_loaded)
I've written a script in Python in association with selenium to click on each of the signs available in a map. However, when I execute my script, it throws timeout exception error upon reaching this line wait.until(EC.staleness_of(item)).
Before hitting that line, the script should have clicked once but It could not? How can I click on all the signs in that map cyclically?
This is the site link.
This is my code so far (perhaps, I'm trying with the wrong selectors):
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
link = "https://www.findapetwash.com/"
driver = webdriver.Chrome()
driver.get(link)
wait = WebDriverWait(driver, 15)
for item in wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "#map .gm-style"))):
item.click()
wait.until(EC.staleness_of(item))
driver.quit()
Signs visible on that map like:
Post script: I know that this is their API https://www.findapetwash.com/api/locations/getAll/ using which I can get the JSON content but I would like to stick to the Selenium way. Thanks.
I know you wrote you don't want to use the API but using Selenium to get the locations from the map markers seems a bit overkill for this, instead, why not making a call to their Web service using requests and parse the returned json?
Here is a working script:
import requests
import json
api_url='https://www.findapetwash.com/api/locations/getAll/'
class Location:
def __init__(self, json):
self.id=json['id']
self.user_id=json['user_id']
self.name=json['name']
self.address=json['address']
self.zipcode=json['zipcode']
self.lat=json['lat']
self.lng=json['lng']
self.price_range=json['price_range']
self.photo='https://www.findapetwash.com' + json['photo']
def get_locations():
locations = []
response = requests.get(api_url)
if response.ok:
result_json = json.loads(response.text)
for location_json in result_json['locations']:
locations.append(Location(location_json))
return locations
else:
print('Error loading locations')
return False
if __name__ == '__main__':
locations = get_locations()
for l in locations:
print(l.name)
Selenium
If you still want to go the Selenium way, instead of waiting until all the elements are loaded, you could just halt the script for some seconds or even a minute to make sure everything is loaded, this should fix the timeout exception:
import time
driver.get(link)
# Wait 20 seconds
time.sleep(20)
For other possible workarounds, see the accepted answer here: Make Selenium wait 10 seconds
You can click one by one using Selenium if, for some reasons, you cannot use API. Also it is possible to extract information for each sign without clicking on them with Selenium.
Here code to click one by one:
signs = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "li.marker.marker--list")))
for sign in signs:
driver.execute_script("arguments[0].click();", sign)
#do something
Try also without wait, probably will work.
So, I have a Web application at work that need to gather information and build some reports and run some basic data analysis.
The thing is that I'm a complete newbie to HTML, Ajax (Asynchronous JavaScript and XML), Python and Selenium.
What I gather so far is this:
Ajax nature is to perform asynchronous Web Browser activities and in my case, sending server requests to push/pull some data
Selenium handles the asynchronous events performing Wait actions like:
time.sleep('time in ms') # using the time library. So not REALLY Selenium;
Explicit Waits: you define to wait for a certain condition to occur before proceeding further in the code;
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
delay_time = 10 # how much time until raises NoExeption in Selenium
driver = webdriver.Firefox()
driver.get("http://somedomain/url_that_delays_loading")
webDriverWait(driver,delay_time)\
.until(EC.presence_of_element_located((By.ID, 'IdOfMyElement')))`
EC stands for expected conditions represented by:
title_is;
title_contains;
presence_of_element_located
visibility_of_element_located
visibility_of
presence_of_all_elements_located
text_to_be_present_in_element
text_to_be_present_in_element_value
frame_to_be_available_and_switch_to_it
invisibility_of_element_located
element_to_be_clickable
staleness_of
element_to_be_selected
element_located_to_be_selected
element_selection_state_to_be
element_located_selection_state_to_be
alert_is_present
Implicit Waits: tell WebDriver to poll the DOM (Document Object Model) for a certain amount of time when trying to find an element or elements if they are not immediately available;
driver.implicitly_wait(10)-
Executing JavaScript using Java and applies wait: j Query keeps a count of how many Ajax calls are active in its query.active variable;
FluentWait: FluentWait option to handle uncertain waits;
WebdriverWait: use ExpectedCondition and WebDriverWait strategy.
What to use since I have the following situation:
Button to send a clear request via Ajax.
<div id="div_39_1_3" class="Button CoachView CPP BPMHSectionChild CoachView_show" data-type="com.ibm.bpm.coach.Snapshot_b24acf10_7ca3_40fa_b73f_782cddfd48e6.Button" data-binding="local.clearButton" data-bindingtype="boolean" data-config="config175" data-viewid="GhostClear" data-eventid="boundaryEvent_42" data-ibmbpm-layoutpreview="horizontal" control-name="/GhostClear">
<button class="btn btn-labeled"><span class="btn-label icon fa fa-times"></span>Clear</button></div>
This is the event of the button:
function(a) {!e._instance.btn.disabled &&
c.ui.executeEventHandlingFunction(e, e._proto.EVT_ONCLICK) &&
(e._instance.multiClicks || (e._instance.btn.disabled = !0,
f.add(e._instance.btn, "disabled")), e.context.binding &&
e.context.binding.set("value", !0), e.context.trigger(function(a) {
e._instance.btn.disabled = !1;
f.remove(e._instance.btn, "disabled");
setTimeout(function() {
c.ui.executeEventHandlingFunction(e, e._proto.EVT_ONBOUNDARYEVT,
a.status)
})
}, {
callBackForAll: !0
}))
}
Then, my network informs that the ajaxCoach proceeds to the following requests
Is it possible to selenium to see/find if an AJAX action
concluded the page actualization action in Python?
if you have jquery on the page, you can define the button with jquery
and wait until the event function is ready.
for your question:
driver.execute_script('button = $("#div_39_1_3");')
events = driver.execute_script('return $._data(button[0],
"events");')
now you need to wait until events variable is not none.