Asynchronous refresh with Selenium + Python - python

I'm currently working on a project where I need Selenium to refresh two tabs at an exact time and I don't want to wait for the site to be loaded. I tried every method described in multiple posts and still don't get it right. The snippet below leads to nothing and other methods like the browser.refresh() method seem to be synchronous. browser. Execute_Script("location.reload();") also seems to be quite erratic about acting synchronous or not. The goal is to reload the multiple tabs and then figure out if a button is present or not if so, it will be clicked.
import os
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys
import time
import datetime
browser.get('https://google.com')
time.sleep(10)
browser.find_element_by_css_selector("body").send_keys(Keys.F5)
I've also thought about using something like, directly executed into the browser. But this seems to be impossible, is it?
function ready(callback){
// in case the document is already rendered
if (document.readyState!='loading') callback();
// modern browsers
else if (document.addEventListener) document.addEventListener('DOMContentLoaded', callback);
// IE <= 8
else document.attachEvent('onreadystatechange', function(){
if (document.readyState=='complete') callback();
});
}
ready(function(){
console.log("DOM fully loaded and parsed");
var result = document.evaluate("//*[text()='5544']/../../td[#class='action']/form/input[#type='submit' and not(#disabled)]", document, null, XPathResult.ANY_TYPE, null);
result.click();
});
UPDATE:
I figured out that:
browser.get('https://google.com')
time.sleep(5)
username = browser.find_element_by_xpath("//input[#type='text']")
username.click()
username.clear()
username.send_keys(Keys.Enter)
works just fine. But Keys.F5 does not refresh the page... it seems to be a bug?
The main part of my question is that I want to know a workaround for this send_keys F5 operation (asynchronous).

Related

How to download PDF from url in python

Note: This is very different problem compared to other SO answers (Selenium Webdriver: How to Download a PDF File with Python?) available for similar questions.
This is because The URL: https://webice.ongc.co.in/pay_adv?TRACKNO=8262# does not directly return the pdf but in turn makes several other calls and one of them is the url that returns the pdf file.
I want to be able to call the url with a variable for the query param TRACKNO and to be able to save the pdf file using python.
I was able to do this using selenium, but my code fails to work when the browser is used in headless mode and I need it to work in headless mode. The code that I wrote is as follows:
import requests
from urllib3.exceptions import InsecureRequestWarning
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import time
def extract_url(driver):
advice_requests = driver.execute_script("var performance = window.performance || window.mozPerformance || window.msPerformance || window.webkitPerformance || {}; var network = performance.getEntries() || {}; return network;")
print(advice_requests)
for request in advice_requests:
if(request.get('initiatorType',"") == 'object' and request.get('entryType',"") == 'resource'):
link_split = request['name'].split('-')
if(link_split[-1] == 'filedownload=X'):
print("..... Successful")
return request['name']
print("..... Failed")
def save_advice(advice_url,tracking_num):
requests.packages.urllib3.disable_warnings(category=InsecureRequestWarning)
response = requests.get(advice_url,verify=False)
with open(f'{tracking_num}.pdf', 'wb') as f:
f.write(response.content)
def get_payment_advice(tracking_nums):
options = webdriver.ChromeOptions()
# options.add_argument('headless') # DOES NOT WORK IN HEADLESS MODE SO COMMENTED OUT
driver = webdriver.Chrome(options=options)
for num in tracking_nums:
print(num,end=" ")
driver.get(f'https://webice.ongc.co.in/pay_adv?TRACKNO={num}#')
try:
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, 'ls-highlight-domref')))
time.sleep(0.1)
advice_url = extract_url(driver)
save_advice(advice_url,num)
except:
pass
driver.quit()
get_payment_advice['8262']
As it can be seen I get all the network calls that the browser makes in the first line of the extract_url function and then parse each request to find the correct one. However this does not work in headless mode
Is there any other way of doing this as this seems like a workaround? If not, can this be fixed to work in headless mode?
I fixed it, i only changed one function. The correct url is in the given page_source of the driver (with beautifulsoup you can parse html, xml etc.):
from bs4 import BeautifulSoup
def extract_url(driver):
soup = BeautifulSoup(driver.page_source, "html.parser")
object_element = soup.find("object")
data = object_element.get("data")
return f"https://webice.ongc.co.in{data}"
The hostname part may can be extracted from the driver.
I think i did not changed anything else, but if it not work for you, I can paste the full code.
Old Answer:
if you print the text of the returned page (print(driver.page_source)) i think you would get a message that says something like:
"Because of your system configuration the pdf can't be loaded"
This is because the requested site checks some preferences to decide if you are a roboter or not. Maybe it helps to change some arguments (screen size, user agent) to fix this. Here are some information about, how to detect a headless browser.
And for the next time you should paste all relevant code into the question (imports) to make it easier to test.

how to detect unexpected url change python webdriver selenium?

I am automating a browser process but same credentials are used by all the persons(only one user can access the portal at a time), so whenever somebody else login-in, the current user is automatically kicked out with url change to "http://172.17.3.248:8889/ameyoreports/?acpMode=false#loggedOut".
Is there any way to constantly check for url change while my automatation script is running along and when logout is detected end the script.
I am using python selenium webdriver.
In Java we can take help from EventLister https://seleniumhq.github.io/selenium/docs/api/java/org/openqa/selenium/support/events/WebDriverEventListener.html for example if you implement it
public class Test2 implements WebDriverEventListener{
#Override
public void beforeFindBy(By arg0, WebElement arg1, WebDriver driver) {
if(driver.getCurrentUrl().equals("http://172.17.3.248:8889/ameyoreports/?acpMode=false#loggedOut")==true) {
//do want you want.
}
}
we have to use the same like below to cross check url before doing any action (as per above example, cross check url before finding element)
FirefoxDriver driver = new FirefoxDriver();
EventFiringWebDriver eventDriver = new EventFiringWebDriver(driver);
EventHandler handler = new EventHandler();
eventDriver.register(handler);
eventDriver.get("url");
in Java it helps http://toolsqa.com/selenium-webdriver/event-listener/ for python http://selenium-python.readthedocs.io/api.html#module-selenium.webdriver.support.abstract_event_listener
hey there is current_url attribute associated with the selenium webdriver object, you will be able to fetch the changed url using webdriver.current_url.
Keep a check for that and you can break your script whenever you want.
You can test it with the following code
#using chrome webdriver
from selenium.webdriver.chrome.options import Options
browser = Options()
instance = webdriver.Chrome(webdriver_path, options=browser)
instance.get(url)
instance.current_url <<<<<<< this will give the current url opened in browser
# manually enter another url in the browser then again check
instance.current_url

How can I click on a checkbox in a webpage using selenium mimicking manual approach?

I've written a script in python using selenium to tick a checkbox and hit the submit button. When I follow the steps manually, I can do it without solving any captcha. In fact, I do not face any captcha challenge. However, the site throws captchas as soon as I initiate a click on that checkbox using the script below.
website address
This is what I've tried so far:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
wait = WebDriverWait(driver, 10)
driver.get('https://www.truepeoplesearch.com/results?name=John%20Smithers')
WebDriverWait(driver,10).until(EC.frame_to_be_available_and_switch_to_it(driver.find_element_by_css_selector("iframe")))
WebDriverWait(driver,10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "span#recaptcha-anchor"))).click()
How can I click on a checkbox in a webpage using selenium without triggering captchas?
You can use the PyMouse package (python package here) to move to the (x,y) position of the object on the webpage and simulate a mouse click.
from pymouse import PyMouse
mouse = PyMouse()
def click(self, x,y):
"""Mouse event click for webdriver"""
global mouse
mouse.click(x,y,1)
CAPTCHA is used to stop website automation & that's why it can not be automated using selenium. Adn for same reason, your not able to select CAPTCHA tick box. Please refer these link for more info: https://sqa.stackexchange.com/questions/17022/how-to-fill-captcha-using-test-automation
Here is the sample code to select the check box that will trigger the recaptcha images.
url = "https://www.google.com/recaptcha/api2/demo"
driver.get(url)
driver.switch_to.frame(driver.find_element_by_xpath("//iframe[starts-with(#name,'a-')]"))
# update the class name based on the UAT implementation (if it's different)
driver.find_element_by_class_name("recaptcha-checkbox-border").click()
But still you have to complete either image selection/use voice-to-text api to resolve the captcha.
The possible options are using 3rd party APIs or check you have the APIs available in truepeoplesearch where you can get the required information as response.
Edit 1: Using the API and html parser.
url = "https://www.truepeoplesearch.com/results?name=John%20Smithers"
payload = {}
headers= {}
response = requests.request("GET", url, headers=headers, data = payload)
html_content = response.text.encode('utf8')
# now you can load this content into the lxml.html parser and get the information
html_content = response.text.encode('utf8')
root=lxml.html.document_fromstring(html_content)
content=root.xpath("//div[#class='h4']") # here I am get the names
for name in content:
print(name.text_content() + '\n')
If you are working on the team that develops this site, you can agree with the developers about an efficient way to work around the captcha.
For example, they could made a case in the code, captcha to not be shown if there is a cookie with hard to guess name, known only to you and them. Potentially someone can guess that cookie, but if you have no other choice, this is an option.
You can also use a separate key for testing environments as explained here.

Sending Keys Using Splinter

I want to test an autocomplete box using Splinter. I need to send the 'down' and 'enter' keys through to the browser but I'm having trouble doing this.
I am currently finding an input box and typing 'tes' into that box successfully
context.browser.find_by_xpath(\\some\xpath\).first.type('tes')
What I want to do next is to send some keys to the browser, specifically the 'down' key (to select the first autocomplete suggestion) then send the 'enter' key to select that autocomplete element.
I've tried extensive searches and can't figure out how to do this.
I even tried some javascript
script = 'var press = jQuery.Event("keypress"); press.keyCode = 34; press.keyCode = 13;'
context.browser.execute_script(script)
but that didn't do anything unfortunately
packages I'm using:
django 1.6
django-behave==0.1.2
splinter 0.6
current config is:
from splinter.browser import Browser
from django.test.client import Client
context.browser = Browser('chrome')
context.client = Client()
You can send keys by switching to the active element:
from selenium.webdriver.common.keys import Keys
context.browser.find_by_xpath('//input[#name="username"]').first.type('test')
active_web_element = context.browser.driver.switch_to_active_element()
active_web_element.send_keys(Keys.PAGE_DOWN)
active_web_element.send_keys(Keys.ENTER)
The active element will be the last element you interacted with, so in this case the field you typed in.
switch_to_active_element() returns a selenium.webdriver.remote.webelement.WebElement, not a splinter.driver.webdriver.WebDriverElement, so unfortunately you cannot call send_keys on the return value of find_by_*(...) directly.
From the documentation this should work:
from splinter import Browser
from selenium.webdriver.common.keys import Keys
browser = Browser()
browser.type(Keys.RETURN)

Python - Selenium - Print Webpage

How do I print a webpage using selenium please.
import time
from selenium import webdriver
# Initialise the webdriver
chromeOps=webdriver.ChromeOptions()
chromeOps._binary_location = "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe"
chromeOps._arguments = ["--enable-internal-flash"]
browser = webdriver.Chrome("C:\\Program Files\\Google\\Chrome\\Application\\chromedriver.exe", port=4445, chrome_options=chromeOps)
time.sleep(3)
# Login to Webpage
browser.get('www.webpage.com')
Note: I am using the, at present, current version of Google Chrome: Version 32.0.1700.107 m
While it's not directly printing the webpage, it is easy to take a screenshot of the entire current page:
browser.save_screenshot("screenshot.png")
Then the image can be printed using any image printing library. I haven't personally used any such library so I can't necessarily vouch for it, but a quick search turned up win32print which looks promising.
The key "trick" is that we can execute JavaScript in the selenium browser window using the "execute_script" method of the selenium webdriver, and if you execute the JavaScript command "window.print();" it will activate the browsers print function.
Now, getting it to work elegantly requires setting a few preferences to print silently, remove print progress reporting, etc. Here is a small but functional example that loads up and prints whatever website you put in the last line (where 'http://www.cnn.com/' is now):
import time
from selenium import webdriver
import os
class printing_browser(object):
def __init__(self):
self.profile = webdriver.FirefoxProfile()
self.profile.set_preference("services.sync.prefs.sync.browser.download.manager.showWhenStarting", False)
self.profile.set_preference("pdfjs.disabled", True)
self.profile.set_preference("print.always_print_silent", True)
self.profile.set_preference("print.show_print_progress", False)
self.profile.set_preference("browser.download.show_plugins_in_list",False)
self.driver = webdriver.Firefox(self.profile)
time.sleep(5)
def get_page_and_print(self, page):
self.driver.get(page)
time.sleep(5)
self.driver.execute_script("window.print();")
if __name__ == "__main__":
browser_that_prints = printing_browser()
browser_that_prints.get_page_and_print('http://www.cnn.com/')
The key command you were probably missing was "self.driver.execute_script("window.print();")" but one needs some of that setup in init to make it run smooth so I thought I'd give a fuller example. I think the trick alone is in a comment above so some credit should go there too.

Categories