According to the selenium documentation, interactions between the webdriver client and a browser is done via JSON Wire Protocol. Basically the client, written in python, ruby, java whatever, sends JSON messages to the web browser and the web browser responds with JSON too.
Is there a way to view/catch/log these JSON messages while running a selenium test?
For example (in Python):
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('http://google.com')
driver.close()
I want to see what JSON messages are going between the python selenium webdriver client and a browser when I instantiate the driver (in this case Chrome): webdriver.Chrome(), when I'm getting a page: driver.get('http://google.com') and when I'm closing it: driver.close().
FYI, in the #SFSE: Stripping Down Remote WebDriver tutorial, it is done via capturing the network traffic between the local machine where the script is running and the remote selenium server.
I'm tagging the question as Python specific, but really would be happy with any pointers.
When you use Chrome you can direct the chromedriver instance that will drive Chrome to log more information than what is available through the logging package. This information includes the commands sent to the browser and the responses it gets. Here's an example:
from selenium import webdriver
driver = webdriver.Chrome(service_log_path="/tmp/log")
driver.get("http://www.google.com")
driver.find_element_by_css_selector("input")
driver.quit()
The code above will output the log to /tmp/log. The part of the log that corresponds to the find_element_... call looks like this:
[2.389][INFO]: COMMAND FindElement {
"sessionId": "b6707ee92a3261e1dc33a53514490663",
"using": "css selector",
"value": "input"
}
[2.389][INFO]: Waiting for pending navigations...
[2.389][INFO]: Done waiting for pending navigations
[2.398][INFO]: Waiting for pending navigations...
[2.398][INFO]: Done waiting for pending navigations
[2.398][INFO]: RESPONSE FindElement {
"ELEMENT": "0.3367185448296368-1"
}
As far as I know, the commands and responses faithfully represent what is going on between the client and the server. I've submitted bug reports and fixes to the Selenium project on the basis of what I saw in these logs.
Found one option that almost fits my needs.
Just piping the logger to the stdout allows to see underlying requests being made:
import logging
import sys
from selenium import webdriver
# pipe logs to stdout
logger = logging.getLogger()
logger.addHandler(logging.StreamHandler(sys.stdout))
logger.setLevel(logging.NOTSET)
# selenium specific code
driver = webdriver.Chrome()
driver.get('http://google.com')
driver.close()
It prints:
POST http://127.0.0.1:56668/session {"desiredCapabilities": {"platform": "ANY", "browserName": "chrome", "version": "", "javascriptEnabled": true, "chromeOptions": {"args": [], "extensions": []}}}
Finished Request
POST http://127.0.0.1:56668/session/5b6875595143b0b9993ed4f66f1f19fc/url {"url": "http://google.com", "sessionId": "5b6875595143b0b9993ed4f66f1f19fc"}
Finished Request
DELETE http://127.0.0.1:56668/session/5b6875595143b0b9993ed4f66f1f19fc/window {"sessionId": "5b6875595143b0b9993ed4f66f1f19fc"}
Finished Request
I don't see the responses, but this is already a progress.
Related
I am using selenium webdriver to try scrape information from realestate.com.au, here is my code:
from selenium.webdriver import Chrome
from bs4 import BeautifulSoup
path = 'C:\Program Files (x86)\Google\Chrome\Application\chromedriver.exe'
url = 'https://www.realestate.com.au/buy'
url2 = 'https://www.realestate.com.au/property-house-nsw-castle+hill-134181706'
webdriver = Chrome(path)
webdriver.get(url)
soup = BeautifulSoup(webdriver.page_source, 'html.parser')
print(soup)
it works fine with URL but when I try to do the same to open url2, it opens up a blank page, and I checked the console get the following:
"Failed to load resource: the server responded with a status of 429 ()
about:blank:1 Failed to load resource: net::ERR_UNKNOWN_URL_SCHEME
149e9513-01fa-4fb0-aad4-566afd725d1b/2d206a39-8ed7-437e-a3be-862e0f06eea3/fingerprint:1 Failed to load resource: the server responded with a status of 404 ()"
while opening up URL, I tried to search for anything, which also leads to a blank page like url2.
It looks like the www.realestate.com.au website is using an Akamai security tool.
A quick DNS lookup shows that www.realestate.com.au resolves to dualstack.realestate.com.au.edgekey.net.
They are most likely using the Bot Manager product (https://www.akamai.com/us/en/products/security/bot-manager.jsp). I have encountered this on another website recently.
Typically rotating user agents and IP addresses (ideally using residential
proxies) should do the trick. You want to load up the site with a "fresh" browser profile each time. You should also check out https://github.com/67-6f-64/akamai-sensor-data-bypass
I think you should try adding driver.implicitly_wait(10) before your get line, as this will add an implicit wait, in case the page loads too slowly for the driver to pull the site. Also you should consider trying out the Firefox webdriver, since this bug appears to be only affecting chromium browsers.
I'm trying to read telegram messages from https://web.telegram.org with selenium.
When i open https://web.telegram.org in firefox i'm already logged in, but when opening the same page from selenium webdriver(firefox) i get the login page.
I saw that telegram web is not using cookies for the auth but rather saves values in local storage. I can access the local storage with selenium and have keys there such as: "dc2_auth_key", "dc2_server_salt", "dc4_auth_key", ... but I'm not sure what to do with them in order to login(and if i do need to do something with them then why? its the same browser why wont it work the same when opening without selenium?)
To reproduce:
open firefox and login to https://web.telegram.org then run this code:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("https://web.telegram.org")
# my code is here but is irrelevant since im at the login page.
driver.close()
When you open https://web.telegram.org manually using Firefox, the Default Firefox Profile is used. As you login and browse through the website, the websites stores Authentication Cookies within your system. As the cookies gets stored within the local storage of the Default Firefox Profile even on reopening the browsers you are automatically authenticated.
But when GeckoDriver initiates a new web browsing session for your tests everytime a temporary new mozprofile is created while launching Firefox which is evident from the following log:
mozrunner::runner INFO Running command: "C:\\Program Files\\Mozilla Firefox\\firefox.exe" "-marionette" "-profile" "C:\\Users\\ATECHM~1\\AppData\\Local\\Temp\\rust_mozprofile.fDJt0BIqNu0n"
You can find a detailed discussion in Is it Firefox or Geckodriver, which creates “rust_mozprofile” directory
Once the Test Execution completes and quit() is invoked the temporary mozprofile is deleted in the following process:
webdriver::server DEBUG -> DELETE /session/f84dbafc-4166-4a08-afd3-79b98bad1470
geckodriver::marionette TRACE -> 37:[0,3,"quit",{"flags":["eForceQuit"]}]
Marionette TRACE 0 -> [0,3,"quit",{"flags":["eForceQuit"]}]
Marionette DEBUG New connections will no longer be accepted
Marionette TRACE 0 <- [1,3,null,{"cause":"shutdown"}]
geckodriver::marionette TRACE <- [1,3,null,{"cause":"shutdown"}]
webdriver::server DEBUG Deleting session
geckodriver::marionette DEBUG Stopping browser process
So, when you open the same page using Selenium, GeckoDriver and Firefox the cookies which were stored within the local storage of the Default Firefox Profile aren't accessible and hence you are redirected to the Login Page.
To store and use the cookies within the local storage to get authenticated automatically you need to create and use a Custom Firefox Profile.
Here you can find a relevant discussion on webdriver.FirefoxProfile(): Is it possible to use a profile without making a copy of it?
You can auth using your current data from local storage.
Example:
driver.get(TELEGRAM_WEB_URL);
LocalStorage localStorage = ((ChromeDriver) DRIVER).getLocalStorage();
localStorage.clear();
localStorage.setItem("dc2_auth_key","<YOUR AUTH KEY>");
localStorage.setItem("user_auth","<YOUR USER INFO>");
driver.get(TELEGRAM_WEB_URL);
I will start by describing the infrastructure I am working within. It contains multiple proxy servers that uses a load balancer to forward user authentications to the appropriate proxy that are directly tied to an active directory. The authentication uses the credentials and source IP that was used to log into the computer the request is coming from. The server caches the IP and credentials for 60 minutes. I am using a test account specifically for this process and is only used on the unit testing server.
I am working on some automation with selenium webdriver on a remote server using a docker container. I am using python as the scripting language. I am trying to run tests on both internal and external webpages/applications. I was able to get a basic test on an internal website with the following script:
Note: 10.1.54.118 is the server hosting the docker container with the selenium web driver
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
browser = webdriver.Remote(command_executor='http://10.1.54.118:4444/wd/hub', desired_capabilities=DesiredCapabilities.CHROME)
browser.get("http://10.0.0.2")
print (browser.find_element_by_tag_name('body').text)
bodyText = browser.find_element_by_tag_name('body').text
print (bodyText)
if 'Hello' in bodyText:
print ('Found hello in body')
else:
print ('Hello not found in body')
browser.quit()
The script is able to access the internal webpage and print all the text on it.
However, I am experiencing problems trying to run test scripts against external websites.
I have tried the following articles and tutorials and it doesn't seem to work for me.
The articles and tutorials I have tried:
https://www.seleniumhq.org/docs/04_webdriver_advanced.jsp
Pass driver ChromeOptions and DesiredCapabilities?
https://www.programcreek.com/python/example/100023/selenium.webdriver.Remote
https://github.com/webdriverio/webdriverio/issues/324
https://www.programcreek.com/python/example/96010/selenium.webdriver.common.desired_capabilities.DesiredCapabilities.CHROME
Running Selenium Webdriver with a proxy in Python
how do i set proxy for chrome in python webdriver
https://docs.proxymesh.com/article/4-python-proxy-configuration
I have tried creating 4 versions of a script to access an external site i.e. google.com and simply print the text off of it. Every script returns a time out error. I apologize for posting a lot of code but maybe the community is able to see where I am going wrong with the coding aspect.
Code 1:
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
PROXY = "10.32.51.169:3128" # IP:PORT or HOST:PORT
desired_capabilities = webdriver.DesiredCapabilities.CHROME.copy()
desired_capabilities['proxy'] = {
"httpProxy":PROXY,
"ftpProxy":PROXY,
"sslProxy":PROXY,
"socksUsername":"myusername",
"socksPassword":"mypassword",
"noProxy":None,
"proxyType":"MANUAL",
"class":"org.openqa.selenium.Proxy",
"autodetect":False
}
browser = webdriver.Remote('http://10.1.54.118:4444/wd/hub', desired_capabilities)
browser.get("https://www.google.com/")
print (browser.find_element_by_tag_name('body').text)
bodyText = browser.find_element_by_tag_name('body').text
print (bodyText)
if 'Hello' in bodyText:
print ('Found hello in body')
else:
print ('Hello not found in body')
browser.quit()
Is my code incorrect in any way? Am I able to pass configuration parameters to the docker chrome selenium webdriver or do I need to build the docker container with the proxy settings preconfigured before building it? I look forward to your replies and any help that can point me in the right direction.
A little late on this one, but a couple ideas + improvements:
Remove the user/pass from the socks proxy config and add them to your Proxy connection uri.
Use the selenium Proxy object to help abstract some of the other bits of the proxy capability.
Add the scheme to the proxy connection string.
Use a try/finally block to make sure the browser quits despite any failures
Note... I'm using Python3, selenium version 3.141.0, and I'm leaving out the FTP config for brevity/simplicity:
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.common.proxy import Proxy
# Note the addition of the scheme (http) and the user/pass into the connection string.
PROXY = 'http://myusername:mypassword#10.32.51.169:3128'
# Use the selenium Proxy object to add proxy capabilities
proxy_config = {'httpProxy': PROXY, 'sslProxy': PROXY}
proxy_object = Proxy(raw=proxy_config)
capabilities = DesiredCapabilities.CHROME.copy()
proxy_object.add_to_capabilities(capabilities)
browser = webdriver.Remote('http://10.1.54.118:4444/wd/hub', desired_capabilities=capabilities)
# Use try/finally so the browser quits even if there is an exception
try:
browser.get("https://www.google.com/")
print(browser.find_element_by_tag_name('body').text)
bodyText = browser.find_element_by_tag_name('body').text
print(bodyText)
if 'Hello' in bodyText:
print('Found hello in body')
else:
print('Hello not found in body')
finally:
browser.quit()
I want to use headless chrome driver to download pdf. Everything works fine when I downloaded pdf without headless chrome. Here is part of my driver setting code:
options = webdriver.ChromeOptions()
prefs = {'profile.default_content_settings.popups': 0,
"plugins.plugins_list": [{"enabled": False, "name": "Chrome PDF Viewer"}], # Disable Chrome's PDF Viewer
'download.default_directory': 'download_dir' ,
"download.extensions_to_open": "applications/pdf"}
options.add_argument('--headless')
options.add_argument('--disable-gpu')
options.add_argument('"--no-sandbox"')
options.add_argument('--ignore-certificate-errors')
options.add_experimental_option('prefs', prefs)
driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')
params = {'cmd': 'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': download_dir}}
driver.execute("send_command", params)
driver.get(url)
try : WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, 'check-pdf')))
finally:
driver.find_element_by_class_name('check-pdf').click()
The error shows up when I run this file in cmd.
[0623/130628.966:INFO:CONSOLE(7)] "A parser-blocking, cross site (i.e. different eTLD+1) script, http://s11.cnzz.com/z_stat.php?id=1261865322, is invoked via document.write. The network request for this script MAY be blocked by the browser in this or a future page load due to poor network connectivity. If blocked in this page load, it will be confirmed in a subsequent console message. See https://www.chromestatus.com/feature/5718547946799104 for more details.", source: http://utrack.hexun.com/dp/hexun_uweb.js (7)
[0623/130628.968:INFO:CONSOLE(7)] "A parser-blocking, cross site (i.e. different eTLD+1) script, http://s11.cnzz.com/z_stat.php?id=1261865322, is invoked via document.write. The network request for this script MAY be blocked by the browser in this or a future page load due to poor network connectivity. If blocked in this page load, it will be confirmed in a subsequent console message. See https://www.chromestatus.com/feature/5718547946799104 for more details.", source: http://utrack.hexun.com/dp/hexun_uweb.js (7)
[0623/130628.974:INFO:CONSOLE(16)] "A parser-blocking, cross site (i.e. different eTLD+1) script, http://c.cnzz.com/core.php?web_id=1261865322&t=z, is invoked via document.write. The network request for this script MAY be blocked by the browser in this or a future page load due to poor network connectivity. If blocked in this page load, it will be confirmed in a subsequent console message. See https://www.chromestatus.com/feature/5718547946799104 for more details.", source: https://s11.cnzz.com/z_stat.php?id=1261865322 (16)
[0623/130628.976:INFO:CONSOLE(16)] "A parser-blocking, cross site (i.e. different eTLD+1) script, http://c.cnzz.com/core.php?web_id=1261865322&t=z, is invoked via document.write. The network request for this script MAY be blocked by the browser in this or a future page load due to poor network connectivity. If blocked in this page load, it will be confirmed in a subsequent console message. See https://www.chromestatus.com/feature/5718547946799104 for more details.", source: https://s11.cnzz.com/z_stat.php?id=1261865322 (16)
[0623/130629.038:INFO:CONSOLE(8)] "Uncaught ReferenceError: jQuery is not defined", source: http://img.hexun.com/zl/hx/index/js/appDplus.js (8)
[0623/130629.479:WARNING:render_frame_host_impl.cc(2750)] OnDidStopLoading was called twice
I am wondering what the error message means and how I can fix it ?
Any idea would be helpful !
I had a test script that was working, and it stopped working 2 weeks ago. The test is to login to Hotmail, click on new mail, fill in email address, subject, and text in the body, and send the email. Currently I can't enter text into the body of the mail. I tried with ID, CSS, and Xpath. I also tried using the select frame but to no avail. I have attached the Python code and would appreciate help...
The aim of the script is to capture the traffic via Wireshark specifically for Hotmail send mail, with the current Hotmail protocol.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import Select
from selenium.common.exceptions import NoSuchElementException
import unittest, time, re
class HotmailloginpythonWebdriver(unittest.TestCase):
def setUp(self):
self.driver = webdriver.Firefox()
self.driver.implicitly_wait(30)
self.base_url = "https://login.live.com/"
self.verificationErrors = []
def test_hotmailloginpython_webdriver(self):
driver = self.driver
driver.get(self.base_url + "/login.srf?wa=wsignin1.0&rpsnv=11&ct=1321965448&rver=6.1.6206.0&wp=MBI&wreply=http:%2F%2Fmail.live.com%2Fdefault.aspx&lc=1033&id=64855&mkt=en-us&cbcxt=mai&snsc=1")
driver.find_element_by_id("i0116").clear()
driver.find_element_by_id("i0116").send_keys("address#hotmail.com")
driver.find_element_by_id("i0118").clear()
driver.find_element_by_id("i0118").send_keys("password")
driver.find_element_by_id("idSIButton9").click()
driver.find_element_by_id("h_inboxCount").click()
driver.find_element_by_id("NewMessage").click()
driver.find_element_by_id("AutoCompleteTo$InputBox").clear()
driver.find_element_by_id("AutoCompleteTo$InputBox").send_keys("address#hotmail.com")
driver.find_element_by_id("fSubject").clear()
driver.find_element_by_id("fSubject").send_keys("testsubject")
driver.find_element_by_css_selector("body..RichText").clear()
driver.find_element_by_css_selector("body..RichText").send_keys("gggggggggggg")
driver.find_element_by_id("SendMessage").click()
driver.find_element_by_id("c_signout").click()
def is_element_present(self, how, what):
try:
self.driver.find_element(by=how, value=what)
except NoSuchElementException, e:
return False
return True
def tearDown(self):
self.driver.quit()
self.assertEqual([], self.verificationErrors)
if __name__ == "__main__":
unittest.main()
It is very much possible that Microsoft is blocking the automated service(like Selenium) which tries to access the Hotmail or live.com page. According to the Terms of Service( TOS) at Microsoft you can use automates service to login etc. Here is what TOS (Point Number#2) says:
You must not use the service to harm others or the service. For example, you must not use the service to harm, threaten, or harass another person, organization, or Microsoft. You must not: damage, disable, overburden, or impair the service (or any network connected to the service); resell or redistribute the service or any part of it; use any unauthorized means to modify, reroute, or gain access to the service or attempt to carry out these activities; or use any automated process or service (such as a bot, a spider, periodic caching of information stored by Microsoft, or metasearching) to access or use the service.
Full text is available here: http://windows.microsoft.com/en-US/windows-live/microsoft-service-agreement.
I had similar experience myself once testing something with Twitter UI. Maybe you can look for a third party service that can help you login via SMTP or POP3 etc. to measure network traffic instead of using frontend UI.
I suspect this has something to do with cookies. Maybe you removed the cookies from your browser?
Try debugging the script until the password typing or until
driver.find_element_by_id("idSIButton9").click()
to see if it works fine. Perhaps MS changed their UI so it would be nice to debug your app from that point to see if you have to modify your script to update object id's.
Regards.
Try to use Xpath not id. In xpath you can use following-sibling.It will work.
System.setProperty("webdriver.chrome.driver",
"F:\\batch230\\chromedriver.exe");
WebDriver driver = new ChromeDriver();
//open hotmail site
driver.get("http://www.hotmail.com/");
Thread.sleep(5000);
driver.manage().window().maximize();
Thread.sleep(5000);
//do login
driver.switchTo().activeElement().sendKeys("mail id");
driver.findElement(By.id("idSIButton9")).click();
Thread.sleep(5000);
driver.switchTo().activeElement().sendKeys("password");
driver.findElement(By.id("idSIButton9")).click();
Thread.sleep(5000);
//compose mail
driver.findElement(By.xpath("//*[contains(#title,'new message')]")).click();
Thread.sleep(5000);
driver.findElement(By.xpath("(//*[#role='textbox'])[1]"))
.sendKeys("er.anil900#gmail.com",Keys.TAB,"selenium"
,Keys.TAB,"Hi",Keys.ENTER,"How are you");
Thread.sleep(5000);
//send mail
driver.findElement(By.xpath("(//*[#title='Send'])[1]")).click();
Thread.sleep(10000);
//do logout
WebElement e = driver.findElement(By.xpath("(//*[#role='menuitem'])[11]"));
Actions a = new Actions(driver);
a.click(e).build().perform();
Thread.sleep(5000);
WebElement e1 = driver.findElement(By.xpath("//*[text()='Sign out']"));
a.click(e1).build().perform();
Thread.sleep(10000);
driver.close();