Selenium Python - Quit if Basic Auth requested

Selenium Python - Quit if Basic Auth requested - python

I have a script that collects a screenshot of a web site using Selenium. My issue is that if a site requests basic authentication I would like the script to just error and quit.
At the moment it just sits there for about a minute and then takes a blank screen shot.
The code I am using is below.
#!/usr/bin/env python
from pyvirtualdisplay import Display
from selenium import webdriver
display = Display(visible=0, size=(800, 600))
display.start()
browser = webdriver.Firefox()
browser.get('http://www.google.com')
browser.save_screenshot('screenshot.png')
browser.quit()
display.stop()
I am hoping that there is an easy way of making the script after the browser.get command to error if asked for authentication.
Thanks for your help.

May be not loaded this page after then selenium do screenshot. Could you use explicit waits function? For example:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait # available since 2.4.0
from selenium.webdriver.support import expected_conditions as EC # available since 2.26.0
ff = webdriver.Firefox()
ff.get("http://somedomain/url_that_delays_loading")
try:
element = WebDriverWait(ff, 10).until(EC.presence_of_element_located((By.ID, "myDynamicElement")))
finally:
ff.quit()
This example quoted from this page.

Related

Why doesn't my webscrape code accept cookies

I'm a beginner in web scrapping and I've followed a few YouTube videos about how to do this, but regardless to what I try I can't have my code accept the cookies.
This is the code I have so far:
from selenium import webdriver
import time
driver = webdriver.Safari()
URL = "https://www.zoopla.co.uk/new-homes/property/london/?q=London&results_sort=newest_listings&search_source=new-homes&page_size=25&pn=1&view_type=list"
driver.get(URL)
time.sleep(2) # Wait a couple of seconds, so the website doesn't suspect you are a bot
try:
driver.switch_to_frame('gdpr-consent-notice') # This is the id of the frame
accept_cookies_button = driver.find_element_by_xpath('//*[#id="save"]')
accept_cookies_button.click()
except AttributeError: # If you have the latest version of selenium, the code above won't run because the "switch_to_frame" is deprecated
driver.switch_to.frame('gdpr-consent-notice') # This is the id of the frame
accept_cookies_button = driver.find_element_by_xpath('//*[#id="save"]')
accept_cookies_button.click()
except:
pass # If there is no cookies button, we won't find it, so we can pass

I don't have safari webdriver but chrome webdriver, but I think they works similar. On chrome you close the cookie banner with this code
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver.get(URL)
# wait no more than 20 seconds for the `iframe` with id `gdpr-consent-notice` to appear, then switch to it
WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.ID, "gdpr-consent-notice")))
# click accept cookies button
driver.find_element(By.CSS_SELECTOR, '#save').click()

Pressing ctrl+t doesn't work in Selenium Webdriver using ActionChains

I need to open a new browser tab in my test and I've read that the best approach is to simply send the appropriate keys to the browser. I'm using windows so I use ActionChains(driver).send_keys(Keys.CONTROL, "t").perform(), however, this does nothing.
I tried the following to test that Keys.CONTROL is working properly:
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys
def test_trial():
driver = webdriver.Chrome()
driver.get("https://www.google.com/")
ActionChains(driver).send_keys(Keys.CONTROL, "v").perform()
This indeed passes whatever I have copied in the clipboard to the Google search box that is in focus by default.
This is what I want to use but that is not working:
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys
def test_trial():
driver = webdriver.Chrome()
driver.get("https://www.google.com/")
ActionChains(driver).send_keys(Keys.CONTROL, "t").perform()
Nothing seems to happen to the browser, no new tab opened, no dialog box, no notification. Does anyone know why this is?

Try this java Script Executor it should work.
link="https://www.google.com"
driver.execute_script("window.open('{}');".format(link))
Edited code with window handle.
driver=webdriver.Chrome()
driver.get("https://www.google.com")
window_before = driver.window_handles[0]
link="https://www.google.com"
driver.execute_script("window.open('{}');".format(link))
window_after = driver.window_handles[1]
driver.switch_to.window(window_after)
driver.find_element_by_name("q").send_keys("test")

try executing this script:
driver.execute_script("window.open('https://www.google.com');")
for example
myURL = 'https://www.google.com'
driver.execute_script("window.open('" + myURL + "');")

You’ve gotten some good answers utilizing JavaScript execution, but I am curious why your example doesn’t work in the first place.
It’s possible that your ActionChains line is executed before the page has fully loaded; you could try adding a wait as follows:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys
def test_trial():
driver = webdriver.Chrome()
driver.get("https://www.google.com/")
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located(By.TAG_NAME("body")))
ActionChains(driver).send_keys(Keys.CONTROL, "t").perform()

Headless webdriver returns error but non-headless works

I am doing a simple experiment with Amazon and Webdriver. However, using Webdriver Headless cannot find elements and errors out, but non-headless works.
Any suggestions how to get it working headless?
There is a comment right above the --headless flag.
from selenium import webdriver
import sys
import os
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
def get_inventory(url):
chrome_options = Options()
# Making it headless breaks. Commenting
# this line, making it non-headless works.
chrome_options.add_argument("--headless")
chrome_options.add_experimental_option(
"prefs", {'profile.managed_default_content_settings.javascript': 2})
chrome_options.binary_location = '/Applications/Google Chrome Canary.app/Contents/MacOS/Google Chrome Canary'
driver = webdriver.Chrome(executable_path=os.path.abspath(
'chromedriver'), chrome_options=chrome_options)
driver.set_window_size(1200, 1000)
try:
driver.get(url)
add_to_cart_button_xp = '//*[#id="add-to-cart-button"]'
add_to_cart_button = driver.find_element_by_xpath(add_to_cart_button_xp)
add_to_cart_button.click()
driver.get('https://www.amazon.com/gp/cart/view.html/ref=lh_cart')
qty_field_xp = '//div/input[starts-with(#name, "quantity.") and #type="text"]'
qty_field = driver.find_element_by_xpath(qty_field_xp)
qty_field.clear()
qty_field.send_keys("2")
update_link_xp = f'//input[#value="Update" and #type="submit"]'
update_link = driver.find_element_by_xpath(update_link_xp)
update_link.click()
url = 'https://www.amazon.com/Pexio-Professional-Stainless-Food-Safe-Dishwasher/dp/B07BGBSY9F'
get_inventory(url)

I think you just had some selector issues. I checked the elements and updated the quantity setting; everything else should be pretty much the same, aside from the binary locations.
from selenium import webdriver
import sys
import os
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
def get_inventory(url):
chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome(
executable_path='/usr/bin/chromedriver',
chrome_options=chrome_options,
)
chrome_options.add_experimental_option(
"prefs",
{'profile.managed_default_content_settings.javascript': 2},
)
chrome_options.binary_location = '/usr/bin'
driver.set_window_size(1200, 1000)
try:
driver.get(url)
driver.save_screenshot("/tmp/x1.png")
driver.find_element_by_xpath('//*[#id="add-to-cart-button"]').click()
driver.get('https://www.amazon.com/gp/cart/view.html/ref=lh_cart')
driver.find_element_by_xpath("//span[#data-action='a-dropdown-button']").click()
driver.find_element_by_xpath("//*[#class='a-dropdown-link'][text()[contains(., '2')]]").click()
driver.find_element_by_class_name("nav-logo-base").click()
driver.save_screenshot("/tmp/confirm.png")
driver.close()
except Exception as e:
print(e)
url = 'https://www.amazon.com/Pexio-Professional-Stainless-Food-Safe-Dishwasher/dp/B07BGBSY9F'
get_inventory(url)
I've run this with and without --headless and it's working fine for me. I navigated to the homepage at the end so you can confirm the quantity change worked (hence the screenshot).

What is the behavior you see?
When I enabled headless, scripts started failing because running headless slows execution down.
I currently run chrome with these options:
'--no-sandbox', '--headless', '--window-size=1920,1080', '--proxy-server="direct://"', '--proxy-bypass-list=*'
The last two options supposedly help with the slowness, but I didn't see any difference.
Hope this helps.

I verified your claim on my Mac (using /Applications/Google Chrome.app/Contents/MacOS/Google Chrome).
My guess is that, since you are moving from an item page to the cart page of Amazon, the cookies are lost, so that the cart page won't show any item, and therefore won't contain any text input with a name starting with “quantity”, which is what the exception is about.
Googling for headless chrome cookies yields this page, which in turn points to this page, the content of which could also be about your problem. Be it this, or be it a particularly smart behavior of the Amazon website, the fact remains: the cookie that stores the cart (or a key thereof, but the result is the same) is not read by the cart page when in headless mode.

Selenium webdriver and GDPR consent

I wrote a simple scraper sometime ago which opens up chrome browser and scrapes some data from a website. However now everytime I run that script it does not open the url I provide but instead redirects to GDPR consent website. I removed --incognito mode from options but it is still the same. The chrome opens then the script crashes because it is atuomatically redirected to that GDPR consent webpage.
How can I go around this issue?
Here is the code to reproduce the error.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
option = webdriver.ChromeOptions()
#option.add_argument("--incognito")
browser = webdriver.Chrome(executable_path='chromedriverpath', chrome_options=option)
rval=[]
browser.get("https://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL")
timeout = 10
WebDriverWait(browser, timeout)
values_element = browser.find_elements_by_xpath("//td[#class='Fz(s) Fw(500) Ta(end)']")
print(browser)
values = [x.text for x in values_element]
rval.append(values[8])
for title, value in zip(stockname, rval):
print(title + ': ' + value)
evdict=dict(zip(stockname, rval))

So to to bypass such popup window which block selenium from scraping the data I needed to add:
browser.find_element_by_xpath("//input[#type='submit' and #value='OK']").click()
Which would click on the proper button and close the window for me. Then Selenium works without further problems.

Unable to submit keys using selenium with python

Following is the code which im trying to run
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import os
import time
#Create a new firefox session
browser=webdriver.Firefox()
browser.maximize_window()
#navigate to app's homepage
browser.get('http://demo.magentocommerce.com/')
#get searchbox and clear and enter details.
browser.find_element_by_css_selector("a[href='/search']").click()
search=browser.find_element_by_class_name('search-input')
search.click()
time.sleep(5)
search.click()
search.send_keys('phones'+Keys.RETURN)
However, im unable to submit the phones using send_keys.
Am i going wrong somewhere?
Secondly is it possible to always use x-path to locate an element and not rely on id/class/css-selections etc ?

The input element you are interested in has the search_query class name. To make it work without using hardcoded time.sleep() delays, use an Explicit Wait and wait for the search input element to be visible before sending keys to it. Working code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
browser = webdriver.Firefox()
browser.maximize_window()
wait = WebDriverWait(browser, 10)
browser.get('http://demo.magentocommerce.com/')
browser.find_element_by_css_selector("a[href='/search']").click()
search = wait.until(EC.visibility_of_element_located((By.CLASS_NAME, "search-query")))
search.send_keys("phones" + Keys.RETURN)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Selenium Python - Quit if Basic Auth requested - python

Related

Why doesn't my webscrape code accept cookies

Pressing ctrl+t doesn't work in Selenium Webdriver using ActionChains

Headless webdriver returns error but non-headless works

Selenium webdriver and GDPR consent

Unable to submit keys using selenium with python

Categories

Resources