driver.get() stopped working in Headless -mode (Chrome) - python

recently a scraper I made stopped working in headless mode. I've tried with both firefox and Chrome. Notable things are that I am using seleniumwire to access API requests, and that I am using ChromeDriverManager to get the driver. Current version for Chrome/93.0.4577.63.
I've tried modifying the User-Agent manually as can be seen in the below code, in case the website added some checks blocking HeadlessChrome/93.0.4577.63 which is the original User-Agent. This did not help.
When running the script in regular mode, it works. When running in headless mode, the below code would output [] meaning that driver.get(url) does not return any requests. I run this code daily and it stopped working on 8.9.2021 I think, during the day.
from selenium.webdriver.chrome.options import Options as chromeOptions
from seleniumwire import webdriver
from webdriver_manager.chrome import ChromeDriverManager
options = {
'suppress_connection_errors': False,
'connection_timeout': None
}
chrome_options = chromeOptions()
chrome_options.add_argument("--start-maximized")
chrome_options.add_argument("--incognito")
chrome_options.add_argument('--log-level=2')
chrome_options.add_argument("--window-size=1920,1080")
chrome_options.add_argument("--disable-extensions")
chrome_options.add_argument('--allow-running-insecure-content')
chrome_options.add_argument('--headless')
driver = webdriver.Chrome(ChromeDriverManager().install(), seleniumwire_options=options, chrome_options=chrome_options)
userAgent = driver.execute_script("return navigator.userAgent;")
userAgent = userAgent.replace('Headless', '')
driver.execute_cdp_cmd('Network.setUserAgentOverride', {"userAgent": userAgent})
url = 'my URL goes here'
driver.get(url)
print(driver.requests)
Same issue with FireFox, headless does not work but regular browsing does. Any idea what might cause this problem and what could solve it? I've also tried adding the following arguments to Chrome options without any luck:
chrome_options.add_argument("--proxy-server='direct://'")
chrome_options.add_argument("--proxy-bypass-list=*")
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--ignore-certificate-errors')
chrome_options.add_argument('--headless')
chrome_options.add_argument('--ignore-certificate-errors-spki-list')
chrome_options.add_argument('--ignore-ssl-errors')

This may have been solved - I noticed that I first set the window size to maximize and after that set it to 1920,1080. When I removed the argument to maximize
chrome_options.add_argument("--start-maximized") the problem disappeared and now the script works once again.
I'm not sure if this actually solved it or whether it was something else, since Selenium is a bit finicky and sometimes data just won't load the same way for the same web page, but at least now it works.

Related

selenium in python, how to ignore errors without closing the browser

I'm doing an automation using Selenium in Python
How could I simply ignore an error that happens on the site without it closing the browser?
I looked in several places about and I didn't find anything that helped me
You can do:
# Needed libs
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_experimental_option("detach", True)
driver = webdriver.Chrome(options=chrome_options)
Then, if something fails, your browser will not be closed.

Chrome webdriver closes automatically

I am trying to use selenium for scraping and when I try to start the WebDriver it automatically closes. I've tried everything and it still doesn't work.
def launchBrowser():
ch_options = webdriver.ChromeOptions()
ch_options.add_experimental_option("detach",True)
ch_options.add_experimental_option('excludeSwitches', ['enable-logging'])
ch_driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()),options=ch_options)
ch_driver.get(url)
launchBrowser()
I tried to implement some solutions I have seen but it doesn't work.

How does one implement the headless option in the Selenium 4 WebDriver-Manager?

I have but one hurdle to overcome before I can truly call my first bot complete and that is to figure out where to put the options class(?) in order to run ChromeDriverManager in headless mode, and so it stops opening chrome instances! The way the driver is called is:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from webdriver_manager.chrome import ChromeDriverManager
options = Options()
options.headless = True
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
Since the old method of calling webdriver by path hasn't been entirely deprecated yet I don't think there have been very many questions pertaining to the new webdriver-manager. I've found only one or two methods that didn't work, like adding ,options=options after .install() or .options somewhere in the mix. In any case, any suggestions would be appreciated.
try this:
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
options = webdriver.ChromeOptions()
options.add_argument("--headless")
driver = webdriver.Chrome(ChromeDriverManager().install(), options=options)
I typed out this comment and never finished it, so my apologies. The correct code to run selenium 4 WebDriver-Manager in headless mode is indeed:
options = Options()
options.headless = True #
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()),options=options)
#as opposed to what I was trying: #
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install(),options=options))
I imagine that 'options' just needs to be a direct argument of webdriver.Chrome(), so I think this should also work:
driver = webdriver.Chrome(service=Service,options=options(ChromeDriverManager().install()))
Also, I think I have figured out that headless mode makes it easier for websites to mark you as a bot and prompt you to do captchas as after some time of running, because of either captchas or an error in a change of code, my bot couldn't use the search function while headless was true, but performed perfectly with it disabled.

Headless chrome authentication and ssl error in linux

I am trying to access to our internal company site to pull screenshot of it using headless chrome on redhat linux.
For this I am using Python, Selenium, Poppler and Chromedriver.
It is working perfectly on Windows, however on non-gui linux without options.add_argument('--ignore-certificate-errors') its returning white blank page but with ('ignore-certificate-errors') option added its giving 401 error.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
DesiredCapabilities handlSSLErr = DesiredCapabilities.chrome ()
handlSSLErr.setCapability (CapabilityType.ACCEPT_SSL_CERTS, true)
WebDriver driver = new ChromeDriver (handlSSLErr);
options = webdriver.ChromeOptions()
options.add_argument('--ignore-certificate-errors')
options.add_argument('--headless')
options.add_argument('--no-sandbox')
driver = webdriver.Chrome(executable_path=os.path.join(FLASK_STATIC_FOLDER,'chromedriver'),options=options)
URL = '"our internal webpage/"%s' %int(facemapperid)
driver.get(URL)
If you have any suggestions
The option to ignore certificate error is
options.add_argument('--ignore-certificate-errors')
You missed to add --
I was able to achieve what I wanted by doing below
First I made connection to let it cache my cookie
driver.get("https://username:password#mywebsite")
and then do it again
URL = 'username:password#mywebsite

how can i remove notifications and alerts from browser? selenium python 2.7.7

I am trying to submit information in a webpage, but selenium throws this error:
UnexpectedAlertPresentException: Alert Text: This page is asking you
to confirm that you want to leave - data you have entered may not be
saved. ,
>
It's not a leave notification; here is a pic of the notification -
.
If I click in never show this notification again, my action doesn't get saved; is there a way to save it or disable all notifications?
edit: I'm using firefox.
You can disable the browser notifications, using chrome options. Sample code below:
chrome_options = webdriver.ChromeOptions()
prefs = {"profile.default_content_setting_values.notifications" : 2}
chrome_options.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(chrome_options=chrome_options)
With the latest version of Firefox the above preferences didn't work.
Below is the solution which disable notifications using Firefox object
_browser_profile = webdriver.FirefoxProfile()
_browser_profile.set_preference("dom.webnotifications.enabled", False)
webdriver.Firefox(firefox_profile=_browser_profile)
Disable notifications when using Remote Object:
webdriver.Remote(desired_capabilities=_desired_caps, command_executor=_url, options=_custom_options, browser_profile=_browser_profile)
selenium==3.11.0
Usually with browser settings like this, any changes you make are going to get throws away the next time Selenium starts up a new browser instance.
Are you using a dedicated Firefox profile to run your selenium tests? If so, in that Firefox profile, set this setting to what you want and then close the browser. That should properly save it for its next use. You will need to tell Selenium to use this profile though, thats done by SetCapabilities when you start the driver session.
This will do it:
from selenium.webdriver.firefox.options import Options
options = Options()
options.set_preference("dom.webnotifications.enabled", False)
browser = webdriver.Firefox(firefox_options=options)
For Google Chrome and v3 of Selenium you may receive "DeprecationWarning: use options instead of chrome_options", so you will want to do the following:
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
options = webdriver.ChromeOptions()
options.add_argument('--disable-notifications')
driver = webdriver.Chrome(ChromeDriverManager().install(), options=options)
Note: I am using webdriver-manager, but this also works with specifying the executable_path.
This answer is an improvement on TH Todorov code snippet, based on what is working as of Chrome (Version 80.0.3987.163).
lk = os.path.join(os.getcwd(), "chromedriver",) --> in this line you provide the link to the chromedriver, which you can download from chromedrive link
import os
from selenium import webdriver
lk = os.path.join(os.getcwd(), "chromedriver",)
chrome_options = webdriver.ChromeOptions()
prefs = {"profile.default_content_setting_values.notifications" : 2}
chrome_options.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(lk, options=chrome_options)

Categories