Headless chrome authentication and ssl error in linux - python

I am trying to access to our internal company site to pull screenshot of it using headless chrome on redhat linux.
For this I am using Python, Selenium, Poppler and Chromedriver.
It is working perfectly on Windows, however on non-gui linux without options.add_argument('--ignore-certificate-errors') its returning white blank page but with ('ignore-certificate-errors') option added its giving 401 error.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
DesiredCapabilities handlSSLErr = DesiredCapabilities.chrome ()
handlSSLErr.setCapability (CapabilityType.ACCEPT_SSL_CERTS, true)
WebDriver driver = new ChromeDriver (handlSSLErr);
options = webdriver.ChromeOptions()
options.add_argument('--ignore-certificate-errors')
options.add_argument('--headless')
options.add_argument('--no-sandbox')
driver = webdriver.Chrome(executable_path=os.path.join(FLASK_STATIC_FOLDER,'chromedriver'),options=options)
URL = '"our internal webpage/"%s' %int(facemapperid)
driver.get(URL)
If you have any suggestions

The option to ignore certificate error is
options.add_argument('--ignore-certificate-errors')
You missed to add --

I was able to achieve what I wanted by doing below
First I made connection to let it cache my cookie
driver.get("https://username:password#mywebsite")
and then do it again
URL = 'username:password#mywebsite

Related

Is there any way to bypass Google proxy block in Selenium webdriver?

I'm making an app (using Selenium webdriver in Chrome) that searches Google for a specified query (http://www.google.com/search?query) but everytime I search for it I want to change my IP so I'm using proxies.
The problem is Google blocks EVERY proxy I use. Is there anyway to bypass it? Maybe I'm using wrong type of proxies? (I've tried HTTP and HTTPS proxies, still they get blocked everytime)
Maybe my code is wrong?:
from selenium import webdriver
from selenium.webdriver import Chrome
from selenium.webdriver.chrome.options import Options
options = Options()
options.binary_location = "C:/Program Files (x86)/Google/Chrome/Application/chrome.exe"
options.add_argument("disable-extensions")
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option("useAutomationExtension", False)
options.add_argument(f"--proxy-server=ip:port")
driver = Chrome(options=options, executable_path="C:/WebDriver/bin/chromedriver.exe")
driver.get("http://www.google.com/search?query")
Can it be a matter of the proxies quality?
Google has removed the proxy support for FTP entirely in Google Chrome versions 76 and newer. You can use firefox or edge. I tried with firefox and able to launch:
options = Options()
options.binary_location = "C:\Program Files\Mozilla Firefox\Firefox.exe"
options.add_argument("disable-extensions")
options.add_argument("start-maximized")
options.add_argument(f"--proxy-server=ip:port")
driver = webdriver.Firefox(executable_path=r'..\drivers\geckodriver.exe', options=options)
Import:
from selenium import webdriver
from selenium.webdriver.firefox.options import Options

How to initiate a Tor Browser 9.5 which uses the default Firefox to 68.9.0esr using GeckoDriver and Selenium through Python

I'm trying to initiate a tor browsing session through Tor Browser 9.5 which uses the default Firefox v68.9.0esr using GeckoDriver and Selenium through Python on a windows-10 system. But I'm facing an error as:
Code Block:
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
import os
torexe = os.popen(r'C:\Users\username\Desktop\Tor Browser\Browser\TorBrowser\Tor\tor.exe')
profile = FirefoxProfile(r'C:\Users\username\Desktop\Tor Browser\Browser\TorBrowser\Data\Browser\profile.default')
profile.set_preference('network.proxy.type', 1)
profile.set_preference('network.proxy.socks', '127.0.0.1')
profile.set_preference('network.proxy.socks_port', 9050)
profile.set_preference("network.proxy.socks_remote_dns", False)
profile.update_preferences()
firefox_options = webdriver.FirefoxOptions()
firefox_options.binary_location = r'C:\Users\username\Desktop\Tor Browser\Browser\firefox.exe'
driver = webdriver.Firefox(firefox_profile= profile, options = firefox_options, executable_path=r'C:\WebDrivers\geckodriver.exe')
driver.get("https://www.tiktok.com/")
Where as the same code block works through Firefox and Firefox Nightly using the respective binaries.
Do I need any additional settings? Can someone help me out?
Firefox Snapshot:
Firefox Nightly Snapshot:
I managed to resolve this by updating to v9.5.1 and implementing the following changes:
Note that although the code is in C# the same changes to the Tor browser and how it is launched should be applied.
FirefoxProfile profile = new FirefoxProfile(profilePath);
profile.SetPreference("network.proxy.type", 1);
profile.SetPreference("network.proxy.socks", "127.0.0.1");
profile.SetPreference("network.proxy.socks_port", 9153);
profile.SetPreference("network.proxy.socks_remote_dns", false);
FirefoxDriverService firefoxDriverService = FirefoxDriverService.CreateDefaultService(geckoDriverDirectory);
firefoxDriverService.FirefoxBinaryPath = torPath;
firefoxDriverService.BrowserCommunicationPort = 2828;
var firefoxOptions = new FirefoxOptions
{
Profile = null,
LogLevel = FirefoxDriverLogLevel.Trace
};
firefoxOptions.AddArguments("-profile", profilePath);
FirefoxDriver driver = new FirefoxDriver(firefoxDriverService, firefoxOptions);
driver.Navigate().GoToUrl("https://www.google.com");
Important notes:
The following TOR configs need to be changed in about:config :
marionette.enabled: true
marionette.port: set to an unused port, and set this value to firefoxDriverService.BrowserCommunicationPort in your code. This was set to 2828 in my example.
note:
I am not sure whether this really is the definite answer (thus, I'd really appreciate feedback)
solution:
I've managed to send a get request to the check tor page (https://check.torproject.org/) and it displayed an unknown IP to me (additionally, IPs differ if you repeat the request after a time)
Essentially, I've set up the chrome driver to run TOR. Here's the code:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
tor_proxy = "127.0.0.1:9150"
chrome_options = Options()
chrome_options.add_argument("--test-type")
chrome_options.add_argument('--ignore-certificate-errors')
chrome_options.add_argument('--disable-extensions')
chrome_options.add_argument('disable-infobars')
chrome_options.add_argument("--incognito")
chrome_options.add_argument('--proxy-server=socks5://%s' % tor_proxy)
driver = webdriver.Chrome(options=chrome_options)
driver.get('https://check.torproject.org/')
Because the driver is not in headless mode you can inspect the resulting page yourself. It should read:
"Congratulations. This browser is configured to use Tor. [IP Info]. However, it does not appear to be Tor Browser. Click here to go to the download page"
Make sure that the chromedriver.exe file is linked on the path or provide the path to the file as an argument to the driver.Chrome() function.
Edit: make sure TOR browser is running in the background, thanks #Abhishek Rai for pointing that out

How to configure ChromeDriver to initiate Chrome browser in Headless mode through Selenium?

I'm working on a python script to web-scrape and have gone down the path of using Chromedriver as one of the packages. I would like this to operate in the background without any pop-up windows. I'm using the option 'headless' on chromedriver and it seems to do the job in terms of not showing the browser window, however, I still see the .exe file running. See the screenshot of what I'm talking about. Screenshot
This is the code I am using to initiate ChromeDriver:
options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches",["ignore-certificate-errors"])
options.add_argument('headless')
options.add_argument('window-size=0x0')
chrome_driver_path = "C:\Python27\Scripts\chromedriver.exe"
Things I've tried to do is alter the window size in the options to 0x0 but I'm not sure that did anything as the .exe file still popped up.
Any ideas of how I can do this?
I am using Python 2.7 FYI
It should look like this:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument('--headless')
options.add_argument('--disable-gpu') # Last I checked this was necessary.
driver = webdriver.Chrome(CHROMEDRIVER_PATH, chrome_options=options)
This works for me using Python 3.6, I'm sure it'll work for 2.7 too.
Update 2018-10-26: These days you can just do this:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.headless = True
driver = webdriver.Chrome(CHROMEDRIVER_PATH, options=options)
Answer update of 13-October-2018
To initiate a google-chrome-headless browsing context using Selenium driven ChromeDriver now you can just set the --headless property to true through an instance of Options() class as follows:
Effective code block:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.headless = True
driver = webdriver.Chrome(options=options, executable_path=r'C:\path\to\chromedriver.exe')
driver.get("http://google.com/")
print ("Headless Chrome Initialized")
driver.quit()
Answer update of 23-April-2018
Invoking google-chrome in headless mode programmatically have become much easier with the availability of the method set_headless(headless=True) as follows :
Documentation :
set_headless(headless=True)
Sets the headless argument
Args:
headless: boolean value indicating to set the headless option
Sample Code :
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.set_headless(headless=True)
driver = webdriver.Chrome(options=options, executable_path=r'C:\path\to\chromedriver.exe')
driver.get("http://google.com/")
print ("Headless Chrome Initialized")
driver.quit()
Note : --disable-gpu argument is implemented internally.
Original Answer of Mar 30 '2018
While working with Selenium Client 3.11.x, ChromeDriver v2.38 and Google Chrome v65.0.3325.181 in Headless mode you have to consider the following points :
You need to add the argument --headless to invoke Chrome in headless mode.
For Windows OS systems you need to add the argument --disable-gpu
As per Headless: make --disable-gpu flag unnecessary --disable-gpu flag is not required on Linux Systems and MacOS.
As per SwiftShader fails an assert on Windows in headless mode --disable-gpu flag will become unnecessary on Windows Systems too.
Argument start-maximized is required for a maximized Viewport.
Here is the link to details about Viewport.
You may require to add the argument --no-sandbox to bypass the OS security model.
Effective windows code block :
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("--headless") # Runs Chrome in headless mode.
options.add_argument('--no-sandbox') # Bypass OS security model
options.add_argument('--disable-gpu') # applicable to windows os only
options.add_argument('start-maximized') #
options.add_argument('disable-infobars')
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\path\to\chromedriver.exe')
driver.get("http://google.com/")
print ("Headless Chrome Initialized on Windows OS")
Effective linux code block :
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("--headless") # Runs Chrome in headless mode.
options.add_argument('--no-sandbox') # # Bypass OS security model
options.add_argument('start-maximized')
options.add_argument('disable-infobars')
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(chrome_options=options, executable_path='/path/to/chromedriver')
driver.get("http://google.com/")
print ("Headless Chrome Initialized on Linux OS")
Steps through YouTube Video
How to initialize Chrome Browser in Maximized Mode through Selenium
Outro
How to make firefox headless programmatically in Selenium with python?
tl; dr
Here is the link to the Sandbox story.
Update August 20, 2020 -- Now is simple!
chrome_options = webdriver.ChromeOptions()
chrome_options.headless = True
self.driver = webdriver.Chrome(
executable_path=DRIVER_PATH, chrome_options=chrome_options)
UPDATED
It works fine in my case:
from selenium import webdriver
options = webdriver.ChromeOptions()
options.headless = True
driver = webdriver.Chrome(CHROMEDRIVER_PATH, options=options)
Just changed in 2020. Works fine for me.
So after correcting my code to:
options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches",["ignore-certificate-errors"])
options.add_argument('--disable-gpu')
options.add_argument('--headless')
chrome_driver_path = "C:\Python27\Scripts\chromedriver.exe"
The .exe file still came up when running the script. Although this did get rid of some extra output telling me "Failed to launch GPU process".
What ended up working is running my Python script using a .bat file
So basically,
Save python script if a folder
Open text editor, and dump the following code (edit to your script of course)
c:\python27\python.exe c:\SampleFolder\ThisIsMyScript.py %*
Save the .txt file and change the extension to .bat
Double click this to run the file
So this just opened the script in Command Prompt and ChromeDriver seems to be operating within this window without popping out to the front of my screen and thus solving the problem.
The .exe would be running anyway. According to Google - "Run in headless mode, i.e., without a UI or display server dependencies."
Better prepend 2 dashes to command line arguments, i.e. options.add_argument('--headless')
In headless mode, it is also suggested to disable the GPU, i.e. options.add_argument('--disable-gpu')
Try using ChromeDriverManager
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.set_headless()
browser =webdriver.Chrome(ChromeDriverManager().install(),chrome_options=chrome_options)
browser.get('https://google.com')
# capture the screen
browser.get_screenshot_as_file("capture.png")
Solutions above don't work with websites with cloudflare protection, example: https://paxful.com/fr/buy-bitcoin.
Modify agent as follows:
options.add_argument("user-agent=Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Safari/537.36")
Fix found here:
What is the difference in accessing Cloudflare website using ChromeDriver/Chrome in normal/headless mode through Selenium Python
from chromedriver_py import binary_path
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--window-size=1280x1696')
chrome_options.add_argument('--user-data-dir=/tmp/user-data')
chrome_options.add_argument('--hide-scrollbars')
chrome_options.add_argument('--enable-logging')
chrome_options.add_argument('--log-level=0')
chrome_options.add_argument('--v=99')
chrome_options.add_argument('--single-process')
chrome_options.add_argument('--data-path=/tmp/data-path')
chrome_options.add_argument('--ignore-certificate-errors')
chrome_options.add_argument('--homedir=/tmp')
chrome_options.add_argument('--disk-cache-dir=/tmp/cache-dir')
chrome_options.add_argument('user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36')
driver = webdriver.Chrome(executable_path = binary_path,options=chrome_options)
System.setProperty("webdriver.chrome.driver",
"D:\\Lib\\chrome_driver_latest\\chromedriver_win32\\chromedriver.exe");
ChromeOptions chromeOptions = new ChromeOptions();
chromeOptions.addArguments("--allow-running-insecure-content");
chromeOptions.addArguments("--window-size=1920x1080");
chromeOptions.addArguments("--disable-gpu");
chromeOptions.setHeadless(true);
ChromeDriver driver = new ChromeDriver(chromeOptions);
chromeoptions=add_argument("--no-sandbox");
add_argument("--ignore-certificate-errors");
add_argument("--disable-dev-shm-usage'")
is not a supported browser
solution:
Open Browser ${event_url} ${BROWSER} options=add_argument("--no-sandbox"); add_argument("--ignore-certificate-errors"); add_argument("--disable-dev-shm-usage'")
don't forget to add spaces between ${BROWSER} options
There is an option to hide the chromeDriver.exe window in alpha and beta versions of Selenium 4.
from selenium import webdriver
from selenium.webdriver.chrome.service import Service as ChromeService # Similar thing for firefox also!
from subprocess import CREATE_NO_WINDOW # This flag will only be available in windows
chrome_service = ChromeService('chromedriver', creationflags=CREATE_NO_WINDOW)
driver = webdriver.Chrome(service=chrome_service) # No longer console window opened, niether will chromedriver output
You can check it out from here. To pip install beta or alpha versions, you can do "pip install selenium==4.0.0.a7" or "pip install selenium==4.0.0.b4" (a7 means alpha-7 and b4 means beta-4 so for other versions you want, you can modify the command.) To import a specific version of a library in python you can look here.
RECENT UPDATE
Recently there is an update performed on headless mode of Chrome. The flag --headless is now modified and can be used as below
For Chrome version 109 and above, --headless=new flag allows us to explore full functionality Chrome browser in headless mode.
For Chrome version 108 and below (till Version 96), --headless=chrome option will provide us the headless chrome browser.
So, let's add
options.add_argument("--headless=new")
for newer version of Chrome in headless mode as mentioned above.
The below works fine for me with Chrome version 110.0.5481.104
chrome_driver_path = r"E:\driver\chromedriver.exe"
options = webdriver.ChromeOptions()
options.add_argument('--disable-gpu')
//New Update
options.add_argument("--headless=new")
options.binary_location = r"C:\Chrome\Application\chrome.exe"
browser = webdriver.Chrome(chrome_driver_path, options=options)
browser.get('https://www.google.com')
Update August 2021:
The fastest way to do is probably:
from selenium import webdriver
options = webdriver.ChromeOptions()
options.set_headless = True
driver = webdriver.Chrome(options=options)
options.headless = True is deprecated.

Force Selenium Chrome Driver to use QUIC instead of TCP

I am working on downloading HAR from Chrome for YouTube through Selenium Python Script.
Code Snippet:
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--proxy-server={0}".format(url))
chrome_options.add_argument("--enable-quic")
self.driver = webdriver.Chrome(chromedriver,chrome_options = chrome_options)
self.proxy.new_har(args['url'], options={'captureHeaders': True})
self.driver.get(args['url'])
result = json.dumps(self.proxy.har, ensure_ascii=False)
I want QUIC to be used whenever I download HAR but when I look at the packets through Wireshark Selenium driver is using TCP only. Is there a way to force Chrome Driver to use QUIC? Or Is there an alternate to BMP?
A similar thing has been asked for Firefox in this question How to capture all requests made by page in webdriver? Is there any alternative to Browsermob? and there was a solution with Selenium alone without need of any BMP. So is it possible for Chrome?
Workaround for this problem could be: start Chrome normally (with your default profile or create another profile) and enable quic manually. Then start chromedriver with your profile loaded.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = webdriver.ChromeOptions()
options.add_argument("user-data-dir=/home/user/.config/google-chrome")
driver = webdriver.Chrome(executable_path="/home/user/Downloads/chromedriver", chrome_options=options)

how can i remove notifications and alerts from browser? selenium python 2.7.7

I am trying to submit information in a webpage, but selenium throws this error:
UnexpectedAlertPresentException: Alert Text: This page is asking you
to confirm that you want to leave - data you have entered may not be
saved. ,
>
It's not a leave notification; here is a pic of the notification -
.
If I click in never show this notification again, my action doesn't get saved; is there a way to save it or disable all notifications?
edit: I'm using firefox.
You can disable the browser notifications, using chrome options. Sample code below:
chrome_options = webdriver.ChromeOptions()
prefs = {"profile.default_content_setting_values.notifications" : 2}
chrome_options.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(chrome_options=chrome_options)
With the latest version of Firefox the above preferences didn't work.
Below is the solution which disable notifications using Firefox object
_browser_profile = webdriver.FirefoxProfile()
_browser_profile.set_preference("dom.webnotifications.enabled", False)
webdriver.Firefox(firefox_profile=_browser_profile)
Disable notifications when using Remote Object:
webdriver.Remote(desired_capabilities=_desired_caps, command_executor=_url, options=_custom_options, browser_profile=_browser_profile)
selenium==3.11.0
Usually with browser settings like this, any changes you make are going to get throws away the next time Selenium starts up a new browser instance.
Are you using a dedicated Firefox profile to run your selenium tests? If so, in that Firefox profile, set this setting to what you want and then close the browser. That should properly save it for its next use. You will need to tell Selenium to use this profile though, thats done by SetCapabilities when you start the driver session.
This will do it:
from selenium.webdriver.firefox.options import Options
options = Options()
options.set_preference("dom.webnotifications.enabled", False)
browser = webdriver.Firefox(firefox_options=options)
For Google Chrome and v3 of Selenium you may receive "DeprecationWarning: use options instead of chrome_options", so you will want to do the following:
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
options = webdriver.ChromeOptions()
options.add_argument('--disable-notifications')
driver = webdriver.Chrome(ChromeDriverManager().install(), options=options)
Note: I am using webdriver-manager, but this also works with specifying the executable_path.
This answer is an improvement on TH Todorov code snippet, based on what is working as of Chrome (Version 80.0.3987.163).
lk = os.path.join(os.getcwd(), "chromedriver",) --> in this line you provide the link to the chromedriver, which you can download from chromedrive link
import os
from selenium import webdriver
lk = os.path.join(os.getcwd(), "chromedriver",)
chrome_options = webdriver.ChromeOptions()
prefs = {"profile.default_content_setting_values.notifications" : 2}
chrome_options.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(lk, options=chrome_options)

Categories