How to avoid verification page when using selenium chrome driver with python? - python

I want to scrap some data with selenium python. I have this type of screen sometimes :
Do you know to proceed in order to remove this type of verification ? Here my code :
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
options = Options()
options.add_argument("start-maximized")
options.add_argument("--disable-web-security")
options.add_argument("--disable-site-isolation-trials")
options.add_argument("--allow-running-insecure-content")
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
driver.get('THE_WEBSITE_COM')

Selenium specifically and other automation tools have certain user agents and other identifiers which indicate that it's automated. So maybe have a play around with things like that. Some websites use anti bot tools to analyze browsing behaviours and patterns so try to slow it and randomize it eg. random time between page requests
Another trick is to look at the website and try to find if there are any alternative routes to get the information. For example: is there a public API you can use to bypass it? Is there a mobile version of the website? Sometimes mobile versions have less aggressive Captcha enforcement.

What I do most of the time is to launch the browser separately and connect to it using Dev port which is beautifully explained in this article.
To enable Chrome to open a port for remote debugging, we need to launch it with a custom flag –
chrome.exe --remote-debugging-port=9222 --user-data-dir="C:\selenum\ChromeProfile"
Then connect to the browser using this
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_experimental_option("debuggerAddress", "127.0.0.1:9222")
#Change chrome driver path accordingly
chrome_driver = "C:\chromedriver.exe"
driver = webdriver.Chrome(chrome_driver, chrome_options=chrome_options)
print driver.title

Related

How to use a existing chrome installation with selenium webdriver?

I would like to use an existing installation of chrome (or firefox or brave browser) with selenium. Like that I could set prespecified settings / extensions (e.g. start nord-vpn when opening a new instance) that are active when the browser is opened with selenium.
I know there is selenium.webdriver.service with the "executeable-path" option, but it doesn't seem to work when you specify a specific chrome.exe, the usage seems to be for the chrome-driver only and then it still opens a "fresh" installation of chrome.
Starting selenium with extension-file I think is also not an option to use with the nord-vpn extension, as I have two-factor authentication active and login every single time would take too much time and effort, if possible at all.
Firefox profile
To use the existing installation of firefox you have to pass the profile path through set_preference() method using an instance of Option from selenium.webdriver.common.options as follows:
from selenium.webdriver import Firefox
from selenium import webdriver
from selenium.webdriver.firefox.service import Service
from selenium.webdriver.firefox.options import Options
profile_path = r'C:\Users\Admin\AppData\Roaming\Mozilla\Firefox\Profiles\s8543x41.default-release'
options=Options()
options.set_preference('profile', profile_path)
service = Service('C:\\BrowserDrivers\\geckodriver.exe')
driver = Firefox(service=service, options=options)
driver.get("https://www.google.com")
You can find a relevant detailed discussion in Error update preferences in Firefox profile: 'Options' object has no attribute 'update_preferences'
Chrome profile
Where as to use an existing installation of google-chrome you have to pass the user profile path through add_argument() using the user-data-dir key through an instance of Option from selenium.webdriver.common.options as follows:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
options = Options()
options.add_argument("user-data-dir=C:\\Users\\username\\AppData\\Local\\Google\\Chrome\\User Data\\Default")
s = Service('C:\\BrowserDrivers\\chromedriver.exe')
driver = webdriver.Chrome(service=s, options=options)
driver.get("https://www.google.com/")
You can find a relevant detailed discussion in How to open a Chrome Profile through Python

Is there any way to bypass Google proxy block in Selenium webdriver?

I'm making an app (using Selenium webdriver in Chrome) that searches Google for a specified query (http://www.google.com/search?query) but everytime I search for it I want to change my IP so I'm using proxies.
The problem is Google blocks EVERY proxy I use. Is there anyway to bypass it? Maybe I'm using wrong type of proxies? (I've tried HTTP and HTTPS proxies, still they get blocked everytime)
Maybe my code is wrong?:
from selenium import webdriver
from selenium.webdriver import Chrome
from selenium.webdriver.chrome.options import Options
options = Options()
options.binary_location = "C:/Program Files (x86)/Google/Chrome/Application/chrome.exe"
options.add_argument("disable-extensions")
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option("useAutomationExtension", False)
options.add_argument(f"--proxy-server=ip:port")
driver = Chrome(options=options, executable_path="C:/WebDriver/bin/chromedriver.exe")
driver.get("http://www.google.com/search?query")
Can it be a matter of the proxies quality?
Google has removed the proxy support for FTP entirely in Google Chrome versions 76 and newer. You can use firefox or edge. I tried with firefox and able to launch:
options = Options()
options.binary_location = "C:\Program Files\Mozilla Firefox\Firefox.exe"
options.add_argument("disable-extensions")
options.add_argument("start-maximized")
options.add_argument(f"--proxy-server=ip:port")
driver = webdriver.Firefox(executable_path=r'..\drivers\geckodriver.exe', options=options)
Import:
from selenium import webdriver
from selenium.webdriver.firefox.options import Options

Loading ublock with custom filter in Chromium Selenium and Python3

I'm developing a scraping script to collect some data which is behind an authwall, I've got a custom filter in ublock which gets me past the authwall however when i load chromium with ublock using Selenium it doesn't have the filters. I'm using Linux if that helps.
I've tried getting it to pause before getting the information to allow me to check the filters in place and it is blank.
Here is a portion of the code
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_option_settings = Options()
chrome_option_settings.add_argument('--window-size=1920x1080')
extension_path = r'/home/user/.config/chromium/Default/Extensions/cjpalhdlnbpafiamejdnhcphjbkeiagm/1.20.0_0'
chrome_option_settings.add_argument('load-extension='+extension_path)
chrome_driver = "/usr/bin/chromedriver"
driver = webdriver.Chrome(chrome_options=chrome_option_settings, executable_path=chrome_driver)
driver.get(url)
I've also tried to load the Chrome profile using either however neither help.
chrome_options.add_argument("user-data-dir=/home/user/.config/chromium/Default")
or
chrome_options.add_argument("--profile-directory=/home/user/.config/chromium/Default")
Any help would be greatly appreciated
You could try using Options() and invoking add_extension with the path to ublock, hope this helps
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
executable_path = "path_to_webdriver"
extension_path = r'/home/user/.config/chromium/Default/Extensions/cjpalhdlnbpafiamejdnhcphjbkeiagm/1.20.0_0'
chrome_options.add_extension(extension_path)
driver = webdriver.Chrome(executable_path=executable_path, chrome_options=chrome_options)

Force Selenium Chrome Driver to use QUIC instead of TCP

I am working on downloading HAR from Chrome for YouTube through Selenium Python Script.
Code Snippet:
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--proxy-server={0}".format(url))
chrome_options.add_argument("--enable-quic")
self.driver = webdriver.Chrome(chromedriver,chrome_options = chrome_options)
self.proxy.new_har(args['url'], options={'captureHeaders': True})
self.driver.get(args['url'])
result = json.dumps(self.proxy.har, ensure_ascii=False)
I want QUIC to be used whenever I download HAR but when I look at the packets through Wireshark Selenium driver is using TCP only. Is there a way to force Chrome Driver to use QUIC? Or Is there an alternate to BMP?
A similar thing has been asked for Firefox in this question How to capture all requests made by page in webdriver? Is there any alternative to Browsermob? and there was a solution with Selenium alone without need of any BMP. So is it possible for Chrome?
Workaround for this problem could be: start Chrome normally (with your default profile or create another profile) and enable quic manually. Then start chromedriver with your profile loaded.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = webdriver.ChromeOptions()
options.add_argument("user-data-dir=/home/user/.config/google-chrome")
driver = webdriver.Chrome(executable_path="/home/user/Downloads/chromedriver", chrome_options=options)

how can i remove notifications and alerts from browser? selenium python 2.7.7

I am trying to submit information in a webpage, but selenium throws this error:
UnexpectedAlertPresentException: Alert Text: This page is asking you
to confirm that you want to leave - data you have entered may not be
saved. ,
>
It's not a leave notification; here is a pic of the notification -
.
If I click in never show this notification again, my action doesn't get saved; is there a way to save it or disable all notifications?
edit: I'm using firefox.
You can disable the browser notifications, using chrome options. Sample code below:
chrome_options = webdriver.ChromeOptions()
prefs = {"profile.default_content_setting_values.notifications" : 2}
chrome_options.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(chrome_options=chrome_options)
With the latest version of Firefox the above preferences didn't work.
Below is the solution which disable notifications using Firefox object
_browser_profile = webdriver.FirefoxProfile()
_browser_profile.set_preference("dom.webnotifications.enabled", False)
webdriver.Firefox(firefox_profile=_browser_profile)
Disable notifications when using Remote Object:
webdriver.Remote(desired_capabilities=_desired_caps, command_executor=_url, options=_custom_options, browser_profile=_browser_profile)
selenium==3.11.0
Usually with browser settings like this, any changes you make are going to get throws away the next time Selenium starts up a new browser instance.
Are you using a dedicated Firefox profile to run your selenium tests? If so, in that Firefox profile, set this setting to what you want and then close the browser. That should properly save it for its next use. You will need to tell Selenium to use this profile though, thats done by SetCapabilities when you start the driver session.
This will do it:
from selenium.webdriver.firefox.options import Options
options = Options()
options.set_preference("dom.webnotifications.enabled", False)
browser = webdriver.Firefox(firefox_options=options)
For Google Chrome and v3 of Selenium you may receive "DeprecationWarning: use options instead of chrome_options", so you will want to do the following:
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
options = webdriver.ChromeOptions()
options.add_argument('--disable-notifications')
driver = webdriver.Chrome(ChromeDriverManager().install(), options=options)
Note: I am using webdriver-manager, but this also works with specifying the executable_path.
This answer is an improvement on TH Todorov code snippet, based on what is working as of Chrome (Version 80.0.3987.163).
lk = os.path.join(os.getcwd(), "chromedriver",) --> in this line you provide the link to the chromedriver, which you can download from chromedrive link
import os
from selenium import webdriver
lk = os.path.join(os.getcwd(), "chromedriver",)
chrome_options = webdriver.ChromeOptions()
prefs = {"profile.default_content_setting_values.notifications" : 2}
chrome_options.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(lk, options=chrome_options)

Categories