I'm wanting to write a python script using Selenium to scrape a website. Following along with the Real Python article on it, I literally copy and pasted the following code into a py file:
from selenium.webdriver import Firefox
from selenium.webdriver.firefox.options import Options
opts = Options()
opts.set_headless()
assert opts.headless # Operating in headless mode
browser = Firefox(options=opts)
browser.get('https://duckduckgo.com')
Running the script I get the following error:
opts.set_headless()
AttributeError: 'Options' object has no attribute 'set_headless'
Attempted to follow this article and commented out the opts.set_headless() attribute and added opts.headless = True but now I get the following error:
Traceback (most recent call last):
File "/home/usr/local/folder/scraper.py", line 10, in <module>
browser = Firefox(options=opts)
File "/home/usr/local/folder/scraper/venv/lib/python3.10/site-packages/selenium/webdriver/firefox/webdriver.py", line 192, in __init__
self.service.start()
File "/home/usr/local/folder/scraper/venv/lib/python3.10/site-packages/selenium/webdriver/common/service.py", line 106, in start
self.assert_process_still_running()
File "/home/usr/local/folder/scraper/venv/lib/python3.10/site-packages/selenium/webdriver/common/service.py", line 119, in assert_process_still_running
raise WebDriverException(f"Service {self.path} unexpectedly exited. Status code was: {return_code}")
selenium.common.exceptions.WebDriverException: Message: Service geckodriver unexpectedly exited. Status code was: -6
I verified that the geckodriver is located in my $PATH so I have no idea why none of this isn't working. I am using selenium v4.7.2.
After much hair pulling, I was able to determine that almost all articles on the internet dealing with Selenium use deprecated methods and attributes. Hopefully this answer will help many others who have been trying to use this library.
First, the .set_headless() method is fully deprecated and doesn't work. The Python Forums had a helpful discussion around it. In order to use a headless browser, you need to use .add_argument("--headless") and not any other way.
Second, there is now the Service() class that needs to be imported and used for any executable_path= pointing to the geckodriver and any other paths such as logs. These two posts helped on this matter: stackoverflow_1 and stackoverflow_2.
Third, after fixing the code and using the correct modules, attributes, methods and arguments, it was still getting hung up. Searching the logs was pointing to a socket timeout and an issue that was being dealt with the dev team in Sep 2022. This helped me realize that the geckodriver version linked in the original Real Python article I was using was long outdated and needed to be updated to the latest version, which is v0.32.0 at the time of writing.
However, that wasn't why it was getting hung up. I decided to comment out the headless argument and that showed that the Firefox browser was the issue. Apparently, with ubuntu 22.04, Firefox is installed by default with snap and needs to be installed as a .deb file. Here is a good article explaining it.
So ultimately, many different issues with this library and it's constantly being updated with past features, which most articles on the internet use, are all deprecated. The Selenium documentation isn't the greatest either. Here is my final code with the previous issues commented out:
# from selenium import webdriver
from selenium.webdriver import Firefox
from selenium.webdriver.firefox.service import Service
from selenium.webdriver.firefox.options import Options
# Setup--
options = Options()
options.add_argument("--headless")
service = Service(executable_path="/home/$PATH/location/geckodriver.exe", log_path="/home/file/location/log/geckodriver.log")
# caps = webdriver.DesiredCapabilities().FIREFOX
# caps["marionette"] = True
### Deprecated
browser = Firefox(service=service, options=options)
# browser = webdriver.Firefox(firefox_profile=options, capabilities=caps, executable_path="~/bin/geckodriver.exe")
# Parse--
browser.get('https://duckduckgo.com')
logo = browser.find_element(by=id, value='logo_homepage_link')
print(logo[0].text)
browser.quit()
This should work:
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
opts = Options()
opts.headless = True
browser = Firefox(options=opts, executable_path='C:\TheActualPathToThe\geckodriver.exe')
browser.get('https://duckduckgo.com')
Related
When I run this code, the page opens and closes. I can't reach the page. Everything is in the latest version. I am a Windows user.
I was trying to open Instagram page with this code.
enter image description here
I tried to open the instagram site with this code and the site was closed as soon as it was opened.
Im not familiar with selenium but I know for a fact that
get() just gets the HTML code of the website and does not display it
You have to add the below Chrome Option:
options.add_experimental_option("detach", True)
Full code:
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_experimental_option("detach", True)
driver = webdriver.Chrome(service=Service(chrom_driver_pat), options=options)
driver.get(<url>)
Next time, don't post the image, post the code and explain your issue clearly.
First of all, please, do not use images when your want a code review or help, it is more easier to help with raw code.
Selenium now uses Service method to run webdriver.
To run what you want, you need to fix some parts of your code, like this:
from selenium import webdriver
# in my case, i'm using selenium==4.5.0, for this version, is necessary to
# use selenium.webdriver.chrome.service to use driver path
from selenium.webdriver.chrome.service import Service as ChromeService
# For Windows path, use r"C:\..." when you will use \
# Remember to set the executable file too
chrome_driver_path = r"C:\pythondriver\chromedriver\chromedriver.exe"
# you can use "universal" path too, like linux, using / instead of \
# chrome_driver_path = "C:/pythondriver/chromedriver/chromedriver.exe"
# instead using:
# webdriver.Chrome()
# driver = webdriver.Chrome(chrome_driver_path)
# use this:
chrome_service = ChromeService(chrome_driver_path)
driver = webdriver.Chrome(service=chrome_service)
url = "https://instagram.com"
driver.get(url)
I tested using the configurations below and worked for me:
Chromedriver - 108.0.5359.71
Chrome - 108.0.5359.125 - 64 bits
If you need more help using Chrome Service, try look the Selenium Docs
I'm trying to write several tests using selenium, but I'm seeing the following strange behavior.
When I run the tests like this:
from selenium.webdriver import Firefox, FirefoxOptions
from selenium.webdriver.firefox.service import Service
options = FirefoxOptions()
service = Service()
brow = Firefox(service=service, options=options)
brow.execute("get", {'url': 'https://python.org'})
I get the result I expected, the python.org website is opened in Firefox browser.
But if I make a mistake in URL, I'm getting the following error:
from selenium.webdriver import Firefox, FirefoxOptions
from selenium.webdriver.firefox.service import Service
options = FirefoxOptions()
service = Service()
brow = Firefox(service=service, options=options)
brow.execute("get", {'url': 'qwerty'})
selenium.common.exceptions.InvalidArgumentException: Message: Malformed URL: URL constructor: qwerty is not a valid URL.
Stacktrace:
WebDriverError#chrome://remote/content/shared/webdriver/Errors.jsm:186:5
InvalidArgumentError#chrome://remote/content/shared/webdriver/Errors.jsm:315:5
GeckoDriver.prototype.navigateTo#chrome://remote/content/marionette/driver.js:804:11
I just want to understand why I see here WebDriverError#chrome, and not WebDriverError#firefox or something like that.
Is this a bug, or am I doing something wrong?
These error messages...
WebDriverError#chrome://remote/content/shared/webdriver/Errors.jsm:186:5
InvalidArgumentError#chrome://remote/content/shared/webdriver/Errors.jsm:315:5
GeckoDriver.prototype.navigateTo#chrome://remote/content/marionette/driver.js:804:11
containing the phrase #chrome may leave an impression of a strange behavior while using GeckoDriver and firefox combo.
However, as per #AutomatedTester's comment in the GitHub discussion Selenium 3.4.0-GeckoDriver 0.17.0 : GeckoDriver producing logs through Chromium/Chrome modules #787:
These errors are nothing to worry about. Mozilla uses different open source projects to build Firefox for different reasons. It showing Chrome errors means nothing in the big picture.
So you can ignore them safely.
I'm trying to open a Firefox browser with undetected_chromedriver.
But only getting a default Firefox browser instead of getting the url I provided.
What did I miss or do wrong?
Here is the code I made so far.
import undetected_chromedriver as uc
import time
if __name__ == '__main__':
driver_path = "C:/Users/jay/Desktop/py/geckodriver.exe"
Firefox_path = "C:/Program Files/Mozilla Firefox/firefox.exe"
option = uc.ChromeOptions()
option.binary_location = Firefox_path
driver = uc.Chrome(executable_path=driver_path, options=option)
driver.get('https://google.com')
time.sleep(10)
I'd appreciate it if you could help me with this.
undetected_chromedriver is ONLY for chromedriver.
It modifies values directly inside binary file chromedrive.exe and it doesn't know how to modify values inside file geckodriver.exe.
See also repo GitHub - undetected-chromedriver.
There is:
"Works ... on .... Chromium based browsers".
"Automatically downloads the driver binary and patches it."
It means it automatically uses chromedrive.exe and it can't even use geckodriver.exe. And chromedrive.exe doesn't know how to communicate with Firefox - so it can't open page.
You can use it only with browsers which use engine Chromium - like Brave and maybe Opera, Microsoft Edge (but I didn't test it).
I'm relatively new to coding and python. I'm trying to automate logging into linkedin to send messages to my connections. I'm using selenium webdriver for this process. I haven't been able to log in yet with the automated process because I'm getting
the error: dict object has no attribute send_keys.
I know in this code 'username' is a dictionary type because I checked and the error is telling me it has no attribute 'send_keys', I get what the error message is saying, that the attribute does not exist, but I don't know how to fix it. I'd also like to ask the variable I've created called 'username' can I call that anything? I know calling it username is probably the best, but I'm asking this for my understanding.
The following code is what I have done so far, I know it's not complete but I like to work and fix issues one line at a time.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.service import Service
import time
s = Service("/usr/local/bin/chromedriver")
driver = webdriver.Chrome(service=s)
driver.get("https://linkedin.com/login")
time.sleep(2)
username = driver.find_element(By.ID,"username")
username.send_keys("my email address goes here")
I'm also attaching an image so it can be seen what part of the LinkedIn page and tags I'm using to try to log in.
Linkedin inspect element code on signing page
I hope I haven't left anything out, I tried to be as descriptive as possible.
Thanks in advance!
This is a bug!
The method webdriver.find_element() is supposed to return an object webdriver.remote.webelement.WebElement and not a dictionary.
Hence, this behaviour is most likely a bug as documented here and not a coding error of yours.
You might be using an old version of chromium in combination with the newer selenium 4.0.
How to fix it
Option A — Software Update.
Make sure you have the latest version installed for your web browser, web driver and selenium.
Option B — Code Patch.
In case you can't update (I had the problem on my RaspberryPi, here I don't have the option to update Chromium since it is no longer supported.):
You have to activate the w3c option for your webdriver.
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options # [!]
s = Service("/usr/local/bin/chromedriver")
opts = Options() # [!]
opts.add_experimental_option('w3c', True) # [!]
driver = webdriver.Chrome(service=s, options=opts) # [!]
driver.get("https://linkedin.com/login")
time.sleep(2)
username = driver.find_element(By.ID,"username")
username.send_keys("my email address goes here")
I'm trying to run a script which run several tests using Selenium Firefox webdriver.
It works flawless in a local machine, but fail miserably running on a xvfb.
The machine is a CentOS release 6.8 (Final)
Firefox version 45.6.0
I'm using Python/Marionette
The command is similar to this:
xvfb-run --server-args="-screen 0, 1920x1080x24" MyProgram
Running this way I get several errors related to not loading the page.
So I got a few screenshots, and all I see is the "Unable to connect" Firefox screen.
At first I though it could be proxy related... I was already implicit not disabling the proxy and a simple "wget" would work as expect.
But then I forced the Firefox preference in the code so it doesn't use the proxy, for sure, right?
profile = webdriver.FirefoxProfile()
profile.set_preference("network.proxy.type", 0)
Same result.
So I googled for similar situations and found some answers asking to add the display number in the command line.
So I changed the command line to it:
export DISPLAY=:1
xvfb-run --server-args=":1 -screen 0, 1920x1080x24" MyProgram
Then I got a different error, but still not working:
ERROR: WebDriverException: connection refused Traceback (most recent call last):
I have also tried to log more information adding the -e parameter to xvfb-run, but all I get is an empty file.
Any idea what else can I try to make it work?
* UPDATE *
Here's a small code to reproduce the issue
from pyvirtualdisplay import Display
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.common.proxy import *
display = Display(visible=0, size=(1920, 1080))
display.start()
profile = webdriver.FirefoxProfile()
profile.set_preference("network.http.phishy-userpass-length", 255);
profile.set_preference("network.proxy.type", 0)
capabilities = None
# Marionette not necessary as it's Firefox 45
# capabilities = DesiredCapabilities.FIREFOX
# capabilities["marionette"] = True
print("Getting webdriver...")
browser = webdriver.Firefox(firefox_profile=profile, capabilities=capabilities)
print("Requesting URL...")
browser.get('https://www.google.com')
print("TITLE:", browser.title)
browser.quit()
display.stop()
The output:
Getting webdriver...
Requesting URL...
TITLE: Problem loading page