Selenium WebDriverException 'chromedriver.exe' needs to be in PATH - python

I'm trying to use selenium for a python web scraper but when I try to run the program I get the following error:
/usr/local/bin/python3 /Users/xxx/Documents/Python/hello.py
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/selenium/webdriver/common/service.py", line 72, in start
self.process = subprocess.Popen(cmd, env=self.env,
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/subprocess.py", line 854, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/subprocess.py", line 1702, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '/Users/xxx/Documents/Python/chromedriver.exe'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/xxx/Documents/Python/hello.py", line 9, in <module>
wd = webdriver.Chrome(executable_path=DRIVER_PATH)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/selenium/webdriver/chrome/webdriver.py", line 73, in __init__
self.service.start()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/selenium/webdriver/common/service.py", line 81, in start
raise WebDriverException(
selenium.common.exceptions.WebDriverException: Message: 'chromedriver.exe' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home
Here is the python code:
from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
from selenium import webdriver
DRIVER_PATH = '/Users/xxx/Documents/Python/chromedriver.exe'
wd = webdriver.Chrome(executable_path=DRIVER_PATH)
I think the problem is that I'm not specifying the file path in the variable DRIVER_PATH properly but I'm not sure
I am using a Mac

You need to update DRIVER_PATH to include your root directory, which is usually C:\:
DRIVER_PATH = 'C:/Users/xxx/Documents/Python/chromedriver.exe'
Alternatively, you can follow this tutorial to add the path to containing folder of chromedriver.exe (usually chromedriver_win32 folder) to your Path environment variable:
https://docs.telerik.com/teststudio/features/test-runners/add-path-environment-variables

I would try this out (Just adding the 'r'):
wd = webdriver.Chrome(executable_path=r'/Users/xxx/Documents/Python/chromedriver.exe')
if you think it's the filepath then have a go with checking:
import os.path
os.path.exists(DRIVER_PATH)
Also, Beautifulsoup is used will with urllib2
https://www.pythonforbeginners.com/beautifulsoup/beautifulsoup-4-python
import urllib2
url = "https://www.URL.com"
content = urllib2.urlopen(url).read()
soup = BeautifulSoup(content)

You have a mistake in the name of the file.
"chomedriver.exe" is for windows.
If you use macOS and chromedriver for Mac, then the file name should be "chomedriver" without ".exe".
I had the same problem, but this solved it.

Related

Trying to do web scraping with Python, but it doesn't work well

I'm doing a school project and I am trying to scrape data from websites. Basically I'm following a tutorial in edureka - https://www.edureka.co/blog/web-scraping-with-python/#demo
The sample code is like this
from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
driver = webdriver.Chrome("/usr/lib/chromium-browser/chromedriver")
products=[] #List to store name of the product
prices=[] #List to store price of the product
ratings=[] #List to store rating of the product
driver.get("""https://www.flipkart.com/laptops/~buyback-guarantee-on-laptops-/pr?sid=6bo%2Cb5g&amp;amp;amp;amp;amp;amp;amp;amp;uniq""")
content = driver.page_source
soup = BeautifulSoup(content)
for a in soup.findAll('a',href=True, attrs={'class':'_31qSD5'}):
name=a.find('div', attrs={'class':'_3wU53n'})
price=a.find('div', attrs={'class':'_1vC4OE _2rQ-NK'})
rating=a.find('div', attrs={'class':'hGSR34 _2beYZw'})
products.append(name.text)
prices.append(price.text)
ratings.append(rating.text)
df = pd.DataFrame({'Product Name':products,'Price':prices,'Rating':ratings})
df.to_csv('products.csv', index=False, encoding='utf-8')
I simplly copied and pasted the sample code to Python to see how it works, and this is what I got
PS D:\COSC2625_Team_Blue> & C:/Users/meowg/AppData/Local/Programs/Python/Python310/python.exe d:/COSC2625_Team_Blue/test.py
d:\COSC2625_Team_Blue\test.py:5: DeprecationWarning: executable_path has been deprecated, please pass in a Service object
driver = webdriver.Chrome("/usr/lib/chromium-browser/chromedriver")
Traceback (most recent call last):
File "C:\Users\meowg\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\common\service.py", line 71, in start
self.process = subprocess.Popen(cmd, env=self.env,
File "C:\Users\meowg\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 969, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Users\meowg\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 1438, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "d:\COSC2625_Team_Blue\test.py", line 5, in <module>
driver = webdriver.Chrome("/usr/lib/chromium-browser/chromedriver")
File "C:\Users\meowg\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\chrome\webdriver.py", line 69, in __init__
super().__init__(DesiredCapabilities.CHROME['browserName'], "goog",
File "C:\Users\meowg\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\chromium\webdriver.py", line 89, in __init__
self.service.start()
File "C:\Users\meowg\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\common\service.py", line 81, in start
raise WebDriverException(
selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://chromedriver.chromium.org/home
Does anyone know what went wrong? I have no idea what happened.
Looks like you just didn't download the file that was included in the tutorial, by the location of /usr/lib/chromium-browser/chromedriver. We can't really help you here, you just have to download the chromedriver.
I would recommend you use python playwright instead of selenium, as it is just a more modern library, with a slightly smaller learning curve, in my opinion, but that's just a recommendation.

Using Selenium webdriver.Chrome results in PermissionError: [WinError 5] Access is denied [duplicate]

This question already has answers here:
'Webdrivers' executable may have wrong permissions. Please see https://sites.google.com/a/chromium.org/chromedriver/home
(22 answers)
Closed 2 years ago.
I'm trying to web scrape reviews from (https://boxes.mysubscriptionaddiction.com/box/boxycharm?ratings=true#review-update-create) but when I run the code:
from selenium import webdriver
chrome_path = r"C:\Users\Sara Jitkresorn\AppData\Local\Programs\Python\Python37\Scripts"
driver = webdriver.Chrome(chrome_path)
driver.get("https://boxes.mysubscriptionaddiction.com/box/boxycharm?ratings=true#review-update-create")
review = driver.find_element_by_class_name("comment-body")
for post in review:
print(post.text)
I got the following error(s). What do I need to do to fix this?
"C:\Users\Sara Jitkresorn\AppData\Local\Programs\Python\Python37\python.exe" "C:/Users/Sara Jitkresorn/PycharmProjects/untitled/venv/SubsAddict.py"
Traceback (most recent call last):
File "C:\Users\Sara Jitkresorn\AppData\Local\Programs\Python\Python37\lib\site-packages\selenium\webdriver\common\service.py", line 76, in start
stdin=PIPE)
File "C:\Users\Sara Jitkresorn\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 756, in __init__
restore_signals, start_new_session)
File "C:\Users\Sara Jitkresorn\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 1155, in _execute_child
startupinfo)
PermissionError: [WinError 5] Access is denied
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/Sara Jitkresorn/PycharmProjects/untitled/venv/SubsAddict.py", line 3, in <module>
driver = webdriver.Chrome(chrome_path)
File "C:\Users\Sara Jitkresorn\AppData\Local\Programs\Python\Python37\lib\site-packages\selenium\webdriver\chrome\webdriver.py", line 73, in __init__
self.service.start()
File "C:\Users\Sara Jitkresorn\AppData\Local\Programs\Python\Python37\lib\site-packages\selenium\webdriver\common\service.py", line 88, in start
os.path.basename(self.path), self.start_error_message)
selenium.common.exceptions.WebDriverException: Message: 'Scripts' executable may have wrong permissions. Please see https://sites.google.com/a/chromium.org/chromedriver/home
You should replace all "\" with "/".
Your chrome path is wrong, you need to find path to chrome executable, it should be stored in Program Files or Program Files x86 Google subfolder.

selenium - unable to add browsec extension to firefox profile

I want to use selenium to scrape off some website. I can't access the website via my own internet connection, so I need to use browsec mozilla addon for that.
I am unable to launch firefox with selenium with the add-on enabled.
Here is what I have tried:
import selenium
from selenium import webdriver
url = "http://url"
profile = webdriver.FirefoxProfile()
profile.add_extension('browsec#browsec.com.xpi')
#profile.add_extension("C:\Users\urs\AppData\Roaming\Mozilla\Firefox\Profiles\abc.default\extensions\browsec#browsec.com.xpi")
driver = webdriver.Firefox(firefox_profile=profile)
if __name__ == "__main__":
driver.get(url)
driver.wait(5)
driver.quit()
I have tried putting the extension in the same directory where my script is and using the following
profile.add_extension('browsec#browsec.com.xpi')
which gives me this error when I run:
Traceback (most recent call last): File
"C:\Python36\lib\site-packages\selenium\webdriver\firefox\firefox_profile
.py", line 346, in _addon_details
with open(os.path.join(addon_path, 'install.rdf'), 'r') as f: FileNotFoundError: [Errno 2] No such file or directory:
'C:\Users\Usr\AppD
ata\Local\Temp\tmp0hny31u3.browsec#browsec.com.xpi\install.rdf'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "test.py", line 7, in
profile.add_extension("browsec#browsec.com.xpi") File "C:\Python36\lib\site-packages\selenium\webdriver\firefox\firefox_profile
.py", line 95, in add_extension
self._install_extension(extension) File "C:\Python36\lib\site-packages\selenium\webdriver\firefox\firefox_profile
.py", line 274, in _install_extension
addon_details = self._addon_details(addon) File "C:\Python36\lib\site-packages\selenium\webdriver\firefox\firefox_profile
.py", line 351, in _addon_details
raise AddonFormatError(str(e), sys.exc_info()[2]) selenium.webdriver.firefox.firefox_profile.AddonFormatError: ("[Errno
2] No such file or directory:
'C:\\Users\\Usr\\AppData\\Local\\Temp\\tmp0hn
y31u3.browsec#browsec.com.xpi\\install.rdf'", )
I also tried giving the path to the extension:
profile.add_extension("C:\Users\urs\AppData\Roaming\Mozilla\Firefox\Profiles\abc.default\extensions\browsec#browsec.com.xpi")
And I ran into this error:
profile.add_extension("C:\Users\Hassan\AppData\Roaming\Mozilla\Firefox\Profi
les\n5jwlj9l.default\extensions\browsec#browsec.com.xpi")
^ SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in positio n 2-3: truncated
\UXXXXXXXX escape
Formatting the path string like below doesn't help either.
profile.add_extension(r"C:\Users\urs\AppData\Roaming\Mozilla\Firefox\Profiles\abc.default\extensions\browsec#browsec.com.xpi")
I get the following:
Traceback (most recent call last): File "test.py", line 7, in
profile.add_extension(r"C:\Users\Hassan\AppData\Roaming\Mozilla\Firefox\Prof
iles\n5jwlj9l.default\extensions\browsec#browsec.com.xpi") File
"C:\Python36\lib\site-packages\selenium\webdriver\firefox\firefox_profile
.py", line 95, in add_extension
self._install_extension(extension) File "C:\Python36\lib\site-packages\selenium\webdriver\firefox\firefox_profile
.py", line 274, in _install_extension
addon_details = self._addon_details(addon) File "C:\Python36\lib\site-packages\selenium\webdriver\firefox\firefox_profile
.py", line 351, in _addon_details
raise AddonFormatError(str(e), sys.exc_info()[2]) selenium.webdriver.firefox.firefox_profile.AddonFormatError: ("[Errno
2] No such file or directory:
'C:\\Users\\usr\\AppData\\Local\\Temp\\tmp1he
0fym_.browsec#browsec.com.xpi\\install.rdf'", )
How do I configure selenium to run firefox with browsec enabled by default?
I found this article rather helpful.
Instead of adding the extension to the profile, you install it after the browser has been created:
from selenium import webdriver
driver = webdriver.Firefox()
# This installs adblock plus
driver.install_addon("/home/your_username/coding/Project/seleniumTest/adblock.xpi", temporary=True)
driver.get('https://www.stackoverflow.com')
Be sure to add the .xpi to your project folder!
You can try to create profile on firefox browser like - On windows Run --> type
"firefox.exe -P"
It will open profile manager. Create new profile. Start firefox from that profile, add plugins. And use that same profile with code..Sometime it worked for me..
Sorry for my English))
Most likely you are using the new version of Firefox (Quantum - from the 57th version inclusive). In newer versions of Firefox, the extension metadata is not stored in the install.rdf file, but in the manifest.json file. Selenium does not know this yet (in version 3.11, and learns only in 3.14). Therefore, when trying to connect an extension, it looks for habit install.rdf.
Here the author wrote a class that slightly changes the connection function of the extension, and instead of install.rdf, selenium looks for metadata in manifest.json.
What you need to do:
# Add Import
import json
import os
import sys
from selenium.webdriver.firefox.firefox_profile import AddonFormatError
# Add class
class FirefoxProfileWithWebExtensionSupport(webdriver.FirefoxProfile):
def _addon_details(self, addon_path):
try:
return super()._addon_details(addon_path)
except AddonFormatError:
try:
with open(os.path.join(addon_path, 'manifest.json'), 'r') as f:
manifest = json.load(f)
return {
'id': manifest['applications']['gecko']['id'],
'version': manifest['version'],
'name': manifest['name'],
'unpack': False,
}
except (IOError, KeyError) as e:
raise AddonFormatError(str(e), sys.exc_info()[2])
# Declare Firefox_profile written class
profile = FirefoxProfileWithWebExtensionSupport()
Further as usual)))
Good luck)))

Traceback Error while using selenium with python beautifulsoup library

I m using this code for scrapping some data from the link https://website.grader.com/results/www.dubizzle.com. Because the actual script with the tags i want to extract loads after a 15 seconds of load, someone recommended me selemuim to introduce a delay in the code. Hence I use this code
The code is as below
#!/usr/bin/python
import urllib
import time
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from bs4 import BeautifulSoup
from dateutil.parser import parse
from datetime import timedelta
import MySQLdb
import re
import pdb
import sys
import string
driver = webdriver.Firefox()
driver.get('https://website.grader.com/results/dubizzle.com')
time.sleep(25)
html = driver.page_source
soup = BeautifulSoup(html)
# print soup
Sizeofweb=""
try:
Sizeofweb= soup.find('span', {'data-reactid': ".0.0.3.0.0.3.$0.1.1.0"}).text
print Sizeofweb.get_text().encode("utf-8")
except StandardError as e:
converted_date="Error was {0}".format(e)
print converted_date
The part of the html which i am extracting is as below
Snap: https://www.dropbox.com/s/7dwbaiyizwa36m6/5.PNG?dl=0
<div class="result-value" data-reactid=".0.0.3.0.0.3.$0.1.1">
<span data-reactid=".0.0.3.0.0.3.$0.1.1.0">1.1</span>
<span class="result-value-unit" data-reactid=".0.0.3.0.0.3.$0.1.1.1">MB</span>
</div>
I installed the geckodriver by downloading it from here and extracting it to /home directory and then giving it a path export PATH=$PATH:/home/geckodriver as recommended by someone named #Ahn Smith here
Now when i run the program, it gives this error
Traceback (most recent call last):
File "ahmed.py", line 17, in <module>
driver = webdriver.Firefox()
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/firefox/webdriver.py", line 140, in __init__
self.service.start()
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/common/service.py", line 74, in start
stdout=self.log_file, stderr=self.log_file)
File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1327, in _execute_child
raise child_exception
OSError: [Errno 20] Not a directory
There are two ways to point Selenium to the appropriate webdriver. You can pass it as a parameter:
driver = webdriver.Firefox(executable_path='/path/to/geckodriver')
Or you can create a local shell variable containing the PATH:
$ export PATH=$PATH:/path/to/
I think your problem is that you're exporting a PATH variable to the geckodriver and not to the folder containing it.

Selenium - 'Service' object has no attribute 'process'

I am attempting to run a simple program on an Ubuntu 16.04 instance using Python 3.5. The program is below;
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.PhantomJS("p/phantomjs")
driver.get("http://www.bbc.co.uk")
s = BeautifulSoup(driver.page_source, "lxml")
print(s.findAll("a"))
try:
driver.close()
except AttributeError:
pass
All the modules are installed correctly. However, when I run the program, I receive the following errors:
Traceback (most recent call last):
File "t.py", line 4, in <module>
driver = webdriver.PhantomJS("p/phantomjs")
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/phantomjs/webdriver.py", line 52, in __init__
self.service.start()
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/common/service.py", line 64, in start
stdout=self.log_file, stderr=self.log_file)
File "/usr/lib/python3.5/subprocess.py", line 947, in __init__
restore_signals, start_new_session)
File "/usr/lib/python3.5/subprocess.py", line 1551, in _execute_child
raise child_exception_type(errno_num, err_msg)
OSError: [Errno 8] Exec format error
Exception ignored in: <bound method Service.__del__ of <selenium.webdriver.phantomjs.service.Service object at 0x7fb05cd964a8>>
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/common/service.py", line 163, in __del__
self.stop()
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/common/service.py", line 135, in stop
if self.process is None:
AttributeError: 'Service' object has no attribute 'process'
It seems as though it is an issue with Selenium rather than with PhantomJS. However, I have no idea how to make the program work properly.
In other questions similar to this, the issue seems to be with closing the headless instance. However, this error is received as soon as I try to instantiate PhantomJS.
How can this be fixed?
If p folder (as you've mentioned) located in the same directory as your script, then you might need to start your code with something like
from bs4 import BeautifulSoup
from selenium import webdriver
import os
path_to_phantom_js = os.path.dirname(__file__) + '/p/phantomjs'
driver = webdriver.PhantomJS(path_to_phantom_js)
P.S. If it not works, tell me output of print(path_to_phantom_js)

Categories