I am following this tutorial on web scraping https://www.linkedin.com/pulse/how-easy-scraping-data-from-linkedin-profiles-david-craven/. The python script is generating errors and I've already tried adding the directory to the PATH and it shows when I echo the path to the screen, but now it shows "/Users/owner/Users/owner" when there should just be one "Users/owner" in the path.
I'm using bash inside mac os High Sierra and am a data science major so DevOps​ is a challenge for me as well as learning how to post code to StackOverflow but I'm trying to document my steps so it will be easier to troubleshoot this.
I pip installed selenium
I downloaded chromedriver to the directory for my webscraping script file and double clicked it to run
I thought I added the directory to my PATH with 'export PATH=$PATH:~opt/bin:~/Users/owner/sbox/test/pandas_sqlite_dbase/chromedriver' which are the directions I found from http://osxdaily.com/2014/08/14/add-new-path-to-path-command-line/
I updated PIP
The directory I want to run the script from is '/Users/owner/sbox/test/pandas_sqlite_dbase'
There was another SO post Can a website detect when you are using selenium with chromedriver? that talked about how chromedriver with selenium was now auto detected and disabled... so am I trying to scrape with an outdated code base?
I can post my whole PATH or give other info.
from selenium import webdriver
driver = webdriver.Chrome('~/Users/owner/sbox/test/pandas_sqlite_dbase/googlechrome')
driver.get('https://www.linkedin.com')
Now I am getting a traceback error
Traceback (most recent call last):
File "/Users/owner/anaconda3/lib/python3.7/site-packages/selenium/webdriver/common/service.py", line 76, in start
stdin=PIPE)
File "/Users/owner/anaconda3/lib/python3.7/subprocess.py", line 775, in __init__
restore_signals, start_new_session)
File "/Users/owner/anaconda3/lib/python3.7/subprocess.py", line 1522, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '~/Users/owner/sbox/test/pandas_sqlite_dbase/googlechrome': '~/Users/owner/sbox/test/pandas_sqlite_dbase/googlechrome'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/owner/sbox/test/pandas_sqlite_dbase/scraping_tutorial.py", line 7, in <module>
driver = webdriver.Chrome('~/Users/owner/sbox/test/pandas_sqlite_dbase/googlechrome')
File "/Users/owner/anaconda3/lib/python3.7/site-packages/selenium/webdriver/chrome/webdriver.py", line 73, in __init__
self.service.start()
File "/Users/owner/anaconda3/lib/python3.7/site-packages/selenium/webdriver/common/service.py", line 83, in start
os.path.basename(self.path), self.start_error_message)
selenium.common.exceptions.WebDriverException: Message: 'googlechrome' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home
[Finished in 0.7s with exit code 1]
[shell_cmd: python -u "/Users/owner/sbox/test/pandas_sqlite_dbase/scraping_tutorial.py"]
[dir: /Users/owner/sbox/test/pandas_sqlite_dbase]
[path: /usr/bin:/bin:/usr/sbin:/sbin]
I would check what ~ actually is (seems you have the concept bad) usually is home dir, so, for a user, your "Users/owner", that's why you are obtaining "Users/owner/Users/owner".
To check this, you can
$>cd ~
$>pwd
Related
I am trying to learn how to use selenium and python as well i am trying to follow this video :
https://www.youtube.com/watch?v=Xjv1sY630Uc&ab_channel=TechWithTim
This is the code I have :
from selenium import webdriver
PATH = "/Users/fuadhafiz/Documents\chromedriver.exec"
driver = webdriver.Chrome(PATH)
driver.get("https://stackoverflow.com")
But this is what keeps coming up on the terminal ( I am using VS Code and am on mac)
/Library/Frameworks/Python.framework/Versions/3.9/bin/python3 "/Users/fuadhafiz/Documents/Python Projects/Selenium Automation /Web Scraping (1)/web_scraping_attempt.py"
fuadhafiz#Fuads-iMac Web Scraping (1) % /Library/Frameworks/Python.framework/Versions/3.9/bin/python3 "/Users/fuadhafiz/Documents/Python Projects/Selenium Automation /Web Scraping (1)/web_scraping_attempt.py"
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/selenium/webdriver/common/service.py", line 72, in start
self.process = subprocess.Popen(cmd, env=self.env,
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/subprocess.py", line 947, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/subprocess.py", line 1819, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '/Users/fuadhafiz/Documents\\chromedriver.exec'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/fuadhafiz/Documents/Python Projects/Selenium Automation /Web Scraping (1)/web_scraping_attempt.py", line 4, in <module>
driver = webdriver.Chrome(PATH)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/selenium/webdriver/chrome/webdriver.py", line 73, in __init__
self.service.start()
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/selenium/webdriver/common/service.py", line 81, in start
raise WebDriverException(
selenium.common.exceptions.WebDriverException: Message: 'Documents\chromedriver.exec' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home
and in the "problems" section
Anomalous backslash in string: '\c'. String constant might be missing an r prefix.
This is were the chrome driver is saved :
You were almost there. In macos systems the extension for the ChromeDriver binary isn't required. So effectively your code block will be:
from selenium import webdriver
PATH = "/Users/fuadhafiz/Documents/chromedriver"
driver = webdriver.Chrome(PATH)
driver.get("https://stackoverflow.com")
References
You can find a couple of detailed relevant discussions in:
FileNotFoundError: [Errno 2] No such file or directory: 'geckodriver': 'geckodriver' with GeckoDriver and Python in MAC OS
see error message you are using wrong file extension .exec instead of .exe
FileNotFoundError: [Errno 2] No such file or directory: '/Users/fuadhafiz/Documents\\chromedriver.exec'
try this instead
from selenium import webdriver
PATH = "/Users/fuadhafiz/Documents\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://stackoverflow.com")
I am also using mac so install chromedriver in default PATH saves so much trouble of adding path each time
brew cask install chromedriver
now you can simply call
driver = webdriver.Chrome()
I'm following a tutorial on using selenium and I'm having trouble getting started. Namely, when I try to run the code below, I get the error below. I have seen other users with the same problem, I have tried their solutions, they did not work.
These solutions include:
running pycharm as administrator,
setting permissions for all
group/usernames of subprocess.py and service.py
site-package(and pretty much every file/folder within) to full
access.
from selenium import webdriver
driver = webdriver.Chrome(r"C:\Users\User\AppData\Local\Programs\Python\Python37-32\Lib\site-packages\selenium\webdriver\chrome")
driver.get("http://python.org")
Here is the full error message:
Traceback (most recent call last): File
"C:\Users\User\AppData\Local\Programs\Python\Python37-32\lib\site-packages\selenium\webdriver\common\service.py",
line 76, in start
stdin=PIPE) File "C:\Users\User\AppData\Local\Programs\Python\Python37-32\lib\subprocess.py",
line 775, in init
restore_signals, start_new_session) File "C:\Users\User\AppData\Local\Programs\Python\Python37-32\lib\subprocess.py",
line 1178, in _execute_child
startupinfo) PermissionError: [WinError 5] Access is denied
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File
"C:/Users/User/PycharmProjects/PythonProject/DataCollection", line 2,
in
driver = webdriver.Chrome(r"C:\Users\User\AppData\Local\Programs\Python\Python37-32\Lib\site-packages\selenium\webdriver\chrome")
File
"C:\Users\User\AppData\Local\Programs\Python\Python37-32\lib\site-packages\selenium\webdriver\chrome\webdriver.py",
line 73, in init
self.service.start() File "C:\Users\User\AppData\Local\Programs\Python\Python37-32\lib\site-packages\selenium\webdriver\common\service.py",
line 88, in start
os.path.basename(self.path), self.start_error_message) selenium.common.exceptions.WebDriverException: Message: 'chrome'
executable may have wrong permissions. Please see
https://sites.google.com/a/chromium.org/chromedriver/home
first, replace all \ with /
and then add the executable filename in the file location:
driver = webdriver.Chrome(r'C:/Users/User/AppData/Local/Programs/Python/Python37-32/Lib/site-packages/selenium/webdriver/chrome/chromedriver.exe')
I did not know how to create an executable python program before I asked here. Thankfully I received a fast answer and was able to convert my script to an executable program. The executable works perfect but only on my computer.
These are the two error's I am receiving, I feel like I need to modify the script in order to locate the chrome driver I am not sure where Pyinstaller saved everything.
Exception in Tkinter callback
Traceback (most recent call last):
File "site-packages\selenium\webdriver\common\service.py", line 76, in start
File "subprocess.py", line 775, in __init__
File "subprocess.py", line 1178, in _execute_child
FileNotFoundError: [WinError 2] The system cannot find the file specified
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "tkinter\__init__.py", line 1705, in __call__
File "MarijuanaDoctors.py", line 25, in search
File "site-packages\selenium\webdriver\chrome\webdriver.py", line 68, in __init__
File "site-packages\selenium\webdriver\common\service.py", line 83, in start
selenium.common.exceptions.WebDriverException: Message: 'chromedriver'
executable needs to be in PATH. Please see
https://sites.google.com/a/chromium.org/chromedriver/home
You can bundle your "chromedriver.exe" along with your script using Pyinstaller like this:
pyinstaller --add-binary="localpathtochromedriver;." myscript.py
This will copy the "chromedriver.exe" file in the same folder as your main .exe(Or in case of single file option of pyinstaller, this fill will be extracted in temp folder while using exe program).
In your script you can check if you are running the script normally or from bundled(exe file) mode, and choose path to chromedriver.exe accordingly.(This change in script can be common for Single file/folder bundle option of pyinstaller)
import sys
if getattr(sys, 'frozen', False ):
#Running from exe, so the path to exe is saved in sys._MEIPASS
chrome_driver = os.path.join(sys._MEIPASS, "chromedriver.exe")
else:
chrome_driver = 'localpathtochromedriver.exe'
driver = webdriver.Chrome(executable_path=chrome_driver)
You can read about this in docs here.
Limitation:
The user of your .exe should have Chrome installed on their system and Chrome version should work with the chromedriver which is bundled.
Saw a lot of people have had problems like this, but in all my searches I saw a lot of conflicting and confusing information that I didn't understand - this is all a bit out of my newbie depth.
I installed Selenium in PyCharm and was attempting to run this code from the book Automate The Boring Stuff with Python:
from selenium import webdriver
browser = webdriver.Firefox()
browser.get('http://inventwithpython.com')
linkElem = browser.find_element_by_link_text('Read It Online')
type(linkElem)
linkElem.click() # follows the "Read It Online" link
Running it throws the following exceptions:
Traceback (most recent call last):
File "C:\Users\LB\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\common\service.py", line 74, in start
stdout=self.log_file, stderr=self.log_file)
File "C:\Users\LB\AppData\Local\Programs\Python\Python36-32\lib\subprocess.py", line 707, in __init__
restore_signals, start_new_session)
File "C:\Users\LB\AppData\Local\Programs\Python\Python36-32\lib\subprocess.py", line 990, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/LB/Desktop/PythonProjects/AutomateTheBoringStuffProjects/generalTestingFile.py", line 2, in <module>
browser = webdriver.Firefox()
File "C:\Users\LB\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\firefox\webdriver.py", line 142, in __init__
self.service.start()
File "C:\Users\LB\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\common\service.py", line 81, in start
os.path.basename(self.path), self.start_error_message)
selenium.common.exceptions.WebDriverException: Message: 'geckodriver' executable needs to be in PATH.
I've downloaded the latest geckodriver.exe (specifically the 64bit version since I'm on 64bit OS but I'm running 32bit Firefox if that's okay?), but I'm not sure where to put it.
I've looked up how to change a PATH but I don't know what exactly I'm supposed to change or where it's supposed to point to. (Firefox folders? Python folders?)
Followed someone's advice to put geckodriver.exe in C:\Users\LB\ and edit the System Path to ADD that location to the Variable called Path. And now the code works! (You can apparently put geckodriver.exe anywhere, as long as you point the path to that specific folder.)
Source: https://www.howtogeek.com/118594/how-to-edit-your-system-path-for-easy-command-line-access/
This answer was posted as a comment to the question How to set up Python 3 Selenium/Geckodriver for Firefox on Windows 10? by the OP LBoot.
Im somewhat a beginner with Python and recently stumbled across the Selenium module, would appreciate it if someone could help me?
I cant seem to get the selenium module working at all with python3.
I have downloaded the geckodriver for firefox but still no luck, or am installing it incorrectly maybe?
Im using this code:
from selenium import webdriver
browser = webdriver.Firefox()
And seem to be receiving this error:
'OSError: [Errno 8] Exec format error'
A copy of the whole error message is pasted below.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/chron/.local/lib/python3.5/site-packages/selenium/webdriver/firefox/webdriver.py", line 140, in __init__
self.service.start()
File "/home/chron/.local/lib/python3.5/site-packages/selenium/webdriver/common/service.py", line 74, in start
stdout=self.log_file, stderr=self.log_file)
File "/usr/lib/python3.5/subprocess.py", line 947, in __init__
restore_signals, start_new_session)
File "/usr/lib/python3.5/subprocess.py", line 1551, in _execute_child
raise child_exception_type(errno_num, err_msg)
OSError: [Errno 8] Exec format error
OSError: [Errno 8] Exec format error
this looks like it is failing to start geckodriver because you are using a binary that is compiled for the wrong architecture. Make sure you download the correct version for your architecture from https://github.com/mozilla/geckodriver/releases
for example, if you are running 64-bit linux (amd64), you need to download the geckodriver tarball that ends with "linux64.tar.gz".