I am attempting to run a simple program on an Ubuntu 16.04 instance using Python 3.5. The program is below;
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.PhantomJS("p/phantomjs")
driver.get("http://www.bbc.co.uk")
s = BeautifulSoup(driver.page_source, "lxml")
print(s.findAll("a"))
try:
driver.close()
except AttributeError:
pass
All the modules are installed correctly. However, when I run the program, I receive the following errors:
Traceback (most recent call last):
File "t.py", line 4, in <module>
driver = webdriver.PhantomJS("p/phantomjs")
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/phantomjs/webdriver.py", line 52, in __init__
self.service.start()
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/common/service.py", line 64, in start
stdout=self.log_file, stderr=self.log_file)
File "/usr/lib/python3.5/subprocess.py", line 947, in __init__
restore_signals, start_new_session)
File "/usr/lib/python3.5/subprocess.py", line 1551, in _execute_child
raise child_exception_type(errno_num, err_msg)
OSError: [Errno 8] Exec format error
Exception ignored in: <bound method Service.__del__ of <selenium.webdriver.phantomjs.service.Service object at 0x7fb05cd964a8>>
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/common/service.py", line 163, in __del__
self.stop()
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/common/service.py", line 135, in stop
if self.process is None:
AttributeError: 'Service' object has no attribute 'process'
It seems as though it is an issue with Selenium rather than with PhantomJS. However, I have no idea how to make the program work properly.
In other questions similar to this, the issue seems to be with closing the headless instance. However, this error is received as soon as I try to instantiate PhantomJS.
How can this be fixed?
If p folder (as you've mentioned) located in the same directory as your script, then you might need to start your code with something like
from bs4 import BeautifulSoup
from selenium import webdriver
import os
path_to_phantom_js = os.path.dirname(__file__) + '/p/phantomjs'
driver = webdriver.PhantomJS(path_to_phantom_js)
P.S. If it not works, tell me output of print(path_to_phantom_js)
Related
I'm doing a school project and I am trying to scrape data from websites. Basically I'm following a tutorial in edureka - https://www.edureka.co/blog/web-scraping-with-python/#demo
The sample code is like this
from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
driver = webdriver.Chrome("/usr/lib/chromium-browser/chromedriver")
products=[] #List to store name of the product
prices=[] #List to store price of the product
ratings=[] #List to store rating of the product
driver.get("""https://www.flipkart.com/laptops/~buyback-guarantee-on-laptops-/pr?sid=6bo%2Cb5g&amp;amp;amp;amp;amp;amp;amp;uniq""")
content = driver.page_source
soup = BeautifulSoup(content)
for a in soup.findAll('a',href=True, attrs={'class':'_31qSD5'}):
name=a.find('div', attrs={'class':'_3wU53n'})
price=a.find('div', attrs={'class':'_1vC4OE _2rQ-NK'})
rating=a.find('div', attrs={'class':'hGSR34 _2beYZw'})
products.append(name.text)
prices.append(price.text)
ratings.append(rating.text)
df = pd.DataFrame({'Product Name':products,'Price':prices,'Rating':ratings})
df.to_csv('products.csv', index=False, encoding='utf-8')
I simplly copied and pasted the sample code to Python to see how it works, and this is what I got
PS D:\COSC2625_Team_Blue> & C:/Users/meowg/AppData/Local/Programs/Python/Python310/python.exe d:/COSC2625_Team_Blue/test.py
d:\COSC2625_Team_Blue\test.py:5: DeprecationWarning: executable_path has been deprecated, please pass in a Service object
driver = webdriver.Chrome("/usr/lib/chromium-browser/chromedriver")
Traceback (most recent call last):
File "C:\Users\meowg\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\common\service.py", line 71, in start
self.process = subprocess.Popen(cmd, env=self.env,
File "C:\Users\meowg\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 969, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Users\meowg\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 1438, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "d:\COSC2625_Team_Blue\test.py", line 5, in <module>
driver = webdriver.Chrome("/usr/lib/chromium-browser/chromedriver")
File "C:\Users\meowg\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\chrome\webdriver.py", line 69, in __init__
super().__init__(DesiredCapabilities.CHROME['browserName'], "goog",
File "C:\Users\meowg\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\chromium\webdriver.py", line 89, in __init__
self.service.start()
File "C:\Users\meowg\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\common\service.py", line 81, in start
raise WebDriverException(
selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://chromedriver.chromium.org/home
Does anyone know what went wrong? I have no idea what happened.
Looks like you just didn't download the file that was included in the tutorial, by the location of /usr/lib/chromium-browser/chromedriver. We can't really help you here, you just have to download the chromedriver.
I would recommend you use python playwright instead of selenium, as it is just a more modern library, with a slightly smaller learning curve, in my opinion, but that's just a recommendation.
I have a simple Python script that runs Selenium, and I have tried using Torsocks (as usual) simply like this: torsocks python script.py. However, it failed with this error:
Traceback (most recent call last):
File "script.py", line 21, in <module>
browser = webdriver.Firefox(options=options)
File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/firefox/webdriver.py", line 163, in __init__
log_path=service_log_path)
File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/firefox/service.py", line 47, in __init__
self, executable_path, port=port, log_file=log_file, env=env)
File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/common/service.py", line 42, in __init__
self.port = utils.free_port()
File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/common/utils.py", line 37, in free_port
free_socket.listen(5)
PermissionError: [Errno 1] Operation not permitted
Is it actually possible to use Torsocks like this?
I realize that I could send the request with SOCKS5 proxy, but I wonder if it could run using Torsocks, and if not, it would be great to get an explanation.
I'm trying to use selenium for a python web scraper but when I try to run the program I get the following error:
/usr/local/bin/python3 /Users/xxx/Documents/Python/hello.py
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/selenium/webdriver/common/service.py", line 72, in start
self.process = subprocess.Popen(cmd, env=self.env,
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/subprocess.py", line 854, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/subprocess.py", line 1702, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '/Users/xxx/Documents/Python/chromedriver.exe'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/xxx/Documents/Python/hello.py", line 9, in <module>
wd = webdriver.Chrome(executable_path=DRIVER_PATH)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/selenium/webdriver/chrome/webdriver.py", line 73, in __init__
self.service.start()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/selenium/webdriver/common/service.py", line 81, in start
raise WebDriverException(
selenium.common.exceptions.WebDriverException: Message: 'chromedriver.exe' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home
Here is the python code:
from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
from selenium import webdriver
DRIVER_PATH = '/Users/xxx/Documents/Python/chromedriver.exe'
wd = webdriver.Chrome(executable_path=DRIVER_PATH)
I think the problem is that I'm not specifying the file path in the variable DRIVER_PATH properly but I'm not sure
I am using a Mac
You need to update DRIVER_PATH to include your root directory, which is usually C:\:
DRIVER_PATH = 'C:/Users/xxx/Documents/Python/chromedriver.exe'
Alternatively, you can follow this tutorial to add the path to containing folder of chromedriver.exe (usually chromedriver_win32 folder) to your Path environment variable:
https://docs.telerik.com/teststudio/features/test-runners/add-path-environment-variables
I would try this out (Just adding the 'r'):
wd = webdriver.Chrome(executable_path=r'/Users/xxx/Documents/Python/chromedriver.exe')
if you think it's the filepath then have a go with checking:
import os.path
os.path.exists(DRIVER_PATH)
Also, Beautifulsoup is used will with urllib2
https://www.pythonforbeginners.com/beautifulsoup/beautifulsoup-4-python
import urllib2
url = "https://www.URL.com"
content = urllib2.urlopen(url).read()
soup = BeautifulSoup(content)
You have a mistake in the name of the file.
"chomedriver.exe" is for windows.
If you use macOS and chromedriver for Mac, then the file name should be "chomedriver" without ".exe".
I had the same problem, but this solved it.
I am using python 3.5, firefox 45 (also tried 49) and selenium 3.0.1
I tried:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Firefox()
Then I got the error message:
C:\Users\A\AppData\Local\Programs\Python\Python35\python.exe
C:/Users/A/Desktop/car/test.py
Traceback (most recent call last):
File "C:\Users\A\AppData\Local\Programs\Python\Python35\lib\site-packages\selenium\webdriver\common\service.py", line 64, in start
stdout=self.log_file, stderr=self.log_file)
File "C:\Users\A\AppData\Local\Programs\Python\Python35\lib\subprocess.py", line 950, in __init__
restore_signals, start_new_session)
File "C:\Users\A\AppData\Local\Programs\Python\Python35\lib\subprocess.py", line 1220, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/A/Desktop/car/test.py", line 4, in <module>
driver = webdriver.Firefox()
File "C:\Users\A\AppData\Local\Programs\Python\Python35\lib\site-packages\selenium\webdriver\firefox\webdriver.py", line 135, in __init__
self.service.start()
File "C:\Users\A\AppData\Local\Programs\Python\Python35\lib\site-packages\selenium\webdriver\common\service.py", line 71, in start
os.path.basename(self.path), self.start_error_message)
selenium.common.exceptions.WebDriverException: Message: 'geckodriver' executable needs to be in PATH.
Exception ignored in: <bound method Service.__del__ of <selenium.webdriver.firefox.service.Service object at 0x0000000000EB8278>>
Traceback (most recent call last):
File "C:\Users\A\AppData\Local\Programs\Python\Python35\lib\site-packages\selenium\webdriver\common\service.py", line 163, in __del__
self.stop()
File "C:\Users\A\AppData\Local\Programs\Python\Python35\lib\site-packages\selenium\webdriver\common\service.py", line 135, in stop
if self.process is None:
AttributeError: 'Service' object has no attribute 'process'
What can I do? Any help is much appreciated!
If you are using firefox ver >47.0.1 you need to have the [geckodriver][1] executable in your system path. For earlier versions you want to turn marionette off. You can to so like this:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
capabilities = DesiredCapabilities.FIREFOX.copy()
capabilities['marionette'] = False
driver = webdriver.Firefox(capabilities=capabilities)
I am using python selenium webdriver in our framework , of late we have been seeing errors like below
(<type 'exceptions.AttributeError'>,
AttributeError("'unicode' object has no attribute 'text'",)
,<traceback object at 0x00000000037D8D48>)
And also we are seeing this error in our product which has unicode charecters in them.Doesnt happen always happens once in 15 times.
Stacktrace:
self.compare_units_test('Singapore')
line 391, in compare_units_test
self.assertTrue(home_page.is_loaded(), Errors.HOMEPAGE_LOGO_ERROR)
File "", line 32, in is_loaded
"Logo did not load results in 10 seconds")
File "\venv\lib\site-packages\pscore\core\support\ps_wait.py", line 20,in until_visible
self._wait_until_visible(locator, timeout, message, True)
File "venv\lib\site-packages\pscore\core\support\ps_wait.py", line 15, in _wait_until_visible
wait.until(EC.visibility_of_element_located(locator), message=message)
File "\venv\lib\site-packages\selenium\webdriver\support\wait.py", line 66, in until
value = method(self._driver)
File "\venv\lib\site- packages\selenium\webdriver\support\expected_conditions.py", line 72, in __call__
return _element_if_visible(_find_element(driver, self.locator))
File \venv\lib\site- packages\selenium\webdriver\support\expected_conditions.py", line 90, in _element_if_visible
return element if element.is_displayed() else False
AttributeError: 'unicode' object has no attribute 'is_displayed'
2015-09-10 15:56:36 - INFO wd_testcase.py:111 in run : Test Runner: Tearing down test: test_compare_units (tests.test.TestFlightsSearch)
2015-09-10 15:56:36 - INFO wd_testcase.py:48 in tearDown : Test Runner: Attempting to teardown.
2015-09-10 15:56:36 - ERROR wd_testcase.py:125 in run : Traceback (most recent call last):
File \venv\lib\site-packages\pscore\core\wd_testcase.py", line 112, in run
self.tearDown()
File "\venv\lib\site-packages\pscore\core\wd_testcase.py", line 49, in tearDown
WebDriverFinalizer.finalize(self.driver, self.has_failed(), self.logger, self.test_context)
File "\venv\lib\site-packages\pscore\core\finalizers.py", line 28, in finalize
WebDriverFinalizer.finalize_skygrid(driver, test_failed, test_context) File "\lib\site-packages\pscore\core\finalizers.py", line 152, in finalize_skygrid
WebDriverFinalizer.finalise_skygrid_driver_failure(driver, test_context)
File "\venv\lib\site-packages\pscore\core\finalizers.py", line 168, in finalise_skygrid_driver_failure
final_url = driver.current_url
File "venv\lib\site- packages\selenium\webdriver\support\event_firing_webdriver.py", line 201, in __getattr__
raise AttributeError(name)AttributeError: current_url
We are using 2.45 version of selenium and 2.7.7 version of python.
When i dug into webdriver source code found this
try:
str = basestring
except NameError:
pass
which specifically addresses the unicode problem which webdriver was running into.
Any ideas what might be causing this? Help would be greatly appreciated
Found the root cause of the problem, it looked like one of the tests was spawning a new browser window and the subsequent tests were using the dialog window instead of the browser window to do selenium commands. At some point one of the tests performed a driver.quit and when the subsequent tests tried to access driver.something they failed since driver was already killed.