I am running a Python script on a remote server that scrapes periodically a webpage, using PhantomJS as a webdriver in Selenium.
The script stops unexpectedly after running for some hours, throwing the following error:
Traceback (most recent call last):
File "long.py", line 74, in <module>
data = scrape_page_long()
File "long.py", line 19, in scrape_page_long
driver = webdriver.PhantomJS(service_args=['--ignore-ssl-errors=true', '--ssl-protocol=any'])
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/phantomjs/webdriver.py", line 52, in __init__
self.service.start()
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/common/service.py", line 96, in start
self.assert_process_still_running()
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/common/service.py", line 109, in assert_process_still_running
% (self.path, return_code)
selenium.common.exceptions.WebDriverException: Message: Service phantomjs unexpectedly exited. Status code was: -6
I thought at first it had to do with ssl errors (hence the arguments), but it doesn't seem related I think.
Any ideas on what causes this issue?
Your script is never able to scrape the web page, because PhantomJS is not working at all on the server.
If you log into the server and run phantomjs --version you'll see this:
QXcbConnection: Could not connect to display
PhantomJS has crashed. Please read the bug reporting guide at
<http://phantomjs.org/bug-reporting.html> and file a bug report.
Aborted
You can fix this by adding export QT_QPA_PLATFORM=offscreen to your user account's .bashrc, or by adding QT_QPA_PLATFORM=offscreen to the server's /etc/environment.
Related
Ill start with the fact that I am a total beginner at Python scripts and programming in general. I want to automate a Raspberry Pi to boot up, open Chromium in full screen, go to a web page, and log in all automatically. Basically so non technical people can just turn it on and bang, there is it no keyboard or anything needed, just a display. I've been working on this for about a week and have learned a ton, but I have hit a wall and cant get past it.
Running on a Ras Pi 4+, with Raspbian 10
Installed Selenium and Chromedriver
I have the Selenium script done in Python:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument('--start-fullscreen')
chrome_options.add_argument('user-data-dir=/home/pi/Documents/website_login/Chromium_user_data')
chrome_options.add_argument('disable-infobars')
chrome_options.add_argument("--disable-extensions")
chrome_options.add_argument("--disable-gpu")
driver = webdriver.Chrome('/usr/lib/chromium-browser/chromedriver', options=chrome_options)
driver.get("https://www.mywebsite.com")
driver.implicitly_wait(10)
#driver.find_element_by_id('rcc-confirm-button').click() #comment out because after run once the cookies banner gets saves in user settings
driver.find_element_by_link_text('Log in').click()
delay = 5
driver.find_element_by_id('ddlsubsciribers').send_keys('agency')
driver.find_element_by_id('memberfname').send_keys('user')
driver.find_element_by_id('memberpwd').send_keys('password')
driver.find_element_by_id('login').click()
It works fine when I run it from the terminal. Opens chromium, goes to the page, does the Selenium magic, logs in and is exactly what I want.
I then try to get it to launch on startup with crontab -e on the Pi using this command:
#reboot sleep 20; /usr/bin/python3 /home/pi/Documents/website_login/iar_login.py > /home/pi/Documents/website_login/iar_errorlog.err >2&1
Nothing happens when I reboot and I get the following error messages in the log file:
Traceback (most recent call last):
File "/home/pi/Documents/website_login/iar_login.py", line 16, in <module>
driver = webdriver.Chrome('/usr/lib/chromium-browser/chromedriver', options=chrome_options)
File "/usr/local/lib/python3.7/dist-packages/selenium/webdriver/chrome/webdriver.py", line 81, in __init__
desired_capabilities=desired_capabilities)
File "/usr/local/lib/python3.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 157, in __init__
self.start_session(capabilities, browser_profile)
File "/usr/local/lib/python3.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 252, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
File "/usr/local/lib/python3.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally.
(unknown error: DevToolsActivePort file doesn't exist)
(The process started from chrome location /usr/bin/chromium-browser is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
I have searched and read everything I can on the errors. I think it may be a permissions error of some type, but I am out of ideas to try.
Fixed. adding 'export DISPLAY=:0' in the crontab file fixed it. Selenium was trying to execute but it couldn't find a display.
I have 2 servers. I am attempting to setup one for continuous integration for my main website server
Web server 1(cloud-hosting):
Python3.6
Django3.1
Ubuntu16.04
Webserver 2(VPS):
Python3.7
Django3.1
Ubuntu16.04
Jenkins
--ShiningPanda(plugin)
Im new to web development, so if it seems odd as far as my web server types, that is why. I have been following along in the book Test Driven Development with Python. My issue is that when running python manage.py test [app] My [app] inherits from the class StaticLiveSever to generate a testing environment. On webserver 1, this works fine. On webserver 2, i get an error that the request address cannot be assigned. I use jenkins to build the environment, but the error i get is OSerror[99]:cannot assign requested address. I dont understand why this is happening when i run the same commands in Web Sever 1. It runs fine. Although again, the commands are run by jenkins and jenkins is configured to run python3.7
Full Traceback(Main Issue)
Traceback (most recent call last):
File "/var/lib/jenkins/shiningpanda/jobs/ddc1aed1/virtualenvs/d41d8cd9/lib/python3.7/site-packages/django/test/testcases.py", line 1449, in setUpClass
raise cls.server_thread.error
File "/var/lib/jenkins/shiningpanda/jobs/ddc1aed1/virtualenvs/d41d8cd9/lib/python3.7/site-packages/django/test/testcases.py", line 1374, in run
self.httpd = self._create_server()
File "/var/lib/jenkins/shiningpanda/jobs/ddc1aed1/virtualenvs/d41d8cd9/lib/python3.7/site-packages/django/test/testcases.py", line 1389, in _create_server
return ThreadedWSGIServer((self.host, self.port), QuietWSGIRequestHandler, allow_reuse_address=False)
File "/var/lib/jenkins/shiningpanda/jobs/ddc1aed1/virtualenvs/d41d8cd9/lib/python3.7/site-packages/django/core/servers/basehttp.py", line 67, in __init__
super().__init__(*args, **kwargs)
File "/usr/lib/python3.7/socketserver.py", line 452, in __init__
self.server_bind()
File "/usr/lib/python3.7/wsgiref/simple_server.py", line 50, in server_bind
HTTPServer.server_bind(self)
File "/usr/lib/python3.7/http/server.py", line 137, in server_bind
socketserver.TCPServer.server_bind(self)
File "/usr/lib/python3.7/socketserver.py", line 466, in server_bind
self.socket.bind(self.server_address)
OSError: [Errno 99] Cannot assign requested address
After hardcoded host
Traceback (most recent call last):
File "/var/lib/jenkins/workspace/Superlists/functional_tests/base.py", line 47, in setUp
self.browser = webdriver.Firefox()
File "/var/lib/jenkins/shiningpanda/jobs/ddc1aed1/virtualenvs/d41d8cd9/lib/python3.7/site-packages/selenium/webdriver/firefox/webdriver.py", line 164, in __init__
self.service.start()
File "/var/lib/jenkins/shiningpanda/jobs/ddc1aed1/virtualenvs/d41d8cd9/lib/python3.7/site-packages/selenium/webdriver/common/service.py", line 100, in start
self.assert_process_still_running()
File "/var/lib/jenkins/shiningpanda/jobs/ddc1aed1/virtualenvs/d41d8cd9/lib/python3.7/site-packages/selenium/webdriver/common/service.py", line 113, in assert_process_still_running
% (self.path, return_code)
selenium.common.exceptions.WebDriverException: Message: Service geckodriver unexpectedly exited. Status code was: 69
geckodriver.log w/ hardcoded host ip in LiveTestServer
eckodriver: error: Address not available (os error 99)
geckodriver 0.27.0 (7b8c4f32cdde 2020-07-28 18:16 +0000)
WebDriver implementation for Firefox
USAGE:
geckodriver [FLAGS] [OPTIONS]
[...]
Hopefully the tracebacks above are not too confusing. Ultmiately what i did notice was the when im in webserver 2, i access the Django testcases.py module that has LiverServerThread and hardcode host=0.0.0.0 instead of host=localhost (1st traceback). The connection is then made although then the problem lies with geckodriver and the same thing (2nd traceback). I need to hard code ip 0.0.0.0 to be able to establish a connection, but then geckodriver is just listening, which i am assuming, at a completely different location (no error.log shown here).
So 1st id like to at least be able to make a connection to run the LiveServerThread class properly. Then try and resolve the issue with geckodriver. I also was not sure if the type of servers im running on poses as the problem.
This question already has answers here:
How to fix Selenium WebDriverException: The browser appears to have exited before we could connect?
(13 answers)
Closed 7 years ago.
In my Linux system I use Firefox, execute my program, the error which I've is:
Traceback (most recent call last):
File "shenma_diff_main_v2.py", line 90, in <module>
browser = webdriver.Firefox(profile)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/firefox/webdriver.py", line 59, in __init__
self.binary, timeout),
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/firefox/extension_connection.py", line 47, in __init__
self.binary.launch_browser(self.profile)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/firefox/firefox_binary.py", line 66, in launch_browser
self._wait_until_connectable()
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/firefox/firefox_binary.py", line 100, in _wait_until_connectable
raise WebDriverException("The browser appears to have exited "
selenium.common.exceptions.WebDriverException: Message: The browser appears to have exited before we could connect. If you specified a log_file in the FirefoxBinary constructor, check it for details.
If I use root execute my program is OK.
This is because you did not set up correctly to run Firefox without GUI.
This is a tutorial that might be helpful
Selenium Headless Automated Testing in Ubuntu
I think the reason is you need to specify the port number, probably xvfb runs on a different port than Firefox
on one terminal:
xvfb :99 -ac
on the another terminal:
export DISPLAY=:99
and run your scrapy program
I am successful without using root
I am running fedora 19 XFCE on EC2, I get this error when i run the python selenium script...
E
======================================================================
ERROR: test_PROG (__main__.TEST_PROG)
----------------------------------------------------------------------
Traceback (most recent call last):
File "selenium_asda.py", line 24, in setUp
self.driver = webdriver.Firefox()
File "/usr/lib/python2.7/site-packages/selenium-2.36.0-py2.7.egg/selenium/webdriver/firefox/webdriver.py", line 60, in __init__
self.binary, timeout),
File "/usr/lib/python2.7/site-packages/selenium-2.36.0-py2.7.egg/selenium/webdriver/firefox/extension_connection.py", line 47, in __init__
self.binary.launch_browser(self.profile)
File "/usr/lib/python2.7/site-packages/selenium-2.36.0-py2.7.egg/selenium/webdriver/firefox/firefox_binary.py", line 61, in launch_browser
self._wait_until_connectable()
File "/usr/lib/python2.7/site-packages/selenium-2.36.0-py2.7.egg/selenium/webdriver/firefox/firefox_binary.py", line 100, in _wait_until_connectable
self._get_firefox_output())
WebDriverException: Message: 'The browser appears to have exited before we could connect. The output was: \n(process:22490): GLib-CRITICAL **: g_slice_set_config: assertion `sys_page_size == 0\' failed\nGtk-Message: Failed to load module "canberra-gtk-module"\n*** LOG addons.xpi: startup\n*** LOG addons.xpi: checkForChanges\n*** LOG addons.xpi: No changes found\n/usr/lib/firefox/firefox: relocation error: /tmp/tmpxzNZAo/extensions/fxdriver#googlecode.com/platform/Linux_x86-gcc3/components/libwebdriver-firefox-latest.so: symbol _Znwj, version xul24.0 not defined in file libxul.so with link time reference\n'
----------------------------------------------------------------------
Ran 1 test in 5.193s
FAILED (errors=1)
the script works fine on my local machine...and I think this is related to the desktop settings for XFCE.
I connect to the instance via vnc and can get the full XFCE desktop with no issues.
Any hints?
SOLVED:
I downgraded selenium 2.36 to 2.35 and the tests runs OK.
That error is due to browser incompatibility. I had the same thing while trying to run FF24. Try the tutorial here to try and get it running with FF22.
Firefox browser issues with Selenium
I installed selenium-python bindings and trying to create an instance of firefox web driver
as below
>>> from selenium import webdriver
>>> driver = webdriver.Firefox()
I dont know whats wrong here and its displaying the following error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/site-packages/selenium-2.21.3-py2.7.egg/selenium/webdriver/firefox/webdriver.py", line 51, in __init__
self.binary, timeout),
File "/usr/lib/python2.7/site-packages/selenium-2.21.3-py2.7.egg/selenium/webdriver/firefox/extension_connection.py", line 47, in __init__
self.binary.launch_browser(self.profile)
File "/usr/lib/python2.7/site-packages/selenium-2.21.3-py2.7.egg/selenium/webdriver/firefox/firefox_binary.py", line 44, in launch_browser
self._wait_until_connectable()
File "/usr/lib/python2.7/site-packages/selenium-2.21.3-py2.7.egg/selenium/webdriver/firefox/firefox_binary.py", line 81, in _wait_until_connectable
self._get_firefox_output())
selenium.common.exceptions.WebDriverException: Message: 'The browser appears to have exited before we could connect. The output was: Error: cannot open display: :1100\n'
Can any please let me know how to solve this ..............
The error says"cannot open display: :1100". Are you running it on a remote terminal? Make sure you can type "firefox" at the prompt and have the browser open (that is what Webdriver does - opens Firefox on your system and then trying to connect to it). If you are running it on a remote system, do a web search on connecting to X display remotely.