Splinter/Selenium running on Flask / Uwsgi does not see headless display - python

So here's my setup:
Using a flask server with uwsgi, and through a controller action, calling a python script that uses splinter (which uses selenium) to automate the gui. The web server doesn't have a display, so I'm using xvfb.
Sshing into the machine and running xvfb and exporting display=:99, and then running the python script works great. But running it through a controller action does not work - I get the following error:
WebDriverException: Message: The browser appears to have exited before we could connect.
(this is the same error that is returned when xvfb isn't running)
ps aux shows that xvfb is running as the same user as the web server (I've isolated everything, and have a separate controller action that executes:
p = subprocess.Popen("Xvfb :99 &", stdout=fstdout,stderr=fstderr, shell=True))
and DISPLAY is set to :99 on both root and the web server user.
I could install vncserver and try that, but I suspect I will end up with the same problem. I've also tried to avoid calling xvfb directly and using PyVirtualDisplay instead, but same problem.
edit: it errors on this line (if using splinter):
browser = Browser()
or, if selenium:
with pyvirtualdisplay.Display(visible=True):
binary = FirefoxBinary()
driver = webdriver.Firefox(None, binary)
(it errors on the last line there)
Any ideas?

Related

Connect to undetected-chromedriver docker image

I have been using https://hub.docker.com/r/selenium/standalone-chrome on my Synology NAS to use Selenium Webdriver to perform automated requests.
I don't remember the command I ran but I started the container and run driver = webdriver.Remote("http://127.0.0.1:4444/wd/hub") in Python to connect to the selenium chrome image.
However I have a use case that requires me to use undetected-chromedriver. How do I install something like https://hub.docker.com/r/bruvv/undetected_chromedriver and connect to it from my NAS' python terminal?
Beware, everyone can publish on docker hub and so there are numerous undetected-chromedriver's. So what you are trying to install is someone else's (failed) attempt.
official: https://hub.docker.com/r/ultrafunk/undetected-chromedriver
as per #nnhthuan 's comment, some more detail.
undetected-chromedriver will start the Chrome binary, but will do it from python instead of letting the chromedriver binary run Chrome. As undetected-chromedriver does not officially support headless mode, you'll need a way to run "windowed" chrome on docker. To make this happen, you could use Xvfb to emulate a X-server desktop. If you forget this step, you won't be able to connect to chrome as chrome closes itself down (no screens found) even before undetected-chromedriver is able to connect, and so it crashes.
To ensure xvfb keeps running, you could use for example something like this in your entrypoint:
#!/bin/bash
export DISPLAY=:1
function keepUpScreen() {
echo "running keepUpScreen()"
while true; do
sleep .25
if [ -z $(pidof Xvfb) ]; then
Xvfb $DISPLAY -screen $DISPLAY 1280x1024x16 &
fi;
done;
}
keepUpScreen &
echo "running: ${#}"
exec "$#"
once your image is running stable, you could set your chromedriver debug_host to your internal ip address instead of 127.0.0.1, and debug_port to a static value. This would enable connections from remote hosts.
Don't forget to forward them in docker.

I'm trying to start the chrome webdriver on linux, but it hangs, then closes

I'm making a program with selenium (python). It was working, and out of nowhere, the webdriver no longer works. I'm developing on a windows environment (and it works fine), but once I upload the code to the production server (Ubuntu), I try to open the web driver and it only displays data;, the driver hangs, then closes. No code after that continues.
Example:
print("Starting web driver")
driver = webdriver.Chrome(driver_path, options=opt)
print("Opening URL") # This code doesn't run
driver.get(config.url) # This code doesn't run
Things I've tried:
Running it on Windows (it works properly)
Updating the webdriver
Running it outside of the venv
Running the driver in a new, isolated environment
Wrapping the entire code within try-except (no errors output)
Running with the argument --headless
Edit: I'm running python3.7, chrome webdriver V 75.0.3770.90
Edit2: the driver_path var is the relative path to the chromedriver file. The opt are my list of chrome options:
opt = Options()
opt.add_argument('--no-sandbox')
opt.add_argument('--disable-dev-shm-usage')
profile = {"plugins.always_open_pdf_externally": True,
"download.default_directory": download_directory,
"download.prompt_for_download": False,
"download.directory_upgrade": True}
opt.add_experimental_option("prefs", profile)
I'm also using gunicorn as my webserver, but running it with the default (flask) webserver, I still encounter the issue. I'm also executing functions I've written for selenium through a flask-based web application (its a web app for work) The ubuntu machine that's running the script has a desktop environment installed.
After 15-25 seconds, the windows closes with no output in the terminal. After ~90 seconds, I get a message in the terminal saying:
Message: session not created
from disconnected: unable to connect to renderer
(Session info: chrome=75.0.3770.90)
I've also noticed that the chrome driver takes more time to open than usual.
Edit3: I've literally deleted then entire virtual machine and reinstalled it from scratch, and I'm still running into the same issue, I've reverted to an older version and it still doesn't run, which makes no logical sense. My only thought is that there is some configuration error or that something is interfering with it.
Edit4: I was able to get the log from the webdriver by adding the argument --verbose
Heres the log:
[1562179109.454][INFO]: resolved localhost to ["::1","127.0.0.1"]
[1562179111.454][WARNING]: Timed out connecting to Chrome, retrying...
[1562179111.454][INFO]: resolved localhost to ["::1","127.0.0.1"]
[1562179115.454][WARNING]: Timed out connecting to Chrome, retrying...
[1562179115.455][INFO]: resolved localhost to ["::1","127.0.0.1"]
[1562179123.454][WARNING]: Timed out connecting to Chrome, retrying...
[1562179123.455][INFO]: resolved localhost to ["::1","127.0.0.1"]
[1562179139.455][WARNING]: Timed out connecting to Chrome, giving up.
[1562179139.506][INFO]: [42e538ee02eb06b9ac776969dddf01d1] RESPONSE InitSession ERROR session not created
from disconnected: unable to connect to renderer
(Session info: chrome=75.0.3770.90)
[1562179139.506][DEBUG]: Log type 'driver' lost 9 entries on destruction
[1562179139.506][DEBUG]: Log type 'browser' lost 0 entries on destruction
I'm not too familiar with Linux, but from what I've seen in the past, I have a feeling its something to do with /etc/hosts (idk?)
Edit5:
I've noticed this started happening after I installed windscribe (vpn) which makes me think that windscribe is interfering with the connection somehow.
I turns out, windscribe (vpn) interferes with the connection to the chrome webdriver, I believe it has something to do with its built-in firewall, after uninstalling it, calling sudo apt autoremove -y and a reboot, it works properly!
Edit: I re-installed the VPN (windscribe) and de-activated the included firewall and it worked properly after that.

PhantomJS path on Heroku

I have a node app running on Heroku. I am scraping a website using selenium in python and calling the python script from my node app whenever I need to. I installed PhantomJS on my mac and when I run the app locally (node index.js), everything works just fine.
path_to_phantom = '/Users/govind/Desktop/phantomjs-2.1.1-
macosx/bin/phantomjs'
browser = webdriver.PhantomJS(executable_path = path_to_phantom)
However, nothing seems to work on Heroku. I also added the PhantomJS buildpack to my node app but it just doesn't call the python script. The problem I think is the path to PhantomJS buildpack. What path should I add? Or is there any other aspect I'm missing here?
I managed to use Selenium with PhantomJS in my Python application deployed to Heroku following these steps:
1) Switch to using the Cedar-14 stack on my Heroku application
$ heroku stack:set cedar-14
2) Install a PhantomJS buildpack
$ heroku buildpacks:add https://github.com/stomita/heroku-buildpack-phantomjs
With these changes I could then use Selenium to fetch websites
from selenium import webdriver
browser = webdriver.PhantomJS()
browser.get("http://www.google.com") # This does not throw an exception if it got a 404
html = browser.page_source
print html # If this outputs more than just '<html><head></head><body></body></html>' you know that it worked

Cannot create browser process when using selenium from python on RHEL5

I'm trying to use selenium from python but I'm having a problem running it on a RHEL5.5 server. I don't seem to be able to really start firefox.
from selenium import webdriver
b = webdriver.Firefox()
On my laptop with ubuntu this works fine and it starts a brings up a firefox window. When I log in to the server with ssh I can run firefox from the command line and get it displayed on my laptop. It is clearly firefox from the server since it has the RHEL5.5 home page.
When I run the python script above on the server it (or run it in ipython) the script hangs at webdriver.Firefox()
I have also tried
from selenium import webdriver
fb = webdriver.FirefoxProfile()
fb.native_events_enabled=True
b=webdriver.Firefox(fb)
Which also hangs on the final line there.
I'm using python2.7 installed in /opt/python2.7. In installed selenium with /opt/python2.7/pip-2.7.
I can see the firefox process on the server with top and it is using a lot of CPU. I can also see from /proc/#/environ that the DISPLAY is set to localhost:10.0 which seems right.
How can I get a browser started with selenium on RHEL5.5? How can I figure out why Firefox is not starting?
It looks like the problem I'm encountering is this selenium bug:
http://code.google.com/p/selenium/issues/detail?id=2852
I used the fix described in comment #9 http://code.google.com/p/selenium/issues/detail?id=2852#c9
That worked for me.

Selenium issues with IE tests

When I change mt test browser to IE using the following line of code:
self.selenium = selenium("localhost", 4444, "*iexplore", "http://www.mydomain.net/")
I get the following error:
Exception: Failed to start new browser session: java.lang.RuntimeException: SystemRoot apparently not set!
It works perfectly fine using firefox and Chrome. This is running on an Ubuntu server.
How could the Selenium RC server (which is what I guess you are using) possibly start an IE instance on an Ubuntu machine?! IIRC all browser instances started by the Selenium RC server have to be local to the server. So if you want to test with IE, you have to run the SRC on a Windows box. Makes sense?!

Categories