None permission for chromedriver.exe in colab - python

I am trying to run the webdriver resource in the selenium module (python) in Chrome for the google colab. Firstval I have problems to parse the chromedriver.exe file in the command (selenium.webdriver.Chrome('/chromedriver.exe')), overcome that I found the continuos failure of none permission to run the chromedriver.exe, and the version is ok, who knows what possibly is wrong?
WebDriverException: Message: 'chromedriver.exe' executable may have wrong permissions.

You can do it by installing the chromium webdriver and adjusting some options such that it does not crash in google colab:
!pip install selenium
!apt-get update # to update ubuntu to correctly run apt install
!apt install chromium-chromedriver
!cp /usr/lib/chromium-browser/chromedriver /usr/bin
import sys
sys.path.insert(0,'/usr/lib/chromium-browser/chromedriver')
from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
wd = webdriver.Chrome('chromedriver',chrome_options=chrome_options)
wd.get("https://www.webite-url.com")

Related

Selenium Python Not Working In Google Colab

I used below code on google colab and installed selenium package
# !pip install selenium
# !apt-get update # to update ubuntu to correctly run apt install
# !apt install chromium-chromedriver
# !cp /usr/lib/chromium-browser/chromedriver /usr/bin
import sys
sys.path.insert(0,'/usr/lib/chromium-browser/chromedriver')
from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
wd = webdriver.Chrome('chromedriver',chrome_options=chrome_options)
wd.get("https://pooya.um.ac.ir/gateway/PuyaAuthenticate.php")
And just loading for a while and not working at all..!!
How can i fix it to access any website?
The following function works
!apt... is the important part for using Selenium in Google Colab.
!apt update
!apt install chromium-chromedriver
!pip install selenium
from selenium import webdriver
def driversetup():
options = webdriver.ChromeOptions()
#run Selenium in headless mode
options.add_argument('--headless')
options.add_argument('--no-sandbox')
#overcome limited resource problems
options.add_argument('--disable-dev-shm-usage')
options.add_argument("lang=en")
#open Browser in maximized mode
options.add_argument("start-maximized")
#disable infobars
options.add_argument("disable-infobars")
#disable extension
options.add_argument("--disable-extensions")
options.add_argument("--incognito")
options.add_argument("--disable-blink-features=AutomationControlled")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options)
driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined});")
return driver
driver = driversetup()
driver.get('https://www.google.com')

Convert HTML files to png in Colab

I want to convert html files created with Folium to png to finally convert them in a single gif.
I'm stuck on converting html to images. I've tried so far (on Colab):
Code1:
!pip install selenium
!apt-get update # to update ubuntu to correctly run apt install
!apt install chromium-chromedriver
!cp /usr/lib/chromium-browser/chromedriver /usr/bin
import sys
sys.path.insert(0,'/usr/lib/chromium-browser/chromedriver')
from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
wd = webdriver.Chrome('chromedriver',chrome_options=chrome_options)
wd.get("https://www.webite-url.com")
import os
import imageio
import webbrowser
Error1:
WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally.
(unknown error: DevToolsActivePort file doesn't exist)
(The process started from chrome location /usr/bin/chromium-browser is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
Code 2:
import os
import subprocess
url="osm1.html"
outfn = "outfig.png"
subprocess.check_call(["{}".format(url), "--out={}".format(outfn)])
Error2:
FileNotFoundError: [Errno 2] No such file or directory: 'osm1.html': 'osm1.html'
(osm1.html is in the root of Colab)
Code 3:
!pip install bokeh
import bokeh
from bokeh.io import export_png
url="osm1.html"
export_png(url, filename="plot.png")
Error3:
ValueError: OutputDocumentFor expects a sequence of Models
Code 4:
!pip install imgkit
!apt-get install xvfb
!apt-get install wkhtmltopdf
import imgkit
imgkit.from_file('osm1.html', 'out.jpg')
Error4:
OSError: wkhtmltoimage exited with non-zero code 1. error:
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-root'
qt.qpa.screen: QXcbConnection: Could not connect to display
Could not connect to any X display.
You need to install xvfb(sudo apt-get install xvfb, yum install xorg-x11-server-Xvfb, etc), then add option: {"xvfb": ""}.
Any of the solution above can be easily solved? I don't know more ways to do it.

Is there a way we can use Selenium on Google Colab like in Jupyter Notebook?

I've been using Selenium with Jupyter notebook (with Chrome webdriver) for a while. Using it on Jupyter NB, a new automated window is opened and I can see my code at work which is a great utility when the automation includes selecting an option from a drop down or such.
But while using Selenium (with Chrome web driver) on Google Colab, a new automated tab/window is not opened and I can't really see what my code is doing. Feels like being in a dark cave without a Torch.
Can anyone tell me how can I see an automated tab of the code while using Selenium on Colab?
This is what I have tried till now:
!pip install selenium
!apt-get update # to update ubuntu to correctly run apt install
!apt install chromium-chromedriver
!cp /usr/lib/chromium-browser/chromedriver /usr/bin
import sys
sys.path.insert(0,'/usr/lib/chromium-browser/chromedriver')
from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
wd = webdriver.Chrome('chromedriver',chrome_options=chrome_options)
wd.get("https://www.python.org")
and this too
!pip install selenium
!apt-get update
!apt install chromium-chromedriver
from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
wd = webdriver.Chrome('chromedriver',chrome_options=chrome_options)
driver =webdriver.Chrome('chromedriver',chrome_options=chrome_options)

OSError: [Errno 8] Exec format error: 'chromedriver' using Chromedriver on Ubuntu server

I'm trying to use Chromedriver with Ubuntu (AWS instance). I've gotten Chromedriver to work no problem in a local instance, but having many, many issues doing so in a remote instance.
I'm using the following code:
options = Options()
options.add_argument('--no-sandbox')
options.add_argument('--headless')
options.add_argument('--disable-dev-shm-usage')
options.add_argument("--remote-debugging-port=9222")
driver = webdriver.Chrome(executable_path='/usr/bin/chromedriver', chrome_options=options)
However, I keep getting this error:
Traceback (most recent call last):
File "test.py", line 39, in <module>
driver = webdriver.Chrome()
File "/home/ubuntu/.local/lib/python3.6/site-packages/selenium/webdriver/chrome/webdriver.py", line 73, in __init__
self.service.start()
File "/home/ubuntu/.local/lib/python3.6/site-packages/selenium/webdriver/common/service.py", line 76, in start
stdin=PIPE)
File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
restore_signals, start_new_session)
File "/usr/lib/python3.6/subprocess.py", line 1364, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
OSError: [Errno 8] Exec format error: 'chromedriver'
I believe I'm using the most updated version of Selenium, Chrome, and Chromedriver.
Chrome version is:Version 78.0.3904.70 (Official Build) (64-bit)
Selenium:
ubuntu#ip-172-31-31-200:/usr/bin$ pip3 show selenium
Name: selenium
Version: 3.141.0
Summary: Python bindings for Selenium
Home-page: https://github.com/SeleniumHQ/selenium/
Author: UNKNOWN
Author-email: UNKNOWN
License: Apache 2.0
Location: /home/ubuntu/.local/lib/python3.6/site-packages
Requires: urllib3
And, finally, for Chromedriver, I'm almost certain I downloaded the most recent version here: https://chromedriver.storage.googleapis.com/index.html?path=78.0.3904.70/. It's the mac_64 version (I'm using Ubuntu on a Mac). I then placed chromedriver in /usr/bin , as I read that's common practice.
I have no idea why this isn't working. A few options I can think of:
some sort of access issue? I'm a beginner with command line and ubuntu - should I be running this as "root" user?
mis-match between Chromedriver and Chrome versions? Is there a way to tell which chromedriver version I have for certain?
I see that Chromedriver and Selenium are in different locations. Selenium is in: Location: /home/ubuntu/.local/lib/python3.6/site-packages and I've moved chromedriver to: /usr/bin . Could this be causing problems?
Ubuntu Server 18.04 LTS (64-bit Arm):
Download Chrome: wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
Install Chrome: sudo dpkg -i google-chrome-stable_current_amd64.deb
If you'll get error run: sudo apt-get -f install
Check Chrome: google-chrome --version
Download chromedriver for Linux: wget https://chromedriver.storage.googleapis.com/78.0.3904.70/chromedriver_linux64.zip
Unzip chromedriver, install unzip sudo apt install unzip if required: unzip chromedriver_linux64.zip
Move chromedriver to /usr/bin: sudo mv chromedriver /usr/bin/chromedriver
Check chromedriver, run command: chromedriver
Install Java: sudo apt install default-jre
Install Selenium: sudo pip3 install selenium
Create test file, nano test.py with content below. Press CTRL+X to exit and the Y to save. Execute your script - python3 test.py
#!/usr/bin/python3
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument('--no-sandbox')
options.add_argument('--headless')
options.add_argument('--disable-dev-shm-usage')
options.add_argument("--remote-debugging-port=9222")
try:
driver = webdriver.Chrome(chrome_options=options)
driver.get("https://www.google.com")
s = driver.find_element_by_name("q")
assert s.is_displayed() is True
print("ok")
except Exception as ex:
print(ex)
driver.quit()
Example of using Docker and selenium/standalone-chrome-debug:
Install docker, installation steps are here
Start container, using sudo docker run -d -p 4444:4444 -v /dev/shm:/dev/shm selenium/standalone-chrome:3.141.59-xenon command, different options are here
Open Security Group of your instance in AWS and add TCP rule to be able to connect. You can add only your own IP and port 4444 for Selenium
Run test from local
options = webdriver.ChromeOptions()
options.add_argument('--no-sandbox')
options.add_argument('--headless')
options.add_argument('--disable-dev-shm-usage')
options.add_argument("--remote-debugging-port=9222")
driver = webdriver.Remote(command_executor="http://your_instance_ip:4444/wd/hub",
desired_capabilities=options.to_capabilities())
I am running the following on ec2-ubuntu:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.headless = True
driver = webdriver.Chrome("/usr/lib/chromium-browser/chromedriver", chrome_options=options) #Give the full path to chromedriver
Try it. Incase it doesn't work I will find more of the settings.

Python Selenium Geckodriver Connection refused

I spent hours trying to make Selenium works with Python no luck
this error message
selenium.common.exceptions.WebDriverException: Message: connection refused
this is the example I have used:-
from pyvirtualdisplay import Display
from selenium import webdriver
display = Display(visible=0, size=(800, 600))
display.start()
browser = webdriver.Firefox()
browser.get('http://www.python.org')
browser.close()
This is depence I intalled
apt-get install -y xorg xvfb dbus-x11 xfonts-100dpi xfonts-75dpi xfonts-cyrillic
This is /root/geckodriver.log output
1493938773101 geckodriver INFO Listening on 127.0.0.1:40876
1493938774156 geckodriver::marionette INFO Starting browser
/usr/lib/firefox/firefox.sh with args ["-marionette"] (firefox:3128):
GLib-GObject-CRITICAL **: g_object_ref: assertion 'object->ref_count >
0' failed
I'm running Selenium on Ubuntu 14.04 64-bit VPS remote server with 128MB RAM
I can't figure out what's make Selenium not able to communicate with browsers drivers both Chrome and Firefox.
Please start with checking your "firefox" browser version.
I found it very confusing at some point. I'm using the Raspbian and the "Iceweasel" downloaded with apt-get was a Firefox 52 version which didn't work with geckodriver 0.19 (this one requires Firefox 55 or greater).
What worked for me was to download geckorvider v0.16 and that resolved the problem.
Whats moreover, you probably don't need xorg to make it work, the only packages I needed was xfvb and iceweasel.
Ok, I gave up on Geckodriver and I use PhantomJS as my webdriver.
from selenium import webdriver
display = Display(visible=0, size=(800, 600))
display.start()
driver = webdriver.PhantomJS()
driver.get('http://www.python.org')
html_source = driver.page_source
print ("html_source:",html_source)
driver.quit()
Here are the steps I used to install PhantomJS :
cd ~
export PHANTOM_JS="phantomjs-2.1.1-linux-x86_64"
wget https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-2.1.1-linux-x86_64.tar.bz2
tar xvjf $PHANTOM_JS.tar.bz2
mv $PHANTOM_JS /usr/local/share
ln -sf /usr/local/share/$PHANTOM_JS/bin/phantomjs /usr/local/bin
Python Selenium
apt-get install python-pip -y
pip uninstall pyvirtualdisplay
apt-get install x11vnc xvfb fluxbox
Xvfb :99 -ac
xvfb-run -a python 99.py
pip uninstall selenium
pip install selenium==2.53.1
See also How to install PhantomJS on Ubuntu.

Categories