I am running selenium webdriver (firefox) using python on a headless server. I am using pyvirtualdisplay to start and stop the Xvnc display to grab the image of the sites I am visiting. This is working great except flash content is not loading on the pages (I can tell because I am taking screenshots of the pages and I just see empty space where flash content should be on the screenshots).
When I run the same program on my local unix machine, the flash content loads just fine. I have installed flash on my server, and have libflashplayer.so in /usr/lib/mozilla/plugins. The only difference seems to be that I am using the Xvnc display on the server (unless plash wasn't installed properly? but I believe it was since I used to get a message asking me to install flash when I viewed a site that had flash content but since installing flash I dont get that message anymore).
Does anyone have any ideas or experience with this- is there a trick to getting flash to load using a firefox webdriver on a headless server? Thanks
It turns out, I needed to use selenium to scroll down the page to load all the content.
Related
I have a Python GUI application that uses Selenium and Chromedriver to crawl sites, interact with elements, download files, etc. The application has been packaged as a standalone .exe (produced using PyInstaller) and has performed well in tests across a few different Windows and Mac machines. However, on one machine it is producing WinError 10061, screenshot below:
A few other details:
The Web Crawler application appears to work fine and hit all targets when run in headless mode
Directly ahead of this error, the crawler successfully 1) opened the Chromedriver browser (outside of headless mode, so the webpage was visible) 2) accessed the start URL and performed automated tasks on the page (I.e., filling out and completing a login page, clicking 'Submit' button, refreshing page). It's only when accessing subsequent URLs that the Chromedriver quits and produces this error. I'm not sure why it be able to successfully initiate the browser, get the start URL and perform tasks, but fails upon getting another URL on the same site
The URL it fails upon is https://econtent.hogrefe.com/toc/prx/current, but the error has been seen on completely different sites that similarly do not use the headless browser.
Any ideas as to what's happening here?
I am trying to create a python application while using eel to create a user interface in html. My operating system is Ubuntu Linux and I'm using Firefox to display the web interface.
The problem I'm having is every time I run the python code, Firefox opens a blank page saying "Unable to connect" followed by "Firefox can't establish a connection to the server at localhost:8000". However, if I click the "Try Again" button once, twice, or three times, my interface is displayed.
Once open, I can navigate to different pages but I also noticed that once I navigate to a different page, some of my javascript stops working (specifically a window.close() function). I don't know if this is related but I thought I would mention it just in case.
Any advice on the matter would be greatly appreciated.
Thank you.
I changed my browser from firefox to chromium and now my interface loads on startup the first time. I know some documentation says it can be used with firefox, and it can, but it seems to be kind of buggy and works better with other browsers.
However, I'm still having trouble with my javascript not running but that will be another question.
when using selenium chromedriver in headless i need to initally log in into the website im trying to scrape so that the browserdata gets stored via
option.add_argument(r'--user-data-dir=.\UserData').
unfortunately the cookies dont work for headless if i put in the information via non headless.
the easiest way arround that for me was to let it display via option.add_argument('--remote-debugging-port=9222') while in headless.
this worked on my previous pc like a charm however on my new, current one it just displays a white screen of death on the localhost.
python: 3.7.9
i pasted the code from my previous pc where it worked perfectly fine so its not a code side error
When I'm trying to open opensea.io with selenium it's giving Cloudfare captcha, even if I solve the captcha the captcha page is not redirecting to opensea.io
Update: Installing vpn solved this but there must be other ways.
driver.get("https://opensea.io")
Error screenshot given below.
cloudfare error
Edited:
There might be several reasons that are possibly causing this kind of problem:
Cloudflare blocked your I.P. Try using a new I.P. through a proxy (or VPN, Another ISP), and see if it works or not. (https://community.cloudflare.com/t/cant-bypass-cloudflare-captcha/200335/8)
Depending on Selenium versions and editions, it could explicitly tell the browser that it is a bot and allow the websites to know it is Selenium, so Cloudflare then blocks the request.
The browser is the problem. Try a different browser like Firefox.
Cloudflare or the website you are trying to reach cares about special cookies that are not available on a Selenium new browser (This was my wild guess, but it's not the case).
P.S.: I have tried to connect to this URL (https://opensea.io), and interestingly, it worked fine for me.
Here is some information about the environment I performed this action on:
Operation System: CentOS 7, Linux
Selenium Standalone Version: 4.0.0
Java Version: jre-8u311-linux-x64
The browser I used: Firefox
I am now developing a webpage crawler, unfortunately the website generates the results by ajax. Following some coders suggestion, I tried to use selenium, a test automation tool for python.
As the example given in the documentation:
driver = webdriver.Firefox()
This code executes to open the Firefox browser. And then do something just like filling the form, submitting and so on.
Frankly speaking, this example works well on my PC(ubuntu 12.10), but my project will finally transfer to a CentOS server.
What I am considering is whether the code(need to open a browser gui) can be successfully ran on the CentOS server over ssh because no KDE such as gnome provided on that machine.....
And if without browser gui, the code cannot work well, then is there any other solutions?
Any reply would be admired~
You can probably use the HtmlUnit driver if you enable javascript. The only way to be sure though is to test it out. Another option would be to try and run with an X framebuffer.