Selenium can not open opensea.io - python

When I'm trying to open opensea.io with selenium it's giving Cloudfare captcha, even if I solve the captcha the captcha page is not redirecting to opensea.io
Update: Installing vpn solved this but there must be other ways.
driver.get("https://opensea.io")
Error screenshot given below.
cloudfare error

Edited:
There might be several reasons that are possibly causing this kind of problem:
Cloudflare blocked your I.P. Try using a new I.P. through a proxy (or VPN, Another ISP), and see if it works or not. (https://community.cloudflare.com/t/cant-bypass-cloudflare-captcha/200335/8)
Depending on Selenium versions and editions, it could explicitly tell the browser that it is a bot and allow the websites to know it is Selenium, so Cloudflare then blocks the request.
The browser is the problem. Try a different browser like Firefox.
Cloudflare or the website you are trying to reach cares about special cookies that are not available on a Selenium new browser (This was my wild guess, but it's not the case).
P.S.: I have tried to connect to this URL (https://opensea.io), and interestingly, it worked fine for me.
Here is some information about the environment I performed this action on:
Operation System: CentOS 7, Linux
Selenium Standalone Version: 4.0.0
Java Version: jre-8u311-linux-x64
The browser I used: Firefox

Related

ERROR:gcm_channel_status_request.cc(145)] GCM channel request failed message shows in terminal with python project

when I try to use python + Selenium to run my code, the error message shows in the terminal during the program is running, it has no effect of my program, but sort of annoying, does anyone have any solution to tell me why the error shows, and how to disable the error message.
my network is located in China, and our network policy disables the access to Google, is the may the cause?
[21792:15920:1230/144009.402:ERROR:gcm_channel_status_request.cc(145)] GCM channel request failed.
anyway, thanks in advance.
Note: This is not a definitive answer; it is very rough conclusions drawn from a quick investigation.
GCN is Google Cloud Messaging. It appears that the Selenium Chrome extension is using GCN. See for example https://pushwizard.com/chrome-gcm-messaging.
I see these messages when my Python Selenium Chrome application is sleeping and I hibernate the machine. It may occur at other times.
I surmise that the Selenium Python backend is using GCN and uses a ping or keep-alive type of message to find out if the Chrome browser is still there. I further surmise that since my Python application takes much less memory than the Chrome browser, my app wakes up first, pings the browser, and reports that there was no response.
Since it's not causing a problem this is enough for me.
Google Cloud Messaging is deprecated - you should use Firebase Cloud Messaging instead.
You may need to update your selenium webdriver for chrome.
Since your access to google ( and I think google services too ) is disabled this also could be the issue. To test this, you should go and implement the selenium application with firefox (or any other non chromium based browser) ; which shouldn't use google services inherently. ( but i am not 100% sure about that )

Selenium cannot get webpage content on Linux but work well on Windows for a specific website

I am web-scraping Bitcoin quotations from Coinsuper. It is a javascript page. When I first develop my code on Windows using Python 3.7, Selenium, and Chromium, it works well.
I want to deploy this code on my server to fetch data continuously. However, it doesn't work under Linux.
I am sure my code can work, at least on most websites, including Apple, Google, Baidu, Xueqiu, etc.
For the OS system, I have tried Debian 9 and Ubuntu 18.04.
For webdriver, I have tried both Chrome and Firefox.
For webdriver parameters, I have tried:
Add header, including fake-useragent
Ignore SSL certificate
Disable GPU
These make no difference.
I think it might because Coinsuper has some anti-scraping strategy. But I am also confused why the similar code can work on Windows but not on Linux. Are there any differences that might cause this situation?
The code:
from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-gpu') # Only included in Linux version
chrome_options.add_argument('--no-sandbox') # Only included in Linux version
driver = webdriver.Chrome(options=chrome_options)
driver.get('https://www.coinsuper.com/trade')
print(driver.page_source)
driver.quit()
I am the one who asks this question. Thank you all for helping me! Finally, I have solved this problem.
#furas showed that my code could actually get responses from Coinsuper.
#Dalvenjia inspired me that this might be caused by IP blacklist, which is most probable for the cloud servers. And yes, I am using a cloud server.
Here is the solution:
Start a Shadowsocks server from my home IP address, or use any proxy you have.
Start Shadowsocks client on the server:
Add one more argument to ChromeDriver in Python script:
chrome_options.add_argument('--proxy-server=socks5://127.0.0.1:xxxx')
Now I can get contents by bypassing the IP blacklist.
I recommend you to use WebDriverManager dependency:
https://github.com/bonigarcia/webdrivermanager
By using WebDriverManager, you didn't need to download or manage drivers path in code.

ChromeDriver Does Not Open Websites

I am experiencing a very strange behaviour when testing Chrome via selenium webdriver.
Instead of navigating to pages like one would expect the 'get' command leads only to the download of tiny files (no type or.apsx files) from the target site.
Importantly - this behavior only occurs when I pass chrome_options as an argument
to the Chrome driver.
The same testing scripts work flawless with firefox driver.
Code:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options # tried with and without
proxy = '127.0.0.1:9951' # connects to a proxy application
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--proxy-server=%s' % proxy)
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get('whatismyip.com')
Leads to the automatic download of a file called download (no file extension, Size 2 Byte).
While calling other sites results in the download of small aspx files.
This all happens while the browser page remains blank and no interaction with
elements happen = the site is not loaded at all.
No error message, except element not found is thrown.
This is really strange.
Additional info:
I run Debian Wheezy 32 bit and use Python 2.7.
Any suggestions how to solve this issue?
I tried your code and captured the traffic on localhost using an SOCKS v5 proxy through SSH. It is definitely sending data through the proxy but no data is coming back. I have confirmed the proxy was working using Firefox.
I'm running Google Chrome on Ubuntu 14.04 LTS 64-bit. My Chrome browser gives me the following message when I try to configure a proxy in its settings menu:
When running Google Chrome under a supported desktop environment, the
system proxy settings will be used. However, either your system is not
supported or there was a problem launching your system configuration.
But you can still configure via the command line. Please see man
google-chrome-stable for more information on flags and environment
variables.
Unfortunately I don't have a man page for google-chrome-stable.
I also discovered that according to the selenium documentation Chrome is using the system wide proxy settings and according to their documentation it is unknown how to set the proxy in Chrome programmatically: http://docs.seleniumhq.org/docs/04_webdriver_advanced.jsp#using-a-proxy

selenium headlessly run on server over SSH

I am now developing a webpage crawler, unfortunately the website generates the results by ajax. Following some coders suggestion, I tried to use selenium, a test automation tool for python.
As the example given in the documentation:
driver = webdriver.Firefox()
This code executes to open the Firefox browser. And then do something just like filling the form, submitting and so on.
Frankly speaking, this example works well on my PC(ubuntu 12.10), but my project will finally transfer to a CentOS server.
What I am considering is whether the code(need to open a browser gui) can be successfully ran on the CentOS server over ssh because no KDE such as gnome provided on that machine.....
And if without browser gui, the code cannot work well, then is there any other solutions?
Any reply would be admired~
You can probably use the HtmlUnit driver if you enable javascript. The only way to be sure though is to test it out. Another option would be to try and run with an X framebuffer.

Selenium webdriver support for the latest versions of firefox and chrome

I am using selenium-2.35.0 and Python-2.7.
Testcases are written in python.
my python code to create driver object:
from selenium import webdriver
driver = webdriver.Remote(desired_capabilities={
"browserName": "firefox"
})
And run selenium server by,
java - jar selenium-server-standalone-2.35.0.jar
I had my code working in Firefox - 22 - had the selenium server running, able to run scripts in python, etc. So I'm confident the code works.
Recently, I updated FireFox to 23 and now all I get is
"[Errno 10061] No connection could be made because the target machine actively refused it."
I thought maybe I need to restart the server again, or something. But that seems to do nothing. Is this issue related to selenium webdriver's support for the latest browser version?
But as of this link http://selenium.googlecode.com/git/java/CHANGELOG , selenium supports Firefox - 23. If supported, code that run in Firefox - 22 should also run in Firefox - 23 without any code change.
And how can i make the same code work for chrome?
I have found that the newest version of firefox routinely doens't work immediately well with Selenium. Check out this firefox support matrix on Github that someone made. Unfortunately the only thing you can do is stop Firefox from auto-updating and keep your selenium tests running for firefox newest version minus 1 or 2. Chrome tends to work out of the box for Selenium, sometimes the Beta channel has fixed some selenium issues, so try that if you have a particular issue (on the other hand it may introduce other bugs). So in the end you need to be constantly weary of browser updates and routinely checking how they are working with the current version of selenium.
Check out this guide on how to get Selenium working with rolled back versions of firefox:
http://inkhorn.ca/selenium-python-on-ubuntu-using-firefox/
It will also fix any errors that have to do with “version xul**.0 not defined in file libxul.so”

Categories