Is there a possibility if I code a program in python that allows to automatically browse a given website using mechanize to detect if there are popup windows (suggesting advertisements or downloading actions ...) using Python ?. I would appreciate any hint (for example, if you give me a library that fulfills this task I would be very happy)
Mechanize cannot handle javascript and popup windows:
How do I use Mechanize to process JavaScript?
Mechanize and Javascript
To accomplish the goal, you need to utilize a real browser, headless or not. This is where selenium would help. It has a built-in support for popup dialogs:
Selenium WebDriver has built-in support for handling popup dialog
boxes. After you’ve triggerd and action that would open a popup, you
can access the alert with the following:
alert = driver.switch_to_alert()
Example (using this jsfiddle):
from selenium import webdriver
url = "http://fiddle.jshell.net/ebkXh/show/"
driver = webdriver.Firefox()
driver.get(url)
button = driver.find_element_by_xpath('//button[#type="submit"]')
# dismiss
button.click()
driver.switch_to.alert.dismiss()
# accept
button.click()
driver.switch_to.alert.accept()
See also:
Handle Popup Windows
Click the javascript popup through webdriver
Unfortunately, Mechanize's browser seems to skip the pop-ups so the title, URL, and HTML are identical for both pop-ups and normal pages.
Frankly, Python is not the right tool for this job and is lagging behind in this respect IMHO. Having spent months doing web crawling, for sites that use Javascript extensively (the number of which is greatly increasing nowadays), I find that using Javascript-Based environments like PhantomJS or SlimerJS are simply better for what you're trying to do.
If you have the luxury to use Javascript-Based environments, I'd say go right ahead. However, you can still use python. PhantomJS embeds Ghost Driver. You can use Ghost.py to utilize the power of PhantomJS. Or you can use Selenium with Python as illustrated here.
Related
So for my python selenium script I have to complete a lot of Captcha's. I noticed that when I get the Captcha's on my regular browser they're much easier and quicker. Is there a way for me to hide the fact that I'm using a web automation bot so I get the easier Captcha's?
I already tried randomizing the User Agent but to no success.
You can go to your website and inspect the page. Then go to the network tab and select Network. Reload the page and the select the webpage you are accessing from the list. If you scroll down, you can see the user agent that your browser is using to access the page. Use that user agent in your scraper to exactly mimick your browser.
From a generic perspective there are no proven ways to hide the fact that you are using a Selenium driven web automation bot.
You can find a relevant detailed discussion in Can a website detect when you are using Selenium with chromedriver?
However at certain times modifying the navigator.webdriver flag helps to prevent detection.
References
You can find a couple of relevant detailed discussions in:
Is there a way to use Selenium WebDriver without informing the document that it is controlled by WebDriver?
Selenium Chrome gets detected
How does recaptcha 3 know I'm using selenium/chromedriver?
I want to be able to use pure selenium webdriver to open a zoom link in Chrome and then redirect me to the zoom.us application.
When I execute this:
from selenium import webdriver
def main():
driver = webdriver.Chrome()
driver.get("https://zoom.us/j/000-000-000")
main()
I receive a pop-up saying
https://zoom.us wants to open this application.
and I must press a button titled open zoom.us to open the app.
Is there a way to press this pop-up button through selenium. Or, is there some other way to open zoom from chromedriver?
NOTE: I only want to use selenium. I have been able to implement pyautogui to click on the button but that is not what I am looking for.
Solution for Java:
driver.switchTo().alert().accept();
Solution for Python:
driver.switch_to.alert.accept()
There are a lot of duplicated questions regarding this issue. Here is one of them, and it is quite sure that selenium is not capable of achieving such job since it only interacts with the chrome page. I previously encountered this issue as well and here is my solution to it. It might look really unprofessional, but fortunately it works.
The logic of my solution is to change the setting of chrome in order to skip the popup and directly open the application you want. However, the Chrome team has removed this feature in the latter version for some reasons, and we need to get it back manually. Then, we know that everytime when selenium starts to do the thing it opens a new Chrome page with NO customized settings just like the incognito page. Therefore we need to do something to let selenium opened a Chrome page with your customized setting, so that we can make sure that the popup, which we changed manually to skip, can be skipped successfully.
Type the following code in your terminal.
defaults write com.google.Chrome ExternalProtocolDialogShowAlwaysOpenCheckbox -bool true
This enables you to change the setting of skipping popups, which is the feature Chrome team removed.
Restart Chrome,and open the zoom (or whatever application) page to let the popup display. If you do the 1st step correctly you will be able to see there is a checkbox shown next to the "Open Zoom.us" saying if you check it chrome will open this application without asking, that is, to skip the popup for this application.
Now we need to let selenium open the Chrome with our customized setting. To do this, type "chrome://version" in the search tab of your ordinary Chrome (Not automated page opened by selenium). Go to "Profile Path", and copy this path without the last word "default". For example:
/Users/MYNAME/Library/Application Support/Google/Chrome/Default
This is my profile path, but I only copy everything except the last word Default, so this is what I need to copy.
/Users/MYNAME/Library/Application Support/Google/Chrome/
This is for Mac users, but for Windows only the path is different(starts with C:// or something), steps are same.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
option = Options()
option.add_argument('THE PATH YOU JUST COPIED')
driver = webdriver.Chrome(executable_path='YOUR PATH TO CHROMEDRIVER', options=option)
driver.get("google.com") #Or anything else
We use "options" to let selenium open a page with our customized profile. Now you will see selenium opens a Chrome page with all your account profile, settings, and it just appears like your ordinary chrome page.
Run your code. But before that, remember to quit ALL CHROME sessions manually. For Mac, make sure that there is no dot under Chrome icon indicating that Chrome is not running for any circumstances. THIS STEP IS CRITICAL otherwise selenium will open a chrome page and it just stops there.
Here are all the steps. Again, this solution is vert informal and I personally don't think it is a "solution" to this problem. I will try to figure out a better way of achieving this in the future. But I still posted this as an alternative simply because I guess it might be helpful to some extent for somebody just like me. Hope it works for you, and good luck.
This is the site that I want to login into: https://nid.naver.com/nidlogin.login
When I tried to log in this site using selenium webdriver, it showed CAPTCHA.
But when I type id/pw by myself, keyboard typing, the CAPTCHA didn't show up!
How can selenium driver be detected?
It depends on your driver. Chromedriver does set specific js variables when it starts the browser. I'm sure other driver vendors have something similar. So, in short, yes. There are different ways it can determine that you are running via webdriver.
Is there a way to detect using python selenium webdriver if a website is restricting viewing or navigating their site until you accept an alert or pop up?
This won't be one URL it will be many implemented in various ways.
If not is there a way to accept the pop up or alert?
You could probably monitor the window_handles. So if you are on a page and you can't find the element you want and you think that is because they are blocking you with an alert you can see if the number of window_handles increased, switch to the latest window_handle and deal with it.
for handle in driver.window_handles:
driver.switch_to_window(handle)
and you may end up using:
driver.switch_to.alert
which has a bunch of methods you can use.
Is there a possibility if I code a program in python that allows to automatically browse a given website using mechanize to detect if there are popup windows (suggesting advertisements or downloading actions ...) using Python ?. I would appreciate any hint (for example, if you give me a library that fulfills this task I would be very happy)
Mechanize cannot handle javascript and popup windows:
How do I use Mechanize to process JavaScript?
Mechanize and Javascript
To accomplish the goal, you need to utilize a real browser, headless or not. This is where selenium would help. It has a built-in support for popup dialogs:
Selenium WebDriver has built-in support for handling popup dialog
boxes. After you’ve triggerd and action that would open a popup, you
can access the alert with the following:
alert = driver.switch_to_alert()
Example (using this jsfiddle):
from selenium import webdriver
url = "http://fiddle.jshell.net/ebkXh/show/"
driver = webdriver.Firefox()
driver.get(url)
button = driver.find_element_by_xpath('//button[#type="submit"]')
# dismiss
button.click()
driver.switch_to.alert.dismiss()
# accept
button.click()
driver.switch_to.alert.accept()
See also:
Handle Popup Windows
Click the javascript popup through webdriver
Unfortunately, Mechanize's browser seems to skip the pop-ups so the title, URL, and HTML are identical for both pop-ups and normal pages.
Frankly, Python is not the right tool for this job and is lagging behind in this respect IMHO. Having spent months doing web crawling, for sites that use Javascript extensively (the number of which is greatly increasing nowadays), I find that using Javascript-Based environments like PhantomJS or SlimerJS are simply better for what you're trying to do.
If you have the luxury to use Javascript-Based environments, I'd say go right ahead. However, you can still use python. PhantomJS embeds Ghost Driver. You can use Ghost.py to utilize the power of PhantomJS. Or you can use Selenium with Python as illustrated here.