Selenium log in through automation - python

Is selenium only for testing?
I created a script to log in to canvas, a website that my uni uses for class material
however, it seems that it only logs in on the browser generated by the driver, and I will still have to manually log in on the actual browser.
Is there a way for me to make it so that I won't have to log in on the actual browser after running my script?
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import time
PATH = r"C:the path \msedgedriver.exe"
driver = webdriver.Edge(PATH)
driver.get("the website")
driver.maximize_window()
#sign in page
user = driver.find_element(By.ID,"username")
user.send_keys("username")
pw = driver.find_element(By.ID,"password")
pw.send_keys("password")
pw.send_keys(Keys.RETURN)
driver.implicitly_wait(3)
#authentification
driver.switch_to.frame(driver.find_element(By.XPATH,"//iframe[#id='duo_iframe']"))
remember_me = driver.find_element(By.XPATH, "//input[#type='checkbox']")
remember_me.click()
duo = driver.find_element(By.XPATH, '//button[text()="Send Me a Push "]')
duo.click()

As per Selenium's Homepage
Selenium automates browsers. That's it! What you do with that power is
entirely up to you.
Primarily it is for automating web applications for testing purposes,
but is certainly not limited to just that.
Boring web-based administration tasks can (and should) also be
automated as well.
No, you won't be able to reconnect to the already opened Browsing Context.
Even if you are able to extract the ChromeDriver and ChromeSession attributes e.g. Session ID, Cookies, UserAgent and other session attributes from the already initiated ChromeDriver and Chrome Browsing Session still you won't be able to change the set of attributes of the ChromeDriver.
A cleaner way would be to span a new set of ChromeDriver and Chrome Browser instance with the desired set of configurations.

Related

Cannot Sign In Google using Selenium

I'm trying to sign in into google using Selenium. Unfortunately it stops me right after I insert the e-mail. I read through the internet that is a common issue and I tried everything like:
use Firefox
disable 2-key verification
allow less secure app
and others (I cannot remember because this problem has already taken me 6 hours with no solution, I'm going crazy).
HELP ME!
What can I do?
this is my code:
from selenium import webdriver
from selenium.webdriver.firefox.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
# set the firefox webdrive (THIS WORKS)
FIREFOX_DRIVER_PATH = "C:\\Development\\geckodriver.exe"
s = Service(executable_path=FIREFOX_DRIVER_PATH)
driver = webdriver.Firefox(service=s)
driver.get("*my_site*")
# google log-in (THIS WORKS)
GOOGLE_BUTTON_XPATH = '*my_google_button_xpath*'
google_button = driver.find_element(By.XPATH, GOOGLE_BUTTON_XPATH)
google_button.click()
# email input (THIS WORKS)
EMAIL_INPUT_XPATH = '*my_email_xpath*'
email_input = driver.find_element(By.XPATH, EMAIL_INPUT_XPATH)
email_input.send_keys("*sample_mail#gmail.com*")
email_input.send_keys(Keys.ENTER)
# XXX--------AND NOW I'M STUCK:--------XXX
# COULDN'T SIGN YOU IN
# this browser or app may not be secure
(I'VE ALSO TRIED WITH CHROME and CHROME DRIVER, even though in the code above I use Firefox)

Running Python 3.7 and Selenium: How do I reply to the Firefox Password Manager Popup when running a script?

I am grinding through day 2 of me learning Python 3.7 with Selenium.
I am accessing a web page using WebDriver. I have been making progress, but am stymied now. Though I can easily disable the Firefox password manager popup window on my normal Browser (Options/Privacy and Security/Location/Settings), my script's remotely-run (think that is by definition) browser does not recognize that configuration, and the Firefox popup shows up.
The script can ignore the popup and navigate the target site until the very last page that I need to access. At that point, the HTML for that page is inaccessible, until I manually click on the Firefox popup, dismissing it. As soon as I do that, the HTML code for that web page lights up in Firefox Web Developer Inspector.
Now, that HTML code may be inaccessible for other reasons (like I said, day 2 of the learning curve), but is there some library or commands within Webdriver that allow me to automate the dismissal of that FireFox popup. It is not part of the HTML of any page, so I am at a loss.
Edit: I should mention also, the bulk of that last page's content is blank until I manually dismiss the FireFox popup.
I have added the following code, but still am getting the same popup:
from selenium import webdriver
#Using Firefox to access the Web
options = webdriver.FirefoxOptions()
options.set_preference("dom.webnotifications.enabled", False)
driver = webdriver.Firefox(options=options)
driver.maximize_window()
Second Edit: This is the current code section defining the profile, and I am still getting the pop up password manager.
import datetime
import time
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
#Using Firefox to access the Web
profile = webdriver.FirefoxProfile()
#profile.set_preference("dom.push.enabled", False)
profile.set_preference("dom.webnotifications.enabled", False)
profile.update_preferences()
driver = webdriver.Firefox(firefox_profile=profile)
driver.maximize_window()
about:config
preference security.insecure_field_warning.contextual.enabled to false

Unable to load the webpage https://www.riachuelo.com.br/feminino/colecao-feminino using Selenium and Python

I've been trying to scrape this page (https://www.riachuelo.com.br/feminino/colecao-feminino) with Selenium but I canĀ“t manage to access the html because it never loads. I've tried using random user agents and other browsers, but the problem persists. Any ideas why is this happening?
Here is the code:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from fake_useragent import UserAgent
URL = "https://www.riachuelo.com.br/feminino/colecao-feminino"
options = Options()
ua = UserAgent()
userAgent = ua.random
options.add_argument(f'user-agent={userAgent}')
driver = webdriver.Chrome(chrome_options=options,executable_path=r"C:\Program Files (x86)\chromedriver.exe")
driver.get(URL)
I executed your usecase to load the webpage at https://www.riachuelo.com.br/feminino/colecao-feminino using Selenium as follows:
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get('https://www.riachuelo.com.br/feminino/colecao-feminino')
Similarly, as per your observation I have hit the same roadblock that the webpage never loads.:
Analysis
While inspecting the DOM Tree of the webpage you will find that some of the <iframe>, <script> tag refers to the keyword dist. As an example:
src="https://dtbot.directtalk.com.br/1.0/staticbot/dist/js/../index.html#!/?token=c243ce95-db6c-4ab6-9f2b-bf60d69c2d3d&widget=true&top=40&text=Alguma%20d%C3%BAvida%3F&textcolor=ffffff&bgcolor=4E1D3A&from=bottomRigth"
<script id="dtbot-script" src="https://dtbot.directtalk.com.br/1.0/staticbot/dist/js/dtbot.js?token=c243ce95-db6c-4ab6-9f2b-bf60d69c2d3d&widget=true&top=40&text=Alguma%20d%C3%BAvida%3F&textcolor=ffffff&bgcolor=4E1D3A&from=bottomRigth"></script>
Which is a clear indication that the website is protected by Bot Management service provider Distil Networks and the navigation by ChromeDriver gets detected and subsequently blocked.
Distil
As per the article There Really Is Something About Distil.it...:
Distil protects sites against automatic content scraping bots by observing site behavior and identifying patterns peculiar to scrapers. When Distil identifies a malicious bot on one site, it creates a blacklisted behavioral profile that is deployed to all its customers. Something like a bot firewall, Distil detects patterns and reacts.
Further,
"One pattern with Selenium was automating the theft of Web content", Distil CEO Rami Essaid said in an interview last week. "Even though they can create new bots, we figured out a way to identify Selenium the a tool they're using, so we're blocking Selenium no matter how many times they iterate on that bot. We're doing that now with Python and a lot of different technologies. Once we see a pattern emerge from one type of bot, then we work to reverse engineer the technology they use and identify it as malicious".
Reference
You can find a couple of detailed discussion in:
Is there a way to use Selenium WebDriver without informing the document that it is controlled by WebDriver?
Selenium webdriver: Modifying navigator.webdriver flag to prevent selenium detection
Akamai Bot Manager detects WebDriver driven Chrome Browsing Context
Is there a version of selenium webdriver that is not detectable?

logging into a website and downloading a file

I would like to log into a website and download a file. I'm using selenium and the chromedriver. Would like to know if there is a better way. It currently opens up a chrome browser window and sends the info. I don't want to see the browser window opened up and the data being sent. Just want to send it and return the data into a variable.
from selenium import webdriver
driver = webdriver.Chrome()
def site_login(URL,ID_username,ID_password,ID_submit,name,pas):
driver.get(URL)
driver.find_element_by_id(ID_username).send_keys(name)
driver.find_element_by_id(ID_password).send_keys(pas)
driver.find_element_by_id(ID_submit).click()
URL = "www.mywebsite.com/login"
ID_username = "name"
ID_password = "password"
ID_submit = "submit"
name = "myemail#mail.com"
pas = "mypassword"
resp=site_login(URL,ID_username,ID_password,ID_submit,name,pas)
You can run chrome in headless mode. In which case, the chrome UI won't show up and still performing the task you were doing. Some article I found on this https://intoli.com/blog/running-selenium-with-headless-chrome/. Hope this helps.
First option: If you are able to change the driver, you can use phantom-js as driver. That was a headless browser and you can use it with selenium.
Second option: If the site are not dynamic (easily called it SPA) or you are able to trace packet (which can be done in chrome dev tools), you can directly use request with the help of beautifulsoup if you need to get some data on the page.
Just add this two lines
chrome_options = Options()
chrome_options.add_argument("--headless")
This should make chrome run in the background.

Get the href generated from some javascript attached to a button with selenium?

I'm using selenium to pull some automated phone reporting from our phone system (Barracuda Cudatel, very nice small business system but it doesn't have an API for what I need). There is a button on the report page that has some javascript attached to it on a click listener that then tells the browser to download the file.
Obviously selenium isn't really designed to pull files like this, however all I'm trying to do is get the href of the url that would have been sent to the browser. I can then turn around and use the session credentials with requests to pull the file and do processing on it.
How do I do the following (In Python):
Query for the event listener for 'click'
Fire off the javascript
Get the resulting URL
Edit: I'm aware download location can be configured on the browser in selenium however I'm not interested in completing this task in that fashion. This is running against a selenium grid of 20 machines and the request could be routed to any of them. Since I can't pull the file through selenium I'm going to just pull it directly with requests.
Code I'm twiddling with is below.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import Select
from time import sleep
dcap = webdriver.DesiredCapabilities.CHROME
driver = webdriver.Remote(command_executor='http://gridurl:4444/wd/hub', desired_capabilities=dcap)
driver.get("http://cudatelurl")
driver.find_element_by_name("__auth_user").send_keys("user")
driver.find_element_by_name("__auth_pass").send_keys("password")
driver.find_element_by_id("manage").click()
driver.get("http://cudatelurl/#!/cudatel/cdrs")
sleep(5)
date_dropdown = Select(driver.find_element_by_xpath('//*[#id="cui-content-inner"]/div[3]/div/div/div/div[2]/div/div/div[1]/div[2]/div/select'))
date_dropdown.select_by_value("last_week")
# This is the element that has javascript attached to it the click register is
# button.widgetType.csvButtonWidget.widgetized.actionButtonType.table-action
# but I'd prefer to not hard code it
driver.find_element_by_xpath('//*[#id="cui-content-inner"]/div[3]/div/div/div/div[2]/div/div/div[1]/div[2]/button[1]')
print(driver.get_cookies())
print(driver.title)
sleep(10)
driver.close()
driver.quit()
You can still approach it with selenium by configuring the target directory for the file of a specific mime-type to be automatically downloaded (without the Save As dialog), see:
Download PDF files automatically in Firefox using Selenium WebDriver
Access to file download dialog in Firefox
Firefox + Selenium WebDriver and download a csv file automatically

Categories