Downloading multiple files using Selenium click()? - python

Using Firefox/Python/Selenium-- I am able to use click() on a file link on a webpage to download it, and the file downloads to my Downloads folder as expected.
However, when I add more lines to click() on more than 1 link, the script no longer runs as expected. Instead of the files being downloaded, they are all opening in separate browser windows, which all close after the script completes.
Is this by design or is there a way around it or a better way to download multiple files on a webpage?
This is the website in question: https://www.treasury.gov/about/organizational-structure/ig/Pages/igdeskbook.aspx
I am trying to download the links to the Introduction and all parts of Volumes 1-4.
I have a dictionary of the locators:
IgDeskbookPageMap = dict(IgDeskbookBannerXpath = "//div[contains(text(), 'The Inspector General Deskbook')]",
IgDeskbookIntroId = "anch_202",
IgDeskbookVol1Part1Id = "anch_203",
IgDeskbookVol1Part2Id = "anch_204",
IgDeskbookVol1Part3Id = "anch_205",
IgDeskbookVol1Part4Id = "anch_206",
IgDeskbookVol2Id = "anch_207",
IgDeskbookVol3Id = "anch_208",
IgDeskbookVol4Part1Id = "anch_209",
IgDeskbookVol4Part2Id = "anch_210",
IgDeskbookVol4Part3Id = "anch_211"
This is the method:
def click(self, waitTime, locatorMode, Locator):
self.wait_until_element_clickable(waitTime, locatorMode, Locator).click()
These are the click() calls (there are more than 3, but just truncating here for space:
self.click(10,
"id",
IgDeskbookPageMap['IgDeskbookIntroId']
)
self.click(10,
"id",
IgDeskbookPageMap['IgDeskbookVol1Part1Id']
)
self.click(10,
"id",
IgDeskbookPageMap['IgDeskbookVol1Part2Id']
)

I added the following code for launching Firefox and now the download behavior works as expected when clicking on each file:
profile = webdriver.FirefoxProfile()
profile.set_preference('browser.download.folderList', 2)
profile.set_preference('browser.download.manager.showWhenStarting', False)
profile.set_preference('browser.helperApps.alwaysAsk.force', False)
profile.set_preference('browser.helperApps.neverAsk.saveToDisk', 'application/pdf,application/x-pdf')
profile.set_preference("plugin.disable_full_page_plugin_for_types", "application/pdf")
profile.set_preference("pdfjs.disabled", True)
self.driver = webdriver.Firefox(profile)

A way to download such multiple files if opened in different tabs could be to follow these algorithmic steps in your own coding language :
for( all such links) :
click() the pdf link
findElement the download element
click() the download link
close the tab
switch back to last tab //should ideally be completed with previous step

Related

Python Selenium is not downloading data when I click link

I wrote a script to find the download link through a series of click, first on the settings gear icon then on the "Export data" tab and finally on the click here to download data link.
However when i click on the final link it does not download the data to my specified default directory.
**ideally i would like to download the data directly to a variable but i couldn't even figure out the why the general download wasn't working.
I have tried getting the href from the download link and opening a new tab using that url but it still gives me nothing
URL = 'https://edap.epa.gov/public/single/?appid=73b2b6a5-70c6-4820-b3fa-186ac094f10d&sheet=1e76b65b-dd6c-41fd-9143-ba44874e1f9d'
DELAY = 10
def init_driver(url):
options = webdriver.chrome.options.Options()
path = '/Users/X/Applications/chromedriver'
options.add_argument("--headless")
options.add_argument("download.default_directory=Users/X/Python/data_scraper/epa_data")
driver = webdriver.Chrome(chrome_options= options, executable_path=path)
driver.implicitly_wait(20)
driver.get(url)
return driver
def find_settings(web_driver):
#find the settings gear
#time.sleep(10)
try:
driver_wait = WebDriverWait(web_driver,10)
ng_scope = driver_wait.until(EC.visibility_of_element_located((By.CLASS_NAME,"ng-scope")))
settings = web_driver.find_element_by_css_selector("span.cl-icon.cl-icon--cogwheel.cl-icon-right-align")
print(settings)
settings.click()
#export_data = web_driver.find_elements_by_css_selector("span.lui-list__text.ng-binding")
#print(web_driver.page_source)
except Exception as e:
print(e)
print(web_driver.page_source)
def get_settings_list(web_driver):
#find the export button and download data
menu_item_list = {}
find_settings(web_driver)
#print(web_driver.page_source)
try:
time.sleep(8)
print("got menu_items")
menu_items = web_driver.find_elements_by_css_selector("span.lui-list__text.ng-binding")
for i in menu_items:
print(i.text)
menu_item_list[i.text] = i
except Exception as e:
print(e)
return menu_item_list
def get_export_data(web_driver):
menu_items = get_settings_list(web_driver)
print(menu_items)
export_data = menu_items['Export data']
export_data.click()
web_driver.execute_script("window.open();")
print(driver.window_handles)
main_window = driver.window_handles[0]
temp_window = driver.window_handles[1]
driver.switch_to_window(main_window)
time.sleep(8)
download_data = driver.find_element_by_xpath("//a[contains(text(), 'Click here to download your data file.')]")
download_href = download_data.get_attribute('href')
print(download_href)
download_data.click()
driver.switch_to_window(temp_window)
driver.get("https://edap.epa.gov"+download_href)
print(driver.page_source)
driver = init_driver(URL)
#get_settings_list(driver)
get_export_data(driver)
I would like to have this code emulate the manual action of clicking the settings gear icon, then export data then download data which downloads data in a csv (ideally i want to skip the file and put in a pandas dataframe, but that an issue for another time)
For security reasons, Chrome will not allow downloads while running headless. Here's a link to some more information and a possible workaround.
Unless you need to use Chrome, Firefox will allow downloads while headless - albeit with some tweaking.

Save complete web page (incl css, images) using python/selenium

I am using Python/Selenium to submit genetic sequences to an online database, and want to save the full page of results I get back. Below is the code that gets me to the results I want:
from selenium import webdriver
URL = 'https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastx&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome'
SEQUENCE = 'CCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACA' #'GAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGA'
CHROME_WEBDRIVER_LOCATION = '/home/max/Downloads/chromedriver' # update this for your machine
# open page with selenium
# (first need to download Chrome webdriver, or a firefox webdriver, etc)
driver = webdriver.Chrome(executable_path=CHROME_WEBDRIVER_LOCATION)
driver.get(URL)
time.sleep(5)
# enter sequence into the query field and hit 'blast' button to search
seq_query_field = driver.find_element_by_id("seq")
seq_query_field.send_keys(SEQUENCE)
blast_button = driver.find_element_by_id("b1")
blast_button.click()
time.sleep(60)
At that point I have a page that I can manually click "save as," and get a local file (with a corresponding folder of image/js assets) that lets me view the whole returned page locally (minus content which is generated dynamically from scrolling down the page, which is fine). I assumed there would be a simple way to mimic this 'save as' function in python/selenium but haven't found one. The code to save the page below just saves html, and does not leave me with a local file that looks like it does in the web browser, with images, etc.
content = driver.page_source
with open('webpage.html', 'w') as f:
f.write(content)
I've also found this question/answer on SO, but the accepted answer just brings up the 'save as' box, and does not provide a way to click it (as two commenters point out)
Is there a simple way to 'save [full page] as' using python? Ideally I'd prefer an answer using selenium since selenium makes the crawling part so straightforward, but I'm open to using another library if there's a better tool for this job. Or maybe I just need to specify all of the images/tables I want to download in code, and there is no shortcut to emulating the right-click 'save as' functionality?
UPDATE - Follow up question for James' answer
So I ran James' code to generate a page.html (and associated files) and compared it to the html file I got from manually clicking save-as. The page.html saved via James' script is great and has everything I need, but when opened in a browser it also shows a lot of extra formatting text that's hidden in the manually save'd page. See attached screenshot (manually saved page on the left, script-saved page with extra formatting text shown on right).
This is especially surprising to me because the raw html of the page saved by James' script seems to indicate those fields should still be hidden. See e.g. the html below, which appears the same in both files, but the text at issue only appears in the browser-rendered page on the one saved by James' script:
<p class="helpbox ui-ncbitoggler-slave ui-ncbitoggler" id="hlp1" aria-hidden="true">
These options control formatting of alignments in results pages. The
default is HTML, but other formats (including plain text) are available.
PSSM and PssmWithParameters are representations of Position Specific Scoring Matrices and are only available for PSI-BLAST.
The Advanced view option allows the database descriptions to be sorted by various indices in a table.
</p>
Any idea why this is happening?
As you noted, Selenium cannot interact with the browser's context menu to use Save as..., so instead to do so, you could use an external automation library like pyautogui.
pyautogui.hotkey('ctrl', 's')
time.sleep(1)
pyautogui.typewrite(SEQUENCE + '.html')
pyautogui.hotkey('enter')
This code opens the Save as... window through its keyboard shortcut CTRL+S and then saves the webpage and its assets into the default downloads location by pressing enter. This code also names the file as the sequence in order to give it a unique name, though you could change this for your use case. If needed, you could additionally change the download location through some extra work with the tab and arrow keys.
Tested on Ubuntu 18.10; depending on your OS you may need to modify the key combination sent.
Full code, in which I also added conditional waits to improve speed:
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.expected_conditions import visibility_of_element_located
from selenium.webdriver.support.ui import WebDriverWait
import pyautogui
URL = 'https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastx&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome'
SEQUENCE = 'CCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACA' #'GAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGA'
# open page with selenium
# (first need to download Chrome webdriver, or a firefox webdriver, etc)
driver = webdriver.Chrome()
driver.get(URL)
# enter sequence into the query field and hit 'blast' button to search
seq_query_field = driver.find_element_by_id("seq")
seq_query_field.send_keys(SEQUENCE)
blast_button = driver.find_element_by_id("b1")
blast_button.click()
# wait until results are loaded
WebDriverWait(driver, 60).until(visibility_of_element_located((By.ID, 'grView')))
# open 'Save as...' to save html and assets
pyautogui.hotkey('ctrl', 's')
time.sleep(1)
pyautogui.typewrite(SEQUENCE + '.html')
pyautogui.hotkey('enter')
This is not a perfect solution, but it will get you most of what you need. You can replicate the behavior of "save as full web page (complete)" by parsing the html and downloading any loaded files (images, css, js, etc.) to their same relative path.
Most of the javascript won't work due to cross origin request blocking. But the content will look (mostly) the same.
This uses requests to save the loaded files, lxml to parse the html, and os for the path legwork.
from selenium import webdriver
import chromedriver_binary
from lxml import html
import requests
import os
driver = webdriver.Chrome()
URL = 'https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastx&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome'
SEQUENCE = 'CCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACA'
base = 'https://blast.ncbi.nlm.nih.gov/'
driver.get(URL)
seq_query_field = driver.find_element_by_id("seq")
seq_query_field.send_keys(SEQUENCE)
blast_button = driver.find_element_by_id("b1")
blast_button.click()
content = driver.page_source
# write the page content
os.mkdir('page')
with open('page/page.html', 'w') as fp:
fp.write(content)
# download the referenced files to the same path as in the html
sess = requests.Session()
sess.get(base) # sets cookies
# parse html
h = html.fromstring(content)
# get css/js files loaded in the head
for hr in h.xpath('head//#href'):
if not hr.startswith('http'):
local_path = 'page/' + hr
hr = base + hr
res = sess.get(hr)
if not os.path.exists(os.path.dirname(local_path)):
os.makedirs(os.path.dirname(local_path))
with open(local_path, 'wb') as fp:
fp.write(res.content)
# get image/js files from the body. skip anything loaded from outside sources
for src in h.xpath('//#src'):
if not src or src.startswith('http'):
continue
local_path = 'page/' + src
print(local_path)
src = base + src
res = sess.get(hr)
if not os.path.exists(os.path.dirname(local_path)):
os.makedirs(os.path.dirname(local_path))
with open(local_path, 'wb') as fp:
fp.write(res.content)
You should have a folder called page with a file called page.html in it with the content you are after.
Inspired by FThompson's answer above, I came up with the following tool that can download full/complete html for a given page url (see: https://github.com/markfront/SinglePageFullHtml)
UPDATE - follow up with Max's suggestion, below are steps to use the tool:
Clone the project, then run maven to build:
$> git clone https://github.com/markfront/SinglePageFullHtml.git
$> cd ~/git/SinglePageFullHtml
$> mvn clean compile package
Find the generated jar file in target folder: SinglePageFullHtml-1.0-SNAPSHOT-jar-with-dependencies.jar
Run the jar in command line like:
$> java -jar .target/SinglePageFullHtml-1.0-SNAPSHOT-jar-with-dependencies.jar <page_url>
The result file name will have a prefix "FP, followed by the hashcode of the page url, with file extension ".html". It will be found in either folder "/tmp" (which you can get by System.getProperty("java.io.tmp"). If not, try find it in your home dir or System.getProperty("user.home") in Java).
The result file will be a big fat self-contained html file that includes everything (css, javascript, images, etc.) referred to by the original html source.
I'll advise u to have a try on sikulix which is an image based automation tool for operate any widgets within PC OS, it supports python grammar and run with command line and maybe the simplest way to solve ur problem.
All u need to do is just give it a screenshot, call sikulix script in ur python automation script(with OS.system("xxxx") or subprocess...).

Cannot use chrome cast button with selenium

I am trying to use Selenium to cast youtube videos to my chromecast. When I open youtube in chrome normally I see the cast button and it works fine. When I open it with Selenium the cast button is missing, and when I select Cast from menu it gives me the error "No Cast destinations found. Need help?"
I am using python, and have tried lots of combinations of flags with webdriver. Here is what I have
options = webdriver.ChromeOptions()
options.add_argument('--user-data-dir=./ChromeProfile')
options.add_argument('--disable-session-crashed-bubble')
options.add_argument('--disable-save-password-bubble')
options.add_argument('--disable-permissions-bubbles')
options.add_argument('--bwsi')
options.add_argument('--load-media-router-component-extension')
options.add_argument('--enable-video-player-chromecast-support');
excludeList = ['disable-component-update',
'ignore-certificate-errors',
]
options.add_experimental_option('excludeSwitches', excludeList)
chromedriverPath = '/my/path/to/chromedriver'
driver = webdriver.Chrome(chromedriverPath, chrome_options=options)
path = 'https://www.youtube.com/watch?v=Bz9Lza059NU'
driver.get(path);
time.sleep(60) # Let the user actually see something!
driver.quit()
I figured out how to get it working. It seemed to require two steps. Copying my default profile over to somewhere selenium could use it, and figuring out the correct flags to use when opening chrome. The key being selenium automatically added a bunch of flags that I didn't want, so I had to exclude one.
First to find out where my profile is stored, I opened up chrome to this url chrome://version/.
This gave me lots of information, but the important ones were
Command Line: /usr/lib/chromium-browser/chromium-browser --enable-pinch --flag-switches-begin --flag-switches-end
Profile Path: /home/mdorrell/.config/chromium/Default
First I copied my profile to some directory that Selenium could use
cp -R /home/mdorrell/.config/chromium/Default/* /home/mdorrell/ChromeProfile
Then I opened this same page in the browser opened by selenium and got the list of flags that selenium added from the Command Line row. The one that ended up giving me the problems was --disable-default-apps
In the end the code that I needed to add ended up looking like this
options = webdriver.ChromeOptions()
# Set the user data directory
options.add_argument('--user-data-dir=/home/mdorrell/ChromeProfile')
# get list of flags selenium adds that we want to exclude
excludeList = [
'disable-default-apps',
]
options.add_experimental_option('excludeSwitches', excludeList)
chromedriverPath = '/my/path/to/chromedriver'
driver = webdriver.Chrome(chromedriverPath, chrome_options=options)
path = 'https://www.youtube.com/watch?v=Bz9Lza059NU'
driver.get(path);
time.sleep(60) # Let the user actually see something!
driver.quit()
Thanks #MikeD for sharing your answer.
I was having the same issue when I wanted to chrome cast a R Shiny Dashboard via a selenium browser (with RSelenium). If I'd click on Cast it would show me "No Cast destinations found. Need help?", whereas from a normal browser it works fine.
In my case, it worked after excluding two switches (including the ChromeProfile was not necessary), which in R can be done with:
library(RSelenium)
options <- list()
options$chromeOptions$excludeSwitches <- list('disable-background-networking',
'disable-default-apps')
rD <- rsDriver(verbose = FALSE, port = 4570L, extraCapabilities = options)

Python Selenium : Can't ignore Save Dialog while trying to download a file

My scripts run on Python 3.6, Selenium 2.48 and Firefox 41 (can't upgrade, I'm on a company)
I want to download some XML files from a website using Python and Selenium Webdriver. I use a Firefox profile to avoid the dialog frame and save the file in a specific location.
profile = webdriver.firefox.firefox_profile.FirefoxProfile()
profile.set_preference("browser.download.folderList", 2)
profile.set_preference("browser.download.manager.showWhenStarting", False)
profile.set_preference("browser.download.panel.shown", False)
profile.set_preference("browser.download.dir", dloadPath)
profile.set_preference("browser.helperApps.neverAsk.openFile","application/xml,text/xml")
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/xml,text/xml")
browser = webdriver.Firefox(firefox_profile=profile)
The program finds all links downloadable (tested : works)
links = []
elements = browser.find_elements_by_xpath("//a[contains(#href,'reception/')]")
for elem in elements:
href = elem.get_attribute("href")
links.append(href)
return links
To download the file I use get() from Selenium
browser.get(fileUrl)
The files I'm looking for have a very specific url, means that I can't use Requests or urllib (2 or 3) and I need to login to the website and navigate througth it, can do It with those modules.
The url is like :
https://www.example.com/cft/cft/reception/filename.xml?user=xxxxxxxx&password=xxxxxxxx
Here is the html link :
filename.xml
With my script I can access to the website, navigate throught it but when I get the file url the dialog frame pops up, with no reasons that I found.
The script works very well on other websites, I think the problem is the url.
Thanks for your help

How can I download a file on a click event using selenium?

I am working on python and selenium. I want to download file from clicking event using selenium. I wrote following code.
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
browser = webdriver.Firefox()
browser.get("http://www.drugcite.com/?q=ACTIMMUNE")
browser.close()
I want to download both files from links with name "Export Data" from given url. How can I achieve it as it works with click event only?
Find the link using find_element(s)_by_*, then call click method.
from selenium import webdriver
# To prevent download dialog
profile = webdriver.FirefoxProfile()
profile.set_preference('browser.download.folderList', 2) # custom location
profile.set_preference('browser.download.manager.showWhenStarting', False)
profile.set_preference('browser.download.dir', '/tmp')
profile.set_preference('browser.helperApps.neverAsk.saveToDisk', 'text/csv')
browser = webdriver.Firefox(profile)
browser.get("http://www.drugcite.com/?q=ACTIMMUNE")
browser.find_element_by_id('exportpt').click()
browser.find_element_by_id('exporthlgt').click()
Added profile manipulation code to prevent download dialog.
I'll admit this solution is a little more "hacky" than the Firefox Profile saveToDisk alternative, but it works across both Chrome and Firefox, and doesn't rely on a browser-specific feature which could change at any time. And if nothing else, maybe this will give someone a little different perspective on how to solve future challenges.
Prerequisites: Ensure you have selenium and pyvirtualdisplay installed...
Python 2: sudo pip install selenium pyvirtualdisplay
Python 3: sudo pip3 install selenium pyvirtualdisplay
The Magic
import pyvirtualdisplay
import selenium
import selenium.webdriver
import time
import base64
import json
root_url = 'https://www.google.com'
download_url = 'https://www.google.com/images/branding/googlelogo/2x/googlelogo_color_272x92dp.png'
print('Opening virtual display')
display = pyvirtualdisplay.Display(visible=0, size=(1280, 1024,))
display.start()
print('\tDone')
print('Opening web browser')
driver = selenium.webdriver.Firefox()
#driver = selenium.webdriver.Chrome() # Alternately, give Chrome a try
print('\tDone')
print('Retrieving initial web page')
driver.get(root_url)
print('\tDone')
print('Injecting retrieval code into web page')
driver.execute_script("""
window.file_contents = null;
var xhr = new XMLHttpRequest();
xhr.responseType = 'blob';
xhr.onload = function() {
var reader = new FileReader();
reader.onloadend = function() {
window.file_contents = reader.result;
};
reader.readAsDataURL(xhr.response);
};
xhr.open('GET', %(download_url)s);
xhr.send();
""".replace('\r\n', ' ').replace('\r', ' ').replace('\n', ' ') % {
'download_url': json.dumps(download_url),
})
print('Looping until file is retrieved')
downloaded_file = None
while downloaded_file is None:
# Returns the file retrieved base64 encoded (perfect for downloading binary)
downloaded_file = driver.execute_script('return (window.file_contents !== null ? window.file_contents.split(\',\')[1] : null);')
print(downloaded_file)
if not downloaded_file:
print('\tNot downloaded, waiting...')
time.sleep(0.5)
print('\tDone')
print('Writing file to disk')
fp = open('google-logo.png', 'wb')
fp.write(base64.b64decode(downloaded_file))
fp.close()
print('\tDone')
driver.close() # close web browser, or it'll persist after python exits.
display.popen.kill() # close virtual display, or it'll persist after python exits.
Explaination
We first load a URL on the domain we're targeting a file download from. This allows us to perform an AJAX request on that domain, without running into cross site scripting issues.
Next, we're injecting some javascript into the DOM which fires off an AJAX request. Once the AJAX request returns a response, we take the response and load it into a FileReader object. From there we can extract the base64 encoded content of the file by calling readAsDataUrl(). We're then taking the base64 encoded content and appending it to window, a gobally accessible variable.
Finally, because the AJAX request is asynchronous, we enter a Python while loop waiting for the content to be appended to the window. Once it's appended, we decode the base64 content retrieved from the window and save it to a file.
This solution should work across all modern browsers supported by Selenium, and works whether text or binary, and across all mime types.
Alternate Approach
While I haven't tested this, Selenium does afford you the ability to wait until an element is present in the DOM. Rather than looping until a globally accessible variable is populated, you could create an element with a particular ID in the DOM and use the binding of that element as the trigger to retrieve the downloaded file.
In chrome what I do is downloading the files by clicking on the links, then I open chrome://downloads page and then retrieve the downloaded files list from shadow DOM like this:
docs = document
.querySelector('downloads-manager')
.shadowRoot.querySelector('#downloads-list')
.getElementsByTagName('downloads-item')
This solution is restrained to chrome, the data also contains information like file path and download date. (note this code is from JS, may not be the correct python syntax)
Here is the full working code. You can use web scraping to enter the username password and other field. For getting the field names appearing on the webpage, use inspect element. Element name(Username,Password or Click Button) can be entered through class or name.
from selenium import webdriver
# Using Chrome to access web
options = webdriver.ChromeOptions()
options.add_argument("download.default_directory=C:/Test") # Set the download Path
driver = webdriver.Chrome(options=options)
# Open the website
try:
driver.get('xxxx') # Your Website Address
password_box = driver.find_element_by_name('password')
password_box.send_keys('xxxx') #Password
download_button = driver.find_element_by_class_name('link_w_pass')
download_button.click()
driver.quit()
except:
driver.quit()
print("Faulty URL")

Categories