I am trying to download some data off of the FanGraphs Leaderboards using selenium. I was using Firefox do to so, but Chrome is a bit faster, so I was trying to switch over to that. With Firefox, downloading the files worked find, but I have been having trouble switching over to Chrome.
Setting Up Chrome
chrome_options = webdriver.ChromeOptions()
chrome_options.headless = False
os.makedirs("dist", exist_ok=True)
preferences = {
"profile.default_content_settings.popups": 0,
"download.default_directory": "dist/",
"directory_upgrade": True
}
chrome_options.add_experimental_option(
"prefs", preferences
)
self.browser = webdriver.Chrome(
chrome_options=chrome_options
)
Exporting Data
while True:
try:
WebDriverWait(self.browser, 20).until(
expected_conditions.element_to_be_clickable(
(By.ID, "LeaderBoard1_cmdCSV")
)
).click()
break
except exceptions.ElementClickInterceptedException:
self.__close_ad()
When ever I run the tests for my module, the CSV file ends up in C:/Users/UserDir/Downloads, rather than the dist/ folder in my current working directory. I double checked that the dist/ folder exists, and it does.
Specs
Python v3.9
selenium v3.141.0
Chromedriver v89.0.4389.23
Google Chrome v88.0.4324.190
I had this same problem and I fixed as
option.add_experimental_option("prefs", {'download.default_directory': f"{download_path}",
'download.prompt_for_download': False,
'download.directory_upgrade': True})
directory_upgrade may miss download. before.
Copy and paste this one
preferences = {
"profile.default_content_settings.popups": 0,
"download.default_directory": "dist/",
"download.directory_upgrade": True
}
Related
I am trying to download a PDF from the following url,https://sec.report/Document/0001670254-20-001152/
There is a download button embedded in the html. I am using the following code to click the button and send the download to my desktop as defined in my path. The program runs without any errors but the PDF does not show up in the desktop. I have tried changing the location to different places, ie Downloads. I have also toggled the preferences in google chrome to download PDF files instead of automatically opening them in Chrome. Any ideas?
from selenium import webdriver
download_dir = "C:\\Users\\andrewlittle\\Desktop"
options = webdriver.ChromeOptions()
profile = {"plugins.plugins_list": [{"enabled": False, "name": "Chrome PDF Viewer"}],
"download.default_directory": download_dir , "download.extensions_to_open": "applications/pdf"}
options.add_experimental_option("prefs", profile)
chromedriver_path = os.getcwd() + '/chromedriver'
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('https://sec.report/Document/0001670254-20-001152/document_1.pdf')
driver.close()
Thanks in advance!
See the answer below:
import time
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium import webdriver
download_dir = "/Users/test/Documents/"
options = Options()
options.add_experimental_option('prefs', {
"download.default_directory": download_dir,
"download.prompt_for_download": False,
"download.directory_upgrade": True,
"plugins.always_open_pdf_externally": True
}
)
service = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=service, options=options)
driver.get('https://sec.report/Document/0001670254-20-001152/document_1.pdf')
time.sleep(3)
driver.quit()
I put the time.sleep in for some security in case the file takes a little longer to download. However, it is not necessary.
I also used the newer, Service and Options objects for Selenium.
The key to the code is the use of,
"download.default_directory": download_dir,
"download.prompt_for_download": False,
"download.directory_upgrade": True,
"plugins.always_open_pdf_externally": True
These allow for Chrome to download the PDF without prompt to the directory of your choice.
I wrote the following procedure, based on Selenium and Chrome, to download a PDF file to a defined folder, after performing some actions on a web app:
chrome_options = webdriver.ChromeOptions()
prefs = {'download.default_directory' : path_to_destination,
"plugins.always_open_pdf_externally": True
# Additional options I've tried but didn't work
#,"download.prompt_for_download": False,
# 'profile.default_content_setting_values.automatic_downloads': 1,
# "helperApps.neverAsk.saveToDisk": mime_types,
# "plugin.disable_full_page_plugin_for_types": mime_types
}
chrome_options.add_experimental_option('prefs', prefs)
driver = webdriver.Chrome(
executable_path=executable_path,
chrome_options=chrome_options
)
However, as soon as I click on the link that, normally, allows to visualize the pdf, the following unclickable page is displayed:
As soon as I manually click on it, everything works fine and the file is correctly downloaded to the indicated folder ("path_to_destination");
I tried with:
driver.find_element_by_xpath("//*[contains(#id, 'open-button')]").click()
# Or
driver.find_element_by_xpath("//*[contains(#id, 'main-content')]").click()
Since the xpath is:
//*[#id="main-content"]/a
But it does not work.
How can I either avoid opening this second page or clicking on the "Apri" (= Open) button?
P.S. Using Firefox and the following options, everything works fine:
# Setup
profile = webdriver.FirefoxProfile()
mime_types = "application/pdf,application/vnd.adobe.xfdf,"\
"application/vnd.fdf,application/vnd.adobe.xdp+xml"
profile.set_preference("browser.download.folderList", 2)
profile.set_preference("browser.download.manager.showWhenStarting", False)
profile.set_preference("browser.download.dir", full_destination)
# For PDFs
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", mime_types)
profile.set_preference("plugin.disable_full_page_plugin_for_types", mime_types)
profile.set_preference("pdfjs.disabled", True)
Try adding the following to prefs:
"download.prompt_for_download": False
There's a lot on this topic. However, I have found nothing workable so far that involves using what is said in the title above and configurations listed below.
Here is what I am attempting to do: go to this webpage and click on the csv document icon for download (via xpath or css selectors). Either icon is fine - they download the same content.
The sourcecode below outlines what I have done so far. This script runs with no issues, but no document is downloaded - how do I possibly resolve this issue?
Note the following parameters for OS, Python, ChromeDriver, and Chrome configurations:
macOS Mojave v.10.14.6, Python v.3.7.3, ChromeDriver v.770386540, Chrome v.770386540
from selenium import webdriver
options = webdriver.ChromeOptions()
prefs = {"download.default_directory": "SOME_PATH"}
options.add_experimental_option("prefs", prefs)
options.binary_location = 'PATH_TO_CHROME'
options.add_argument('headless')
# set the window size
options.add_argument('window-size=1200x600')
# initialize the driver
driver = webdriver.Chrome('PATH_TO_CHROME_DRIVER',
options=options)
page_url = 'http://webapps.rrc.texas.gov/eds/eds_searchUic.xhtml'
button = '//*[#id="SearchUicForm:searchTable_paginator_top"]/a[7]'
driver.get(page_url)
# wait up to 10 seconds for the elements to become available
driver.implicitly_wait(5)
driver.find_element_by_xpath(button).click()
You can comment this line of code options.add_argument('headless') and see what is happening in browser. It basically clicks the cvs icon and a download window pop up in browser so we need to handle this pop up window in order to download. We can add chrome options to prevent this.
options = Options()
options.add_experimental_option("prefs", {
"download.default_directory": r"C:\Users\xxx\downloads\Test",
"download.prompt_for_download": False,
"download.directory_upgrade": True,
"safebrowsing.enabled": True
})
driver = webdriver.Chrome(chrome_options=options)
This code is supposed to download the sample pdf file but it only displays.
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_experimental_option("prefs", {
"download.default_directory": r"/Users/ugur/Downloads/",
"download.prompt_for_download": True,
"download.directory_upgrade": False,
"safebrowsing.enabled": True
})
driver = webdriver.Chrome(executable_path="/Users/ugur/Downloads/chromedriver",chrome_options=options)
driver.get('http://www.africau.edu/images/default/sample.pdf')
This is a demonstration, real website is different and it requires authentication so after running the initial part of the code I manually enter username and password and then run a for.
On your computer, open Chrome.
Navigate to chrome://settings
Go to advance settings.
Under “Privacy”, click Content settings.
Under “PDF Documents," check the box next to "Download PDF files instead of automatically opening.”
The website has a download button and in python I can do
button.click()
to get the file downloaded to the Chrome download folder with a filename specified by the website.
Is there a way to change the target folder and filename, on Windows?
Try with:
download_dir = "/yourDownloadPath/"
chrome_options = webdriver.ChromeOptions()
preferences = {"download.default_directory": download_dir ,
"directory_upgrade": True,
"safebrowsing.enabled": True }
chrome_options.add_experimental_option("prefs", preferences)
driver = webdriver.Chrome(chrome_options=chrome_options,executable_path=r'/pathTo/chromedriver')
driver.get("urlfiletodownload");
You can create a profile for chrome and define the download location for the tests. Here is an example:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = webdriver.ChromeOptions()
options.add_argument("download.default_directory=C:/Downloads")
driver = webdriver.Chrome(chrome_options=options)
Works !!!
Store the dir path in variable and pass the variable to "download.default_directory"
exepath = sys.arg[0]
# get the path from the .py file
Dir_path = os.path.dirname(os.path.abspath(exepath))
# get the path of "PDF_Folder" directory
Download_dir = Dir_path+"\\PDF_Folder\\"
preferences = {"download.default_directory": Download_dir , # pass the variable
"download.prompt_for_download": False,
"directory_upgrade": True,
"safebrowsing.enabled": True }
chrome_options.add_experimental_option("prefs", preferences)
driver = webdriver.Chrome(chrome_options=chrome_options,executable_path=r'/pathTo/chromedriver')
driver.get("urlfiletodownload");
After wasting some time, i found out, that the recommended solution did NOT work for me:
options.add_argument("download.default_directory=C:/MyDownloadPath")
This is a code snippet below that worked for me. The chromedriver.exe was in the same folder as my python script and i wanted to download also to the same folder. You won't need the executable_path parameter if the chromedriver is in your PATH and can be found by selenium.
import os
from selenium import webdriver
localdir = os.path.dirname(os.path.realpath(__file__))
chromeOptions = webdriver.ChromeOptions()
prefs = { "download.default_directory" : localdir }
chromeOptions.add_experimental_option("prefs", prefs)
exe_path = os.path.join(localdir, 'chromedriver.exe')
with webdriver.Chrome(executable_path=exe_path, options=chromeOptions) as driver:
# do your chrome download stuff here:
driver.get(link)
My System:
Windows 10
Python 3.7.6
Chrome 80.0.3987.132
ChromeDriver 80.0.3987.106
Pip Module: Selenium 3.141.0
Date: March 2020