I've tried to adapt several existing solutions (1, 2) to the remote Firefox webdriver running in a selenium/standalone-firefox Docker container:
options = Options()
options.set_preference('browser.download.dir', '/src/app/output')
options.set_preference('browser.download.folderList', 2)
options.set_preference('browser.download.manager.showWhenStarting', False)
options.set_preference('browser.helperApps.alwaysAsk.force', False)
options.set_preference('browser.helperApps.neverAsk.saveToDisk', 'application/pdf')
options.set_preference('pdfjs.disabled', True)
options.set_preference('pdfjs.enabledCache.state', False)
options.set_preference('plugin.disable_full_page_plugin_for_types', False)
cls.driver = webdriver.Remote(
command_executor='http://selenium:4444/wd/hub',
desired_capabilities={'browserName': 'firefox', 'acceptInsecureCerts': True},
options=options
)
Navigating and clicking the relevant download button works fine, but the file never appears in the download directory. I've verified everything I can think of:
The user in the Selenium container can create files in /src/app/output and those files are visible in the host OS.
I can download the file successfully using my desktop browser.
The response content type is application/pdf.
What am I missing?
It turned out other changes done while researching this were resulting in the server returning a text/plain document rather than a PDF file. For reference, this is the simplest set of options I could get to work:
options.set_preference('browser.download.dir', DOWNLOAD_DIRECTORY)
options.set_preference('browser.download.folderList', 2)
options.set_preference('browser.helperApps.neverAsk.saveToDisk', 'application/pdf')
options.set_preference('pdfjs.disabled', True)
Related
I am trying to turn of Firefox download dialog. I used this piece of python code that use selenium library. This should make that file is directly download into entered path without additional asking.
from selenium import webdriver
def disable_download_dialog(path):
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.manager.showWhenStarting", False)
fp.set_preference("browser.download.dir", path)
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/pdf")
fp.update_preferences()
return fp.path
Then I call this function in my RF test like this:
${ff_profile_path}= disable download dialog ${EXECDIR}\\path\\to\\my\\folder
and then Open browser like this:
Open Browser ${url} ${browser} ff_profile_dir=${ff_profile_path}
From the test run I can see that download window is still displayed. The path to my folder, where I want to send downloaded file is displayed in test logs like this:
D:\\path\\to\\the\\folder\\named\\Downloads
And the firefox profile is really updated and saved in Temp file. But it looks like it's not loaded and therefore used for my test. The path to the firefox profile is like this:
C:\Users\surname~1.name\AppData\Local\Temp\tmp83d29mnz
ofc it's everytime a new profile created, what is not an issue. Maybe it could be great if I can also set the path for this firefox profile I created with python function.
So the question(s) here are:
Why the download dialog is still show when I disabled it?
Can be firefox profile saved in the folder that is defined by me?
Ok, so I found out, what was the missing piece.
I added these two lines of code into the python function
fp.set_preference("browser.helperApps.alwaysAsk.force", False)
fp.set_preference("pdfjs.disabled", True)
So the final version of the function looks like this:
def disable_download_dialog(path):
from selenium import webdriver
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.manager.showWhenStarting", False)
fp.set_preference("browser.download.dir", path)
fp.set_preference("browser.helperApps.alwaysAsk.force", False)
fp.set_preference("browser.helperApps.neverAsk.saveToDisk",'application/pdf')
fp.set_preference("pdfjs.disabled", True)
fp.update_preferences()
return fp.path
I am programming a Python Webscraper which needs to be able to click on a download button and save a PDF to a location that is defined through an XML-File.
The problematic part of my code is the following:
profile = webdriver.FirefoxProfile()
download_Path = items.get(key = 'dir') # Get download path from XML.
if not os.path.exists(download_Path):
os.makedirs(download_Path)
profile.set_preference("browser.helperApps.alwaysAsk.force", False)
profile.set_preference("browser.download.panel.shown", False)
profile.set_preference("browser.download.manager.useWindow", False)
profile.set_preference("webdriver_enable_native_events", False)
profile.set_preference("browser.helperApps.neverAsk.openFile", "application/pdf;")
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/pdf;")
profile.set_preference("browser.download.folderList", 2)
profile.set_preference("browser.download.dir", download_Path)
profile.update_preferences()
driver = webdriver.Firefox(executable_path = DriverPath, options = options, firefox_profile = profile)
Almost everything works fine, the download directory gets changed in the intended way, so the profile.set_preferences works, but the other preferences don't change. I'm searching for a while now and as you can see I tried different options so that the browser doesn't ask to open the file or where to save it, and just moves it in the given directory.
I solved it myself. The answere is, that you have to configure the PDF-Reader that is intergrated in Firefox ("PDF.js") separtly with the following code:
profile.set_preferences("pdfjs.disable", True)
That's it the rest functions as intended.
I'm not really a Python user, but I'm using some code that I got online to download a file. One of the code is:
urlpage = 'https://www150.statcan.gc.ca/n1/tbl/csv/' + '10100127' + '-eng.zip'
profile = webdriver.FirefoxProfile()
profile.set_preference("browser.download.folderList", 2)
profile.set_preference("browser.download.manager.showWhenStarting", False)
profile.set_preference("browser.download.dir", 'D:\downloads')
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/x-gzip")
driver = webdriver.Firefox()
driver.get(urlpage)
Which from what I can see, should just download the file to my D: drive in the downloads folder, yet when I run the code, the webpage opens and then asks me if I would like to either view or download the file. Is there anything wrong with the code? or am I doing something wrong?
Not sure if it's important information, but I'm using PyCharm as my IDE
Here is the script that you should use, this will save the file in system default downloads folder.
FF_options = webdriver.FirefoxProfile()
FF_options.set_preference("browser.helperApps.neverAsk.saveToDisk","application/zip")
driver= webdriver.Firefox(firefox_profile=FF_options)
If you want to save the downloaded file in specific location then add the below prefs.
# change the path here, current line will save in the working directory meaning
# the location where your script is.
FF_options.set_preference("browser.download.dir", os.getcwd())
FF_options.set_preference("browser.download.folderList",2)
My scripts run on Python 3.6, Selenium 2.48 and Firefox 41 (can't upgrade, I'm on a company)
I want to download some XML files from a website using Python and Selenium Webdriver. I use a Firefox profile to avoid the dialog frame and save the file in a specific location.
profile = webdriver.firefox.firefox_profile.FirefoxProfile()
profile.set_preference("browser.download.folderList", 2)
profile.set_preference("browser.download.manager.showWhenStarting", False)
profile.set_preference("browser.download.panel.shown", False)
profile.set_preference("browser.download.dir", dloadPath)
profile.set_preference("browser.helperApps.neverAsk.openFile","application/xml,text/xml")
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/xml,text/xml")
browser = webdriver.Firefox(firefox_profile=profile)
The program finds all links downloadable (tested : works)
links = []
elements = browser.find_elements_by_xpath("//a[contains(#href,'reception/')]")
for elem in elements:
href = elem.get_attribute("href")
links.append(href)
return links
To download the file I use get() from Selenium
browser.get(fileUrl)
The files I'm looking for have a very specific url, means that I can't use Requests or urllib (2 or 3) and I need to login to the website and navigate througth it, can do It with those modules.
The url is like :
https://www.example.com/cft/cft/reception/filename.xml?user=xxxxxxxx&password=xxxxxxxx
Here is the html link :
filename.xml
With my script I can access to the website, navigate throught it but when I get the file url the dialog frame pops up, with no reasons that I found.
The script works very well on other websites, I think the problem is the url.
Thanks for your help
I wish to have Firefox using selenium for Python to download the Master data (Download, XLSX) Excel file from this Frankfurt stock exchange webpage.
The problem: I can't get Firefox to download the file without asking where to save it first.
Let me first point out that the URL I'm trying to get the Excel file from, is really a Blob URL:
http://www.xetra.com/blob/1193366/b2f210876702b8e08e40b8ecb769a02e/data/All-tradable-ETFs-ETCs-and-ETNs.xlsx
Perhaps the Blob is causing my problem? Or, perhaps the problem is in my MIME handling?
from selenium import webdriver
profile_dir = "path/to/ff_profile"
dl_dir = "path/to/dl/folder"
ff_profile = webdriver.FirefoxProfile(profile_dir)
ff_profile.set_preference("browser.download.folderList", 2)
ff_profile.set_preference("browser.download.manager.showWhenStarting", False)
ff_profile.set_preference("browser.download.dir", dl_dir)
ff_profile.set_preference('browser.helperApps.neverAsk.saveToDisk', "text/plain, application/vnd.ms-excel, text/csv, text/comma-separated-values, application/octet-stream")
driver = webdriver.Firefox(ff_profile)
url = "http://www.xetra.com/xetra-en/instruments/etf-exchange-traded-funds/list-of-tradable-etfs"
driver.get(url)
dl_link = driver.find_element_by_partial_link_text("Master data")
dl_link.click()
The actual mime-type to be used in this case is:
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
How do I know that? Here is what I've done:
opened Firefox manually and navigated to the target site
when downloading the file, checked the checkbox to save these kind of files automatically
went to Help -> Troubleshooting Information and navigated to the "Profile Folder"
in the profile folder, foudn and opened mimetypes.rdf
inside the mimetypes.rdf found the record/resource corresponding to the excel file I've recently downloaded