Python : Downloading file from the link which is implemented with php - python

I am trying to download file from the web page.
The link of file is implemented by php:~/download.php?id=~
The download of a file is possible to click the link or right-click and select the menu, "save this file" in the web browser.
At first, I used the selenium with phantomjs. It was successful to get the link with tag "a" by "find_element". I performed clicking or right-clicking with ActionChains of selenium, but it couldn't download the file. By searching the web, it looks like phantomjs doesn't support the download of a file.
What I consider to use as second way is using firefox or chrome which looks like supporting downloading file. Please give me an advice whether this way is the best or not. I am running the program on raspberry pi b+.
Thank you very much.

The easiest way to download file:
import urllib
url = "http://domain.com/~/download.php?id=~"
path_to_file = "/local/folder/where/you/want/to/save/file/file_name"
Python 2.x
urllib.urlretrieve(url, path_to_file)
Python 3.x
urllib.request.urlretrieve(url, path_to_file)
If you need to download file with selenium:
Firefox
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
profile = FirefoxProfile ()
profile.set_preference("browser.download.folderList",2)
profile.set_preference("browser.download.manager.showWhenStarting",False)
profile.set_preference("browser.download.dir", '/download/folder/by/default')
profile.set_preference("browser.helperApps.neverAsk.saveToDisk",file_MIME_type)
driver = webdriver.Firefox(firefox_profile=profile)
Chrome
from selenium import webdriver
download_dir = "/download/folder/by/default"
chrome_options = webdriver.ChromeOptions()
preferences = {"download.default_directory": download_dir ,
                      "directory_upgrade": True,
                      "safebrowsing.enabled": True }
chrome_options.add_experimental_option("prefs", preferences)
driver = webdriver.Chrome(chrome_options=chrome_options)

Related

chromeOptions.add_experimental_option no such attribute

I wish to do a direct download of a PDF and not display in Chrome's pdf view plugin
The Python code I found is
chromeOptions = webdriver.ChromeOptions()
prefs = {"plugins.plugins_disabled" : ["Chrome PDF Viewer"]}
chromeOptions.add_experimental_option("prefs",prefs)
driver=webdriver.Chrome('/usr/lib/chromium-browser/chromedriver', chrome_options=chromeOptions)
chromeOptions does not have an add_experimental_option function/methodP.
Is there a way to make this work please.
Here is the proper way to initialize chrome options:
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
I believe that is your issue. I tested this code and it worked for me:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
prefs = {"plugins.plugins_disabled" : ["Chrome PDF Viewer"]}
chrome_options.add_experimental_option("prefs",prefs)
driver=webdriver.Chrome(chrome_options=chrome_options)
For more information you can read the docs here regarding the Chrome WebDriver API for Selenium
For whatever reason the method add_experimental_option does not appear. Possibly this is because I am using a Linux install. My goal is to download a series of PDFs automatically. A work around is to first get the PDF in the pdf-viewer by finding a web element with the click() command. this loads the PDF into the viewer, then read the contents of the URL bar, the use the PDF address to make a call to the Linux operating system running the dowload command "wget" to obtain the PDF file. That is:
driver.find_element_by_class_name('browzine-direct-to-pdf-link').click()
pdfAddress=driver.current_url
os.system("wget %s -P /home/keir/Downloads/pdfs" % pdfAddress)

Unable to download a file through requests after getting url from selenium webdriver

While working with selenium webdriver, I want to set download location to a particular location and work with the headless browser. But I am unable to do both at once. Upon going headless, download location changes back.
Here is the piece of my code:
options = webdriver.ChromeOptions()
options.add_experimental_option("prefs",{
"download.default_directory":os.getcwd()+"\mydir",
"download.prompt_for_download":False,
"download.directory_upgrade": True
})
options.add_argument('--headless')
driver = webdriver.Chrome(chrome_options=options)
Unfortunately, chromedriver does not currently support headless downloads.

Force Selenium Chrome Driver to use QUIC instead of TCP

I am working on downloading HAR from Chrome for YouTube through Selenium Python Script.
Code Snippet:
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--proxy-server={0}".format(url))
chrome_options.add_argument("--enable-quic")
self.driver = webdriver.Chrome(chromedriver,chrome_options = chrome_options)
self.proxy.new_har(args['url'], options={'captureHeaders': True})
self.driver.get(args['url'])
result = json.dumps(self.proxy.har, ensure_ascii=False)
I want QUIC to be used whenever I download HAR but when I look at the packets through Wireshark Selenium driver is using TCP only. Is there a way to force Chrome Driver to use QUIC? Or Is there an alternate to BMP?
A similar thing has been asked for Firefox in this question How to capture all requests made by page in webdriver? Is there any alternative to Browsermob? and there was a solution with Selenium alone without need of any BMP. So is it possible for Chrome?
Workaround for this problem could be: start Chrome normally (with your default profile or create another profile) and enable quic manually. Then start chromedriver with your profile loaded.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = webdriver.ChromeOptions()
options.add_argument("user-data-dir=/home/user/.config/google-chrome")
driver = webdriver.Chrome(executable_path="/home/user/Downloads/chromedriver", chrome_options=options)

how can i remove notifications and alerts from browser? selenium python 2.7.7

I am trying to submit information in a webpage, but selenium throws this error:
UnexpectedAlertPresentException: Alert Text: This page is asking you
to confirm that you want to leave - data you have entered may not be
saved. ,
>
It's not a leave notification; here is a pic of the notification -
.
If I click in never show this notification again, my action doesn't get saved; is there a way to save it or disable all notifications?
edit: I'm using firefox.
You can disable the browser notifications, using chrome options. Sample code below:
chrome_options = webdriver.ChromeOptions()
prefs = {"profile.default_content_setting_values.notifications" : 2}
chrome_options.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(chrome_options=chrome_options)
With the latest version of Firefox the above preferences didn't work.
Below is the solution which disable notifications using Firefox object
_browser_profile = webdriver.FirefoxProfile()
_browser_profile.set_preference("dom.webnotifications.enabled", False)
webdriver.Firefox(firefox_profile=_browser_profile)
Disable notifications when using Remote Object:
webdriver.Remote(desired_capabilities=_desired_caps, command_executor=_url, options=_custom_options, browser_profile=_browser_profile)
selenium==3.11.0
Usually with browser settings like this, any changes you make are going to get throws away the next time Selenium starts up a new browser instance.
Are you using a dedicated Firefox profile to run your selenium tests? If so, in that Firefox profile, set this setting to what you want and then close the browser. That should properly save it for its next use. You will need to tell Selenium to use this profile though, thats done by SetCapabilities when you start the driver session.
This will do it:
from selenium.webdriver.firefox.options import Options
options = Options()
options.set_preference("dom.webnotifications.enabled", False)
browser = webdriver.Firefox(firefox_options=options)
For Google Chrome and v3 of Selenium you may receive "DeprecationWarning: use options instead of chrome_options", so you will want to do the following:
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
options = webdriver.ChromeOptions()
options.add_argument('--disable-notifications')
driver = webdriver.Chrome(ChromeDriverManager().install(), options=options)
Note: I am using webdriver-manager, but this also works with specifying the executable_path.
This answer is an improvement on TH Todorov code snippet, based on what is working as of Chrome (Version 80.0.3987.163).
lk = os.path.join(os.getcwd(), "chromedriver",) --> in this line you provide the link to the chromedriver, which you can download from chromedrive link
import os
from selenium import webdriver
lk = os.path.join(os.getcwd(), "chromedriver",)
chrome_options = webdriver.ChromeOptions()
prefs = {"profile.default_content_setting_values.notifications" : 2}
chrome_options.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(lk, options=chrome_options)

Download file from sharepoint using selenium webdriver python

I am trying to download file from sharepoint url and written code to neverask.savetodisk but still it is showing dialog to save file. I tried same code and it works when we click download link from other URL but not working with sharepoint application. Here is code what i used...
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
# To prevent download dialog
profile = webdriver.FirefoxProfile()
profile.set_preference('browser.download.folderList', 2) # custom location
profile.set_preference('browser.download.manager.showWhenStarting', False)
profile.set_preference("browser.download.defaultFolder",'tt_at');
profile.set_preference("browser.download.lastDir",'tt_at');
profile.set_preference('browser.download.dir', 'tt_at')
profile.set_preference("browser.download.useDownloadDir",True);
profile.set_preference('browser.helperApps.neverAsk.saveToDisk', "application/octet-stream,application/msexcel")
browser = webdriver.Firefox(profile)
browser.get("https://docs.ad.sys.com/sites/cloud/Project/Form/FolderCTID=0x01200069047C40C93C3846B74E0776AAD1610A&InitialTabId=Ribbon%2EDocument&VisibilityContext=WSSTabPersistence")
browser.find_element_by_xpath('/html/body/form/div[8]/div/div[3]/div[3]/div[2]/div/div/table/tbody/tr/td/table/tbody/tr/td/div/table[1]/tbody/tr/td/table/tbody/tr[12]/td[4]/div[1]/a').click()
but this above code still showing dialog to select location.
I think I got the solution, try the following:
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
profile = webdriver.FirefoxProfile()
#Give Complete Path of download dir
profile.set_preference("browser.download.lastDir",r'd:\temp')
profile.set_preference("browser.download.useDownloadDir",True)
profile.set_preference("browser.download.manager.showWhenStarting",False)
profile.set_preference('browser.helperApps.neverAsk.saveToDisk',"application/vnd.ms-excel,Content-Type=application/vnd.openxmlformats-officedocument.spreadsheetml.sheet,application/octet-stream")
profile.set_preference('browser.helperApps.neverAsk.openFile', "application/vnd.ms-excel,Content-Type=application/vnd.openxmlformats-officedocument.spreadsheetml.sheet,application/octet-stream")
browser = webdriver.Firefox(profile)
browser.get("https://docs.ad.sys.com/sites/cloud/Project/Form/FolderCTID=0x01200069047C40C93C3846B74E0776AAD1610A&InitialTabId=Ribbon%2EDocument&VisibilityContext=WSSTabPersistence")
browser.find_element_by_xpath('/html/body/form/div[8]/div/div[3]/div[3]/div[2]/div/div/table/tbody/tr/td/table/tbody/tr/td/div/table[1]/tbody/tr/td/table/tbody/tr[12]/td[4]/div[1]/a').click()
If this is also not working for you, do the following, install the addon tamperdata in firefox and observe the content type for the file you are trying to download and then add that exact text to "browser.helperApps.neverAsk.*" preference. That shall solve your problem!

Categories