python requests download trough download button on page

python requests download trough download button on page - python

So I am creating an application were you can download files trough a link. This webpage contains a download button and that needs to be pressed in order to start the download. This is the link where you can reference to: link. My code:
link = input("enter link: ")
r = requests.get(link, allow_redirects=True)
how can I make requests or any other library click on the download button and save this file?

Using selenium:
INSTALLATION
Skip this if you already have selenium installed.
Install Selenium, type the following in your terminal: pip3 install selenium
Now you need a webdriver for Selenium. If you are wanting to use Chrome, firstly type "chrome://version/" in your browser and find the version you are using. Then go to this link and download the appropriate webdriver for your browser. If you are using a different browser, like Firefox for example, just type selenium [your browser] webdriver.
Installation docs
CODE
Now for the code (following code is for Chrome, you would only need to change driver = webdriver.Chrome() if you are using a different webdriver):
from selenium import webdriver #importing webdriver
PATH = "C:/path/to/chromedriver" #webdriver location
driver = webdriver.Chrome(PATH)
link = input("Enter link: ")
driver.get(link) #going to URL
driver.find_element_by_xpath("/html/body/div/main/div[3]/div/div/div/div/div[2]/div[2]/span/button")\
.click() #clicking on the button
I used full xpath to locate the button, but you can use multiple things as seen in the documentation.

Related

Download file from linked HTML ref, use in Selenium python script

I am trying to create an automation process for downloading updated versions of VS Code Marketplace extensions, and have a selenium python script that takes in a list of extension hosting pages and names, navigates to the extension page, clicks on version history tab, and clicks the top (most-recent) download link. I change the driver's chrome options to edit chrome's default download directory to a created folder under that extension's name. (ex. download process from marketplace)
This all works well, but is extremely time consuming because a new window needs to be opened upon each iteration with a different extension as the driver settings have to be reset to change the chrome download location. Furthermore, selenium guidance recommends against download clicks and to rather capture URL and translate to an HTTP request library.
To solve this, I am trying to use urllib download from an http link and download to a specified path- this could then let me get around needing to reset the driver settings upon every iteration, which would then allow me to run the driver in a single window and just open new tabs to save overall time. urllib documentation
However, when I inspect the download button on an extension, the only link I can find is the href link which has a format like:
https://marketplace.visualstudio.com/_apis/public/gallery/publishers/grimmer/vsextensions/vscode-back-forward-button/0.1.6/vspackage (raw html)
In examples in the documentation the links have a format like:
https://www.facebook.com/favicon.ico
with the filename on the end.
I have tried multiple functions from urllib to download from that href link, but it doesn't seem to recognize it, so I'm not sure if there's any way to get a link that looks like the format from the documention, or some other solution?
Also, urllib seems to require the file name (i.e. extensionversionnumber.vsix) at the end of the path to download to a specified location, but I can't seem to pull the file name from the html etiher.
import os
from struct import pack
import time
import pandas as pd
import urllib.request
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
inputLocation=input("Enter csv file path: ")
fileLocation=os.path.abspath(inputLocation)
inputPath=input("Enter path to where packages will be stored: ")
workingPath=os.path.abspath(inputPath)
df=pd.read_csv(fileLocation)
hostingPages=df['Hosting Page'].tolist()
packageNames=df['Package Name'].tolist()
chrome_options = webdriver.ChromeOptions()
def downloadExtension(url, folderName):
os.chdir(workingPath)
if not os.path.exists(folderName):
os.makedirs(folderName)
filepath=os.path.join(workingPath, folderName)
chrome_options.add_experimental_option("prefs", {
"download.default_directory": filepath,
"download.prompt_for_download": False,
"download.directory_upgrade": True
})
driver=webdriver.Chrome(options=chrome_options)
wait=WebDriverWait(driver, 20)
driver.get(url)
wait.until(lambda d: d.find_element(By.ID, "versionHistory"))
driver.find_element(By.ID, "versionHistory").click()
wait.until(lambda d: d.find_element(By.LINK_TEXT, "Download"))
#### attempt to use urllib to download by html request rather than click ####
link=driver.find_element(By.LINK_TEXT, "Download").get_attribute('href')
urllib.request.urlretrieve(link, filepath)
#### above line does not work ####
driver.quit()
for i in range(len(hostingPages)):
downloadExtension(hostingPages[i], packageNames[i])

How do I get Microsoft Edge to click on SAVE AS with Python

In our company we are using Selenium for web-automation on Windows. Win runs with signed-in user and locked screen. Recently company changed security for not starting downloading automatically. I need to click with Python on Save As button (while Windows is locked => pyautogui is not a option). What about pywinauto or other lib? thx

If you can still use Selenium, try this out:
NOTE - Tested using Windows 11, Edge 97, Python 3.9, and Selenium 4.1
import os
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.edge.service import Service
# 'executable_path' is deprecated. Use 'service'
driver = webdriver.Edge(
service=Service("C:\\Users\\kabarto\\Documents\\webdrivers\\msedgedriver.exe"))
driver.get("https://developer.microsoft.com/en-us/microsoft-edge/tools/webdriver/")
# Find the link and use .click to save to Downloads
driver.find_element(
By.XPATH,
"//a[#href=\"https://msedgedriver.azureedge.net/97.0.1072.62/edgedriver_win64.zip\"]"
).click()
# Close browser when download is complete
while not os.path.exists("C:\\Users\\kabarto\\Downloads\\edgedriver_win64.zip"):
time.sleep(1)
driver.quit()
You can also get a list of all the links, and then go through them to find the one you want:
links = driver.find_elements(By.TAG_NAME, "a")
for l in links:
if "win64" in l.get_attribute("href"):
print(l.get_attribute("href"))

I figured it out (work around):
Before downloading, to open a new tab with Edge Downloads
To launch downloading (getting pop-up)
To active tab with Downloads and with Selenium click on particular option (Cancel Save Save As ....)

How do I use a feature on a website using python

I am trying to figure out how to activate/click on a feature using python. Like it goes to a page and click on a certain button. How can I do this? Are there any modules that may help?

Try using the selenium package in Python.
Once you pip install selenium and download chromedriver, you should be able to use something like this -
from selenium import webdriver
url = "your_url"
chrome_options = webdriver.ChromeOptions()
driver = webdriver.Chrome("/path/to/chromedriver", chrome_options=chrome_options)
driver.delete_all_cookies()
driver.get(url)
And after your page opens, you'll first have to find the element using inspect and then based on its name/id/class/etc, you can click on it using -
driver.find_element_by_name('<element_name>').click()

Selenium - unable to submit comment

I am trying to write a script that would allow me to submit comments to a news website programmatically.
I am using Selenium and here is my scrip (with the exact link I am trying to work with):
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome()
url = "https://www.delfi.lt/en/lifestyle/earth-day-events-for-the-spring-equinox.d?id=87005127"
driver.get(url)
# Clicking 'I agree' on a cookies banner:
cookies_ok = '//*[#id="c-right"]/a'
driver.find_element_by_xpath(cookies_ok).click()
# XPath list
anon = '//*[#id="comments-listing"]/div[2]/div/div[2]/div/ul/li[1]/span'
name = '//*[#id="inputDiv"]/div/form/input'
comment = '//*[#id="inputDiv"]/div/form/div[3]/div/textarea'
button = '//*[#id="inputDiv"]/div/form/div[4]/div[2]/button[1]'
# Click 'Anonymous' -> fill name and comment fields -> press PUBLISH
driver.find_element_by_xpath(anon).click()
driver.find_element_by_xpath(name).send_keys('name')
driver.find_element_by_xpath(comment).send_keys('comment')
driver.find_element_by_xpath(button).click()
Everything works, but when I the last command is executed, I am getting this message on the website:
"Cookies are blocked or not supported by your browser". However, when I follow the same steps myself in browser, there are no issues with cookies.
Any ideas on how to prevent this error?
Thanks

try using this
driver = webdriver.Chrome(executable_path=webdriver_manager.chrome.ChromeDriverManager().install())
This will install the latest chrome browser and your test will run.
You may need to install webdriver-manager by using pip install webdriver-manager

You may optimize the code and I assume you have to latest binaries :
driver = webdriver.Chrome("C:\\Users\\***\\Desktop\\Selenium+Python\\chromedriver.exe")
driver.maximize_window()
wait = WebDriverWait(driver, 30)
driver.get("https://www.delfi.lt/en/lifestyle/earth-day-events-for-the-spring-equinox.d?id=87005127")
wait.until(EC.element_to_be_clickable((By.XPATH, "//*[#id='c-right']/a"))).click()
ActionChains(driver).move_to_element(wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "li.as-link:first-child")))).click().perform()
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input.input-name"))).send_keys("denisafonin")
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "textarea.input-message"))).send_keys("Your comment")
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button.input-login"))).click()

Navigating to a web page and downloading a report using Python

Could you please let me know how to improve the following script to actually click on the export button.
The following script goes to the report's page but does not click on the export button:
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("<Path to Chrome profile>") #Path to your chrome profile
url = '<URL of the report>'
driver = webdriver.Chrome(executable_path="C:/tools/selenium/chromedriver.exe", chrome_options=options)
driver.get(url)
exportButton = driver.find_element_by_xpath('//*[#id="js_2o"]')
clickexport = exportButton.click()
How would you make the script actually click on the export button?
I would appreciate your help.
Thank you!

try with xpath, example:
driver.find_element_by_xpath('//button[#id="export_button"]').click()

Selenium isn't designed for this. Do you actually care about using Selenium and the browser, or do you just want the file? If the latter, use requests. You can use the browser network inspector, right click->"copy as curl" to get all the headers and cookies you need.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

python requests download trough download button on page - python

Related

Download file from linked HTML ref, use in Selenium python script

How do I get Microsoft Edge to click on SAVE AS with Python

How do I use a feature on a website using python

Selenium - unable to submit comment

Navigating to a web page and downloading a report using Python

Categories

Resources