How can I automate downloads using IDM? - python

I want to automate downloading using selenium python which in turn carries the link to IDM. However, the thing is I can't get to download using IDM.

Thisis not good practice in selenium automation
Whilst it is possible to start a download by clicking a link with a browser under Selenium’s control, the API does not expose download progress, making it less than ideal for testing downloaded files. This is because downloading files is not considered an important aspect of emulating user interaction with the web platform. Instead, find the link using Selenium (and any required cookies) and pass it to a HTTP request library like libcurl.
Please refer seleniumhq site

This is the syntax to run IDM in Python. It will download to the default local path. Use an additional parameter '/p' to change the local path if needed.
from subprocess import run
idm_path = "C:\Program Files (x86)\Internet Download Manager\idman.exe"
url = "example url"
filename = "song.mp3"
run([idm_path, '/d', url, '/f', filename, '/n'])
source: Start IDM download from command line.

Related

How to download a file that takes 5 seconds to finish with python?

I am trying to write some code that would download a file. Now this file from this website specifically, once you go on to that link, it takes 5 seconds for it to actually prompt the download, for example: https://sourceforge.net/projects/esp32-s2-mini/files/latest/download
I have tried using the obvious methods, such as wget.download and urllib.request.urlretrieve
urllib.request.urlretrieve('https://sourceforge.net/projects/esp32-s2-mini/files/latest/download', 'zzz')
get.download('https://sourceforge.net/projects/esp32-s2-mini/files/latest/download', 'zzzdasdas')
However, that does not work, it downloads something else, but not what I want it to.
Any suggestions would be great.
Using chrome's download page (ctrl+j should open it, or just click "Show All" when downloading a file), we can see all of our recent downloads. The link you provided is just the page that begins the download, not the location of the actual file itself. Right-clicking the blue name lets us copy the address to the actual file being downloaded.
The actual link of the file, in this case, is https://cfhcable.dl.sourceforge.net/project/esp32-s2-mini/ToolFlasher/NodeMCU-PyFlasher-3.0-x64.exe
We can then make a GET request to download the file. Testing this out with bash wget downloads the file properly.
wget https://versaweb.dl.sourceforge.net/project/esp32-s2-mini/ToolFlasher/NodeMCU-PyFlasher-3.0-x64.exe
You can, of course, use python requests to accomplish this as well.
import requests
response = requests.get(r"https://cfhcable.dl.sourceforge.net/project/esp32-s2-mini/ToolFlasher/NodeMCU-PyFlasher-3.0-x64.exe")
with open("NodeMCU-PyFlasher-3.0-x64.exe", "wb") as f:
f.write(response.content)
Note that we are using wb (write bytes) mode instead of the default w (write).

Can I manipulate an image in the browser with github pages?

Is it possible to upload and manipulate a photo in the browser with GitHub-pages? The photo doesn't need to be stored else than just for that session.
PS. I'm new to this area and I am using python to manipulate the photo.
GitHub pages allows users to create static HTML sites. This means you have no control over the server which hosts the HTML files - it is essentially a file server.
Even if you did have full control over the server (e.g. if you hosted your own website), it would not be possible to allow the client to run Python code in the browser since the browser only interprets JavaScript.
Therefore the most easy solution is to re-write your code in JavaScript.
Failing this, you could offer a download link to your Python script, and have users trust you enough to run it on their computer.

Downloading a file from a html? url with python 3

I've been searching for hours on how to download a file the documentation shows me how to do this; but cygwin is horrible and an annoyance to use and I'm trying to implement this in Python 3 for a program. I've tried to use urllib, requests, wget(in python), httplib and some other. But it only fetched the redirected page (as you would get if you paste the link in the url bar with the properly formatted url.)
Though when I inspect a page and I trigger the download link that has the same address that I tried, it works properly and provide me with a download pop-up. Here is an example page the link is triggered by clicking "Download data"
I don't get how any python package is unable to send the proper get request and that I would need to implement this program in linux only to be able to use 'wget'.
Anyone has a clue on how to properly call the url?
You need to add &submit=Download+Data to the end of your URL to download the data. You can see this with the network tab of inspect element in google chrome. Hope I helped!
I think
from subprocess import call
def download(URL)
CMD = ['curl',url]
call(CMD)
to run this:
download('www.download.com/blah/bah/blah')
if you want to use this from the interpreter:
save as module.py
python -i /path/to/module.py
>>>download('www.download.com/blah/bah/blah')
p.s. if this works i'll prob use this in my shell program
EDIT: my comment:
I tried this and got "malformed url" error
from subprocess import call
def download(FILE,URL):
#FILE = file to save to
#URL - download from here
CMD = ['curl','-o',FILE,URL]
call(CMD)
this is what i do for all system commands from python so its something to do with curl specifically.

How to download a file pushed to a browser using python?

I want to download a zip file using python.
With this type of url,
http://server.com/file.zip
this is quite simple by using urllib2.urlopen and writing it in a local file.
But in my case I have this type of url:
http://server.com/customer/somedata/download?id=121&m=zip,
the download is launched after a form validation.
It could be useful to precise that in my case I want to deploy it on heroku, so I can't use spynner that is built with C++. This download is launched after a scraping that uses scrapy.
From a browser the download works well, I get a good zip file with its name. Using python I just get html and header data...
Is there any way to get a file from this type of url in python ?
This Site is serving JavaScript which then invokes the download.
You have no choice but to: a) evaluate the JavaScript in a simulated Browser environment or b) parse manually what the JS does, and re-implement that in python. e.g. string extraction of the URL and download key, possibly invoking an AJAX request, and finally download the file
I generally recommend Mechanize for webpage related automation, but it cannot deal with JavaScript either, so I guess you can stick with Scrapy if you want to go for plan b).
When you do the download in the browser, open up the network tab of the developer console and record what HTTP method (probably POST), the POST parameters, the cookie, and everything else that is part of the validation; then use a library to replicate that.

python mechanize blank download or how to do it in casperjs

I am downloading information for a research project from a site that uses ajax to load URLs and does not allow serial downloading. I am dumping the urls from casperjs into a file I read and use browser.retrieve(url,dump_filename) to download the information with mechanize. I mostly get blank file downloads but they are periodically filled with content. Is there a way to modify the headers so that I can always get data. Also, a casperjs download alternative is welcome. I have tried casperjs download() but it saves a blank file as well. I think it has something to do with the headers. File downloads always work in a browser.
I prefer Selenium over Mechanize when it comes to more "sophisticated" web-sites, that use AJAX, JS, etc.
You said downloading works, when you're using your browser. Well Selenium does the same thing - it uses Firefox on your desktop to fulfill its tasks

Categories