How to download file using python, requests module - python

I need to open the page automatically and download the file returned by the server
I have a simple code to open the page and download the content. I am also pulling the headers so I know the name of the returned file. below is the code
downloadPageRequest = self.reqSession.get( self.url_file ,stream=True)
headers = downloadPageRequest.headers
if 'content-disposition' in headers:
file_name = re.findall("filename=(.+)", headers['content-disposition'])
that's what I got, it returns an array with the filename, but now I am stuck and have no idea how to open and go through returned excel file
this has to be done using requests, that's why i cannot use any other method (e.g selenium)
will be thankful for your support

Related

How to download a file with .torrent extension from link with Python

I tried using wget:
url = https://yts.lt/torrent/download/A4A68F25347C709B55ED2DF946507C413D636DCA
wget.download(url, 'c:/path/')
The result was that I got a file with the name A4A68F25347C709B55ED2DF946507C413D636DCA and without any extension.
Whereas when I put the link in the navigator bar and click enter, a torrent file gets downloaded.
EDIT:
Answer must be generic not case dependent.
It must be a way to download .torrent files with their original name.
You can get the filename inside the content-disposition header, i.e.:
import re, requests, traceback
try:
url = "https://yts.lt/torrent/download/A4A68F25347C709B55ED2DF946507C413D636DCA"
r = requests.get(url)
d = r.headers['content-disposition']
fname = re.findall('filename="(.+)"', d)
if fname:
with open(fname[0], 'wb') as f:
f.write(r.content)
except:
print(traceback.format_exc())
Py3 Demo
The code above is for python3. I don't have python2 installed and I normally don't post code without testing it.
Have a look at https://stackoverflow.com/a/11783325/797495, the method is the same.
I found an a way that gets the torrent files downloaded with their original name like as they were actually downloaded by putting the link in the browser's nav bar.
The solution consists of opening the user's browser from Python :
import webbrowser
url = "https://yts.lt/torrent/download/A4A68F25347C709B55ED2DF946507C413D636DCA"
webbrowser.open(url, new=0, autoraise=True)
Read more:
Call to operating system to open url?
However the downside is :
I don't get the option to choose the folder where I want to save the
file (unless I changed it in the browser but still, in case I want to save
torrents that matches some criteria in an other
path, it won't be possible).
And of course, your browser goes insane opening all those links XD

Download files using Python

I'm trying to download a few files using roboBrowser, URLLIB or any other python library, but I couldn't find a way to make it work.
Basically, I have a form which retrieves a .CSV file when is submitted, but I couldn't find any way to start this download.
I have submitted the form using RoboBrowser and URLLIB post but I couldn't reach the file
Form = browser.get_form(action=re.compile(r'downloadForm'))
Form ["d_screen_file"].value = "1"
browser.submit_form(Form , submit=programForm['download'])
or
action = browser.find('form', id='fx_form').get('action')
requests.post(action)
There is another way to submit this form/make this requisition to engage this download?
I figure out how to make it work:
Using requests I do it a post with stream=True
f = session.post(FormRequest, data=search_data, stream=True)
After that, I create a CSV file to receive this data and use a for loop to parse the data using iter_content and write in the file
with open("file.csv", 'wb') as s:
for chunk in f.iter_content(chunk_size=1024):
s.write(chunk)

How to download file using Python and authentication

I am trying to download a .tar file through a link not directly to the file. If you browse to the page a popup appears with "Select path to download"
the url looks like this: http://document.internal.somecompany.com/Download?DocNo=2/1449-CUF10101/1
I am new to python.
This is my code so far with this part of the project:
manager = urllib2.HTTPPasswordMgrWithDefaultRealm()
manager.add_password(None, url, secrets["user"], secrets["password"])
#Create an authentication handler using the password manager
auth = urllib2.HTTPBasicAuthHandler(manager)
#Create an opener that will replace the default urlopen method on further calls
opener = urllib2.build_opener(auth)
urllib2.install_opener(opener)
#Here you should access the full url you wanted to open
response = urllib2.urlopen(url)
print response
Printing the response return this: <addinfourl at 139931044873856 whose fp = <socket._fileobject object at 0x7f443c35e8d0>>
I do not know how to go further, or if the response is anything near correct? I need to open the .tar and access a raml-file in it, and I do not know if I need to download the file and open it or just open it directly and print out the raml-file.
Any suggestions?
You need to open a file in 'wb' mode and write the content of the file you are trying to download, as the following:
response = urllib2.urlopen(url)
with open(os.path.basename(url), 'wb') as wf:
wf.write(response.read())
You can also specify the path instead of using just os.path.basename(url)
and have a look at tarfile for more info on how to deal with .tar files

upload a file using python requests module

I am trying to upload a file using python requests module and i am not sure whether we can use both data and files in the post call.
fileobj= open(filename,'rb')
upload_data = {
'data':payload,
'file':fileobj
}
resp = s.post(upload_url,data=upload_data,headers=upload_headers)
and this is not working. So can anyone help me with this ?
I think you should be using the data and files keyword parameters in the post request to send the data and file respectively.
with open(filename,'rb') as fileobj:
files = {'file': fileobj}
resp = s.post(upload_url,data=payload,files=files,headers=upload_headers)
I've also use a context manager just because it closes the file for me and takes care of exceptions that happen either during file opening or during something that happens with the requests post.

Dynamically Export URL Document To Server Using Python

I'm writing a script in python and I'm trying to wrap my head around a problem. I've a URL that when opened, downloads a document. I'm trying to write a python script that opens the https URL that downloads this document, and automatically send that document to a server I have opened using python's pysftp module.
I can't wrap my head around how to do this... Do you think I'd be able to just do:
server.put(urllib.open('https://......./document'))
EDIT:
This is the code I've tried before the above doesn't work...
download_file = urllib2.urlopen('https://somewebsite.com/file.csv')
file_contents = download_file.read().replace('"', '')
columns = [x.strip() for x in file_contents.split(',')]
# Write Downloaded File Contents To New CSV File
with open('file.csv', 'wb') as f:
writer = csv.writer(f)
writer.writerow(columns)
# Upload New File To Server
srv.put('./file.csv', './SERVERFOLDER/file.csv')
ALSO:
How would I go about getting a FILE that is ONE DAY old from the server? (Examining age of each file)... using paramiko

Categories