I am trying to download a .tar file through a link not directly to the file. If you browse to the page a popup appears with "Select path to download"
the url looks like this: http://document.internal.somecompany.com/Download?DocNo=2/1449-CUF10101/1
I am new to python.
This is my code so far with this part of the project:
manager = urllib2.HTTPPasswordMgrWithDefaultRealm()
manager.add_password(None, url, secrets["user"], secrets["password"])
#Create an authentication handler using the password manager
auth = urllib2.HTTPBasicAuthHandler(manager)
#Create an opener that will replace the default urlopen method on further calls
opener = urllib2.build_opener(auth)
urllib2.install_opener(opener)
#Here you should access the full url you wanted to open
response = urllib2.urlopen(url)
print response
Printing the response return this: <addinfourl at 139931044873856 whose fp = <socket._fileobject object at 0x7f443c35e8d0>>
I do not know how to go further, or if the response is anything near correct? I need to open the .tar and access a raml-file in it, and I do not know if I need to download the file and open it or just open it directly and print out the raml-file.
Any suggestions?
You need to open a file in 'wb' mode and write the content of the file you are trying to download, as the following:
response = urllib2.urlopen(url)
with open(os.path.basename(url), 'wb') as wf:
wf.write(response.read())
You can also specify the path instead of using just os.path.basename(url)
and have a look at tarfile for more info on how to deal with .tar files
Related
I tried using wget:
url = https://yts.lt/torrent/download/A4A68F25347C709B55ED2DF946507C413D636DCA
wget.download(url, 'c:/path/')
The result was that I got a file with the name A4A68F25347C709B55ED2DF946507C413D636DCA and without any extension.
Whereas when I put the link in the navigator bar and click enter, a torrent file gets downloaded.
EDIT:
Answer must be generic not case dependent.
It must be a way to download .torrent files with their original name.
You can get the filename inside the content-disposition header, i.e.:
import re, requests, traceback
try:
url = "https://yts.lt/torrent/download/A4A68F25347C709B55ED2DF946507C413D636DCA"
r = requests.get(url)
d = r.headers['content-disposition']
fname = re.findall('filename="(.+)"', d)
if fname:
with open(fname[0], 'wb') as f:
f.write(r.content)
except:
print(traceback.format_exc())
Py3 Demo
The code above is for python3. I don't have python2 installed and I normally don't post code without testing it.
Have a look at https://stackoverflow.com/a/11783325/797495, the method is the same.
I found an a way that gets the torrent files downloaded with their original name like as they were actually downloaded by putting the link in the browser's nav bar.
The solution consists of opening the user's browser from Python :
import webbrowser
url = "https://yts.lt/torrent/download/A4A68F25347C709B55ED2DF946507C413D636DCA"
webbrowser.open(url, new=0, autoraise=True)
Read more:
Call to operating system to open url?
However the downside is :
I don't get the option to choose the folder where I want to save the
file (unless I changed it in the browser but still, in case I want to save
torrents that matches some criteria in an other
path, it won't be possible).
And of course, your browser goes insane opening all those links XD
I need to open the page automatically and download the file returned by the server
I have a simple code to open the page and download the content. I am also pulling the headers so I know the name of the returned file. below is the code
downloadPageRequest = self.reqSession.get( self.url_file ,stream=True)
headers = downloadPageRequest.headers
if 'content-disposition' in headers:
file_name = re.findall("filename=(.+)", headers['content-disposition'])
that's what I got, it returns an array with the filename, but now I am stuck and have no idea how to open and go through returned excel file
this has to be done using requests, that's why i cannot use any other method (e.g selenium)
will be thankful for your support
I want to implement force in python pyramid framework when a request come like
example.com/media/files/test.mp3
it open in the browser and start running. i want to stop it and make it forcefully download.
I just working this way and it work for me force download i send file name request parameter
#view_config(route_name='download')
def download_view(request):
MEDIA_PATH= os.path.join(PROJECT_ROOT, 'media'),
if request.params.get('filename', ''):
filename = request.params['filename']
file_path = MEDIA_PATH + filename
base_file_name = os.path.basename(file_path)
response = FileResponse(file_path, request=request,cache_max_age=86400)
headers = response.headers
headers['Content-Type'] = 'application/download'
headers['Accept-Ranges'] = 'bite'
headers['Content-Disposition'] = 'attachment;filename=' +base_file_name
return response
add this view in init.py
config.add_route('download', '/download')
send file name parameter it work for me.
Just add download="test.mp3" to the download link.
So it would be like:
Download Now
I am trying to upload a file using python requests module and i am not sure whether we can use both data and files in the post call.
fileobj= open(filename,'rb')
upload_data = {
'data':payload,
'file':fileobj
}
resp = s.post(upload_url,data=upload_data,headers=upload_headers)
and this is not working. So can anyone help me with this ?
I think you should be using the data and files keyword parameters in the post request to send the data and file respectively.
with open(filename,'rb') as fileobj:
files = {'file': fileobj}
resp = s.post(upload_url,data=payload,files=files,headers=upload_headers)
I've also use a context manager just because it closes the file for me and takes care of exceptions that happen either during file opening or during something that happens with the requests post.
I'm using this code to download .torrent files:
torrent = urllib2.urlopen(torrent URL, timeout = 30)
output = open('mytorrent.torrent', 'wb')
output.write(torrent.read())
The resultant mytorrent.torrent file doesn't open in any bittorrent client and throws up "unable to parse meta file" error. The problem apparently is that although the torrent URL (e.g. http://torcache.com/torrent-file-1.torrent) ends with a '.torrent' suffix, it is compressed using gzip and needs to be uncompressed and then saved as a torrent file. I've confirmed this by unzipping the file in terminal:gunzip mytorrent.torrent > test.torrent and opening the file in the bittorrent client which opens fine.
How do I modify python to look up the file encoding and figure out how the file is compressed and use the right tool to uncompress it and save as a .torrent file?
gzip'ed data must be unziped. You can deteted this, if you look out for the content-encoding header.
import gzip, urllib2, StringIO
req = urllib2.Request(url)
opener = urllib2.build_opener()
response = opener.open(req)
data = response.read()
if response.info()['content-encoding'] == 'gzip':
gzipper = gzip.GzipFile(StringIO(fileobj=data))
plain = gzipper.read()
data = plain
output.write(data)