Python file upload from url using requests library

Python file upload from url using requests library - python

I want to upload a file to an url. The file I want to upload is not on my computer, but I have the url of the file. I want to upload it using requests library. So, I want to do something like this:
url = 'http://httpbin.org/post'
files = {'file': open('report.xls', 'rb')}
r = requests.post(url, files=files)
But, only difference is, the file report.xls comes from some url which is not in my computer.

The only way to do this is to download the body of the URL so you can upload it.
The problem is that a form that takes a file is expecting the body of the file in the HTTP POST. Someone could write a form that takes a URL instead, and does the fetching on its own… but that would be a different form and request than the one that takes a file (or, maybe, the same form, with an optional file and an optional URL).
You don't have to download it and save it to a file, of course. You can just download it into memory:
urlsrc = 'http://example.com/source'
rsrc = requests.get(urlsrc)
urldst = 'http://example.com/dest'
rdst = requests.post(urldst, files={'file': rsrc.content})
Of course in some cases, you might always want to forward along the filename, or some other headers, like the Content-Type. Or, for huge files, you might want to stream from one server to the other without downloading and then uploading the whole file at once. You'll have to do any such things manually, but almost everything is easy with requests, and explained well in the docs.*
* Well, that last example isn't quite easy… you have to get the raw socket-wrappers off the requests and read and write, and make sure you don't deadlock, and so on…

There is an example in the documentation that may suit you. A file-like object can be used as a stream input for a POST request. Combine this with a stream response for your GET (passing stream=True), or one of the other options documented here.
This allows you to do a POST from another GET without buffering the entire payload locally. In the worst case, you may have to write a file-like class as "glue code", allowing you to pass your glue object to the POST that in turn reads from the GET response.
(This is similar to a documented technique using the Node.js request module.)

import requests
img_url = "http://...."
res_src = requests.get(img_url)
payload={}
files=[
('files',('image_name.jpg', res_src.content,'image/jpeg'))
]
headers = {"token":"******-*****-****-***-******"}
response = requests.request("POST", url, headers=headers, data=payload, files=files)
print(response.text)
above code is working for me.

Related

Python requests module not passing params in session

I am using am attempting to do a bulk download of a series of PDFs from a site that requires login authentication. I am able to successfully log in, however, when I attempt a GET request for '/transcripts/transcript.pdf?user_id=3007' but, the request returns the content for '/transcripts/transcript.pdf'.
Does anyone have any idea why the URL param is not sending? Or why it would be rerouted?
I have tried passing the parameter 'user_id' as data, params, and hardcoded in the URL.
I have removed the actual domain from the strings below just for privacy
with requests.Session() as s:
login = s.get('<domain>/login/canvas')
# print the html returned or something more intelligent to see if it's a successful login page.
print(login.text)
login_html = lxml.html.fromstring(login.text)
hidden_inputs = login_html.xpath(r'//form//input[#type="hidden"]')
form = {x.attrib["name"]: x.attrib["value"] for x in hidden_inputs}
print("form: ",form)
form['pseudonym_session[unique_id]']= username
form['pseudonym_session[password]']= password
response = s.post('<domain>/login/canvas',data=form)
print(response.url, response.status_code) # gets <domain>?login_success=1 200
# An authorised request.
data = { 'user_id':'3007'}
r = s.get('<domain>/transcripts/transcript.pdf?user_id=3007', data=data)
print(r.url) # gets <domain>/transcripts/transcript.pdf
print(r.status_code) # gets 200
with open('test.pdf', 'wb') as f:
f.write(r.content)
GET response returns /transcripts/transcript.pdf and not /transcripts/transcript.pdf?user_id=3007

From the looks of it, you are trying to use canvas. I'm pretty sure in canvas, you can bulk download all test attachments.
If that's not the case, There are a few things to try:
after logging in, try typing the url with user_id into a browser. Does that take you directly to the PDF file or links to one?
if so, look at the url, it may simply not display the parameters; some websites do this, don't worry about it
If not, GET may not be enough; perhaps the site uses javascript, etc.

after looking through the '.history' of the request I found a series of 302 redirects.
The first was to '/login?force_login=0&target_uri=%2Ftranscripts%2Ftranscript.pdf'
In a desperate attempt, I tried: s.get('/login?force_login=0&target_uri=%2Ftranscripts%2Ftranscript.pdf%3Fuser_id%3D3007') and this still rerouted me a few times but ultimately got me the file I wanted!
If anyone has a more elegant solution to this or any resources that I can read I would greatly appreciate it!

How to upload a binary/video file using Python http.client PUT method?

I am communicating with an API using HTTP.client in Python 3.6.2.
In order to upload a file it requires a three stage process.
I have managed to talk successfully using POST methods and the server returns data as I expect.
However, the stage that requires the actual file to be uploaded is a PUT method - and I cannot figure out how to syntax the code to include a pointer to the actual file on my storage - the file is an mp4 video file.
Here is a snippet of the code with my noob annotations :)
#define connection as HTTPS and define URL
uploadstep2 = http.client.HTTPSConnection("grabyo-prod.s3-accelerate.amazonaws.com")
#define headers
headers = {
'accept': "application/json",
'content-type': "application/x-www-form-urlencoded"
}
#define the structure of the request and send it.
#Here it is a PUT request to the unique URL as defined above with the correct file and headers.
uploadstep2.request("PUT", myUniqueUploadUrl, body="C:\Test.mp4", headers=headers)
#get the response from the server
uploadstep2response = uploadstep2.getresponse()
#read the data from the response and put to a usable variable
step2responsedata = uploadstep2response.read()
The response I am getting back at this stage is an
"Error 400 Bad Request - Could not obtain the file information."
I am certain this relates to the body="C:\Test.mp4" section of the code.
Can you please advise how I can correctly reference a file within the PUT method?
Thanks in advance

uploadstep2.request("PUT", myUniqueUploadUrl, body="C:\Test.mp4", headers=headers)
will put the actual string "C:\Test.mp4" in the body of your request, not the content of the file named "C:\Test.mp4" as you expect.
You need to open the file, read it's content then pass it as body. Or to stream it, but AFAIK http.client does not support that, and since your file seems to be a video, it is potentially huge and will use plenty of RAM for no good reason.
My suggestion would be to use requests, which is a way better lib to do this kind of things:
import requests
with open(r'C:\Test.mp4'), 'rb') as finput:
response = requests.put('https://grabyo-prod.s3-accelerate.amazonaws.com/youruploadpath', data=finput)
print(response.json())

I do not know if it is useful for you, but you can try to send a POST request with requests module :
import requests
url = ""
data = {'title':'metadata','timeDuration':120}
mp3_f = open('/path/your_file.mp3', 'rb')
files = {'messageFile': mp3_f}
req = requests.post(url, files=files, json=data)
print (req.status_code)
print (req.content)
Hope it helps .

Python POST request does not take form data with no files

Before downvoting/marking as duplicate, please note:
I have already tried out this, this, this, this,this, this - basically almost all the methods I could find pointed out by the Requests documentation but do not seem to find any solution.
Problem:
I want to make a POST request with a set of headers and form data.
There are no files to be uploaded. As per the request body in Postman, we set the parameters by selecting 'form-data' under the 'Body' section for the request.
Here is the code I have:
headers = {'authorization': token_string,
'content-type':'multipart/form-data; boundary=----WebKitFormBoundaryxxxxxXXXXX12345'} # I get 'unsupported application/x-www-form-url-encoded' error if I remove this line
body = {
'foo1':'bar1',
'foo2':'bar2',
#... and other form data, NO FILE UPLOADED
}
#I have also tried the below approach
payload = dict()
payload['foo1']='bar1'
payload['foo2']='bar2'
page = ''
page = requests.post(url, proxies=proxies, headers=headers,
json=body, files=json.dump(body)) # also tried data=body,data=payload,files={} when giving data values
Error
{"errorCode":404,"message":"Required String parameter 'foo1' is not
present"}
EDIT:
Adding a trace of the network console. I am defining it in the same way in the payload as mentioned on the request payload.

There isn't any gui at all? You could get the network data from chrome, although:
Try this:
headers = {'authorization': token_string}
Probably there is more authorization? Or smthng else?
You shouldn't add Content-Type as requests will handle it for you.
Important, you could see the content type as WebKitFormBoundary, so for the payload you must take, the data from the "name" variable.
Example:
(I know you won't upload any file, it just an example) -
So in this case, for my payload would look like this: payload = {'photo':'myphoto'} (yea there would be an open file etc etc, but I try to keep it simple)
So your payload would be this-> (So always use name from the WebKit)
payload = {'foo1':'foo1data',
'foo2':'foo2data'}
session.post(url,data = payload, proxies etc...)
Important! As I can see you use the method from requests library. Firstly you always should create a session like this
session = requests.session() -> it will handle cookies, headers, etc, and won't open a new session, or plain requests with every requests.get/post.

Download doesn't start in web browser by flask.Response

In Flask (micro web framework), we have a view as:
#app.route('/download/<id>/<resolution>/<extension>/')
def download_by_id(id, resolution=None, extension=None):
stream = youtube.stream_url(id, resolution, extension)
binary = requests.get(stream['url'], stream=True)
return flask.Response(
binary,
headers={'Content-Disposition': 'attachment; '
'filename=' + stream['filename']})
In template we have a link as Download 240p Video and when it's clicked, it should start downloading that video.
Issue is:
It is working fine in some browsers where no Download Manager like IDM etc. is installed. But IDM fails to download it. IDM just hangs at http://example.com/download/adkdsk457jds/240p/mp4/
Same is the case with Firefox's own download manager. Firefox just downloads a plain .html page and not the actual video.
But, videos gets downloaded successfully in Chrome when no IDM or other Download Manager is installed.
Please help and advice why it's not working. Do i need to change something in code?

You haven't included any response information, including the content type; you need to copy over a little more information about the original response to communicate what type of response you are returning. Otherwise defaults are used (dictated either by the HTTP standard or by Flask).
Specifically, at the very least you want to copy across the content type, length, and the transfer encoding:
headers={
'Content-Disposition': 'attachment; filename=' + stream['filename']
}
for header in ('content-type', 'content-length', 'transfer-encoding'):
if header in binary.headers:
headers[header] = binary.headers[header]
return flask.Response(binary.raw, headers=headers)
I'm using the response.raw underlying raw file object; this should work too but has the added advantage that any compression applied by YouTube is retained.
Some download managers may try to use a HTTP range request to grab a download in parallel, even when the server is not advertising that it supports such requests. You should probably respond with a 406 Not Acceptable response (requesting byte ranges when not supported is a Accept-* violation). You'll need to log what headers the download manager sends to be sure if this is the case.

Add 'Content-Type': 'application/octet-stream' to headers

Receive attachment with urllib - Python

I am testing my webpage software by sending requests from python to it. I am able to send requests, receive responses and parse the json. However, one option on the webpage is to download files. I send the download request and can confirm that the response headers contain what I expect (application/octet-stream and the appropriate filename) but the Content-Length is 0. If the length is 0, I assume the file was not actually sent. I am able to download files from other means so I know my software works but I am having trouble with getting it to work with python.
I build up the request then do:
f = urllib.request.urlopen(request)
f.body = f.read()
I expect data to be in f.body but it is empty (I see "b''")
Is there a different way to access the file contents from an attachment in python?

Is there a different way to access the file contents from an attachment in python?
This is in python-requests instead urllib, since I'm more familiar with that.
import requests
url = "http://example.com/foobar.jpg"
#make request
r = requests.get(url)
attachment_data = r.content
#save to file
with open(r"C:/pictures/foobar.jpg", 'wb') as f:
f.write(attachment_data)

Turns out I needed to throw some data into the file in order to have something in the body. I should've noticed this much sooner.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python file upload from url using requests library - python

Related

Python requests module not passing params in session

How to upload a binary/video file using Python http.client PUT method?

Python POST request does not take form data with no files

Download doesn't start in web browser by flask.Response

Receive attachment with urllib - Python

Categories

Resources