Related
I'm performing a simple task of uploading a file using Python requests library. I searched Stack Overflow and no one seemed to have the same problem, namely, that the file is not received by the server:
import requests
url='http://nesssi.cacr.caltech.edu/cgi-bin/getmulticonedb_release2.cgi/post'
files={'files': open('file.txt','rb')}
values={'upload_file' : 'file.txt' , 'DB':'photcat' , 'OUT':'csv' , 'SHORT':'short'}
r=requests.post(url,files=files,data=values)
I'm filling the value of 'upload_file' keyword with my filename, because if I leave it blank, it says
Error - You must select a file to upload!
And now I get
File file.txt of size bytes is uploaded successfully!
Query service results: There were 0 lines.
Which comes up only if the file is empty. So I'm stuck as to how to send my file successfully. I know that the file works because if I go to this website and manually fill in the form it returns a nice list of matched objects, which is what I'm after. I'd really appreciate all hints.
Some other threads related (but not answering my problem):
Send file using POST from a Python script
http://docs.python-requests.org/en/latest/user/quickstart/#response-content
Uploading files using requests and send extra data
http://docs.python-requests.org/en/latest/user/advanced/#body-content-workflow
If upload_file is meant to be the file, use:
files = {'upload_file': open('file.txt','rb')}
values = {'DB': 'photcat', 'OUT': 'csv', 'SHORT': 'short'}
r = requests.post(url, files=files, data=values)
and requests will send a multi-part form POST body with the upload_file field set to the contents of the file.txt file.
The filename will be included in the mime header for the specific field:
>>> import requests
>>> open('file.txt', 'wb') # create an empty demo file
<_io.BufferedWriter name='file.txt'>
>>> files = {'upload_file': open('file.txt', 'rb')}
>>> print(requests.Request('POST', 'http://example.com', files=files).prepare().body.decode('ascii'))
--c226ce13d09842658ffbd31e0563c6bd
Content-Disposition: form-data; name="upload_file"; filename="file.txt"
--c226ce13d09842658ffbd31e0563c6bd--
Note the filename="file.txt" parameter.
You can use a tuple for the files mapping value, with between 2 and 4 elements, if you need more control. The first element is the filename, followed by the contents, and an optional content-type header value and an optional mapping of additional headers:
files = {'upload_file': ('foobar.txt', open('file.txt','rb'), 'text/x-spam')}
This sets an alternative filename and content type, leaving out the optional headers.
If you are meaning the whole POST body to be taken from a file (with no other fields specified), then don't use the files parameter, just post the file directly as data. You then may want to set a Content-Type header too, as none will be set otherwise. See Python requests - POST data from a file.
(2018) the new python requests library has simplified this process, we can use the 'files' variable to signal that we want to upload a multipart-encoded file
url = 'http://httpbin.org/post'
files = {'file': open('report.xls', 'rb')}
r = requests.post(url, files=files)
r.text
Client Upload
If you want to upload a single file with Python requests library, then requests lib supports streaming uploads, which allow you to send large files or streams without reading into memory.
with open('massive-body', 'rb') as f:
requests.post('http://some.url/streamed', data=f)
Server Side
Then store the file on the server.py side such that save the stream into file without loading into the memory. Following is an example with using Flask file uploads.
#app.route("/upload", methods=['POST'])
def upload_file():
from werkzeug.datastructures import FileStorage
FileStorage(request.stream).save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
return 'OK', 200
Or use werkzeug Form Data Parsing as mentioned in a fix for the issue of "large file uploads eating up memory" in order to avoid using memory inefficiently on large files upload (s.t. 22 GiB file in ~60 seconds. Memory usage is constant at about 13 MiB.).
#app.route("/upload", methods=['POST'])
def upload_file():
def custom_stream_factory(total_content_length, filename, content_type, content_length=None):
import tempfile
tmpfile = tempfile.NamedTemporaryFile('wb+', prefix='flaskapp', suffix='.nc')
app.logger.info("start receiving file ... filename => " + str(tmpfile.name))
return tmpfile
import werkzeug, flask
stream, form, files = werkzeug.formparser.parse_form_data(flask.request.environ, stream_factory=custom_stream_factory)
for fil in files.values():
app.logger.info(" ".join(["saved form name", fil.name, "submitted as", fil.filename, "to temporary file", fil.stream.name]))
# Do whatever with stored file at `fil.stream.name`
return 'OK', 200
You can send any file via post api while calling the API just need to mention files={'any_key': fobj}
import requests
import json
url = "https://request-url.com"
headers = {"Content-Type": "application/json; charset=utf-8"}
with open(filepath, 'rb') as fobj:
response = requests.post(url, headers=headers, files={'file': fobj})
print("Status Code", response.status_code)
print("JSON Response ", response.json())
#martijn-pieters answer is correct, however I wanted to add a bit of context to data= and also to the other side, in the Flask server, in the case where you are trying to upload files and a JSON.
From the request side, this works as Martijn describes:
files = {'upload_file': open('file.txt','rb')}
values = {'DB': 'photcat', 'OUT': 'csv', 'SHORT': 'short'}
r = requests.post(url, files=files, data=values)
However, on the Flask side (the receiving webserver on the other side of this POST), I had to use form
#app.route("/sftp-upload", methods=["POST"])
def upload_file():
if request.method == "POST":
# the mimetype here isnt application/json
# see here: https://stackoverflow.com/questions/20001229/how-to-get-posted-json-in-flask
body = request.form
print(body) # <- immutable dict
body = request.get_json() will return nothing. body = request.get_data() will return a blob containing lots of things like the filename etc.
Here's the bad part: on the client side, changing data={} to json={} results in this server not being able to read the KV pairs! As in, this will result in a {} body above:
r = requests.post(url, files=files, json=values). # No!
This is bad because the server does not have control over how the user formats the request; and json= is going to be the habbit of requests users.
Upload:
with open('file.txt', 'rb') as f:
files = {'upload_file': f.read()}
values = {'DB': 'photcat', 'OUT': 'csv', 'SHORT': 'short'}
r = requests.post(url, files=files, data=values)
Download (Django):
with open('file.txt', 'wb') as f:
f.write(request.FILES['upload_file'].file.read())
Regarding the answers given so far, there was always something missing that prevented it to work on my side. So let me show you what worked for me:
import json
import os
import requests
API_ENDPOINT = "http://localhost:80"
access_token = "sdfJHKsdfjJKHKJsdfJKHJKysdfJKHsdfJKHs" # TODO: get fresh Token here
def upload_engagement_file(filepath):
url = API_ENDPOINT + "/api/files" # add any URL parameters if needed
hdr = {"Authorization": "Bearer %s" % access_token}
with open(filepath, "rb") as fobj:
file_obj = fobj.read()
file_basename = os.path.basename(filepath)
file_to_upload = {"file": (str(file_basename), file_obj)}
finfo = {"fullPath": filepath}
upload_response = requests.post(url, headers=hdr, files=file_to_upload, data=finfo)
fobj.close()
# print("Status Code ", upload_response.status_code)
# print("JSON Response ", upload_response.json())
return upload_response
Note that requests.post(...) needs
a url parameter, containing the full URL of the API endpoint you're calling, using the API_ENDPOINT, assuming we have an http://localhost:8000/api/files endpoint to POST a file
a headers parameter, containing at least the authorization (bearer token)
a files parameter taking the name of the file plus the entire file content
a data parameter taking just the path and file name
Installation required (console):
pip install requests
What you get back from the function call is a response object containing a status code and also the full error message in JSON format. The commented print statements at the end of upload_engagement_file are showing you how you can access them.
Note: Some useful additional information about the requests library can be found here
Some may need to upload via a put request and this is slightly different that posting data. It is important to understand how the server expects the data in order to form a valid request. A frequent source of confusion is sending multipart-form data when it isn't accepted. This example uses basic auth and updates an image via a put request.
url = 'foobar.com/api/image-1'
basic = requests.auth.HTTPBasicAuth('someuser', 'password123')
# Setting the appropriate header is important and will vary based
# on what you upload
headers = {'Content-Type': 'image/png'}
with open('image-1.png', 'rb') as img_1:
r = requests.put(url, auth=basic, data=img_1, headers=headers)
While the requests library makes working with http requests a lot easier, some of its magic and convenience obscures just how to craft more nuanced requests.
In Ubuntu you can apply this way,
to save file at some location (temporary) and then open and send it to API
path = default_storage.save('static/tmp/' + f1.name, ContentFile(f1.read()))
path12 = os.path.join(os.getcwd(), "static/tmp/" + f1.name)
data={} #can be anything u want to pass along with File
file1 = open(path12, 'rb')
header = {"Content-Disposition": "attachment; filename=" + f1.name, "Authorization": "JWT " + token}
res= requests.post(url,data,header)
I'm working on a kind of personal cloud project which allows to upload and download files (with a google like search feature). The backend is written in python (using aiohttp) while I'm using a react.js website as a client. When a file is uploaded, the backend stores it on the filesystem and renames it with its sha256 hash. The original name and a few other metadata are stored next to it (description, etc). When the user downloads the file, I'm serving it using multipart and I want the user to get it with the original name and not the hash, indeed my-cool-image.png is more user friendly than a14e0414-b84c-4d7b-b0d4-49619b9edd8a. But I'm not able to do it (whatever I try, the download file is called with the hash).
Here is my code:
async def download(self, request):
if not request.username:
raise exceptions.Unauthorized("A valid token is needed")
data = await request.post()
hash = data["hash"]
file_path = storage.get_file(hash)
dotfile_path = storage.get_file("." + hash)
if not os.path.exists(file_path) or not os.path.exists(dotfile_path):
raise exceptions.NotFound("file <{}> does not exist".format(hash))
with open(dotfile_path) as dotfile:
dotfile_content = json.load(dotfile)
name = dotfile_content["name"]
headers = {
"Content-Type": "application/octet-stream; charset=binary",
"Content-Disposition": "attachment; filename*=UTF-8''{}".format(
urllib.parse.quote(name, safe="")
),
}
return web.Response(body=self._file_sender(file_path), headers=headers)
Here is what it looks like (according to the browser):
Seems right, yet it's not working.
One thing I want to make clear though: sometimes I get a warning (on the client side) saying Resource interpreted as Document but transferred with MIME type application/octet-stream. I don't know the MIME type of the files since they are provided by the users, but I tried to use image/png (I did my test with a png image stored on the server). The file didn't get downloaded (it was displayed in the browser, which is not something that I want) and the file name was still its hash so it didn't help with my issue.
Here is the entire source codes of the backend:
https://git.io/nexmind-node
And of the frontend:
https://git.io/nexmind-client
EDIT:
I received a first answer by Julien Castiaux so I tried to implement it, even though it looks better, it doesn't solve my issue (I still have the exact same behaviour):
async def download(self, request):
if not request.username:
raise exceptions.Unauthorized("A valid token is needed")
data = await request.post()
hash = data["hash"]
file_path = storage.get_file(hash)
dotfile_path = storage.get_file("." + hash)
if not os.path.exists(file_path) or not os.path.exists(dotfile_path):
raise exceptions.NotFound("file <{}> does not exist".format(hash))
with open(dotfile_path) as dotfile:
dotfile_content = json.load(dotfile)
name = dotfile_content["name"]
response = web.StreamResponse()
response.headers['Content-Type'] = 'application/octet-stream'
response.headers['Content-Disposition'] = "attachment; filename*=UTF-8''{}".format(
urllib.parse.quote(name, safe="") # replace with the filename
)
response.enable_chunked_encoding()
await response.prepare(request)
with open(file_path, 'rb') as fd: # replace with the path
for chunk in iter(lambda: fd.read(1024), b""):
await response.write(chunk)
await response.write_eof()
return response
From aiohttp3 documentation
StreamResponse is intended for streaming data, while Response contains HTTP BODY as an attribute and sends own content as single piece with the correct Content-Length HTTP header.
You rather use aiohttp.web.StreamResponse as you are sending (potentially very large) files. Using StreamResponse, you have full control of the outgoing http response stream : headers manipulation (including the filename) and chunked encoding.
from aiohttp import web
import urllib.parse
async def download(req):
resp = web.StreamResponse()
resp.headers['Content-Type'] = 'application/octet-stream'
resp.headers['Content-Disposition'] = "attachment; filename*=UTF-8''{}".format(
urllib.parse.quote(filename, safe="") # replace with the filename
)
resp.enable_chunked_encoding()
await resp.prepare(req)
with open(path_to_the_file, 'rb') as fd: # replace with the path
for chunk in iter(lambda: fd.read(1024), b""):
await resp.write(chunk)
await resp.write_eof()
return resp
app = web.Application()
app.add_routes([web.get('/', download)])
web.run_app(app)
Hope it helps !
I'm performing a simple task of uploading a file using Python requests library. I searched Stack Overflow and no one seemed to have the same problem, namely, that the file is not received by the server:
import requests
url='http://nesssi.cacr.caltech.edu/cgi-bin/getmulticonedb_release2.cgi/post'
files={'files': open('file.txt','rb')}
values={'upload_file' : 'file.txt' , 'DB':'photcat' , 'OUT':'csv' , 'SHORT':'short'}
r=requests.post(url,files=files,data=values)
I'm filling the value of 'upload_file' keyword with my filename, because if I leave it blank, it says
Error - You must select a file to upload!
And now I get
File file.txt of size bytes is uploaded successfully!
Query service results: There were 0 lines.
Which comes up only if the file is empty. So I'm stuck as to how to send my file successfully. I know that the file works because if I go to this website and manually fill in the form it returns a nice list of matched objects, which is what I'm after. I'd really appreciate all hints.
Some other threads related (but not answering my problem):
Send file using POST from a Python script
http://docs.python-requests.org/en/latest/user/quickstart/#response-content
Uploading files using requests and send extra data
http://docs.python-requests.org/en/latest/user/advanced/#body-content-workflow
If upload_file is meant to be the file, use:
files = {'upload_file': open('file.txt','rb')}
values = {'DB': 'photcat', 'OUT': 'csv', 'SHORT': 'short'}
r = requests.post(url, files=files, data=values)
and requests will send a multi-part form POST body with the upload_file field set to the contents of the file.txt file.
The filename will be included in the mime header for the specific field:
>>> import requests
>>> open('file.txt', 'wb') # create an empty demo file
<_io.BufferedWriter name='file.txt'>
>>> files = {'upload_file': open('file.txt', 'rb')}
>>> print(requests.Request('POST', 'http://example.com', files=files).prepare().body.decode('ascii'))
--c226ce13d09842658ffbd31e0563c6bd
Content-Disposition: form-data; name="upload_file"; filename="file.txt"
--c226ce13d09842658ffbd31e0563c6bd--
Note the filename="file.txt" parameter.
You can use a tuple for the files mapping value, with between 2 and 4 elements, if you need more control. The first element is the filename, followed by the contents, and an optional content-type header value and an optional mapping of additional headers:
files = {'upload_file': ('foobar.txt', open('file.txt','rb'), 'text/x-spam')}
This sets an alternative filename and content type, leaving out the optional headers.
If you are meaning the whole POST body to be taken from a file (with no other fields specified), then don't use the files parameter, just post the file directly as data. You then may want to set a Content-Type header too, as none will be set otherwise. See Python requests - POST data from a file.
(2018) the new python requests library has simplified this process, we can use the 'files' variable to signal that we want to upload a multipart-encoded file
url = 'http://httpbin.org/post'
files = {'file': open('report.xls', 'rb')}
r = requests.post(url, files=files)
r.text
Client Upload
If you want to upload a single file with Python requests library, then requests lib supports streaming uploads, which allow you to send large files or streams without reading into memory.
with open('massive-body', 'rb') as f:
requests.post('http://some.url/streamed', data=f)
Server Side
Then store the file on the server.py side such that save the stream into file without loading into the memory. Following is an example with using Flask file uploads.
#app.route("/upload", methods=['POST'])
def upload_file():
from werkzeug.datastructures import FileStorage
FileStorage(request.stream).save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
return 'OK', 200
Or use werkzeug Form Data Parsing as mentioned in a fix for the issue of "large file uploads eating up memory" in order to avoid using memory inefficiently on large files upload (s.t. 22 GiB file in ~60 seconds. Memory usage is constant at about 13 MiB.).
#app.route("/upload", methods=['POST'])
def upload_file():
def custom_stream_factory(total_content_length, filename, content_type, content_length=None):
import tempfile
tmpfile = tempfile.NamedTemporaryFile('wb+', prefix='flaskapp', suffix='.nc')
app.logger.info("start receiving file ... filename => " + str(tmpfile.name))
return tmpfile
import werkzeug, flask
stream, form, files = werkzeug.formparser.parse_form_data(flask.request.environ, stream_factory=custom_stream_factory)
for fil in files.values():
app.logger.info(" ".join(["saved form name", fil.name, "submitted as", fil.filename, "to temporary file", fil.stream.name]))
# Do whatever with stored file at `fil.stream.name`
return 'OK', 200
You can send any file via post api while calling the API just need to mention files={'any_key': fobj}
import requests
import json
url = "https://request-url.com"
headers = {"Content-Type": "application/json; charset=utf-8"}
with open(filepath, 'rb') as fobj:
response = requests.post(url, headers=headers, files={'file': fobj})
print("Status Code", response.status_code)
print("JSON Response ", response.json())
#martijn-pieters answer is correct, however I wanted to add a bit of context to data= and also to the other side, in the Flask server, in the case where you are trying to upload files and a JSON.
From the request side, this works as Martijn describes:
files = {'upload_file': open('file.txt','rb')}
values = {'DB': 'photcat', 'OUT': 'csv', 'SHORT': 'short'}
r = requests.post(url, files=files, data=values)
However, on the Flask side (the receiving webserver on the other side of this POST), I had to use form
#app.route("/sftp-upload", methods=["POST"])
def upload_file():
if request.method == "POST":
# the mimetype here isnt application/json
# see here: https://stackoverflow.com/questions/20001229/how-to-get-posted-json-in-flask
body = request.form
print(body) # <- immutable dict
body = request.get_json() will return nothing. body = request.get_data() will return a blob containing lots of things like the filename etc.
Here's the bad part: on the client side, changing data={} to json={} results in this server not being able to read the KV pairs! As in, this will result in a {} body above:
r = requests.post(url, files=files, json=values). # No!
This is bad because the server does not have control over how the user formats the request; and json= is going to be the habbit of requests users.
Upload:
with open('file.txt', 'rb') as f:
files = {'upload_file': f.read()}
values = {'DB': 'photcat', 'OUT': 'csv', 'SHORT': 'short'}
r = requests.post(url, files=files, data=values)
Download (Django):
with open('file.txt', 'wb') as f:
f.write(request.FILES['upload_file'].file.read())
Regarding the answers given so far, there was always something missing that prevented it to work on my side. So let me show you what worked for me:
import json
import os
import requests
API_ENDPOINT = "http://localhost:80"
access_token = "sdfJHKsdfjJKHKJsdfJKHJKysdfJKHsdfJKHs" # TODO: get fresh Token here
def upload_engagement_file(filepath):
url = API_ENDPOINT + "/api/files" # add any URL parameters if needed
hdr = {"Authorization": "Bearer %s" % access_token}
with open(filepath, "rb") as fobj:
file_obj = fobj.read()
file_basename = os.path.basename(filepath)
file_to_upload = {"file": (str(file_basename), file_obj)}
finfo = {"fullPath": filepath}
upload_response = requests.post(url, headers=hdr, files=file_to_upload, data=finfo)
fobj.close()
# print("Status Code ", upload_response.status_code)
# print("JSON Response ", upload_response.json())
return upload_response
Note that requests.post(...) needs
a url parameter, containing the full URL of the API endpoint you're calling, using the API_ENDPOINT, assuming we have an http://localhost:8000/api/files endpoint to POST a file
a headers parameter, containing at least the authorization (bearer token)
a files parameter taking the name of the file plus the entire file content
a data parameter taking just the path and file name
Installation required (console):
pip install requests
What you get back from the function call is a response object containing a status code and also the full error message in JSON format. The commented print statements at the end of upload_engagement_file are showing you how you can access them.
Note: Some useful additional information about the requests library can be found here
Some may need to upload via a put request and this is slightly different that posting data. It is important to understand how the server expects the data in order to form a valid request. A frequent source of confusion is sending multipart-form data when it isn't accepted. This example uses basic auth and updates an image via a put request.
url = 'foobar.com/api/image-1'
basic = requests.auth.HTTPBasicAuth('someuser', 'password123')
# Setting the appropriate header is important and will vary based
# on what you upload
headers = {'Content-Type': 'image/png'}
with open('image-1.png', 'rb') as img_1:
r = requests.put(url, auth=basic, data=img_1, headers=headers)
While the requests library makes working with http requests a lot easier, some of its magic and convenience obscures just how to craft more nuanced requests.
In Ubuntu you can apply this way,
to save file at some location (temporary) and then open and send it to API
path = default_storage.save('static/tmp/' + f1.name, ContentFile(f1.read()))
path12 = os.path.join(os.getcwd(), "static/tmp/" + f1.name)
data={} #can be anything u want to pass along with File
file1 = open(path12, 'rb')
header = {"Content-Disposition": "attachment; filename=" + f1.name, "Authorization": "JWT " + token}
res= requests.post(url,data,header)
Is it possible to download a large file in chunks using httplib2. I am downloading files from a Google API, and in order to use the credentials from the google OAuth2WebServerFlow, I am bound to use httplib2.
At the moment I am doing:
flow = OAuth2WebServerFlow(
client_id=XXXX,
client_secret=XXXX,
scope=XYZ,
redirect_uri=XYZ
)
credentials = flow.step2_exchange(oauth_code)
http = httplib2.Http()
http = credentials.authorize(http)
resp, content = self.http.request(url, "GET")
with open(file_name, 'wb') as fw:
fw.write(content)
But the content variable can get more than 500MB.
Any way of reading the response in chunks?
You could consider streaming_httplib2, a fork of httplib2 with exactly that change in behaviour.
in order to use the credentials from the google OAuth2WebServerFlow, I am bound to use httplib2.
If you need features that aren't available in httplib2, it's worth looking at how much work it would be to get your credential handling working with another HTTP library. It may be a good longer-term investment. (e.g. How to download large file in python with requests.py?.)
About reading response in chunks (works with httplib, must work with httplib2)
import httplib
conn = httplib.HTTPConnection("google.com")
conn.request("GET", "/")
r1 = conn.getresponse()
try:
print r1.fp.next()
print r1.fp.next()
except:
print "Exception handled!"
Note: next() may raise StopIteration exception, you need to handle it.
You can avoid calling next() like this
F=open("file.html","w")
for n in r1.fp:
F.write(n)
F.flush()
You can apply oauth2client.client.Credentials to a urllib2 request.
First, obtain the credentials object. In your case, you're using:
credentials = flow.step2_exchange(oauth_code)
Now, use that object to get the auth headers and add them to the urllib2 request:
req = urllib2.Request(url)
auth_headers = {}
credentials.apply(auth_headers)
for k,v in auth_headers.iteritems():
req.add_header(k,v)
resp = urllib2.urlopen(req)
Now resp is a file-like object that you can use to read the contents of the URL
The API I'm working on has a method to send images by POSTing to /api/pictures/ with the picture file in the request.
I want to automate some sample images with Python's requests library, but I'm not exactly sure how to do it. I have a list of URLs that point to images.
rv = requests.get('http://api.randomuser.me')
resp = rv.json()
picture_href = resp['results'][0]['user']['picture']['thumbnail']
rv = requests.get(picture_href)
resp = rv.content
rv = requests.post(prefix + '/api/pictures/', data = resp)
rv.content returns bytecode. I get a 400 Bad Request from the server but no error message. I believe I'm either 'getting' the picture wrong when I do rv.content or sending it wrong with data = resp. Am I on the right track? How do I send files?
--Edit--
I changed the last line to
rv = requests.post('myapp.com' + '/api/pictures/', files = {'file': resp})
Server-side code (Flask):
file = request.files['file']
if file and allowed_file(file.filename):
...
else:
abort(400, message = 'Picture must exist and be either png, jpg, or jpeg')
Server aborts with status code 400 and the message above. I also tried reading resp with BytesIO, didn't help.
The problem is your data is not a file, its a stream of bytes. So it does not have a "filename" and I suspect that is why your server code is failing.
Try sending a valid filename along with the correct mime type with your request:
files = {'file': ('user.gif', resp, 'image/gif', {'Expires': '0'})}
rv = requests.post('myapp.com' + '/api/pictures/', files = files)
You can use imghdr to figure out what kind of image you are dealing with (to get the correct mime type):
import imghdr
image_type = imghdr.what(None, resp)
# You should improve this logic, by possibly creating a
# dictionary lookup
mime_type = 'image/{}'.format(image_type)