I'm attempting to use this API endpoint to upload a file:
https://h.app.wdesk.com/s/cerebral-docs/?python#uploadfileusingpost
With this python function:
def upload_file(token, filepath, table_id):
url = "https://h.app.wdesk.com/s/wdata/prep/api/v1/file"
headers = {
'Accept': 'application/json',
'Authorization': f'Bearer {token}'
}
files = {
"tableId": (None, table_id),
"file": open(filepath, "rb")
}
resp = requests.post(url, headers=headers, files=files)
print(resp.request.headers)
return resp.json()
The Content-Type and Content-Length headers are computed and added by the requests library internally as per their documentation. When assigning to the files kwarg in the post function, the library knows it's supposed to be a multipart/form-data request.
The print out of the request header is as follows, showing the Content-Type and Content-Length that the library added. I've omitted the auth token.
{'User-Agent': 'python-requests/2.24.0', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive',
'Authorization': 'Bearer <omitted>', 'Content-Length': '8201', 'Content-Type': 'multipart/form-data; boundary=bb582b9071574462d44c4b43ec4d7bf3'}
The json response from the API is:
{'body': ['contentType must not be null'], 'code': 400}
The odd thing is that the same request, when made through Postman, gives a different response - which is what I expected from Python as well.
{ "code": 409, "body": "duplicate file name" }
These are the Postman request headers:
POST /s/wdata/prep/api/v1/file HTTP/1.1
Authorization: Bearer <omitted>
Accept: */*
Cache-Control: no-cache
Postman-Token: 34ed08d4-4467-4168-a4e4-c83b16ce9afb
Host: h.app.wdesk.com
Content-Type: multipart/form-data; boundary=--------------------------179907322036790253179546
Content-Length: 8279
The Postman request also computes the Content-Type and Content-Length headers when the request is sent, and are not user specified.
I am quite confused as to why I'm getting two different behaviors from the API service for the same request.
There must be something I'm missing and can't figure out what it is.
Figured out what was wrong with my request, compared to NodeJS and Postman.
The contentType being referred to in the API's error message was the file parameter's content type, not the http request header Content-Type.
The upload started to work flawlessly when I updated my file parameter like so:
files = {
"tableId": (None, table_id),
"file": (Path(filepath).name, open(filepath, "rb"), "text/csv", None)
}
I learned that Python's requests library will not automatically add the file's mime type to the request body. We need to be explicit about it.
Hope this helps someone else too.
Related
I'm attempting to utilize BIMTrack's REST API to post an image. To do this the API requires me to send a json file with but prior to the image, inherently requiring multipart/form-data.
Failure post the json fill will be met with the error code: 415 and error message: The content-type of the first file of the request must be application\json.
I've successfully made this post request using the web debugging proxies of Postman & Fiddler but am unable to repeat my successes within python requests.
Python Code (This doesn't work) :
image = r"C:\Users\aflemming\Desktop\Images\DBMICon.png"
jsonFile = r"C:\Users\aflemming\source\repos\IfcOpenShell\IfcOpenShell\BIM\myjson.json"
headers = {'Authorization' : 'Bearer <MyToken>'}
files = {
'Json': (None, open(jsonFile, 'rb'), 'application/json'),
'Image': (None, open(image, 'rb'), 'image/png')
}
r = requests.post(https://api.bimtrackapp.co/v3/hubs/07La7cOZ/projects/20767/issues/3161484/viewpoints, files=files, headers=headers)
Fiddler Raw Request (This works) :
User-Agent:Fiddler Everywhere
Authorization:Bearer eb5e3983a7546dad76067418ff93175ef42b816dd57f78f54101f0b63862542e
Host:api.bimtrackapp.co
Content-Length:11322
Content-Type:multipart/form-data;boundary=-------------------------acebdf13572468
---------------------------acebdf13572468
Content-Disposition: form-data; name="description"
the_text_is_here
---------------------------acebdf13572468
Content-Disposition: form-data; name="jsonfile"; filename="myjson.json"
Content-Type: application/json
<#INCLUDE *C:\Users\aflemming\source\repos\IfcOpenShell\IfcOpenShell\BIM\myjson.json*#>
---------------------------acebdf13572468
Content-Disposition: form-data; name="image"; filename="DBMICon.png"
Content-Type: image/png
<#INCLUDE *C:\Users\aflemming\Desktop\Images\DBMICon.png*#>
---------------------------acebdf13572468--
Postman Request (This also works):
BIMTrack's REST API: https://api.bimtrackapp.co/swagger/ui/index
I'm happy to provide more information where required.
I found a solution by altering the request method (using requests.request over requests.post) and setting the verify=False parameter.
It seems as though the request was encountering an SSLCertVerificationError and bypassing the certificate resolved this.
Final Code:
image = r"C:\Users\aflemming\Desktop\Images\DBMICon.png"
jsonFile = r"C:\Users\aflemming\source\repos\IfcOpenShell\IfcOpenShell\BIM\myjson.json"
url = "https://api.bimtrackapp.co/v3/hubs/07La7cOZ/projects/20767/issues/3161484/viewpoints"
files = [
('Json', ('Json2', open(jsonFile,'rb'), 'application/json')),
('Image', ('Image2', open(image,'rb'), 'image/png'))
]
headers = {
'Authorization': 'Bearer <MyToken>'
}
response = requests.request("POST", url, headers=headers, files = files, verify=False)
I am using an API, which receives a pdf file and does some analysis, but I am receiving Response 500 always
Have initially tested using Postman and the request goes through, receiving response 200 with the corresponding JSON information. The SSL security should be turned off.
However, when I try to do request via Python, I always get Response 500
Python code written by me:
import requests
url = "https://{{BASE_URL}}/api/v1/documents"
fin = open('/home/train/aab2wieuqcnvn3g6syadumik4bsg5.0062.pdf', 'rb')
files = {'file': fin}
r = requests.post(url, files=files, verify=False)
print (r)
#r.text is empty
Python code, produced by the Postman:
import requests
url = "https://{{BASE_URL}}/api/v1/documents"
payload = "------WebKitFormBoundary7MA4YWxkTrZu0gW\r\nContent-Disposition: form-data; name=\"file\"; filename=\"aab2wieuqcnvn3g6syadumik4bsg5.0062.pdf\"\r\nContent-Type: application/pdf\r\n\r\n\r\n------WebKitFormBoundary7MA4YWxkTrZu0gW--"
headers = {
'content-type': "multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW",
'Content-Type': "application/x-www-form-urlencoded",
'cache-control': "no-cache",
'Postman-Token': "65f888e2-c1e6-4108-ad76-f698aaf2b542"
}
response = requests.request("POST", url, data=payload, headers=headers)
print(response.text)
Have masked the API link as {{BASE_URL}} due to the confidentiality
Response by Postman:
{
"id": "5e69058e2690d5b0e519cf4006dfdbfeeb5261b935094a2173b2e79a58e80ab5",
"name": "aab2wieuqcnvn3g6syadumik4bsg5.0062.pdf",
"fileIds": {
"original": "5e69058e2690d5b0e519cf4006dfdbfeeb5261b935094a2173b2e79a58e80ab5.pdf"
},
"creationDate": "2019-06-20T09:41:59.5930472+00:00"
}
Response by Python:
Response<500>
UPDATE:
Tried the GET request - works fine, as I receive the JSON response from it. I guess the problem is in posting pdf file. Is there any other options on how to post a file to an API?
Postman Response RAW:
POST /api/v1/documents
Content-Type: multipart/form-data; boundary=--------------------------375732980407830821611925
cache-control: no-cache
Postman-Token: 3e63d5a1-12cf-4f6b-8f16-3d41534549b9
User-Agent: PostmanRuntime/7.6.0
Accept: */*
Host: {{BASE_URL}}
cookie: c2b8faabe4d7f930c0f28c73aa7cafa9=736a1712f7a3dab03dd48a80403dd4ea
accept-encoding: gzip, deflate
content-length: 3123756
file=[object Object]
HTTP/1.1 200
status: 200
Date: Thu, 20 Jun 2019 10:59:55 GMT
Content-Type: application/json; charset=utf-8
Transfer-Encoding: chunked
Location: /api/v1/files/95463e88527ecdc94393fde685ab1d05fa0ee0b924942f445b14b75e983c927e
api-supported-versions: 1.0
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
Referrer-Policy: strict-origin
{"id":"95463e88527ecdc94393fde685ab1d05fa0ee0b924942f445b14b75e983c927e","name":"aab2wieuqcnvn3g6syadumik4bsg5.0062.pdf","fileIds":{"original":"95463e88527ecdc94393fde685ab1d05fa0ee0b924942f445b14b75e983c927e.pdf"},"creationDate":"2019-06-20T10:59:55.7038573+00:00"}
CORRECT REQUEST
So, eventually - the correct code is the following:
import requests
files = {
'file': open('/home/train/aab2wieuqcnvn3g6syadumik4bsg5.0062.pdf', 'rb'),
}
response = requests.post('{{BASE_URL}}/api/v1/documents', files=files, verify=False)
print (response.text)
A 500 error indicates an internal server error, not an error with your script.
If you're receiving a 500 error (as opposed to a 400 error, which indicates a bad request), then theoretically your script is fine and it's the server-side code that needs to be adjusted.
In practice, it could still be due a bad request though.
If you're the one running the API, then you can check the error logs and debug the code line-by-line to figure out why the server is throwing an error.
In this case though, it sounds like it's a third-party API, correct? If so, I recommend looking through their documentation to find a working example or contacting them if you think it's an issue on their end (which is unlikely but possible).
I was able to use postman and do a post request with image and a string parameter. I am not able to do the same if I copy the python code from postman and run it.
import requests
url = "yyyyyyyyyy"
querystring = {"param1":"xxxxx"}
payload = "------WebKitFormBoundary7MA4YWxkTrZu0gW\r\nContent-Disposition: form-data; name=\"queryFile\"; filename=\"file.jpg\"\r\nContent-Type: image/jpeg\r\n\r\n\r\n------WebKitFormBoundary7MA4YWxkTrZu0gW--"
headers = {
'content-type': "multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW",
'Authorization': "Bearer yyyyyyyyyyy",
'cache-control': "no-cache",
'Postman-Token': "fffffffffff"
}
response = requests.request("POST", url, data=payload, headers=headers, params=querystring)
print(response.text)
{"message":"EMPTY_FILE_NOT_ALLOWED","status":400}
You need to pass a reference to the files you want to upload via the files parameter. See python requests file upload.
Due to requests adding unwanted headers, I decided to prepare the request manually and use Session Send().
Sadly, The following code produces the wrong request
import requests
ARCHIVE_URL = "http://10.0.0.10/post/tmp/archive.zip"
headers = {
'Content-Type': 'application/x-www-form-urlencoded',
'Cache-Control': 'no-cache',
'Connection': 'Keep-Alive',
'Host': '10.0.0.10'
}
DataToSend = 'data'
req = requests.Request('POST', ARCHIVE_URL, data=DataToSend, headers=headers)
prepped = req.prepare()
s = requests.Session()
response = s.send(prepped)
If I look at the request using fiddler I get this:
GET http://10.0.0.10/tmp/archive.zip HTTP/1.1
Accept-Encoding: identity
Connection: Keep-Alive
Host: 10.0.0.10
Cache-Control: no-cache
Content-Type: application/x-www-form-urlencoded
What am I missing?
since prepared request is not connected to the session when using req.prepare() instead of s.prepare_request(req) when s is the session , you must specify request headers since there are no default one that come from the session object.
use s.prepare_request(req) instead of req.prepare() or specify headers dictionary
Using session from requests module in python, it seems that the session sends authorization only with first request, I can't understand why this happened.
import requests
session = requests.Session()
session.auth = (u'user', 'test')
session.verify = False
response = session.get(url='https://my_url/rest/api/1.0/users')
If I look for this response request headers I see:
{'Authorization': 'Basic auth_data', 'Connection': 'keep-alive', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'User-Agent': 'python-requests/2.12.3'}
but if I send next request using the same or not url:
response = session.get(url='https://my_url/rest/api/1.0/users')
I can see that there is no auth header in request anymore:
print response.request.headers
{'Connection': 'keep-alive', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'User-Agent': 'python-requests/2.12.3'}
And I'm getting 401 response because of it.
Why is it so? Shouldn't session send auth with every request made using it?
How can I send auth data with every request using session?
What I see when I run that exact code in your comment is that the Authorization header is missing in the first print, yet it is present in the second. This seems to be the opposite of the problem that you report.
This is explained by the fact that the first request is redirected by a 301 response, and the auth header is not propagated in the follow up request to the redirected location. You can see that the auth header was sent in the initial request by looking in response.history[0].request.headers.
The second request is not redirected because the session has kept the connection to the host open (due the the Connection: keep-alive header), so the auth headers appear when you print response.request.headers.
I doubt that you are actually using https://test.com, but probably a similar thing is happening with the server that you are using.
For testing I recommend using the very handy public test HTTP server https://httpbin.org/headers. This will return the headers received by the server in the response body. You can test redirected requests with one of the redirect URLs.
I didn't find any reliable answer on how to pass, auth info while making request's session in python. So below is my finding:
with requests.sessions.Session() as session:
session.auth = ("username", "password")
# Make any requests here without provide auth info again
session.get("http://www.example.com/users")