Python - Outputting to .JSON with results from Microsoft's Computer Vision API

Python - Outputting to .JSON with results from Microsoft's Computer Vision API - python

Trying to output my response from Microsoft's Computer Vision API to a .json file, it works with all of the other APIs I've been using so far. With the code below, directly from Microsoft's documentation, I get an error:
Error: the JSON object must be str, not 'bytes'
Removing the parsed = json.loads(data) and using print(json.dumps(data, sort_keys=True, indent=2)) prints out the information for the image that I want, but also says Error and is prefixed with
b
denoting it's in bytes and ending with
is not JSON serializable
I'm just trying to find out how I can get the response into a .json file like i'm able to do with other APIs and am at a loss for how I can possible convert this in a way that will work.
import http.client, urllib.request, urllib.parse, urllib.error, base64, json
API_KEY = '{API_KEY}'
uri_base = 'westus.api.cognitive.microsoft.com'
headers = {
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key': API_KEY,
}
params = urllib.parse.urlencode(
{
'visualFeatures': 'Categories, Description, Color',
'language': 'en',
}
)
body = "{'url': 'http://i.imgur.com/WgPtc53.jpg'}"
try:
conn = http.client.HTTPSConnection(uri_base)
conn.request('POST', '/vision/v1.0/analyze?%s' % params, body, headers)
response = conn.getresponse()
data = response.read()
# 'data' contains the JSON data. The following formats the JSON data for display.
parsed = json.loads(data)
print ("Response:")
print (json.dumps(parsed, sort_keys=True, indent=2))
conn.close()
except Exception as e:
print('Error:')
print(e)
Shortly after posting the question, I realized I had missed looking for something: just converting bytes to a string.
found this Convert bytes to a Python string
and was able to modify my code to:
parsed = json.loads(data.decode('utf-8'))
And it seems to have resolved my issue. Now error-free and able to export to .json file like I needed.

Related

Decoding JSON that contains Base64

I'm sending a request for a set of images to one of my API's. The API returns these images in a JSON format. This format contains data about the resource together with a single property that represents the image in Base64.
An example of the JSON being returned.
{
"id": 548613,
"filename": "00548613.png",
"pictureTaken": "2020-03-30T11:38:21.003",
"isVisible": true,
"lotcode": 23,
"company": "05",
"concern": "46",
"base64": "..."
}
The correct content of the Base64
The incorrectly parsed Base64
This is done with the Python3 requests library. When i receive a successful response from the API i attempt to decode the body to JSON using:
url = self.__url__(f"/rest/all/V1/products/{sku}/images")
headers = self.__headers__()
r = requests.get(url=url, headers=headers)
if r.status_code == 200:
return r.json()
elif r.status_code == 404:
return None
else:
raise IOError(
f"Error retrieving product '{sku}', got {r.status_code}: '{r.text}'")
Calling .json() results in the Base64 content being messed up, some parts are not there, and some are replaced with other characters. I tried manually decoding the content using r.content.decode() with the utf-8 and ascii options to see if this was the problem after seeing this post. Sadly this didn't work.
I know the response from the server is correct, it works with Postman, and calling print(r.content) results in a JSON document containing the valid Base64.
How would i go about de-serializing the response from the API to get the valid Base64?

import base64
import re
...
b64text = re.search(b"\"base64\": \"(?P<base>.*)\"", r.content, flags=re.MULTILINE).group("base")
decode = base64.b64decode(b64text).decode(utf-8)
Since you're saying "calling print(r.content) results in the valid Base64", it's just a matter of decoding the base64.

Encode file as base64 and send to API

I'm currently having issues sending a file to an API. I've manually tested my scripts base64 output by printing to the screen and copying and pasting this directly into the API's sandbox which works correctly but when I package it up in JSON ready to send, it no longer works.
What I need is this to send to the API:
{
"content": "mybase64encodedfilestuff"
}
and my python code is:
with open(filename, "rb") as image_file:
encoded_string = base64.b64encode(image_file.read())
encoded_string = encoded_string.decode("utf-8")
payload = {}
payload['content'] = encoded_string
json_payload = json.dumps(payload)
I then send this to the API as:
r = requests.post(url='https://api.example.com/uploads', data=payload,
headers={'Content-Type': 'application/json',
'Authorization': 'Basic '+api_string}, timeout=5)
I feel like I've missed something simple but can't figure it out as I just get a error 400, please provide valid content first. If I make the payload a copy and paste of the print output it works.

I'd converted my string to a JSON but they used the original string, not the JSONified one.

ValueError: Data must not be a string. in Python [duplicate]

I am trying to do the following with requests:
data = {'hello', 'goodbye'}
json_data = json.dumps(data)
headers = {
'Access-Key': self.api_key,
'Access-Signature': signature,
'Access-Nonce': nonce,
'Content-Type': 'application/json',
'Accept': 'text/plain'
}
r = requests.post(url, headers=headers, data=json_data,
files={'file': open('/Users/david/Desktop/a.png', 'rb')})
However, I get the following error:
ValueError: Data must not be a string.
Note that if I remove the files parameter, it works as needed. Why won't requests allow me to send a json-encoded string for data if files is included?
Note that if I change data to be just the normal python dictionary (and not a json-encoded string), the above works. So it seems that the issue is that if files is not json-encoded, then data cannot be json-encoded. However, I need to have my data encoded to match a hash signature that's being created by the API.

When you specify your body to a JSON string, you can no longer attach a file since file uploading requires the MIME type multipart/form-data.
You have two options:
Encapsulate your JSON string as part as the form data (something like json => json.dumps(data))
Encode your file in Base64 and transmit it in the JSON request body. This looks like a lot of work though.

1.Just remove the line
json_data = json.dumps(data)
and change in request as data=data.
2.Remove 'Content-Type': 'application/json' inside headers.
This worked for me.

Alternative solution to this problem is to post data as file.
You can post strings as files. Read more here:
http://docs.python-requests.org/en/latest/user/quickstart/#post-a-multipart-encoded-file
Here is explained how to post multiple files:
http://docs.python-requests.org/en/latest/user/advanced/#post-multiple-multipart-encoded-files

removing the following helped me in my case:
'Content-Type': 'application/json'
then the data should be passed as dictionary

If your files are small, you could simply convert the binary (image or anything) to base64 string and send that as JSON to the API. That is much simpler and more straight forward than the suggested solutions. The currently accepted answer claims that is a lot of work, but it's really simple.
Client:
with open('/Users/houmie/Downloads/log.zip','rb') as f:
bytes = f.read()
tb = b64encode(bytes)
tb_str = tb.decode('utf-8')
body = {'logfile': tb_str}
r = requests.post('https://prod/feedback', data=json.dumps(body), headers=headers)
API:
def create(event, context):
data = json.loads(event["body"])
if "logfile" in data:
tb_back = data["logfile"].encode('utf-8')
zip_data = base64.b64decode(tb_back)

Project Oxford Speaker Recognition- Invalid Audio Format

I have been trying a lot to use the Project Oxford Speaker Recognition API
(https://dev.projectoxford.ai/docs/services/563309b6778daf02acc0a508/operations/5645c3271984551c84ec6797).
I have been successfully able to record the sound on my microphone convert it to the required WAV(PCM,16bit,16K,Mono).
The problem is when I try to post this file as a binary stream to the API it returns an Invalid audio format error message.
The same file is accepted by the demo on the website(https://www.projectoxford.ai/demo/SPID).
I am using python 2.7 with this code.
import httplib
import urllib
import base64
import json
import codecs
headers = {
# Request headers
'Content-Type': 'application/octet-stream',
'Ocp-Apim-Subscription-Key': '{KEY}',
}
params = urllib.urlencode({
})
def enroll(audioId):
conn = httplib.HTTPSConnection('api.projectoxford.ai')
file = open('test.wav','rb')
body = file.read()
conn.request("POST", "/spid/v1.0/verificationProfiles/" + audioId +"/enroll?%s" % params, str(body), headers)
response = conn.getresponse()
data = response.read()
print data
conn.close()
return data
And this is the response that i am getting.
{
"error": {
"code": "BadRequest",
"message": "Invalid Audio Format"
}
}
Please if anyone can guide me as to what I am missing. I have verified all the properties of the audio file and the requirements needed by the API but with no luck.
All answers and comments are appreciated.

I sent this file to Project oxford with my test program that is in ruby and it works properly. I think the issue might be in the other params you are sending. Try changing your 'Content Type' header to 'audio/wav; samplerate=1600' this is the header that I used. I also send a 'Content Length' header with the size of the file. I'm not sure if 'Content Length' is required but it is good standard to include it.

ValueError: Data must not be a string

I am trying to do the following with requests:
data = {'hello', 'goodbye'}
json_data = json.dumps(data)
headers = {
'Access-Key': self.api_key,
'Access-Signature': signature,
'Access-Nonce': nonce,
'Content-Type': 'application/json',
'Accept': 'text/plain'
}
r = requests.post(url, headers=headers, data=json_data,
files={'file': open('/Users/david/Desktop/a.png', 'rb')})
However, I get the following error:
ValueError: Data must not be a string.
Note that if I remove the files parameter, it works as needed. Why won't requests allow me to send a json-encoded string for data if files is included?
Note that if I change data to be just the normal python dictionary (and not a json-encoded string), the above works. So it seems that the issue is that if files is not json-encoded, then data cannot be json-encoded. However, I need to have my data encoded to match a hash signature that's being created by the API.

When you specify your body to a JSON string, you can no longer attach a file since file uploading requires the MIME type multipart/form-data.
You have two options:
Encapsulate your JSON string as part as the form data (something like json => json.dumps(data))
Encode your file in Base64 and transmit it in the JSON request body. This looks like a lot of work though.

1.Just remove the line
json_data = json.dumps(data)
and change in request as data=data.
2.Remove 'Content-Type': 'application/json' inside headers.
This worked for me.

Alternative solution to this problem is to post data as file.
You can post strings as files. Read more here:
http://docs.python-requests.org/en/latest/user/quickstart/#post-a-multipart-encoded-file
Here is explained how to post multiple files:
http://docs.python-requests.org/en/latest/user/advanced/#post-multiple-multipart-encoded-files

removing the following helped me in my case:
'Content-Type': 'application/json'
then the data should be passed as dictionary

If your files are small, you could simply convert the binary (image or anything) to base64 string and send that as JSON to the API. That is much simpler and more straight forward than the suggested solutions. The currently accepted answer claims that is a lot of work, but it's really simple.
Client:
with open('/Users/houmie/Downloads/log.zip','rb') as f:
bytes = f.read()
tb = b64encode(bytes)
tb_str = tb.decode('utf-8')
body = {'logfile': tb_str}
r = requests.post('https://prod/feedback', data=json.dumps(body), headers=headers)
API:
def create(event, context):
data = json.loads(event["body"])
if "logfile" in data:
tb_back = data["logfile"].encode('utf-8')
zip_data = base64.b64decode(tb_back)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python - Outputting to .JSON with results from Microsoft's Computer Vision API - python

Related

Decoding JSON that contains Base64

Encode file as base64 and send to API

ValueError: Data must not be a string. in Python [duplicate]

Project Oxford Speaker Recognition- Invalid Audio Format

ValueError: Data must not be a string

Categories

Resources