Can't reach nested JSON in python 3.8

Can't reach nested JSON in python 3.8 - python

Working with a response from a Websockets subscription.
The response reads like this:
{'jsonrpc': '2.0', 'method': 'subscription', 'params': {'channel': 'book.BTC-PERPETUAL.none.1.100ms', 'data': {'timestamp': 1588975154127, 'instrument_name': 'BTC-PERPETUAL', 'change_id': 19078703948, 'bids': [[10019.5, 8530.0]], 'asks': [[10020.0, 506290.0]]}}}
And I'm trying to reach the first and only values inside "bids" and "asks" arrays via json.loads()
Code looks like this:
async def __async__get_ticks(self):
async with self.ws as echo:
await echo.send(json.dumps(self.request))
while True:
response = await echo.receive()
responseJson = json.loads(response)
print(responseJson["params"]["data"])
And error says:
print(responseJson["params"]["data"])
KeyError: 'params'
However I try, it doesn't want to catch any of the JSON after "jsonprc", for which it successfully returns 2.0. Anything beyond that always comes up with an error.
I tried using .get(), and it helps to go one level deeper, but still not more.
Any ideas on how to format this properly and reach the bids and asks ?
Thank you in advance.

I would suggest using the dict.get() method, but make sure that you set it to return an empty dictionary when querying dictionaries that are expected to have nested dicts.
By default (if you don't specify a second argument to dict.get()), it will return None. This explains why you were only able to go one level deep.
Here's an example:
empty_dict = {}
two_level_dict = {
"one": {
"level": "deeper!"
}
}
# This will return None and the second get call will not fail, because
# the first get returned an empty dict for the .get("level") call to succeed.
first_get = empty_dict.get("one", {}).get("level")
# This will return 'deeper!'
second_get = two_level_dict.get("one", {}).get("level")
print(first_get)
print(second_get)

Related

How to access specific parts of JSON for API call - Python

I want to access 'employmentName' and 'jobTile' values.
But I keep on getting this error - KeyError: 'jobTitle' and KeyError: 'employmentName'
Here is my code below, please note I am not sharing the api URL on this question but it is present and working in the code.
api_three_url = "https://xxxx"
json_data_three = requests.get(api_three_url, headers=my_headers).json()
#print(json_data_three) #this prints the whole json data (working!)
print("So you have previously worked in", json_data_three['employmentName'],"as a", json_data_three['jobTitle'])
This is what the JSON looks like in python dict:
{
'hasAddedValue': False,
'employments': [{
'id': 527,
'employmentName': 'Sallys hair',
'jobTitle': 'Stylists',
'jobStartDate': '2019-03',
'jobEndDate': '2020-04',
'jobType': 'PAID',
'status': True,
'isJobCurrent': False
}]
}
Please guide me to where I am going wrong.
Many thanks :)

The employmentName and jobType fields are not at the top level in your JSON, hence Python is giving you KeyError. They are actually in one of the objects inside employments.
So if you want to obtain those fields from the first object inside employments, it would be:
json_data_three['employments'][0]['employmentName']

How to speed up returning a 20MB Json file from a Python-Flask application?

I am trying to call an API which in turn triggers a store procedure from our sqlserver database. This is how I coded it.
class Api_Name(Resource):
def __init__(self):
pass
#classmethod
def get(self):
try:
engine = database_engine
connection = engine.connect()
sql = "DECLARE #return_value int EXEC #return_value = [dbname].[dbo].[proc_name])
return call_proc(sql, apiname, starttime, connection)
except Exception as e:
return {'message': 'Proc execution failed with error => {error}'.format(error=e)}, 400
pass
call_proc is the method where I return the JSON from database.
def call_proc(sql: str, connection):
try:
json_data = []
rv = connection.execute(sql)
for result in rv:
json_data.append(dict(zip(result.keys(), result)))
return Response(json.dumps(json_data), status=200)
except Exception as e:
return {'message': '{error}'.format(error=e)}, 400
finally:
connection.close()
The problem with the output is the way JSON is returned and the size of it.
At first the API used to take 1minute 30seconds: when the return statement was like this:
case1: return Response(json.dumps(json_data), status=200, mimetype='application/json')
After looking online, I found that the above statement is trying to prettify JSON. So I removed mimetype from the response & made it as
case2: return Response(json.dumps(json_data), status=200)
The API runs for 30seconds, although the JSON output is not aligned properly but its still JSON.
I see the output size of the JSON returned from the API is close 20MB. I observed this on postman response:
Status: 200 OK Time: 29s Size: 19MB
The difference in Json output:
case1:
[ {
"col1":"val1",
"col2":"val2"
},
{
"col1":"val1",
"col2":"val2"
}
]
case2:
[{"col1":"val1","col2":"val2"},{"col1":"val1","col2":"val2"}]
Will the difference in output from the two aforementioned cases are different ? If so, how can I fix the problem ?
If there is no difference, is there any way I speed up this further and reduce the run time further more, like compressing the JSON which I am returning ?

You can use gzip compression to make your plain text weight from Megabytes to even Kilobytes. Or even use flask-compress library for that.
Also I'd suggest to use ujson to make dump() call faster.
import gzip
from flask import make_response
import ujson as json
#app.route('/data.json')
def compress():
compression_level = 5 # of 9 max
data = [
{"col1": "val1", "col2": "val2"},
{"col1": "val1", "col2": "val2"}
]
content = gzip.compress(json.dumps(data).encode('utf8'), compression_level)
response = make_response(content)
response.headers['Content-length'] = len(content)
response.headers['Content-Encoding'] = 'gzip'
return response
Documentation:
https://docs.python.org/3/library/gzip.html
https://github.com/colour-science/flask-compress
https://pypi.org/project/ujson/

First of all, profile: if 90% the time is being spent transferring across the network then optimising processing speed is less useful than optimising transfer speed (for example, by compressing the response as wowkin recommended (though the web server may be configured to do this automatically, if you are using one)
Assuming that constructing the JSON is slow, if you control the database code you could use its JSON capabilities to serialise the data, and avoid doing it at the Python layer. For example,
SELECT col1, col2
FROM tbl
WHERE col3 > 42
FOR JSON AUTO
would give you
[
{
"col1": "foo",
"col2": 1
},
{
"col1": "bar",
"col2": 2
},
...
]
Nested structures can be created too, described in the docs.
If the requester only needs the data, return it as a download using flask's send_file feature and avoid the cost of constructing an HTML response:
from io import BytesIO
from flask import send_file
def call_proc(sql: str, connection):
try:
rv = connection.execute(sql)
json_data = rv.fetchone()[0]
# BytesIO expects encoded data; if you can get the server to encode
# the data instead it may be faster.
encoded_json = json_data.encode('utf-8')
buf = BytesIO(encoded_json)
return send_file(buf, mimetype='application/json', as_attachment=True, conditional=True)
except Exception as e:
return {'message': '{error}'.format(error=e)}, 400
finally:
connection.close()

You need to implement pagination on your API. 19MB is absurdly large and will lead to some very annoyed users.
gzip and clevererness with the JSON responses will sadly not be enough, you'll need to put in a bit more legwork.
Luckily, there's many pagination questions and answers, and Flasks modular approach to things will mean that someone probably wrote up a module that's applicable to your problem. I'd start off by re-implementing the method with an ORM. I heard that sqlalchemy is quite good.

To answer your question:
1 - Both JSON are semantically identical.
You can make use of http://www.jsondiff.com to compare two JSON.
2 - I would recommend you to make chunks of your data and send it across network.
This might help:
https://masnun.com/2016/09/18/python-using-the-requests-module-to-download-large-files-efficiently.html

TL;DR; Try restructuring your JSON payload (i.e. change schema)
I see that you are constructing the JSON response in one of your APIs. Currently, your JSON payload looks something like:
[
{
"col0": "val00",
"col1": "val01"
},
{
"col0": "val10",
"col1": "val11"
}
...
]
I suggest you restructure it in such a way that each (first level) key in your JSON represents the entire column. So, for the above case, it will become something like:
{
"col0": ["val00", "val10", "val20", ...],
"col1": ["val01", "val11", "val21", ...]
}
Here are the results from some offline test I performed.
Experiment variables:
NUMBER_OF_COLUMNS = 10
NUMBER_OF_ROWS = 100000
LENGTH_OF_STR_DATA = 5
#!/usr/bin/env python3
import json
NUMBER_OF_COLUMNS = 10
NUMBER_OF_ROWS = 100000
LENGTH_OF_STR_DATA = 5
def get_column_name(id_):
return 'col%d' % id_
def random_data():
import string
import random
return ''.join(random.choices(string.ascii_letters, k=LENGTH_OF_STR_DATA))
def get_row():
return {
get_column_name(i): random_data()
for i in range(NUMBER_OF_COLUMNS)
}
# data1 has same schema as your JSON
data1 = [
get_row() for _ in range(NUMBER_OF_ROWS)
]
with open("/var/tmp/1.json", "w") as f:
json.dump(data1, f)
def get_column():
return [random_data() for _ in range(NUMBER_OF_ROWS)]
# data2 has the new proposed schema, to help you reduce the size
data2 = {
get_column_name(i): get_column()
for i in range(NUMBER_OF_COLUMNS)
}
with open("/var/tmp/2.json", "w") as f:
json.dump(data2, f)
Comparing sizes of the two JSONs:
$ du -h /var/tmp/1.json
17M
$ du -h /var/tmp/2.json
8.6M
In this case, it almost got reduced by half.
I would suggest you do the following:
First and foremost, profile your code to see the real culprit. If it is really the payload size, proceed further.
Try to change your JSON's schema (as suggested above)
Compress your payload before sending (either from your Flask WSGI app layer or your webserver level - if you are running your Flask app behind some production grade webserver like Apache or Nginx)

For large data that you can't paginate using something like ndjson (or any type of delimited record format) can really reduce the server resources needed since you'd be preventing holding the JSON object in memory. You would need to get access to the response stream to write each object/line to the response though.
The response
[ {
"col1":"val1",
"col2":"val2"
},
{
"col1":"val1",
"col2":"val2"
}
]
Would end up looking like
{"col1":"val1","col2":"val2"}
{"col1":"val1","col2":"val2"}
This also has advantages on the client since you can parse and process each line on it's own as well.
If you aren't dealing with nested data structures responding with a CSV is going to be even smaller.

I want to note that there is a standard way to write a sequence of separate records in JSON, and it's described in RFC 7464. For each record:
Write the record separator byte (0x1E).
Write the JSON record, which is a regular JSON document that can also contain inner line breaks, in UTF-8.
Write the line feed byte (0x0A).
(Note that the JSON text sequence format, as it's called, uses a more liberal syntax for parsing text sequences of this kind; see the RFC for details.)
In your example, the JSON text sequence would look as follows, where \x1E and \x0A are the record separator and line feed bytes, respectively:
\x1E{"col1":"val1","col2":"val2"}\x0A\x1E{"col1":"val1","col2":"val2"}\x0A
Since the JSON text sequence format allows inner line breaks, you can write each JSON record as you naturally would, as in the following example:
\x1E{
"col1":"val1",
"col2":"val2"}
\x0A\x1E{
"col1":"val1",
"col2":"val2"
}\x0A
Notice that the media type for JSON text sequences is not application/json, but application/json-seq; see the RFC.

python/chatterbot: get_response different behaviour on print and saving in dict

Currently I'm having a problem setting up a simple REST service with flask and a chatterbot. You can see the full code here.
The goal is, that the service returns a json with a response from the chatbot to a given request.
The problem is, that when I want to save the response from the chatbot in a dict:
dialog = {
"id": 1,
"usersay": request,
# chatterbot function to get a response from the bot
"botsay": chatbot.get_response(request)
}
It will be saved as a chatterbot special Statement Object like which will then look like this:
"botsay": <Statement text:bot response>
When I try to jsonify a dict with this object I get the following error:
TypeError: Can't convert 'Statement' object to str implicitly
I searched online to find a solution but havent found anything helpful. In addition, I'm not experienced with python.
What is absolutely unexplainable for me is, when I use
>>> request = "Hi"
>>> print(chatbot.get_response(request))
I will get the correct output
> Hello
I just want to save the plain response in the dict so can I return it as a json to the client.
Could anyone explain the problem?
Thanks in advance!

Problem solved by simple accessing the "text" attribute of the Statement Object with the . notation (see heere).
>>> response = chatterbot.get_response("Hi")
>>> dialog = { ..., "botsay" = response.text, ... }
>>> print dialog
{ ..., "botsay": "Hello", ...}

"TypeError" simple get method in python tornado to access record from Mongodb

Hi I have recently started programming in Python (I am newbie to python programming). I have a small collection of data in my MongoDB.I have written a simple get method to find all the data from my collection. But I have an error returning the fetched value.
Here is my code:
import bson
from bson import json_util
from bson.json_util import dumps
class TypeList(APIHandler):
#gen.coroutine
def get(self):
doc = yield db.vtype.find_one()
print(doc)
a = self.write(json_util.dumps(doc))
return a
def options(self):
pass
It gives me the fetched data.
But when I replace these lines
a = self.write....
return a
with return bson.json_util.dumps({ 'success': True, 'mycollectionKey': doc })
it gives me a type error.
TypeError: Expected None, got {'success': True, 'mycollectionKey': {'type': 1, 'item': 'cookie'}}
Can anyone explain me why I get this error and is there anyway to solve the problem.
Thanks in advance.

RequestHandler.get() is not supposed to return anything. This error is simply warning you that you returned a value that is being ignored. Tornado handlers produce output by calling self.write(), not by returning a value.

Python beatbox salesforce external Id upsert

I'm trying to upsert a record via the salesforce Beatbox python client the upsert operation seems to work fine but I can't quite work out how to specify an externalid as a foreign key:
Attempting to upsert with:
consolidatedToInsert = []
for id,ce in ConsolidatedEbills.items():
consolidatedToInsert.append(
{
'type':'consolidated_ebill__c',
'Account__r':{'type':'Account','ETL_Natural_Key__c':ce['CLASS_REFERENCE']},
'ETL_Natural_Key__c':ce['ISSUE_UNIQUE_ID']
}
)
print consolidatedToInsert[0]
pc.login('USERNAME', 'TOTALLYREALPASSWORD')
ret = pc.upsert('ETL_Natural_Key__c',consolidatedToInsert[0])
print ret
gives the error:
'The external foreign key reference does not reference a valid entity: Account__r'
[{'isCreated': False, 'errors': [{'fields': [], 'message': 'The external foreign key reference does not reference a valid entity: Account__r', 'statusCode': 'INVALID_FIEL
D'}], 'id': '', 'success': False, 'created': False}]
The soap examples and the specificity of the error text seem to indicate that it's possible but I can find little in the documentation about inserting with external ids.
On a closer look I'm not sure if this is possible at all, a totally mangled key to Account__r seems to pass silently as if it's not even being targeted for XML translation, I'd love to be wrong though.
A quick change to pythonclient.py 422:0:
for k,v in field_dict.items():
if v is None:
fieldsToNull.append(k)
field_dict[k] = []
if k.endswith('__r') and isinstance(v,dict):
pass
elif hasattr(v,'__iter__'):
if len(v) == 0:
fieldsToNull.append(k)
else:
field_dict[k] = ";".join(v)
and another to __beatbox.py 375:0
for fn in sObjects.keys():
if (fn != 'type'):
if (isinstance(sObjects[fn],dict)):
self.writeSObjects(s, sObjects[fn], fn)
else:
s.writeStringElement(_sobjectNs, fn, sObjects[fn])
and it works like some dark magic.

Currently Beatbox doesn't support serializing nested dictionaries like this, which is needed for the externalId resolution you're trying to do. (If you look at the generated request, you can see that the nested dictionary is just serialized as a string)).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Can't reach nested JSON in python 3.8 - python

Related

How to access specific parts of JSON for API call - Python

How to speed up returning a 20MB Json file from a Python-Flask application?

python/chatterbot: get_response different behaviour on print and saving in dict

"TypeError" simple get method in python tornado to access record from Mongodb

Python beatbox salesforce external Id upsert

Categories

Resources