How to properly parse JSON with simplejson? - python

I can have the following JSON string:
{ "response" : [ [ { "name" : "LA_",
"uid" : 123456
} ],
[ { "cid" : "1",
"name" : "Something"
} ],
[ { "cid" : 1,
"name" : "Something-else"
} ]
] }
or one of the following:
{"error":"some-error"}
{ "response" : [ [ { "name" : "LA_",
"uid" : 123456
} ],
[ { "cid" : "1",
"name" : ""
} ],
[ { "cid" : 1,
"name" : "Something-else"
} ]
] }
{ "response" : [ [ { "name" : "LA_",
"uid" : 123456
} ] ] }
So, I am not sure if all childs and elements are there. Will it be enough to do the following verifications to get Something value:
if jsonstr.get('response'):
jsonstr = jsonstr.get('response')[1][0]
if jsonstr:
name = jsonstr.get('name')
if jsonstr: # I don't need empty value
# save in the database
Can the same be simplified?

You're not guaranteed that the ordering of your inner objects will be the same every time you parse it, so indexing is not a safe bet to reference the index of the object with the name attribute set to Something.
Instead of nesting all those if statements, you can get away with using a list comprehension. Observe that if you iterate the response key, you get a list of lists, each with a dictionary inside of it:
>>> data = {"response":[[{"uid":123456,"name":"LA_"}],[{"cid":"1","name":"Something"}],[{"cid":1,"name":"Something-else"}]]}
>>> [lst for lst in data.get('response')]
[[{'name': 'LA_', 'uid': 123456}], [{'name': 'Something', 'cid': '1'}], [{'name': 'Something-else', 'cid': 1}]]
If you index the first item in each list (lst[0]), you end up with a list of objects:
>>> [lst[0] for lst in data.get('response')]
[{'name': 'LA_', 'uid': 123456}, {'name': 'Something', 'cid': '1'}, {'name': 'Something-else', 'cid': 1}]
If you then add an if condition into your list comprehension to match the name attribute on the objects, you get a list with a single item containing your desired object:
>>> [lst[0] for lst in data.get('response') if lst[0].get('name') == 'Something']
[{'name': 'Something', 'cid': '1'}]
And then by indexing the first item that final list, you get the desired object:
>>> [lst[0] for lst in data.get('response') if lst[0].get('name') == 'Something'][0]
{'name': 'Something', 'cid': '1'}
So then you can just turn that into a function and move on with your life:
def get_obj_by_name(data, name):
objects = [lst[0] for lst in data.get('response', []) if lst[0].get('name') == name]
if objects:
return objects[0]
return None
print get_obj_by_name(data, 'Something')
# => {'name': 'Something', 'cid': '1'}
print get_obj_by_name(data, 'Something')['name']
# => 'Something'
And it should be resilient and return None if the response key isn't found:
print get_obj_by_name({"error":"some-error"}, 'Something')
# => None

Related

Comparing dictionary of list of dictionary/nested dictionary

There are two dict main and input, I want to validate the "input" such that all the keys in the list of dictionary and nested dictionary (if present/all keys are optional) matches that of the main if not the wrong/different key should be returned as the output.
main = "app":[{
"name": str,
"info": [
{
"role": str,
"scope": {"groups": list}
}
]
},{
"name": str,
"info": [
{"role": str}
]
}]
input_data = "app":[{
'name': 'nms',
'info': [
{
'role': 'user',
'scope': {'groups': ['xyz']
}
}]
},{
'name': 'abc',
'info': [
{'rol': 'user'}
]
}]
when compared input with main the wrong/different key should be given as output, in this case
['rol']
The schema module does exactly this.
You can catch SchemaUnexpectedTypeError to see which data doesn't match your pattern.
Also, make sure you don't use the word input as a variable name, as it's the name of a built-in function.
keys = []
def print_dict(d):
if type(d) == dict:
for val in d.keys():
df = d[val]
try:
if type(df) == list:
for i in range(0,len(df)):
if type(df[i]) == dict:
print_dict(df[i])
except AttributeError:
pass
keys.append(val)
else:
try:
x = d[0]
if type(x) == dict:
print_dict(d[0])
except:
pass
return keys
keys_input = print_dict(input)
keys = []
keys_main = print_dict(main)
print(keys_input)
print(keys_main)
for i in keys_input[:]:
if i in keys_main:
keys_input.remove(i)
print(keys_input)
This has worked for me. you can check above code snippet and if any changes provide more information so any chances if required.
Dictionary and lists compare theire content nested by default.
input_data == main should result in the right output if you format your dicts correctly. Try adding curly brackets "{"/"}" arround your dicts. It should probably look like something like this:
main = {"app": [{
"name": str,
"info": [
{
"role": str,
"scope": {"groups": list}
}
]
},{
"name": str,
"info": [
{"role": str}
]
}]}
input_data = {"app":[{
'name': 'nms',
'info': [
{
'role': 'user',
'scope': {'groups': ['xyz']
}
}]
},{
'name': 'abc',
'info': [
{'rol': 'user'}
]
}]}
input_data2 = {"app": [{
'name': 'nms',
'info': [
{
'role': 'user',
'scope': {'groups': ['xyz']
}
}]
}, {
'name': 'abc',
'info': [
{'rol': 'user'}
]
}]}
Comparision results should look like this:
input_data2 == input_data # True
main == input_data # False

need to turn JSON values into keys

I have some json that I would like to transform from this:
[
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
...
{
"name":"fieldN",
"intValue":"N"
}
]
into this:
{ "field1" : "1",
"field2" : "2",
...
"fieldN" : "N",
}
For each pair, I need to change the value of the name field to a key, and the values of the intValue field to a value. This doesn't seem like flattening or denormalizing. Are there any tools that might do this out-of-the-box, or will this have to be brute-forced? What's the most pythonic way to accomplish this?
parameters = [ # assuming this is loaded already
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
{
"name":"fieldN",
"intValue":"N"
}
]
field_int_map = dict()
for p in parameters:
field_int_map[p['name']] = p['intValue']
yields {'field1': '1', 'field2': '2', 'fieldN': 'N'}
or as a dict comprehension
field_int_map = {p['name']:p['intValue'] for p in parameters}
This works to combine the name attribute with the intValue as key:value pairs, but the result is a dictionary instead of the original input type which was a list.
Use dictionary comprehension:
json_dct = {"parameters":
[
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
{
"name":"fieldN",
"intValue":"N"
}
]}
dct = {d["name"]: d["intValue"] for d in json_dct["parameters"]}
print(dct)
# {'field1': '1', 'field2': '2', 'fieldN': 'N'}

Merge dictionaries with same key from two lists of dicts in python

I have two dictionaries, as below. Both dictionaries have a list of dictionaries as the value associated with their properties key; each dictionary within these lists has an id key. I wish to merge my two dictionaries into one such that the properties list in the resulting dictionary only has one dictionary for each id.
{
"name":"harry",
"properties":[
{
"id":"N3",
"status":"OPEN",
"type":"energetic"
},
{
"id":"N5",
"status":"OPEN",
"type":"hot"
}
]
}
and the other list:
{
"name":"harry",
"properties":[
{
"id":"N3",
"type":"energetic",
"language": "english"
},
{
"id":"N6",
"status":"OPEN",
"type":"cool"
}
]
}
The output I am trying to achieve is:
"name":"harry",
"properties":[
{
"id":"N3",
"status":"OPEN",
"type":"energetic",
"language": "english"
},
{
"id":"N5",
"status":"OPEN",
"type":"hot"
},
{
"id":"N6",
"status":"OPEN",
"type":"cool"
}
]
}
As id: N3 is common in both the lists, those 2 dicts should be merged with all the fields. So far I have tried using itertools and
ds = [d1, d2]
d = {}
for k in d1.keys():
d[k] = tuple(d[k] for d in ds)
Could someone please help in figuring this out?
Here is one of the approach:
a = {
"name":"harry",
"properties":[
{
"id":"N3",
"status":"OPEN",
"type":"energetic"
},
{
"id":"N5",
"status":"OPEN",
"type":"hot"
}
]
}
b = {
"name":"harry",
"properties":[
{
"id":"N3",
"type":"energetic",
"language": "english"
},
{
"id":"N6",
"status":"OPEN",
"type":"cool"
}
]
}
# Create dic maintaining the index of each id in resp dict
a_ids = {item['id']: index for index,item in enumerate(a['properties'])} #{'N3': 0, 'N5': 1}
b_ids = {item['id']: index for index,item in enumerate(b['properties'])} #{'N3': 0, 'N6': 1}
# Loop through one of the dict created
for id in a_ids.keys():
# If same ID exists in another dict, update it with the key value
if id in b_ids:
b['properties'][b_ids[id]].update(a['properties'][a_ids[id]])
# If it does not exist, then just append the new dict
else:
b['properties'].append(a['properties'][a_ids[id]])
print (b)
Output:
{'name': 'harry', 'properties': [{'id': 'N3', 'type': 'energetic', 'language': 'english', 'status': 'OPEN'}, {'id': 'N6', 'status': 'OPEN', 'type': 'cool'}, {'id': 'N5', 'status': 'OPEN', 'type': 'hot'}]}
It might help to treat the two objects as elements each in their own lists. Maybe you have other objects with different name values, such as might come out of a JSON-formatted REST request.
Then you could do a left outer join on both name and id keys:
#!/usr/bin/env python
a = [
{
"name": "harry",
"properties": [
{
"id":"N3",
"status":"OPEN",
"type":"energetic"
},
{
"id":"N5",
"status":"OPEN",
"type":"hot"
}
]
}
]
b = [
{
"name": "harry",
"properties": [
{
"id":"N3",
"type":"energetic",
"language": "english"
},
{
"id":"N6",
"status":"OPEN",
"type":"cool"
}
]
}
]
a_names = set()
a_prop_ids_by_name = {}
a_by_name = {}
for ao in a:
an = ao['name']
a_names.add(an)
if an not in a_prop_ids_by_name:
a_prop_ids_by_name[an] = set()
for ap in ao['properties']:
api = ap['id']
a_prop_ids_by_name[an].add(api)
a_by_name[an] = ao
res = []
for bo in b:
bn = bo['name']
if bn not in a_names:
res.append(bo)
else:
ao = a_by_name[bn]
bp = bo['properties']
for bpo in bp:
if bpo['id'] not in a_prop_ids_by_name[bn]:
ao['properties'].append(bpo)
res.append(ao)
print(res)
The idea above is to process list a for names and ids. The names and ids-by-name are instances of a Python set. So members are always unique.
Once you have these sets, you can do the left outer join on the contents of list b.
Either there's an object in b that doesn't exist in a (i.e. shares a common name), in which case you add that object to the result as-is. But if there is an object in b that does exist in a (which shares a common name), then you iterate over that object's id values and look for ids not already in the a ids-by-name set. You add missing properties to a, and then add that processed object to the result.
Output:
[{'name': 'harry', 'properties': [{'id': 'N3', 'status': 'OPEN', 'type': 'energetic'}, {'id': 'N5', 'status': 'OPEN', 'type': 'hot'}, {'id': 'N6', 'status': 'OPEN', 'type': 'cool'}]}]
This doesn't do any error checking on input. This relies on name values being unique per object. So if you have duplicate keys in objects in both lists, you may get garbage (incorrect or unexpected output).

How to access MongoDB array in which is are stored key-value pairs by key name

I am working with pymongo and after writing aggregate query
db.collection.aggregate([{'$project': {'Id': '$ResultData.Id','data' : '$Results.Data'}}])
I received the object:
{'data': [{'key': 'valid', 'value': 'true'},
{'key': 'number', 'value': '543543'},
{'key': 'name', 'value': 'Saturdays cx'},
{'key': 'message', 'value': 'it is valid.'},
{'key': 'city', 'value': 'London'},
{'key': 'street', 'value': 'Bigeye'},
{'key': 'pc', 'value': '3566'}],
Is there a way that I can access the values by the key name? Like that '$Results.Data.city' and I will receive London. I would like to do that on the level of MongoDB aggregate query so it means I want to write a query in the way:
db.collection.aggregate([{'$project':
{'Id': '$ResultData.Id',
'data' : '$Results.Data',
'city' : $Results.Data.city',
'name' : $Results.Data.name',
'street' : $Results.Data.street',
'pc' : $Results.Data.pc',
}}])
And receive all the values of provided keys.
Using the $elemMatch projection operator in the following query from mongo shell:
db.collection.find(
{ _id: <some_value> },
{ _id: 0, data: { $elemMatch: { key: "city" } } }
)
The output:
{ "data" : [ { "key" : "city", "value" : "London" } ] }
Using PyMongo (gets the same output):
collection.find_one(
{ '_id': <some_value> },
{ '_id': 0, 'data': { '$elemMatch': { 'key': 'city' } } }
)
Using PyMongo aggregate method (gets the same result):
pipeline = [
{
'$project': {
'_id': 0,
'data': {
'$filter': {
'input': '$data', 'as': 'dat',
'cond': { '$eq': [ '$$dat.key', INPUT_KEY ] }
}
}
}
}
]
INPUT_KEY = 'city'
pprint.pprint(list(collection.aggregate(pipeline)))
Naming the received object "result", if result['data'] always is a list of dictionaries with 2 keys (key and value), you can convert the whole list to a dictionary using keys as keys and values as values. Given that this statement is somewhat confusing, here's the code:
data = {pair['key']: pair['value'] for pair in result['data']}
From here, data['city'] will give you 'London', data['street'] will be 'Bigeye' and so on. Obviously, this assumes that there are no conflicts amoung key values in result['data']. Note that this dictionary will (just as the original result['data']) only contain strings so don't expect data['number'] to be an integer.
Another approach would be to dynamically create an object holding each key-value pair as an attribute, allowing you to use the following syntax: data.city, data.street, ... But this would required more complicated code and is a less common and stable approach.

How to add new key into dictionary like this [{ {]. This looks more like a dictionary inside a list

I would like to add new key into the dictionary list. Example:
"label" : [] (with empty list)
[
{
"Next" : {
"seed" : [
{
"Argument" : [
{
"id" : 4,
"label" : "org"
},
{
"id" : "I"
},
{
"word" : "He",
"seed" : 2,
"id" : 3,
"label" : "object"
},
{
"word" : "Gets",
"seed" : 9,
"id" : 2,
"label" : "verb"
}
]
}
],
"Next" : "he,get",
"time" : ""
}
}
]
I tried to use loop into "seed" and then to "argument" then use .update("label":[]) in the loop but it won't work. Can anyone please give me an example of using for loop to loop from beginning then to add these new "label"?
My prefered goal: ( to have extra "label" within the dictionary according to my input)
Example:
[
{
"Next" : {
"seed" : [
{
"Argument" : [
{
"id" : 4,
"label" : "org"
},
{
"id" : "I"
},
{
"word" : "He",
"seed" : 2,
"id" : 3,
"label" : "object"
},
{
"word" : "Gets",
"seed" : 9,
"id" : 2,
"label" : "verb"
},
{
"id" : 5,
"label" : "EXTRA"
},
{
"id" : 6,
"label" : "EXTRA"
},
{
"id" : 7,
"label" : "EXTRA"
}
]
}
],
"Next" : "he,get",
"time" : ""
}
}
]
I am new to dictionary so really need help with this
If I understand your problem correctly, you want to add 'label' to dict in Argument where there is no label. You could do it like so -
for i in x[0]['Next']['seed'][0]['Argument']:
if not 'label' in i.keys():
i['label'] = []
Where x is your dict. But what's x[0]['Next']['seed'][0]['Argument']:?
Let's simplify your dict -
x = [{'Next': {'Next': 'he,get',
'seed': [{'Argument': [{these}, {are}, {your}, {dicts}]}],
'time': ''}}]
How did we reach here?
Let's see-
x = [{'Next'(parent dict): {'Next'(child of previous 'Next'):{},
'seed(child of previous 'Next')':[{these}, {are}, {your}, {dicts}](a list of dicts)}]
I hope that makes sense. And to add more dictionaries in Argument
# create a function that returns a dict
import random # I don't know how you want ids, so this
def create_dicts():
return {"id": random.randint(1, 10), "label": ""}
for i in range(3): # 3 is how many dicts you want to push in Argument
x[0]['Next']['seed'][0]['Argument'].append(create_dicts())
Now your dict will become -
[{'Next': {'Next': 'he,get',
'seed': [{'Argument': [{'id': 4, 'label': 'org'},
{'id': 'I'},
{'id': 3, 'label': 'object', 'seed': 2, 'word': 'He'},
{'id': 2, 'label': 'verb', 'seed': 9, 'word': 'Gets'},
{'id': 1, 'label': ''},
{'id': 4, 'label': ''},
{'id': 4, 'label': ''}]}],
'time': ''}}]
First things first: access the list of dict that need to be updated.
according to your given structure that's l[0]["Next"]["seed"][0]["Argument"]
Then iterate that list and check if label already exists, if it does not then add it as an empty list.
This can be done by explicit checking:
if "label" not in i:
i["label"] = []
or by re-assigning:
i["label"] = i.get("label", [])
Full Code:
import pprint
l = [ {
"Next" : {
"seed" : [ {
"Argument" : [ {
"id" : 4,
"label" : "org"
}, {
"id" : "I"
}, {
"word" : "He",
"seed" : 2,
"id" : 3,
"label" : "object"
}, {
"word" : "Gets",
"seed" : 9,
"id" : 2,
"label" : "verb"
} ]
} ],
"Next" : "he,get",
"time" : ""
} }]
# access the list of dict that needs to be updated
l2 = l[0]["Next"]["seed"][0]["Argument"]
for i in l2:
i["label"] = i.get("label", []) # use the existing label or add an empty list
pprint.pprint(l)
Output:
[{'Next': {'Next': 'he,get',
'seed': [{'Argument': [{'id': 4, 'label': 'org'},
{'id': 'I', 'label': []},
{'id': 3,
'label': 'object',
'seed': 2,
'word': 'He'},
{'id': 2,
'label': 'verb',
'seed': 9,
'word': 'Gets'}]}],
'time': ''}}]
You have a list with one nested dictionary. Get the list of the inner dicts, and iterate. Assuming your initial data structure is named data
dict_list = data[0]['Next']['seed'][0]['Argument']
for item in dict_list:
item['label'] = input()

Categories