Generate single dictionary from a list of dictionaries - python

I have a method that takes a list of field names. In the method, I am making an API call out to get a record which will contain a list of dictionaries of fields.
API call example:
"fields": [
{
"datetime_value": "1987-02-03T00:00:00",
"name": "birth_date"
},
{
"text_value": "Dennis",
"name": "first_name"
},
{
"text_value": "Monsewicz",
"name": "last_name"
},
{
"text_value": "Male",
"name": "sex"
},
{
"text_value": "White",
"name": "socks"
}
]
My method makeup looks like contact(contact_id, contact_fields) where contact_fields looks like ['last_name', 'first_name']
The final fields dictionary I am trying to create would look like (not worried about order):
{
"last_name": "Monsewicz",
"first_name": "Dennis"
}
So, basically generate a single dictionary where the key is the name attribute from each dictionary in the list, but only if the name is in the list of field names passed into the method.
I've tried this:
"fields": {x: y for x, y in contact['fields'] if x in contact_fields}

Something like this?
>>> fields
[{'datetime_value': '1987-02-03T00:00:00', 'name': 'birth_date'},
{'name': 'first_name', 'text_value': 'Dennis'},
{'name': 'last_name', 'text_value': 'Monsewicz'},
{'name': 'sex', 'text_value': 'Male'},
{'name': 'socks', 'text_value': 'White'}]
>>> output = {}
>>> for field in fields:
... key = field.pop('name')
... _unused_key, value = field.popitem()
... output[key] = value
...
>>> output
{'birth_date': '1987-02-03T00:00:00',
'first_name': 'Dennis',
'last_name': 'Monsewicz',
'sex': 'Male',
'socks': 'White'}

How about this one-liner?
output = dict((x['name'], x['text_value']) for x in fields)
It basically loops through fields, pulls out name/text_value pairs then constructs a dict from it.

Related

Comparing dictionary of list of dictionary/nested dictionary

There are two dict main and input, I want to validate the "input" such that all the keys in the list of dictionary and nested dictionary (if present/all keys are optional) matches that of the main if not the wrong/different key should be returned as the output.
main = "app":[{
"name": str,
"info": [
{
"role": str,
"scope": {"groups": list}
}
]
},{
"name": str,
"info": [
{"role": str}
]
}]
input_data = "app":[{
'name': 'nms',
'info': [
{
'role': 'user',
'scope': {'groups': ['xyz']
}
}]
},{
'name': 'abc',
'info': [
{'rol': 'user'}
]
}]
when compared input with main the wrong/different key should be given as output, in this case
['rol']
The schema module does exactly this.
You can catch SchemaUnexpectedTypeError to see which data doesn't match your pattern.
Also, make sure you don't use the word input as a variable name, as it's the name of a built-in function.
keys = []
def print_dict(d):
if type(d) == dict:
for val in d.keys():
df = d[val]
try:
if type(df) == list:
for i in range(0,len(df)):
if type(df[i]) == dict:
print_dict(df[i])
except AttributeError:
pass
keys.append(val)
else:
try:
x = d[0]
if type(x) == dict:
print_dict(d[0])
except:
pass
return keys
keys_input = print_dict(input)
keys = []
keys_main = print_dict(main)
print(keys_input)
print(keys_main)
for i in keys_input[:]:
if i in keys_main:
keys_input.remove(i)
print(keys_input)
This has worked for me. you can check above code snippet and if any changes provide more information so any chances if required.
Dictionary and lists compare theire content nested by default.
input_data == main should result in the right output if you format your dicts correctly. Try adding curly brackets "{"/"}" arround your dicts. It should probably look like something like this:
main = {"app": [{
"name": str,
"info": [
{
"role": str,
"scope": {"groups": list}
}
]
},{
"name": str,
"info": [
{"role": str}
]
}]}
input_data = {"app":[{
'name': 'nms',
'info': [
{
'role': 'user',
'scope': {'groups': ['xyz']
}
}]
},{
'name': 'abc',
'info': [
{'rol': 'user'}
]
}]}
input_data2 = {"app": [{
'name': 'nms',
'info': [
{
'role': 'user',
'scope': {'groups': ['xyz']
}
}]
}, {
'name': 'abc',
'info': [
{'rol': 'user'}
]
}]}
Comparision results should look like this:
input_data2 == input_data # True
main == input_data # False

need to turn JSON values into keys

I have some json that I would like to transform from this:
[
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
...
{
"name":"fieldN",
"intValue":"N"
}
]
into this:
{ "field1" : "1",
"field2" : "2",
...
"fieldN" : "N",
}
For each pair, I need to change the value of the name field to a key, and the values of the intValue field to a value. This doesn't seem like flattening or denormalizing. Are there any tools that might do this out-of-the-box, or will this have to be brute-forced? What's the most pythonic way to accomplish this?
parameters = [ # assuming this is loaded already
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
{
"name":"fieldN",
"intValue":"N"
}
]
field_int_map = dict()
for p in parameters:
field_int_map[p['name']] = p['intValue']
yields {'field1': '1', 'field2': '2', 'fieldN': 'N'}
or as a dict comprehension
field_int_map = {p['name']:p['intValue'] for p in parameters}
This works to combine the name attribute with the intValue as key:value pairs, but the result is a dictionary instead of the original input type which was a list.
Use dictionary comprehension:
json_dct = {"parameters":
[
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
{
"name":"fieldN",
"intValue":"N"
}
]}
dct = {d["name"]: d["intValue"] for d in json_dct["parameters"]}
print(dct)
# {'field1': '1', 'field2': '2', 'fieldN': 'N'}

Merge dictionaries with same key from two lists of dicts in python

I have two dictionaries, as below. Both dictionaries have a list of dictionaries as the value associated with their properties key; each dictionary within these lists has an id key. I wish to merge my two dictionaries into one such that the properties list in the resulting dictionary only has one dictionary for each id.
{
"name":"harry",
"properties":[
{
"id":"N3",
"status":"OPEN",
"type":"energetic"
},
{
"id":"N5",
"status":"OPEN",
"type":"hot"
}
]
}
and the other list:
{
"name":"harry",
"properties":[
{
"id":"N3",
"type":"energetic",
"language": "english"
},
{
"id":"N6",
"status":"OPEN",
"type":"cool"
}
]
}
The output I am trying to achieve is:
"name":"harry",
"properties":[
{
"id":"N3",
"status":"OPEN",
"type":"energetic",
"language": "english"
},
{
"id":"N5",
"status":"OPEN",
"type":"hot"
},
{
"id":"N6",
"status":"OPEN",
"type":"cool"
}
]
}
As id: N3 is common in both the lists, those 2 dicts should be merged with all the fields. So far I have tried using itertools and
ds = [d1, d2]
d = {}
for k in d1.keys():
d[k] = tuple(d[k] for d in ds)
Could someone please help in figuring this out?
Here is one of the approach:
a = {
"name":"harry",
"properties":[
{
"id":"N3",
"status":"OPEN",
"type":"energetic"
},
{
"id":"N5",
"status":"OPEN",
"type":"hot"
}
]
}
b = {
"name":"harry",
"properties":[
{
"id":"N3",
"type":"energetic",
"language": "english"
},
{
"id":"N6",
"status":"OPEN",
"type":"cool"
}
]
}
# Create dic maintaining the index of each id in resp dict
a_ids = {item['id']: index for index,item in enumerate(a['properties'])} #{'N3': 0, 'N5': 1}
b_ids = {item['id']: index for index,item in enumerate(b['properties'])} #{'N3': 0, 'N6': 1}
# Loop through one of the dict created
for id in a_ids.keys():
# If same ID exists in another dict, update it with the key value
if id in b_ids:
b['properties'][b_ids[id]].update(a['properties'][a_ids[id]])
# If it does not exist, then just append the new dict
else:
b['properties'].append(a['properties'][a_ids[id]])
print (b)
Output:
{'name': 'harry', 'properties': [{'id': 'N3', 'type': 'energetic', 'language': 'english', 'status': 'OPEN'}, {'id': 'N6', 'status': 'OPEN', 'type': 'cool'}, {'id': 'N5', 'status': 'OPEN', 'type': 'hot'}]}
It might help to treat the two objects as elements each in their own lists. Maybe you have other objects with different name values, such as might come out of a JSON-formatted REST request.
Then you could do a left outer join on both name and id keys:
#!/usr/bin/env python
a = [
{
"name": "harry",
"properties": [
{
"id":"N3",
"status":"OPEN",
"type":"energetic"
},
{
"id":"N5",
"status":"OPEN",
"type":"hot"
}
]
}
]
b = [
{
"name": "harry",
"properties": [
{
"id":"N3",
"type":"energetic",
"language": "english"
},
{
"id":"N6",
"status":"OPEN",
"type":"cool"
}
]
}
]
a_names = set()
a_prop_ids_by_name = {}
a_by_name = {}
for ao in a:
an = ao['name']
a_names.add(an)
if an not in a_prop_ids_by_name:
a_prop_ids_by_name[an] = set()
for ap in ao['properties']:
api = ap['id']
a_prop_ids_by_name[an].add(api)
a_by_name[an] = ao
res = []
for bo in b:
bn = bo['name']
if bn not in a_names:
res.append(bo)
else:
ao = a_by_name[bn]
bp = bo['properties']
for bpo in bp:
if bpo['id'] not in a_prop_ids_by_name[bn]:
ao['properties'].append(bpo)
res.append(ao)
print(res)
The idea above is to process list a for names and ids. The names and ids-by-name are instances of a Python set. So members are always unique.
Once you have these sets, you can do the left outer join on the contents of list b.
Either there's an object in b that doesn't exist in a (i.e. shares a common name), in which case you add that object to the result as-is. But if there is an object in b that does exist in a (which shares a common name), then you iterate over that object's id values and look for ids not already in the a ids-by-name set. You add missing properties to a, and then add that processed object to the result.
Output:
[{'name': 'harry', 'properties': [{'id': 'N3', 'status': 'OPEN', 'type': 'energetic'}, {'id': 'N5', 'status': 'OPEN', 'type': 'hot'}, {'id': 'N6', 'status': 'OPEN', 'type': 'cool'}]}]
This doesn't do any error checking on input. This relies on name values being unique per object. So if you have duplicate keys in objects in both lists, you may get garbage (incorrect or unexpected output).

Remove duplicate dictionary from a list on the basis of key value by priority

Suppose I have the following type of list of dictionaries:
iterlist = [
{"Name": "Mike", "Type": "Admin"},
{"Name": "Mike", "Type": "Writer"},
{"Name": "Mike", "Type": "Reader"},
{"Name": "Zeke", "Type": "Writer"},
{"Name": "Zeke", "Type": "Reader"}
]
I want to remove duplicates of "Name" on the basis of "Type" by the following priority (Admin > Writer > Reader), so the end result should be:
iterlist = [
{"Name": "Mike", "Type": "Admin"},
{"Name": "Zeke", "Type": "Writer"}
]
I found a similar question but it removes duplicates for one explicit type of key-value: Link
Can someone please guide me on how to move forward with this?
This is a modified form of the solution suggested by #azro, their solution and the other solution does not take into account the priority you mentioned, you can get over that using the following code. Have a priority dict.
iterlist = [
{"Name": "Mike", "Type": "Writer"},
{"Name": "Mike", "Type": "Reader"},
{"Name": "Mike", "Type": "Admin"},
{"Name": "Zeke", "Type": "Reader"},
{"Name": "Zeke", "Type": "Writer"}
]
# this is used to get the priority
priorites = {i:idx for idx, i in enumerate(['Admin', 'Writer', 'Reader'])}
sort_key = lambda x:(x['Name'], priorites[x['Type']])
groupby_key = lambda x:x['Name']
result = [next(i[1]) for i in groupby(sorted(iterlist, key=sort_key), key=groupby_key)]
print(result)
Output
[{'Name': 'Mike', 'Type': 'Admin'}, {'Name': 'Zeke', 'Type': 'Writer'}]
You can also use pandas in the following way:
transform the list of dictionary to data frame:
import pandas as pd
df = pd.DataFrame(iterlist)
create a mapping dict:
m = {'Admin': 3, 'Writer': 2, 'Reader': 1}
create a priority column using replace:
df['pri'] = df['Type'].replace(m)
sort_values by pri and groupby by Name and get the first element only:
df = df.sort_values('pri', ascending=False).groupby('Name').first().reset_index()
drop the pri column and return to dictionary using to_dict:
df.drop('pri', axis='columns').to_dict(orient='records')
This will give you the following:
[{'Name': 'Mike', 'Type': 'Admin'}, {'Name': 'Zeke', 'Type': 'Writer'}]
Here is solution you can try out,
unique = {}
for v in iterlist:
# check if key exists, if not update to `unique` dict
if not unique.get(v['Name']):
unique[v['Name']] = v
print(unique.values())
dict_values([{'Name': 'Mike', 'Type': 'Admin'}, {'Name': 'Zeke', 'Type': 'Writer'}])

How to access MongoDB array in which is are stored key-value pairs by key name

I am working with pymongo and after writing aggregate query
db.collection.aggregate([{'$project': {'Id': '$ResultData.Id','data' : '$Results.Data'}}])
I received the object:
{'data': [{'key': 'valid', 'value': 'true'},
{'key': 'number', 'value': '543543'},
{'key': 'name', 'value': 'Saturdays cx'},
{'key': 'message', 'value': 'it is valid.'},
{'key': 'city', 'value': 'London'},
{'key': 'street', 'value': 'Bigeye'},
{'key': 'pc', 'value': '3566'}],
Is there a way that I can access the values by the key name? Like that '$Results.Data.city' and I will receive London. I would like to do that on the level of MongoDB aggregate query so it means I want to write a query in the way:
db.collection.aggregate([{'$project':
{'Id': '$ResultData.Id',
'data' : '$Results.Data',
'city' : $Results.Data.city',
'name' : $Results.Data.name',
'street' : $Results.Data.street',
'pc' : $Results.Data.pc',
}}])
And receive all the values of provided keys.
Using the $elemMatch projection operator in the following query from mongo shell:
db.collection.find(
{ _id: <some_value> },
{ _id: 0, data: { $elemMatch: { key: "city" } } }
)
The output:
{ "data" : [ { "key" : "city", "value" : "London" } ] }
Using PyMongo (gets the same output):
collection.find_one(
{ '_id': <some_value> },
{ '_id': 0, 'data': { '$elemMatch': { 'key': 'city' } } }
)
Using PyMongo aggregate method (gets the same result):
pipeline = [
{
'$project': {
'_id': 0,
'data': {
'$filter': {
'input': '$data', 'as': 'dat',
'cond': { '$eq': [ '$$dat.key', INPUT_KEY ] }
}
}
}
}
]
INPUT_KEY = 'city'
pprint.pprint(list(collection.aggregate(pipeline)))
Naming the received object "result", if result['data'] always is a list of dictionaries with 2 keys (key and value), you can convert the whole list to a dictionary using keys as keys and values as values. Given that this statement is somewhat confusing, here's the code:
data = {pair['key']: pair['value'] for pair in result['data']}
From here, data['city'] will give you 'London', data['street'] will be 'Bigeye' and so on. Obviously, this assumes that there are no conflicts amoung key values in result['data']. Note that this dictionary will (just as the original result['data']) only contain strings so don't expect data['number'] to be an integer.
Another approach would be to dynamically create an object holding each key-value pair as an attribute, allowing you to use the following syntax: data.city, data.street, ... But this would required more complicated code and is a less common and stable approach.

Categories