Pythonic way to map between two dicts with one having nested keys - python

I have dicts of two types representing same data. These are consumed by two different channels hence their keys are different.
for example:
Type A
{
"key1": "value1",
"key2": "value2",
"nestedKey1" : {
"key3" : "value3",
"key4" : "value4"
}
}
Type B
{
"equiKey1": "value1",
"equiKey2": "value2",
"equinestedKey1.key3" : "value3",
"equinestedKey1.key4" : "value4"
}
I want to map data from Type B to type A.
currently i am creating it as below
{
"key1": typeBObj.get("equiKey1"),
.....
}
Is there a better and faster way to do that in Python

First, you need a dictionary mapping keys in B to keys (or rather lists of keys) in A. (If the keys follow the pattern from your question, or a similar pattern, this dict might also be generated.)
B_to_A = {
"equiKey1": ["key1"],
"equiKey2": ["key2"],
"equinestedKey1.key3" : ["nestedKey1", "key3"],
"equinestedKey1.key4" : ["nestedKey1", "key4"]
}
Then you can define a function for translating those keys.
def map_B_to_A(d):
res = {}
for key, val in B.items():
r = res
*head, last = B_to_A[key]
for k in head:
r = res.setdefault(k, {})
r[last] = val
return res
print(map_B_to_A(B) == A) # True
Or a bit shorter, but probably less clear, using reduce:
def map_B_to_A(d):
res = {}
for key, val in B.items():
*head, last = B_to_A[key]
reduce(lambda d, k: d.setdefault(k, {}), head, res)[last] = val
return res

Related

Access dictionary values using list values as subsequent keys

keys = ['prop1', 'prop2', 'prop3']
dict = { prop1: { prop2: { prop3: True } } }
How do I get the value True out of the dict using the list?
Not having any success with
val = reduce((lambda a, b: dict[b]), keys)
update:
keys and dict can be arbitrarily long, but will always have matching properties/keys.
Using a loop:
>>> a = ['prop1', 'prop2', 'prop3']
>>> d = {'prop1': {'prop2': {'prop3': True}}}
>>> result = d
>>> for k in a:
... result = result[k]
...
>>> result
True
Using a functional style:
>>> from functools import reduce
>>> reduce(dict.get, a, d)
True
EDIT:
As Op. rephrased his question, I made an update:
Actually you don't need for keys at all to get "True".
You can use a recursive function to do it nicely without knowing the keys.
d = { 'prop1': { 'prop2': { 'prop3': True } } }
def d_c(dc):
if isinstance(list(dc.values())[0], dict):
return d_c(list(dc.values())[0])
return list(dc.values())[0]
Result:
True

Create unique dictionaries for two keys in a list of dictionaries

I am getting a list of dictionaries in the following format from an API, :
eg.
xlist =[
{ "id":1, "day":2, name:"abc", ... },
{ "id":1, "day":3, name:"abc", ... },
{ "id":1, "day":2, name:"xyz", ... },
{ "id":1, "day":3, name:"xyz", ... },...
]
So, to store/optimize the queries in to the DB I have to convert them in this format.
What is efficient or other way to generate following structure?
unique_xlist =[
{ "id":1, "day":2, name:["abc", "xyz"], ... },
{ "id":1, "day":3, name:["abc", "xyz"], ... },
]
What I am doing :
names = list(set([ v['name'] for v in xlist])) #-- GET UNIQUE NAMES
templist = [{k:(names if k == 'name' else v) \
for k, v in obj.items()} for obj in xlist] #-- Append Unique names
unique_xlist= {v['day']:v for v in templist}.values() #- find unique dicts
I don't think this is very efficient, am using 3 loops just to find unique dicts by day.
You could use itertools.groupby:
from itertools import groupby
xlist.sort(key=lambda x: (x["id"], x["day"], x["name"])) # or use sorted()
unique_xlist = []
for k, g in groupby(xlist, lambda x: (x["id"], x["day"])):
unique_xlist.append({"id": k[0], "day": k[1], "name": [i["name"] for i in g]})
Simply use the values that makes an item unique as keys to a dictionary:
grouped = {}
for x in xlist:
key = (x['id'], x['day'])
try:
grouped[key]['name'].append(x['name'])
except KeyError:
grouped[key] = x
grouped[key]['name'] = [x['name']]
You can listify this again afterwards if necessary.

Using dictionary comprehenion, need more than 1 value to unpack

Consider you have JSON structured like below:
{
"valueA": "2",
"valueB": [
{
"key1": "value1"
},
{
"key2": "value2"
},
{
"key3": "value3"
}
]
}
and when doing something like:
dict_new = {key:value for (key,value) in dict['valueB'] if key == 'key2'}
I get:
ValueError: need more than 1 value to unpack
Why and how to fix it?
dict['valueB'] is a list of dictionaries. You need another layer of nesting for your code to work, and since you are looking for one key, you need to produce a list here (keys must be unique in a dictionary):
values = [value for d in dict['valueB'] for key, value in d.items() if key == 'key2']
If you tried to make a dictionary of key2: value pairs, you will only have the last pair left, as the previous values have been replaced by virtue of having been associated with the same key.
Better still, just grab that one key, no need to loop over all items if you just wanted that one key:
values = [d['key2'] for d in dict['valueB'] if 'key2' in d]
This filters on the list of dictionaries in the dict['valueB'] list; if 'key2' is a key in that nested dictionary, we extract it.

How to remove dictionary's keys and values based on another dictionary?

I wish to remove keys and values in one JSON dictionary based on another JSON dictionary's keys and values. In a sense I am looking perform a "subtraction". Let's say I have JSON dictionaries a and b:
a = {
"my_app":
{
"environment_variables":
{
"SOME_ENV_VAR":
[
"/tmp",
"tmp2"
]
},
"variables":
{ "my_var": "1",
"my_other_var": "2"
}
}
}
b = {
"my_app":
{
"environment_variables":
{
"SOME_ENV_VAR":
[
"/tmp"
]
},
"variables":
{ "my_var": "1" }
}
}
Imagine you could do a-b=c where c looks like this:
c = {
"my_app":
{
"environment_variables":
{
"SOME_ENV_VAR":
[
"/tmp2"
]
},
"variables":
{ "my_other_var": "2" }
}
}
How can this be done?
You can loop through your dictionary using for key in dictionary: and you can delete keys using del dictionary[key], I think that's all you need. See the documentation for dictionaries: https://docs.python.org/2/tutorial/datastructures.html#dictionaries
The way you can do it is to:
Create copy of a -> c;
Iterate over every key, value pair inside b;
Check if for same top keys you have same inner keys and values and delete them from c;
Remove keys with empty values.
You should modify code, if your case will be somehow different (no dict(dict), etc).
print(A)
print(B)
C = A.copy()
# INFO: Suppose your max depth is as follows: "A = dict(key:dict(), ...)"
for k0, v0 in B.items():
# Look for similiar outer keys (check if 'vars' or 'env_vars' in A)
if k0 in C:
# Look for similiar inner (keys, values)
for k1, v1 in v0.items():
# If we have e.g. 'my_var' in B and in C and values are the same
if k1 in C[k0] and v1 == C[k0][k1]:
del C[k0][k1]
# Remove empty 'vars', 'env_vars'
if not C[k0]:
del C[k0]
print(C)
{'environment_variables': {'SOME_ENV_VAR': ['/tmp']},
'variables': {'my_var': '2', 'someones_var': '1'}}
{'environment_variables': {'SOME_ENV_VAR': ['/tmp']},
'variables': {'my_var': '2'}}
{'variables': {'someones_var': '1'}}
The following does what you need:
def subtract(a, b):
result = {}
for key, value in a.items():
if key not in b or b[key] != value:
if not isinstance(value, dict):
if isinstance(value, list):
result[key] = [item for item in value if item not in b[key]]
else:
result[key] = value
continue
inner_dict = subtract(value, b[key])
if len(inner_dict) > 0:
result[key] = inner_dict
return result
It checks if both key and value are present. It could del items, but I think is much better to return a new dict with the desired data instead of modifying the original.
c = subtract(a, b)
UPDATE
I have just updated for the latest version of the data provided by in the question. Now it 'subtract' list values as well.
UPDATE 2
Working example: ipython notebook

Filtering dictionaries and creating sub-dictionaries based on keys/values in Python?

Ok, I'm stuck, need some help from here on...
If I've got a main dictionary like this:
data = [ {"key1": "value1", "key2": "value2", "key1": "value3"},
{"key1": "value4", "key2": "value5", "key1": "value6"},
{"key1": "value1", "key2": "value8", "key1": "value9"} ]
Now, I need to go through that dictionary already to format some of the data, ie:
for datadict in data:
for key, value in datadict.items():
...filter the data...
Now, how would I in that same loop somehow (if possible... if not, suggest alternatives please) check for values of certain keys, and if those values match my presets then I would add that whole list to another dictionary, thus effectively creating smaller dictionaries as I go along out of this main dictionary based on certain keys and values?
So, let's say I want to create a sub-dictionary with all the lists in which key1 has value of "value1", which for the above list would give me something like this:
subdata = [ {"key1": "value1", "key2": "value2", "key1": "value3"},
{"key1": "value1", "key2": "value8", "key1": "value9"} ]
Here is a not so pretty way of doing it. The result is a generator, but if you really want a list you can surround it with a call to list(). Mostly it doesn't matter.
The predicate is a function which decides for each key/value pair if a dictionary in the list is going to cut it. The default one accepts all. If no k/v-pair in the dictionary matches it is rejected.
def filter_data(data, predicate=lambda k, v: True):
for d in data:
for k, v in d.items():
if predicate(k, v):
yield d
test_data = [{"key1":"value1", "key2":"value2"}, {"key1":"blabla"}, {"key1":"value1", "eh":"uh"}]
list(filter_data(test_data, lambda k, v: k == "key1" and v == "value1"))
# [{'key2': 'value2', 'key1': 'value1'}, {'key1': 'value1', 'eh': 'uh'}]
Net of the issues already pointed out in other comments and answers (multiple identical keys can't be in a dict, etc etc), here's how I'd do it:
def select_sublist(list_of_dicts, **kwargs):
return [d for d in list_of_dicts
if all(d.get(k)==kwargs[k] for k in kwargs)]
subdata = select_sublist(data, key1='value1')
The answer is too simple, so I guess we are missing some information. Anyway:
result = []
for datadict in data:
for key, value in datadict.items():
thefiltering()
if datadict.get('matchkey') == 'matchvalue':
result.append(datadict)
Also, you "main dictionary" is not a dictionary but a list. Just wanted to clear that up.
It's an old question, but for some reason there is no one-liner syntax answer:
{ k: v for k, v in <SOURCE_DICTIONARY>.iteritems() if <CONDITION> }
For example:
src_dict = { 1: 'a', 2: 'b', 3: 'c', 4: 'd' }
predicate = lambda k, v: k % 2 == 0
filtered_dict = { k: v for k, v in src_dict.iteritems() if predicate(k, v) }
print "Source dictionary:", src_dict
print "Filtered dictionary:", filtered_dict
Will produce the following output:
Source dictionary: {1: 'a', 2: 'b', 3: 'c', 4: 'd'}
Filtered dictionary: {2: 'b', 4: 'd'}
Inspired by the answer of Skurmedal, I split this into a recursive scheme to work with a database of nested dictionaries. In this case, a "record" is the subdictionary at the trunk. The predicate defines which records we are after -- those that match some (key,value) pair where these pairs may be deeply nested.
def filter_dict(the_dict, predicate=lambda k, v: True):
for k, v in the_dict.iteritems():
if isinstance(v, dict) and _filter_dict_sub(predicate, v):
yield k, v
def _filter_dict_sub(predicate, the_dict):
for k, v in the_dict.iteritems():
if isinstance(v, dict) and filter_dict_sub(predicate, v):
return True
if predicate(k, v):
return True
return False
Since this is a generator, you may need to wrap with dict(filter_dict(the_dict)) to obtain a filtered dictionary.

Categories