I want to get all parent keys for all items in a nested python dictionary with unlimited levels. Take an analogy, if you think of a nested dictionary as a directory containing sub-directories, the behaviour I want is similar to what glob.glob(dir, recursive=True) does.
For example, suppose we have the following dictionary:
sample_dict = {
"key_1": {
"sub_key_1": 1,
"sub_key_2": 2,
},
"key_2": {
"sub_key_1": 3,
"sub_key_2": {
"sub_sub_key_1": 4,
},
},
}
I want to get the full "path" of every value in the dictionary:
["key_1", "sub_key_1", 1]
["key_1", "sub_key_2", 2]
["key_2", "sub_key_1", 3]
["key_2", "sub_key_2", "sub_sub_key_1", 4]
Just wondering if there is a clean way to do that?
Using generators can often simplify the code for these type of tasks and make them much more readable while avoiding passing explicit state arguments to the function. You get a generator instead of a list, but this is a good thing because you can evaluate lazily if you want to. For example:
def getpaths(d):
if not isinstance(d, dict):
yield [d]
else:
yield from ([k] + w for k, v in d.items() for w in getpaths(v))
result = list(getpaths(sample_dict))
Result will be:
[['key_1', 'sub_key_1', 1],
['key_1', 'sub_key_2', 2],
['key_2', 'sub_key_1', 3],
['key_2', 'sub_key_2', 'sub_sub_key_1', 4]]
You can solve it recursively
sample_dict = {
"key_1": {
"sub_key_1": 1,
"sub_key_2": 2,
},
"key_2": {
"sub_key_1": 3,
"sub_key_2": {
"sub_sub_key_1": 4,
},
}
}
def full_paths(sample_dict, paths=[], parent_keys=[]):
for key in sample_dict.keys():
if type(sample_dict[key]) is dict:
full_paths(sample_dict[key], paths=paths, parent_keys=(parent_keys + [key]))
else:
paths.append(parent_keys + [key] + [sample_dict[key]])
return paths
print(full_paths(sample_dict))
You can use this solution.
sample_dict = {
"key_1": {
"sub_key_1": 1,
"sub_key_2": 2,
},
"key_2": {
"sub_key_1": 3,
"sub_key_2": {
"sub_sub_key_1": 4,
},
},
}
def key_find(sample_dict, li=[]):
for key, val in sample_dict.items():
if isinstance(val, dict):
key_find(val, li=li + [key])
else:
print(li + [key] + [val])
key_find(sample_dict)
Related
This question already has answers here:
How to find a particular JSON value by key?
(9 answers)
Closed 2 years ago.
for the most part of my day i've been trying to find a way to solve this. I'm trying to find a way to find all keys in my json code that have the key "price" and if they do, populate a dictionary or something with the price and the name of each item that has a price. This is the simplified json, please note that the "price" keys can also be further nested. I'm trying to search the whole json code for the key:
{
"status": "success",
"data": {
"top_products": {
"products": [
{
"price": 3,
"name": "Apple"
},
{
"price": 2,
"name": "Banana"
}
]
},
"products": {
"fruits": {
"list": [
{
"price": 4,
"name": "Pear"
},
{
"name": "Kiwi"
},
{
"price": 4,
"name": "Pineapple"
},
{
"name": "Cherry"
}
]
},
"veggies": {
"list": [
{
"price": 3,
"name": "cucumber"
},
{
"name": "tomato"
},
{
"price": 2,
"name": "onion"
},
{
"name": "green pepper"
}
]
}
}
}
}
Here is what i've managed to get working so far (didnt come up with this, found it in some other response):
def findkeys(node, kv):
if isinstance(node, list):
for i in node:
for x in findkeys(i, kv):
yield x
elif isinstance(node, dict):
if kv in node:
yield node[kv]
for j in node.values():
for x in findkeys(j, kv):
yield x
print(list(findkeys(jsonResponse, 'price')))
The first part works, it returns all the keys that have a price. I'm trying to figure out a way to also write the "name" key for all the prices, preferably into a dictionary. Whats the best approach to do this?
Thanks,
Rob
Use the following code, if there are only unique items:
def create_db(data, find, other):
db = {}
def recurse(data):
if isinstance(data, list):
for elem in data:
recurse(elem)
elif isinstance(data, dict):
if find in data:
db[data[other]] = data[find]
for k, v in data.items():
recurse(v)
recurse(data)
return db
>>> create_db(data, 'price', 'name')
{'Apple': 3, 'Banana': 2, 'Pear': 4, 'Pineapple': 4, 'cucumber': 3, 'onion': 2}
Else:
def create_db(data, find, other):
db = {}
ctr = {}
def recurse(data):
if isinstance(data, list):
for elem in data:
recurse(elem)
elif isinstance(data, dict):
if find in data:
if data[other] in ctr:
ctr[data[other]] = str(int(ctr[data[other]] or '1') + 1)
else:
ctr[data[other]] = ''
key = data[other] + ctr[data[other]]
db[key] = data[find]
for k, v in data.items():
recurse(v)
recurse(data)
return db
For example, if data had two Apples:
data = {'status': 'success',
'data': {'top_products': {'products': [{'price': 3, 'name': 'Apple'},
{'price': 4, 'name': 'Apple'},
{'price': 2, 'name': 'Banana'}]}}}
# Second approach will add a serial number to each duplicate item
>>> create_db(data, 'price', 'name')
{'Apple': 3, 'Apple2': 4, 'Banana': 2}
For easier access in case of duplicates, you can create a nested dict:
def create_db(data, find, other):
db = {}
def recurse(data):
if isinstance(data, list):
for elem in data:
recurse(elem)
elif isinstance(data, dict):
if find in data:
if data[other] in db:
if isinstance(db[data[other]], dict):
db[data[other]][len(db[data[other]]) + 1] = data[other]
else:
db[data[other]] = {0: db.pop(data[other]), 1: data[find]}
else:
db[data[other]] = data[find]
for k, v in data.items():
recurse(v)
recurse(data)
return db
# For the data in above approach:
>>> create_db(data, 'price', 'name')
{'Apple': {0: 3, 1: 4}, 'Banana': 2}
I am trying to write a program where I am having a list of dictionaries in the following manner
[
{
'unique':1,
'duplicate':2,
},
{
'unique':1,
'duplicate':2,
},
{
'unique':1,
'duplicate':2,
},
{
'unique':1,
'duplicate':2,
}
]
Can we form it as a dictionary, where the first key in tuple should become unique Key in a dictionary
and it's corresponding values as a list for that values
Example:
[
{
'unique':1,
'duplicate':2,
},
{
'unique':1,
'duplicate':8,
},
{
'unique':2,
'duplicate':2,
},
{
'unique':1,
'duplicate':4,
}
]
The above list should be converted into the following
---- Expected Outcome ---
[
{
'unique':1,
'duplicates':[2,8,4]
},
{
'unique':2,
'duplicates':[2]
}
]
PS: I am doing this in python
Thanks for the code in advance
you can also use itertools.groupby:
from itertools import groupby
from operator import itemgetter
l = [
{
'unique':1,
'duplicate':2,
},
{
'unique':1,
'duplicate':8,
},
{
'unique':2,
'duplicate':2,
},
{
'unique':1,
'duplicate':4,
}
]
key = itemgetter('unique')
result = [{'unique':k, 'duplicate': list(map(itemgetter('duplicate'), g))}
for k, g in groupby(sorted(l, key=key ), key = key)]
print(result)
output:
[{'unique': 1, 'duplicate': [2, 8, 4]}, {'unique': 2, 'duplicate': [2]}]
I think this list comprehension can solve your problem:
result = [{'unique': id, 'duplicates': [d['duplicate'] for d in l if d['unique'] == id]} for id in set(map(lambda d: d['unique'], l))]
This might help you:
l = [
{
'unique':1,
'duplicate':2,
},
{
'unique':1,
'duplicate':8,
},
{
'unique':2,
'duplicate':2,
},
{
'unique':1,
'duplicate':4,
}
]
a = set()
for i in l:
a.add(i['unique'])
d = {i:[] for i in a }
for i in l:
d[i['unique']].append(i['duplicate'])
output = [{'unique': i, 'duplicate': j}for i, j in d.items()]
The output will be:
[{'unique': 1, 'duplicate': [2, 8, 4]}, {'unique': 2, 'duplicate': [2]}]
defaultdict(list) may help you here:
from collections import defaultdict
# data = [ {'unique': 1, 'duplicate': 2}, ... ] # your data
dups = defaultdict(list) # {unique: [duplicate]}
for dd in data:
dups[dd['unique']].append(dd['duplicate'])
answer = [dict(unique = k, duplicates = v) for k, v in dups.items()]
If you don't know the name of unique key, then replace 'unique' with something like
unique_key = list(data[0].keys())[0]
unique=[]
duplicate ={}
for items in data:
if items['unique'] not in unique:
unique.append(items['unique'])
duplicate[items['unique']]=[items['duplicate']]
else:
duplicate[items['unique']].append(items['duplicate'])
new_data=[]
for key in unique:
new_data.append({'unique':key,'duplicate':duplicate[key]})
Explanation: In the first for loop, I am appending unique keys to 'unique'. If the key doesn't exists in 'unique', I will append it in 'unique' & add a key in 'duplicate' with value as single element list. If the same key is found again, I simply append that value to 'duplicate' corresponding the key. In the 2nd loop, I am creating a 'new_dict' where I am adding these unique keys & its duplicate value list
I'll just go straight to example:
Here we have a dictionary with a test name, and another dict which contains the level categorization.
EDIT
Input:
test_values={
{
"name":"test1",
"level_map":{
"system":1,
"system_apps":2,
"app_test":3
}
},
{
"name":"test2",
"level_map":{
"system":1,
"system_apps":2,
"app_test":3
}
},
{
"name":"test3",
"level_map":{
"system":1,
"memory":2,
"memory_test":3
}
}
}
Output:
What I want is this:
dict_obj:
{
"system":{
"system_apps":{
"app_test":{
test1 object,
test2 object
},
"memory":{
"memory_test":{
test3 object
}
}
}
}
I just can't wrap my head around the logic and I'm struggling to even come up with an approach. If someone could guide me, that would be great.
Let's start with level_map. You can sort keys on values to get the ordered levels:
>>> level_map = { "system": 1, "system_apps": 2, "app_test": 3}
>>> L = sorted(level_map.keys(), key=lambda k: level_map[k])
>>> L
['system', 'system_apps', 'app_test']
Use these elements to build a tree:
>>> root = {}
>>> temp = root
>>> for k in L[:-1]:
... temp = temp.setdefault(k, {}) # create new inner dict if necessary
...
>>> temp.setdefault(L[-1], []).append("test") # and add a name
>>> root
{'system': {'system_apps': {'app_test': ['test']}}}
I split the list before the last element, because the last element will be associated to a list, not a dict (leaves of the tree are lists in your example).
Now, the it's easy to repeat this with the list of dicts:
ds = [{ "name": "test1",
"level_map": { "system": 1, "system_apps": 2, "app_test": 3}
}, { "name": "test2",
"level_map": { "system": 1, "system_apps": 2, "app_test": 3}
}, { "name": "test3",
"level_map": { "system": 1, "memory": 2, "memory_test": 3}
}]
root = {}
for d in ds:
name = d["name"]
level_map = d["level_map"]
L = sorted(level_map.keys(), key=lambda k: level_map[k])
temp = root
for k in L[:-1]:
temp = temp.setdefault(k, {})
temp.setdefault(L[-1], []).append(name)
# root is: {'system': {'system_apps': {'app_test': ['test1', 'test2']}, 'memory': {'memory_test': ['test3']}}}
I have a dictionary that looks like this:
d = {'dev':
{<dev1>:
{'mod':
{<mod1>:
{'port': [1, 2, 3]
}
}
}
<dev2>:
{'mod':
{<mod3>:
{'port': [] }
}
}
}
}
I want to be able to write a function, such that if i provide a search object such as 'mod1', it provides me the parent key as 'dev1'.
I have searched all over and tried a bunch of things, but couldnt seem to get this to work. Any help will be appreciated!
I have tried the stuff mentioned at the link below:
Python--Finding Parent Keys for a specific value in a nested dictionary
Find a key in a python dictionary and return its parents
This should work:
def find_parent_keys(d, target_key, parent_key=None):
for k, v in d.items():
if k == target_key:
yield parent_key
if isinstance(v, dict):
for res in find_parent_keys(v, target_key, k):
yield res
Usage:
d = {
'dev': {
'dev1': {
'mod': {
'mod1': {'port': [1, 2, 3]},
},
},
'dev2': {
'mod': {
'mod3': {'port': []},
},
},
},
}
print list(find_parent_keys(d, 'mod'))
print list(find_parent_keys(d, 'dev'))
Output:
['dev2', 'dev1']
[None]
I have the following object in python:
{
name: John,
age: {
years:18
},
computer_skills: {
years:4
},
mile_runner: {
years:2
}
}
I have an array with 100 people with the same structure.
What is the best way to go through all 100 people and make it such that there is no more "years"? In other words, each object in the 100 would look something like:
{
name: John,
age:18,
computer_skills:4,
mile_runner:2
}
I know I can do something in pseudocode:
for(item in list):
if('years' in (specific key)):
specifickey = item[(specific key)][(years)]
But is there a smarter/more efficent way?
Your pseudo-code is already pretty good I think:
for person in persons:
for k, v in person.items():
if isinstance(v, dict) and 'years' in v:
person[k] = v['years']
This overwrites every property which is a dictionary that has a years property with that property’s value.
Unlike other solutions (like dict comprehensions), this will modify the object in-place, so no new memory to keep everything is required.
def flatten(d):
ret = {}
for key, value in d.iteritems():
if isinstance(value, dict) and len(value) == 1 and "years" in value:
ret[key] = value["years"]
else:
ret[key] = value
return ret
d = {
"name": "John",
"age": {
"years":18
},
"computer_skills": {
"years":4
},
"mile_runner": {
"years":2
}
}
print flatten(d)
Result:
{'age': 18, 'mile_runner': 2, 'name': 'John', 'computer_skills': 4}
Dictionary comprehension:
import json
with open("input.json") as f:
cont = json.load(f)
print {el:cont[el]["years"] if "years" in cont[el] else cont[el] for el in cont}
prints
{u'age': 18, u'mile_runner': 2, u'name': u'John', u'computer_skills': 4}
where input.json contains
{
"name": "John",
"age": {
"years":18
},
"computer_skills": {
"years":4
},
"mile_runner": {
"years":2
}
}
Linear with regards to number of elements, you can't really hope for any lower.
As people said in the comments, it isn't exactly clear what your "object" is, but assuming that you actually have a list of dicts like this:
list = [{
'name': 'John',
'age': {
'years': 18
},
'computer_skills': {
'years':4
},
'mile_runner': {
'years':2
}
}]
Then you can do something like this:
for item in list:
for key in item:
try:
item[key] = item[key]['years']
except (TypeError, KeyError):
pass
Result:
list = [{'age': 18, 'mile_runner': 2, 'name': 'John', 'computer_skills': 4}]