I have a very complicated problem that I am sort of hoping to rubberduck here:
I have a dictionary:
{
"1": {
"1.1": {
"1.1.1": {}
},
"1.2": {
"1.2.1": {}
}
},
"2": {
"2.1": {
"2.1.1": {}
},
"2.2": {
"2.2.2": {}
}
}
}
whose structure wont always be the same (i.e., there could be further nesting or more keys in any sub-dictionary). I need to be able to generate a specifically ordered list of lists (contained sub-lists need not be ordered) based on some input. The structure of the lists is based on the dictionary. Accounting for all keys in the dictionary, the list of lists would look like:
[['1', '2'], ['1.2', '1.1'], ['1.1.1'], ['1.2.1'], ['2.2', '2.1'], ['2.1.1'], ['2.2.2']]
That is, the first sublist contains the two keys at the highest level of the dictionary. The second sub-list contains the two keys under the first "highest level" key. The third and fourth sub-lists contain the keys available under the "2nd level" of the dictionary. (And so on)
I need a function that, based on input (that is any key in the nested dictionary), will return the correct list of lists. For example(s):
function('2.2.2')
>>> [['2'], None, None, None, ['2.2'], None, ['2.2.2']] # or [['2'], [], [], [], ['2.2'], [], ['2.2.2']]
function('1.1')
>>> [['1'], ['1.1'], None, None, None, None, None] # or [['1'], ['1.1'], [], [], [], [], []]
function('1.2.1')
>>> [['1'], ['1.2'], None, ['1.2.1'], None, None, None] # or [['1'], ['1.2'], [], ['1.2.1'], None, [], []]
It is almost like I need to be able to "know" the structure of the dictionary as I recurse. I keep thinking maybe if I can find the input key in the dictionary and then trace it up, I will be able to generate the list of lists but
how can I recurse "upwards" in the dictionary and
how in the world do I store the information in the lists as I "go along"?
Your master list is just a depth-first list of all the keys in your dict structure. Getting this is fairly easy:
def dive_into(d):
if d and isinstance(d, dict):
yield list(d.keys())
for v in d.values():
yield from dive_into(v)
d = {
"1": {
"1.1": {
"1.1.1": {}
},
"1.2": {
"1.2.1": {}
}
},
"2": {
"2.1": {
"2.1.1": {}
},
"2.2": {
"2.2.2": {}
}
}
}
master_list = list(dive_into(d))
# [['1', '2'], ['1.1', '1.2'], ['1.1.1'], ['1.2.1'], ['2.1', '2.2'], ['2.1.1'], ['2.2.2']]
Next, your function needs to find all the parent keys of the given key, and only return the keys that are in the path to the given key. Since your keys always have the format <parent>.<child>.<grandchild>, you only need to iterate over this list, and return any elements e for which key.startswith(e) is True:
def function(key):
lst = [[e for e in keys if key.startswith(e)] for keys in master_list]
return [item or None for item in lst]
Testing this with your examples:
>>> function('2.2.2')
Out: [['2'], None, None, None, ['2.2'], None, ['2.2.2']]
>>> function('1.1')
Out: [['1'], ['1.1'], None, None, None, None, None]
>>> function('1.2.1')
Out: [['1'], ['1.2'], None, ['1.2.1'], None, None, None]
Related
I have a list made of responses to an API request. The result is the JSON with a lot of nested dictionaries and lists like this:
{'Siri': {'ServiceDelivery': {'ResponseTimestamp': '2020-12-13T08:16:31Z',
'ProducerRef': 'IVTR_HET',
'ResponseMessageIdentifier': 'IVTR_HET:ResponseMessage::df80db10-3cd3-4d8d-9cee-27ec6c716630:LOC:',
'StopMonitoringDelivery': [{'ResponseTimestamp': '2020-12-13T08:16:31Z',
'Version': '2.0',
'Status': 'true',
'MonitoredStopVisit': [{'RecordedAtTime': '2020-12-13T08:16:27.623Z',
'ItemIdentifier': 'RATP:Item::STIF.StopPoint.Q.40944.A.7:LOC',
'MonitoringRef': {'value': 'STIF:StopPoint:Q:40944:'},
'MonitoredVehicleJourney': {'lineRef': {'value': 'STIF:Line::C01742:'},
'framedVehicleJourneyRef': {'dataFrameRef': {'value': 'any'},
'datedVehicleJourneyRef': 'RATP:VehicleJourney::RA.A.0916.1:LOC'},
'journeyPatternName': {'value': 'TEDI'},
'vehicleMode': [],
'publishedLineName': [{'value': 'St-Germain-en-Laye. Poissy. Cergy. / Boissy-St-Léger. Marne-la-Vallée.'}],
'directionName': [{'value': 'St-Germain-en-Laye. Poissy. Cergy.'}],
'serviceFeatureRef': [],
'vehicleFeatureRef': [],
'originName': [],
'originShortName': [],
'destinationDisplayAtOrigin': [],
'via': [],
'destinationRef': {'value': 'STIF:StopPoint:Q:411359:'},
'destinationName': [{'value': 'Poissy'}],
'destinationShortName': [],
'originDisplayAtDestination': [],
'vehicleJourneyName': [],
'journeyNote': [{'value': 'TEDI'}],
'facilityConditionElement': [],
'situationRef': [],
'monitored': True,
'monitoringError': [],
'progressStatus': [],
'trainBlockPart': [],
'additionalVehicleJourneyRef': [],
'monitoredCall': {'stopPointRef': {'value': 'STIF:StopPoint:Q:40944:'},
'stopPointName': [{'value': 'Torcy'}],
'vehicleAtStop': True,
'originDisplay': [],
'destinationDisplay': [],
'callNote': [],
'facilityConditionElement': [],
'situationRef': [],
'arrivalOperatorRefs': [],
'expectedDepartureTime': '2020-12-13T08:16:00Z',
'departurePlatformName': {'value': '2'},
'departureOperatorRefs': []}}},
I need an efficient way to extract the value pairs corresponding to, for example, the expectedDepartureTime key that's lying somewhere hidden in that maze of nested dictionaries. It has to be efficient because the formatting from the source site isn't very stable so different stations (it's transit data) will have different keys for the same value, e.g. ExpectedDepartureTime or expecteddepartureTime and I might need to try a lot of different things.
P.S.: I tried the solution by #hexerei software here Find all occurrences of a key in nested dictionaries and lists but it just gave me a
<generator object gen_dict_extract at 0x00000200632C8948>
as the output. How do I get the values out of this?
This function should return a list containing key-value pairs regardless of the depth:
def accumulate_keys(dct): # returns all the key-value pairs
key_list = []
def accumulate_keys_recursive(dct): # will accumulate key-value pairs in key_list
for key in dct.keys():
if isinstance(dct[key], dict):
accumulate_keys_recursive(dct[key])
else:
key_list.append((key, dct[key]))
accumulate_keys_recursive(dct)
return key_list
print(accumulate_keys(dct))
I am having issues eliminating the second item of the lists nested inside a list of dictionaries. I think it may be because there are a couple of empty lists, so the indexing does not work. How can I delete the second item of each nested list pair but also skip the empty lists?
In the end, the nested list should be flattened as it does not have a second pair anymore.
The list looks something like this:
list_dict = [{"name": "Ken", "bla": [["abc", "ABC"],["def", "DEF"]]},
{"name": "Bob", "bla": []}, #skip the empty list
{"name": "Cher", "bla":[["abc", "ABC"]]}]
Desired output:
wanted = [{"name": "Ken", "bla": ["abc", "def"]},
{"name": "Bob", "bla": []},
{"name": "Cher", "bla":["abc"]}]
My code:
for d in list_dict:
for l in list(d["bla"]):
if l is None:
continue #use continue to ignore the empty lists
d["bla"].remove(l[1]) #remove second item of every nested list pair (gives error).
Yu can use [:1] to get only first item from the list (works with zero-lenght lists too):
list_dict = [{"name": "Ken", "bla": [["abc", "ABC"],["def", "DEF"]]},
{"name": "Bob", "bla": []}, #skip the empty list
{"name": "Cher", "bla":[["abc", "ABC"]]}]
for i in list_dict:
i['bla'] = [ll for l in [l[:1] for l in i['bla']] for ll in l]
print(list_dict)
Prints:
[{'name': 'Ken', 'bla': ['abc', 'def']},
{'name': 'Bob', 'bla': []},
{'name': 'Cher', 'bla': ['abc']}]
You can flatten lists using itertools.chain.from_iterable.
Example:
In [24]: l = [["abc", "ABC"],["def", "DEF"]]
In [25]: list(itertools.chain.from_iterable(l))
Out[25]: ['abc', 'ABC', 'def', 'DEF']
After you flatten your list you can slice it to get every second element:
In [26]: flattened = list(itertools.chain.from_iterable(l))
In [27]: flattened[::2]
Out[27]: ['abc', 'def']
You can use chain.from_iterable like this example:
from itertools import chain
from collections.abc import Iterable
list_dict = [{'name': 'Ken', 'bla': [['abc', 'ABC'], ['def', 'DEF']]},
{'name': 'Bob', 'bla': []},
{'name': 'Cher', 'bla': [['abc', 'ABC']]}]
out = []
for k in list_dict:
tmp = {}
for key, value in k.items():
# Check if the value is iterable
if not isinstance(value, Iterable):
tmp[key] = value
else:
val = list(chain.from_iterable(value))[:1]
tmp[key] = val
out.append(tmp)
print(out)
[{'name': 'Ken', 'bla': ['ABC', 'def', 'DEF']},
{'name': 'Bob', 'bla': []},
{'name': 'Cher', 'bla': ['ABC']}]
This should work, even if a given l is empty:
for d in list_dict:
d["bla"][:] = (v[0] for v in d["bla"])
For improved efficiency, you can extract d["bla"] once per entry:
for d in list_dict:
l = d["bla"]
l[:] = (v[0] for v in l)
This changes list_dict to:
[{'name': 'Ken', 'bla': ['abc', 'def']},
{'name': 'Bob', 'bla': []},
{'name': 'Cher', 'bla': ['abc']}]
Note that this solution does not create any new lists. It simply modifies the existing lists. This solution is more concise and more efficient than any of the other posted solutions.
Given:
list_dict = [{"name": "Ken", "bla": [["abc", "ABC"],["def", "DEF"]]},
{"name": "Bob", "bla": []}, #skip the empty list
{"name": "Cher", "bla":[["abc", "ABC"]]}]
desired_dict = [{"name": "Ken", "bla": ["abc", "def"]},
{"name": "Bob", "bla": []},
{"name": "Cher", "bla":["abc"]}]
You can simply recreate d['bra'] to be what you want. The empty lists are skipped since there is nothing to iterate and the existing entry is unchanged:
for d in list_dict:
d['bla']=[sl[0] for sl in d['bla']]
>>> list_dict==desired_dict
True
I've got a list and a dict. Now I want the items in the list to be key values in the dict and they have to be new lists. In a pythonic way :-)
I got it working. But I guess not in a real pythonic way.
my_people = ['theo', 'Jan', 'Jason']
my_classes = {'class_1': {}, 'class_2': {}, 'class_3': {}}
my_classes['class_1'] = dict.fromkeys(my_people, 1)
for p in my_classes['class_1']:
my_classes['class_1'][p] = []
So is there a way to make the items lists without the last for p in my_classes loop?
Output: {'class_1': {'theo': [], 'Jan': [], 'Jason': []}, 'class_2': {}, 'class_3': {}}
If your input is as you show you can just use a dict comprehension:
my_people = ['theo', 'Jan', 'Jason']
my_classes = {'class_1': {}, 'class_2': {}, 'class_3': {}}
my_classes['class_1'] = {name: [] for name in my_people}
First of all, your variable my_classes isn't a dictionary. It's a set, a collection without duplicated itens.
For create the dictionary that you want you can use dict comprehension like that:
my_people = ['theo', 'Jan', 'Jason']
dictionary = { elem: [] for elem in my_people }
// -> {'theo': [], 'Jan': [], 'Jason': []}
Consider a dict of the following form:
dic = {
"First": {
3: "Three"
},
"Second": {
1: "One"
},
"Third": {
2:"Two"
}
}
I would like to sort it by the nested dic key (3, 1, 2)
I tried using the lambda function in the following manner but it returns a "KeyError: 0"
dic = sorted(dic.items(), key=lambda x: x[1][0])
The expected output would be:
{
"Second": {
1: "One"
},
"Third": {
2: "Two"
},
"First": {
3:"Three"
}
}
In essence what I want to know is how to designate a nested key independently from the main dictionary key.
In the lambda function, x is a key-value pair, x[1] is the value, which is itself a dictionary. x[1].keys() is its keys, but it needs to be turned into a list if you want to get its one and only item by its index. Thus:
sorted(dic.items(), key = lambda x: list(x[1].keys())[0])
which evaluates to:
[('Second', {1: 'One'}), ('Third', {2: 'Two'}), ('First', {3: 'Three'})]
dic = {'First': {3: 'Three'}, 'Second': {1: 'One'}, 'Third': {2: 'Two'}}
sorted_list = sorted(dic.items(), key=lambda x:list(x[1].keys())[0])
sorted_dict = dict(sorted_list)
print(sorted_dict)
You need to get the keys for the nested dictionary first and then convert them into list and sort over its first index. You will get a sorted list. All you need to convert this list to dictionary using dict(). I hope that helps. This snippet works for python3.
Given this nested dictionary, how could I print all the "phone" values using a for loop?
people = {
'Alice': {
'phone': '2341',
'addr': '87 Eastlake Court'
},
'Beth': {
'phone': '9102',
'addr': '563 Hartford Drive'
},
'Randy': {
'phone': '4563',
'addr': '93 SW 43rd'
}
for d in people.values():
print d['phone']
Loop over the values and then use get() method, if you want to handle the missing keys, or a simple indexing to access the nested values. Also, for the sake of optimization you can do the whole process in a list comprehension :
>>> [val.get('phone') for val in people.values()]
['4563', '9102', '2341']
Using a list comprehension
>>> [people[i]['phone'] for i in people]
['9102', '2341', '4563']
Or if you'd like to use a for loop.
l = []
for person in people:
l.append(people[person]['phone'])
>>> l
['9102', '2341', '4563']