Having trouble turning these loops into a dictionary comprehension - it might be impossible.
The general idea is that I have a dictionary of excludes that looks like this:
excludes = {
"thing1": ["name", "address"],
"thing2": ["username"]
}
Then I have a larger dictionary that I want to "clean" using the exclusions
original_dict = {
"thing1": {"name": "John", "address": "123 Anywhere Drive", "occupation": "teacher" },
"thing2": {"username": "bearsfan123", "joined_date": "01/01/2015"},
"thing3": {"pet_name": "Spot"}
}
If I run the following:
for k, v in original_dict.iteritems():
if k in excludes.keys():
for key in excludes[k]:
del v[key]
I'm left with:
original_dict = {
"thing1": {"occupation": "teacher" },
"thing2": {"joined_date": "01/01/2015"},
"thing3": {"pet_name": "Spot"}
}
This is perfect, but I'm not sure if I can better represent this as a dictionary comprehension - simply adding the keys I want rather than deleting the ones I don't.
I've gotten down to the second for but am not sure how to represent that in a
new_dict = {k: v for (k, v) in original_dict.iteritems()}
{k:{sub_k:val for sub_k, val in v.iteritems()
if sub_k not in excludes.get(k, {})}
for k,v in original_dict.iteritems()}
Note the need for excludes.get(k, {}).
After pasting in your data and running it in IPython:
In [154]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:{k:{sub_k:val for sub_k, val in v.iteritems()
: if sub_k not in excludes.get(k, {})}
: for k,v in original_dict.iteritems()}
:--
Out[154]:
{'thing1': {'occupation': 'teacher'},
'thing2': {'joined_date': '01/01/2015'},
'thing3': {'pet_name': 'Spot'}}
I'd personally argue that the for-loop approach is more readable and just generally better and less surprising across the spectrum of different developer experience levels of potential code readers.
A slight variation of the for loop approach that doesn't require evil side-effects with del and uses an inner dict comprehension:
new_dict = {}
for k, v in original_dict.iteritems():
k_excludes = excludes.get(k, {})
new_dict[k] = {sub_k:sub_v for sub_k, sub_v in v.iteritems()
if sub_k not in k_excludes}
Related
Having list comprehension of dictionaries. My goal is to create dictionary with one key if key["formula"] does not exist or dictionary with two keys if formula exists.
Currently having something like this and it works
cols = [{"header":k, **(({"formula":v["formula"]}) if v.get("formula") else {})} for k,v in inp["cols"].items()]
Is there any shorter / more elegant way to gain the same effect?
Edit (expected output): for clarification, what I need to achieve is
inp = {"cols":{"header1":{"formula":"test"}}}
cols = [{"header":k, **(({"formula":v["formula"]}) if v.get("formula") else {})} for k,v in inp["cols"].items()]
->
[{'header': 'header1', 'formula': 'test'}]
inp = {"cols":{"header1":{"notformula":"test"}}}
cols = [{"header":k, **(({"formula":v["formula"]}) if v.get("formula") else {})} for k,v in inp["cols"].items()]
->
[{'header': 'header1'}]
You can improve it slightly using the dict union operator introduced in Python 3.9. Here it is with extra line breaks for readability and PEP 8 compliance.
cols = [
{"header": k} | ({"formula": v["formula"]} if "formula" in v else {})
for k, v in inp["cols"].items()
]
I've also replaced your v.get("formula") with an in check. v.get("formula") is falsy if v is {"formula": []}, for instance, which you may, but probably don't, want.
I would consider extracting the logic on the second line into a function.
I am looking for a nice and efficient way to get particulars values of a dictionary.
{
"online_apply":{
"value":"true"
},
"interview_format":{
"value":"In-store"
},
"interview_time":{
"value":"4PM-5PM",
}
}
I am trying to transform the dictionary above to:
{
"online_apply": "true",
"interview_format": "In-store",
"interview_time": "4PM-5PM"
}
Thank you for your help
You can use a dict comprehension:
{k: v['value'] for k, v in d.items()}
I have a dictionary of dictionaries that looks like this:
d = {names: {IDs: {"constant": value, "some_list": []}}
Where each name potentially has multiple IDs, and each ID has a constant value and a list of variable length with specific strings as keys. My goal is to print names and IDS when the list is a given length. I know how to do this with nested for loops:
for n in d:
for i in d[n]:
num = len(d[n][i]["some_list"])
if num > 5:
print "Warning %s %s has %i items" % (n, i, num)
I do not have a reason why the above is not acceptable, it works and is readable.
I'm curious though if there is a way to specify n and i on a single for loop. The following fail for different reasons:
for one in d.values().keys(): # fails as list has no attribute keys
for one.keys() in d.values(): # fails as functions can't be assigned to calls
The following will generate a list of tuples that could then be iterated over, but still contains two for loops inside the comprehension and would require an additional loop through the new list to print:
new_list = [(n, i) for n in d for i in d[n] if len(d[n][i]["some_list"] > 5]
Is it impossible to do without using 2 for loops? or is there a trick I'm missing?
With a 2+ level dictionary, you are always going to have nested loops somewhere, whether they're hidden inside a function or expressed explicitly.
What you may want to do is change the structure of your dictionary to make it use a multi-part key (in a tuple):
tupleDict = { (name,idx):content for name,idd in d.items() for idx,content in idd() }
print(tupleDict)
# {('name1', 'ID11'): {'constant': 1, 'some_list': [1]}, ('name1', 'ID12'): {'constant': 2, 'some_list': [2]}, ...}
Then, you can apply filters without nested loops using that alternate structure:
min5Lists = { k:v for k,v in tupleDict.items() if len(v['some_list'])>5 }
A lot of problems here. First, your syntax is incorrect.
d = {names: {IDs: {"constant": value, "some_list": []}}
should be more like
d = {"names": {"IDs": {"constant": "some_value", "some_list": []}}}
If each name can have multiple IDs, and there can be multiple names, show a more complete example:
d = {
"name1": {
"id1": {"constant": "value1", "some_list": []},
"id2": {"constant": "value2", "some_list": []},
},
"name2": {
"idA": {"constant": "valueA", "some_list": []},
"idB": {"constant": "valueB", "some_list": []},
},
}
Since you have an arbitrary set of IDs inside arbitrary set of names, I don't thin it can be done with one loop.
If you could do it in one loop, it would be very difficult to understand and maintain.
I have a nested list of dictionary like follows:
list_of_dict = [
{
"key": "key1",
"data": [
{
"u_key": "u_key_1",
"value": "value_1"
},
{
"u_key": "u_key_2",
"value": "value_2"
}
]
},
{
"key": "key2",
"data": [
{
"u_key": "u_key_1",
"value": "value_3"
},
{
"u_key": "u_key_2",
"value": "value_4"
}
]
}
]
As you can see list_of_dict is a list of dict and inside that, data is also a list of dict. Assume that all the objects inside list_of_dict and data has similar structure and all the keys are always present.
In the next step I convert list_of_dict to list_of_tuples, where first element of tuple is key followed by all the values against value key inside data
list_of_tuples = [
('key1', 'value_1'),
('key1', 'value_2'),
('key2', 'value_3'),
('key2','value_4')
]
The final step is comparison with a list(comparison_list). List contains string values. The values inside the list CAN be from the value key inside data. I need to check if any value inside comparison_list is inside list_of_tuples and fetch the key(first item of tuple) of that value.
comparison_list = ['value_1', 'value_2']
My expected output is:
out = ['key1', 'key1']
My solution is follows:
>>> list_of_tuples = [(c.get('key'),x.get('value'))
for c in list_of_dict for x in c.get('data')]
>>> for t in list_of_tuple:
if t[1] in comparison_list:
print("Found: {}".format(t[0]))
So summary of problem is that I have list of values(comparison_list) which I need to find inside data array.
The dataset that I am operating on is quite huge(>100M). I am looking to speed up my solution and also make it more compact and readable.
Can I somehow skip the step where I create list_of_tuples and do the comparison directly?
There are a few simple optimization you can try:
make comparison_list a set so the lookup is O(1) instead of O(n)
make list_of_tuples a generator, so you don't have to materialize all the entries at once
you can also integrate the condition into the generator itself
Example:
comparison_set = set(['value_1', 'value_2'])
tuples_generator = ((c['key'], x['value'])
for c in list_of_dict for x in c['data']
if x['value'] in comparison_set)
print(*tuples_generator)
# ('key1', 'value_1') ('key1', 'value_2')
Of course, you can also keep the comparison separate from the generator:
tuples_generator = ((c['key'], x['value'])
for c in list_of_dict for x in c['data'])
for k, v in tuples_generator:
if v in comparison_set:
print(k, v)
Or you could instead create a dict mapping values from comparison_set to keys from list_of_dicts. This will make finding the key to a particular value faster, but note that you can then only keep one key to each value.
values_dict = {x['value']: c['key']
for c in list_of_dict for x in c['data']
if x['value'] in comparison_set}
print(values_dict)
# {'value_2': 'key1', 'value_1': 'key1'}
In last step you can use filter something like this instead of iterating over that:
comparison_list = ['value_1', 'value_2']
print(list(filter(lambda x:x[1] in comparison_list,list_of_tuples)))
output:
[('key1', 'value_1'), ('key1', 'value_2')]
I have the following list:
["stephane", "philippe", "hélène", ["hugo", "jean-michel", "fernand"], "gustave"]
And I would like to order it like this:
["gustave", "hélène", ["fernand", "hugo", "jean-michel"], "philippe", "stephane"]
NB: If there is a nested list following a user, this list must stay to the right of this user.
In addition to that all nested lists works the same way. It's recursive.
Your data sounds like it would be better represented as a dictionary. Lists where consecutive elements have a special relationship sound odd.
If you instead represented your data like this:
{
"stephane": {},
"philippe": {},
"hélène": {
"hugo": {},
"jean-michel": {},
"fernand": {},
},
"gustave": {},
}
Then you can simply sort the keys of the dictionaries to get the order you want.
I've used Ned's proposal and came up with this:
d = {
"stephane": {},
"philippe": {},
"helene": {
"hugo": {},
"jean-michel": {},
"fernand": {},
},
"gustave": {},
}
def sort_dict_as_list(d):
sorted_list = []
for k, v in sorted(d.items()):
if k:
sorted_list.append(k)
if v:
sorted_list.append(v)
return sorted_list
def sort_recursive(d):
if d:
for k, v in d.items():
d[k] = sort_recursive(v)
return sort_dict_as_list(d)
else:
return d
if __name__ == "__main__":
print sort_recursive(d)
Output
python sortit.py
['gustave', 'helene', ['fernand', 'hugo', 'jean-michel'], 'philippe', 'stephane']
I haven't tested it thoroughly, but it's a starting point. I was trying to solve it with a list as a data structure, but I ended up nesting recursive functions and it was way too ugly... Ned's proposal was really good.