Python filter defaultdict - python

I have a defaultdict of lists, but I want to basically do this:
myDefaultDict = filter(lambda k: len(k)>1, myDefaultDict)
Except it only seems to work with lists. What can I do?

Are you trying to get only values with len > 1?
Dictionary comprehensions are a good way to handle this:
reduced_d = {k: v for k, v in myDefaultDict.items() if len(v) > 1}
As martineau pointed out, this does not give you the same defaultdict functionality of the source myDefaultDict. You can use the dict comprehension on defaultdict instantiaion, as martineau shows to get the same defaultdict functionality.
from collections import defaultdict
myDefaultDict = defaultdict(list, {'ab': [1,2,3], 'c': [4], 'def': [5,6]}) # original
reduced_d = defaultdict(list, {k: v for k, v in myDefaultDict.items() if len(v) > 1})

I'm not sure whether you want to delete keys or values longer than 1.
Assuming it's the length of the key, here's how to do it with filter:
from collections import defaultdict
# create test data
my_defaultdict = defaultdict(list, {'ab': [1,2,3], 'c': [4], 'def': [5,6]})
my_defaultdict = defaultdict(my_defaultdict.default_factory,
filter(lambda i: len(i[0])>1, my_defaultdict.items()))
print(my_defaultdict)
Output:
defaultdict(<type 'list'>, {'ab': [1, 2, 3], 'def': [5, 6]})
If it's the length of the associated value, just change the len(i[0]) to len(i[1]).

You can use filter and then use map to remove each dict entry based on the filtered condition:
>>> example={'l1':[1], 'l2':[2,3], 'l3':[4,5,6], 'l4':[7]}
>>> filter(lambda k: len(example[k])<2, example)
['l4', 'l1']
>>> map(example.__delitem__, filter(lambda k: len(example[k])<2, example))
[None, None]
>>> example
{'l2': [2, 3], 'l3': [4, 5, 6]}

Related

How to update multiple dictionary values based on a condition?

I have a dictionary which looks like:
dict = {'A':[1,2], 'B':[0], 'c':[4]}
need it to look like:
dict = {'A':[1,2], 'B':[0,0], 'c':[4,0]}
What I am doing right now:
dict = {x: y+[0] for (x,y) in dict.items() if len(y) < 2}
which generates:
dict = {'B':[0,0], 'c':[4,0]}
any idea how I could avoid eliminating those who do not meet the condition?
You're almost there. Try:
my_dict = {x: y + [0] if len(y) < 2 else y
for (x,y) in dict.items()}
(as mentioned by jp_data_analysis, avoid naming variables after builtins like dict)
This is one way.
Note: do not name variables after classes, e.g. use d instead of dict.
d = {'A':[1,2], 'B':[0], 'c':[4]}
d = {k: v if len(v)==2 else v+[0] for k, v in d.items()}
# {'A': [1, 2], 'B': [0, 0], 'c': [4, 0]}
You can use dictionary comprehension:
d = {'A':[1,2], 'B':[0], 'c':[4]}
new_d = {a:b+[0] if len(b) == 1 else b for a, b in d.items()}
Also, it is best practice not to assign variables to names shadowing common builtins, such as dict, as you are then overriding the function in the current namespace.
Your code is almost correct. Your problem is that you're filtering out any lists bigger than 2. What you need to do instead is simply place them in the new dictionary unchanged. This can be done using the ternary operator. It has the form value1 if condition else value2.
Also, if you want a more general way to pad every list in your dictionary to
be of equal length, you can use map and max.
Here is your code with the above modifications:
>>> d = {'A':[1, 2], 'B': [0], 'c': [4]}
>>>
>>> max_len = max(map(len, d.values()))
>>> {k: v + [0] * (max_len - len(v)) if len(v) < max_len else v for k, v in d.items()}
{'A': [1, 2], 'B': [0, 0], 'c': [4, 0]}
>>>
A generalized way:
d = {'A':[1,2], 'B':[0], 'c':[4]}
m = max(len(v) for v in d.values())
for k, v in d.items():
if len(v) < m:
d[k].extend([0 for i in range(m-len(v))])
You were very close, just use update():
d = {'A':[1,2], 'B':[0], 'c':[4]}
d.update({x: y+[0] for (x,y) in d.items() if len(y) < 2})
d
# {'A': [1, 2], 'B': [0, 0], 'c': [4, 0]}
Like others have said, don't use reassign reserved names like dict, it's a one way street down to debugging hell.

Python: Dictionary changed size during iteration" [duplicate]

I have a dictionary of lists in which some of the values are empty:
d = {'a': [1], 'b': [1, 2], 'c': [], 'd':[]}
At the end of creating these lists, I want to remove these empty lists before returning my dictionary. I tried doing it like this:
for i in d:
if not d[i]:
d.pop(i)
but I got a RuntimeError. I am aware that you cannot add/remove elements in a dictionary while iterating through it...what would be a way around this then?
See Modifying a Python dict while iterating over it for citations that this can cause problems, and why.
In Python 3.x and 2.x you can use use list to force a copy of the keys to be made:
for i in list(d):
In Python 2.x calling keys made a copy of the keys that you could iterate over while modifying the dict:
for i in d.keys():
But note that in Python 3.x this second method doesn't help with your error because keys returns an a view object instead of copying the keys into a list.
You only need to use copy:
This way you iterate over the original dictionary fields and on the fly can change the desired dict d.
It works on each Python version, so it's more clear.
In [1]: d = {'a': [1], 'b': [1, 2], 'c': [], 'd':[]}
In [2]: for i in d.copy():
...: if not d[i]:
...: d.pop(i)
...:
In [3]: d
Out[3]: {'a': [1], 'b': [1, 2]}
(BTW - Generally to iterate over copy of your data structure, instead of using .copy for dictionaries or slicing [:] for lists, you can use import copy -> copy.copy (for shallow copy which is equivalent to copy that is supported by dictionaries or slicing [:] that is supported by lists) or copy.deepcopy on your data structure.)
Just use dictionary comprehension to copy the relevant items into a new dict:
>>> d
{'a': [1], 'c': [], 'b': [1, 2], 'd': []}
>>> d = {k: v for k, v in d.items() if v}
>>> d
{'a': [1], 'b': [1, 2]}
For this in Python 2:
>>> d
{'a': [1], 'c': [], 'b': [1, 2], 'd': []}
>>> d = {k: v for k, v in d.iteritems() if v}
>>> d
{'a': [1], 'b': [1, 2]}
This worked for me:
d = {1: 'a', 2: '', 3: 'b', 4: '', 5: '', 6: 'c'}
for key, value in list(d.items()):
if value == '':
del d[key]
print(d)
# {1: 'a', 3: 'b', 6: 'c'}
Casting the dictionary items to list creates a list of its items, so you can iterate over it and avoid the RuntimeError.
I would try to avoid inserting empty lists in the first place, but, would generally use:
d = {k: v for k,v in d.iteritems() if v} # re-bind to non-empty
If prior to 2.7:
d = dict( (k, v) for k,v in d.iteritems() if v )
or just:
empty_key_vals = list(k for k in k,v in d.iteritems() if v)
for k in empty_key_vals:
del[k]
To avoid "dictionary changed size during iteration error".
For example: "when you try to delete some key",
Just use 'list' with '.items()'. Here is a simple example:
my_dict = {
'k1':1,
'k2':2,
'k3':3,
'k4':4
}
print(my_dict)
for key, val in list(my_dict.items()):
if val == 2 or val == 4:
my_dict.pop(key)
print(my_dict)
Output:
{'k1': 1, 'k2': 2, 'k3': 3, 'k4': 4}
{'k1': 1, 'k3': 3}
This is just an example. Change it based on your case/requirements.
For Python 3:
{k:v for k,v in d.items() if v}
You cannot iterate through a dictionary while itโ€™s changing during a for loop. Make a casting to list and iterate over that list. It works for me.
for key in list(d):
if not d[key]:
d.pop(key)
Python 3 does not allow deletion while iterating (using the for loop above) a dictionary. There are various alternatives to do it; one simple way is to change the line
for i in x.keys():
with
for i in list(x)
The reason for the runtime error is that you cannot iterate through a data structure while its structure is changing during iteration.
One way to achieve what you are looking for is to use a list to append the keys you want to remove and then use the pop function on dictionary to remove the identified key while iterating through the list.
d = {'a': [1], 'b': [1, 2], 'c': [], 'd':[]}
pop_list = []
for i in d:
if not d[i]:
pop_list.append(i)
for x in pop_list:
d.pop(x)
print (d)
For situations like this, I like to make a deep copy and loop through that copy while modifying the original dict.
If the lookup field is within a list, you can enumerate in the for loop of the list and then specify the position as the index to access the field in the original dict.
Nested null values
Let's say we have a dictionary with nested keys, some of which are null values:
dicti = {
"k0_l0":{
"k0_l1": {
"k0_l2": {
"k0_0":None,
"k1_1":1,
"k2_2":2.2
}
},
"k1_l1":None,
"k2_l1":"not none",
"k3_l1":[]
},
"k1_l0":"l0"
}
Then we can remove the null values using this function:
def pop_nested_nulls(dicti):
for k in list(dicti):
if isinstance(dicti[k], dict):
dicti[k] = pop_nested_nulls(dicti[k])
elif not dicti[k]:
dicti.pop(k)
return dicti
Output for pop_nested_nulls(dicti)
{'k0_l0': {'k0_l1': {'k0_l2': {'k1_1': 1,
'k2_2': 2.2}},
'k2_l1': 'not '
'none'},
'k1_l0': 'l0'}
The Python "RuntimeError: dictionary changed size during iteration" occurs when we change the size of a dictionary when iterating over it.
To solve the error, use the copy() method to create a shallow copy of the dictionary that you can iterate over, e.g., my_dict.copy().
my_dict = {'a': 1, 'b': 2, 'c': 3}
for key in my_dict.copy():
print(key)
if key == 'b':
del my_dict[key]
print(my_dict) # ๐Ÿ‘‰๏ธ {'a': 1, 'c': 3}
You can also convert the keys of the dictionary to a list and iterate over the list of keys.
my_dict = {'a': 1, 'b': 2, 'c': 3}
for key in list(my_dict.keys()):
print(key)
if key == 'b':
del my_dict[key]
print(my_dict) # ๐Ÿ‘‰๏ธ {'a': 1, 'c': 3}
If the values in the dictionary were unique too, then I used this solution:
keyToBeDeleted = None
for k, v in mydict.items():
if(v == match):
keyToBeDeleted = k
break
mydict.pop(keyToBeDeleted, None)

What is the pythonic way to reverse a defaultdict(list)?

What is the pythonic way to reverse a defaultdict(list)?
I could iterating through the defaultdict and creating a new defaultdict.
Is there any other way? Is this pythonic:
>>> from collections import defaultdict
>>> x = defaultdict(list)
>>> y = [[1,2,3,4],[3,4,5,6]]
>>> z= ['a','b']
>>> for i,j in zip(y,z):
... x[j] = i
...
>>> x
defaultdict(<type 'list'>, {'a': [1, 2, 3, 4], 'b': [3, 4, 5, 6]})
>>> x2 = defaultdict(list)
>>> for k,v in x.items():
... for i in v:
... x2[i].append(k)
...
>>> x2
defaultdict(<type 'list'>, {1: ['a'], 2: ['a'], 3: ['a','b'], 4: ['a','b'], 5: ['b'], 6: ['b']})
I believe the best way is to simply loop as you did:
target = defaultdict(list)
for key, values in original.items():
for value in values:
target[value].append(key)
Alternatively you could avoid the inner for:
for key, values in original.items():
target.update(zip(values, [key] * len(values)))
Or using itertools.repeat:
import itertools as it
for key, values in original.items():
target.update(zip(values, it.repeat(key)))
However these last solutions only work for the simple case where values in different lists are distinct.
Remember that pythonic doesn't have any definite meaning. I'd consider python a solution that is:
Readable
Correctly use the language features
Correctly use the built-ins/standard library
Efficient
And the points are in order of importance. Efficience is last because it is most often implied by point 2 and 3.
Is this more pythonic or just more cryptic?
map(lambda (i, k): x2[i].append(k), [(i, k) for i in v for k, v in x.items()])
Following variant is required for python 3, and is less clear:
map(lambda i_k: x2[i_k[0]].append(i_k[1]), [(i, k) for i in v for k, v in x.items()])
Writing this, I've concluded this is probably about the least pythonic way of doing it. But possibly educational; it was for me.
Edit: Don't do this.

get the smallest items from a listOfLists group by keys

I've got a list like
listOfLists = [['key2', 1], ['key1', 2], ['key2', 2], ['key1', 1]]
The first item of an inner list is the key. The second item of an inner list is the value.
I want to get an output [['key1', 1], ['key2', 1]] which gives the list that its value is the smallest of the lists that has the same key and the output group by the key (my English is poor so just use the concept of Sql Syntax)
I've written some code like this:
listOfLists = [['key2', 1], ['key1', 2], ['key2', 2], ['key1', 1]]
listOfLists.sort() #this will sort by key, and then ascending by value
output = []
for index, l in enumerate(listOfLists):
if index == 0:
output.append(l)
if l[0] == listOfLists[index - 1][0]:
#has the same key, and the value is larger, discard
continue
else:
output.append(l)
this seems not smart enough
is there any simpler way to do this work?
How about using a dictionary (no need to sort the data)?
>>> listOfLists = [['key2', 1], ['key1', 2], ['key2', 2], ['key1', 1]]
>>> d = {}
>>> for k,v in listOfLists:
d.setdefault(k, []).append(v)
>>> d = {k:min(v) for k,v in d.items()}
>>> d
{'key2': 1, 'key1': 1}
You can convert to a list if you want
O(N log N) solution
You can just use the dict constructor for this. It is O(N log N) because of the sorting step
>>> dict(sorted(listOfLists, reverse=True))
{'key2': 1, 'key1': 1}
To see why this works, look at the result of sorted
>>> sorted(listOfLists, reverse=True)
[['key2', 2], ['key2', 1], ['key1', 2], ['key1', 1]]
The dict constructor will replace each key as it traverses the list and sorted has pushed the minimum for each key to the end of the sublist for that key
O(N) solution
>>> d = {}
>>> for k, v in listOfLists:
... d[k] = min(d.get(k, v), v)
...
>>> d
{'key2': 1, 'key1': 1}
The itertools module has a very useful groupby function that is probably exactly what you need:
from itertools import groupby
listOfLists.sort()
for key, subgroup in groupby(listOfLists, lambda item: item[0]):
print key, min(subgroup)

How can I avoid "RuntimeError: dictionary changed size during iteration" error?

I have a dictionary of lists in which some of the values are empty:
d = {'a': [1], 'b': [1, 2], 'c': [], 'd':[]}
At the end of creating these lists, I want to remove these empty lists before returning my dictionary. I tried doing it like this:
for i in d:
if not d[i]:
d.pop(i)
but I got a RuntimeError. I am aware that you cannot add/remove elements in a dictionary while iterating through it...what would be a way around this then?
See Modifying a Python dict while iterating over it for citations that this can cause problems, and why.
In Python 3.x and 2.x you can use use list to force a copy of the keys to be made:
for i in list(d):
In Python 2.x calling keys made a copy of the keys that you could iterate over while modifying the dict:
for i in d.keys():
But note that in Python 3.x this second method doesn't help with your error because keys returns an a view object instead of copying the keys into a list.
You only need to use copy:
This way you iterate over the original dictionary fields and on the fly can change the desired dict d.
It works on each Python version, so it's more clear.
In [1]: d = {'a': [1], 'b': [1, 2], 'c': [], 'd':[]}
In [2]: for i in d.copy():
...: if not d[i]:
...: d.pop(i)
...:
In [3]: d
Out[3]: {'a': [1], 'b': [1, 2]}
(BTW - Generally to iterate over copy of your data structure, instead of using .copy for dictionaries or slicing [:] for lists, you can use import copy -> copy.copy (for shallow copy which is equivalent to copy that is supported by dictionaries or slicing [:] that is supported by lists) or copy.deepcopy on your data structure.)
Just use dictionary comprehension to copy the relevant items into a new dict:
>>> d
{'a': [1], 'c': [], 'b': [1, 2], 'd': []}
>>> d = {k: v for k, v in d.items() if v}
>>> d
{'a': [1], 'b': [1, 2]}
For this in Python 2:
>>> d
{'a': [1], 'c': [], 'b': [1, 2], 'd': []}
>>> d = {k: v for k, v in d.iteritems() if v}
>>> d
{'a': [1], 'b': [1, 2]}
This worked for me:
d = {1: 'a', 2: '', 3: 'b', 4: '', 5: '', 6: 'c'}
for key, value in list(d.items()):
if value == '':
del d[key]
print(d)
# {1: 'a', 3: 'b', 6: 'c'}
Casting the dictionary items to list creates a list of its items, so you can iterate over it and avoid the RuntimeError.
I would try to avoid inserting empty lists in the first place, but, would generally use:
d = {k: v for k,v in d.iteritems() if v} # re-bind to non-empty
If prior to 2.7:
d = dict( (k, v) for k,v in d.iteritems() if v )
or just:
empty_key_vals = list(k for k in k,v in d.iteritems() if v)
for k in empty_key_vals:
del[k]
To avoid "dictionary changed size during iteration error".
For example: "when you try to delete some key",
Just use 'list' with '.items()'. Here is a simple example:
my_dict = {
'k1':1,
'k2':2,
'k3':3,
'k4':4
}
print(my_dict)
for key, val in list(my_dict.items()):
if val == 2 or val == 4:
my_dict.pop(key)
print(my_dict)
Output:
{'k1': 1, 'k2': 2, 'k3': 3, 'k4': 4}
{'k1': 1, 'k3': 3}
This is just an example. Change it based on your case/requirements.
For Python 3:
{k:v for k,v in d.items() if v}
You cannot iterate through a dictionary while itโ€™s changing during a for loop. Make a casting to list and iterate over that list. It works for me.
for key in list(d):
if not d[key]:
d.pop(key)
Python 3 does not allow deletion while iterating (using the for loop above) a dictionary. There are various alternatives to do it; one simple way is to change the line
for i in x.keys():
with
for i in list(x)
The reason for the runtime error is that you cannot iterate through a data structure while its structure is changing during iteration.
One way to achieve what you are looking for is to use a list to append the keys you want to remove and then use the pop function on dictionary to remove the identified key while iterating through the list.
d = {'a': [1], 'b': [1, 2], 'c': [], 'd':[]}
pop_list = []
for i in d:
if not d[i]:
pop_list.append(i)
for x in pop_list:
d.pop(x)
print (d)
For situations like this, I like to make a deep copy and loop through that copy while modifying the original dict.
If the lookup field is within a list, you can enumerate in the for loop of the list and then specify the position as the index to access the field in the original dict.
Nested null values
Let's say we have a dictionary with nested keys, some of which are null values:
dicti = {
"k0_l0":{
"k0_l1": {
"k0_l2": {
"k0_0":None,
"k1_1":1,
"k2_2":2.2
}
},
"k1_l1":None,
"k2_l1":"not none",
"k3_l1":[]
},
"k1_l0":"l0"
}
Then we can remove the null values using this function:
def pop_nested_nulls(dicti):
for k in list(dicti):
if isinstance(dicti[k], dict):
dicti[k] = pop_nested_nulls(dicti[k])
elif not dicti[k]:
dicti.pop(k)
return dicti
Output for pop_nested_nulls(dicti)
{'k0_l0': {'k0_l1': {'k0_l2': {'k1_1': 1,
'k2_2': 2.2}},
'k2_l1': 'not '
'none'},
'k1_l0': 'l0'}
The Python "RuntimeError: dictionary changed size during iteration" occurs when we change the size of a dictionary when iterating over it.
To solve the error, use the copy() method to create a shallow copy of the dictionary that you can iterate over, e.g., my_dict.copy().
my_dict = {'a': 1, 'b': 2, 'c': 3}
for key in my_dict.copy():
print(key)
if key == 'b':
del my_dict[key]
print(my_dict) # ๐Ÿ‘‰๏ธ {'a': 1, 'c': 3}
You can also convert the keys of the dictionary to a list and iterate over the list of keys.
my_dict = {'a': 1, 'b': 2, 'c': 3}
for key in list(my_dict.keys()):
print(key)
if key == 'b':
del my_dict[key]
print(my_dict) # ๐Ÿ‘‰๏ธ {'a': 1, 'c': 3}
If the values in the dictionary were unique too, then I used this solution:
keyToBeDeleted = None
for k, v in mydict.items():
if(v == match):
keyToBeDeleted = k
break
mydict.pop(keyToBeDeleted, None)

Categories