python list of dicts remove items with empty lists - python

with a list l like below
l = [{'x': 2}, {'y': [], 'z': 'hello'}, {'a': []}]
need to eject the elements with an empty list as the value.
expected output
[{'x': 2}, {z: 'hello'}]
Was trying to achieve this with list comprehension, need help.

The following will work for your data:
>>> [d for d in ({k: v for k, v in d_.items() if v} for d_ in l) if d]
[{'x': 2}, {'z': 'hello'}]
The inner dict comprehension filters out those key-value pairs from the dicts with empty list values, the outer comprehension filters empty dicts.

You can try this.
list(filter(None,({k:v for k,v in d.items() if v!=[]} for d in l)))
#[{'x': 2}, {'z': 'hello'}]

An alternative is to: a) Use a simple loop to remove empty entries and b) filter the final list:
l = [{'x': 2}, {'y': [], 'z': 'hello'}, {'a': []}]
for i,d in enumerate(l):
l[i]={k:v for k,v in d.items() if v!=[]}
l=list(filter(None, l))
>>> l
[{'x': 2}, {'z': 'hello'}]
The advantage here (over a comprehension) is the list is edited in place vs copied.

Somewhat similar to existing responses, but uses a one-iteration list comprehension, with if to filter out the empty items.
def remove_empty(input_list):
return [dict((k, v) for k, v in d.items() if v != [])
for d in input_list
if not (len(d) == 1 and list(d.values()) == [[]])]
remove_empty(l)
output:
[{'x': 2}, {'z': 'hello'}]

You can try this one :
l = [{'x': 2}, {'y': [], 'z': 'hello'}, {'a': [],'b':'','c':0}]
print([ { k:v for k,v in ll.items() if v != [] } for ll in l ]);

Related

Merge dicts from list with getting set of values for equal keys

Do I have any opportunity to rewrite the code below with dict enhancement? (if I name it right, mean {k: v for k, v in ...})
list_of_dicts = [{'a': 1}, {'b': 2}, {'b': 20, 'c': 3}, {'a': 10, 'b': 2}]
for k, v in [p for d in list_of_dicts for p in d.items()]:
d[k] = d.setdefault(k, set()) | {v}
sure why not :). but it is nested at bit 😉
import itertools
list_of_dicts = [{'a': 1}, {'b': 2}, {'b': 20, 'c': 3}, {'a': 10, 'b': 2}]
o = {k: {d[k] for d in list_of_dicts if k in d} for k in itertools.chain.from_iterable(list_of_dicts)}
print(o)

Merge Dictionaries oin a list with the same key

I have a list of dicts, like this:
x = [{'a': 3}, {'b': 1}, {None: 0}, {'a': 1}, {'b': 1}, {None: 0}]
and I would like to have a something like this
x = [{'a': 4}, {'b': 2}, {None: 0}]
What is the most memory-friendly way to reach that?
You can also use collections.Counter, which will naturally update the counts:
from collections import Counter
l = [{'a': 3}, {'b': 1}, {None: 0}, {'a': 1}, {'b': 1}, {None: 0}]
counts = Counter()
for d in l:
counts.update(d)
print([{k: v} for k, v in counts.items()])
From the docs for collections.Counter.update:
Elements are counted from an iterable or added-in from another mapping (or counter). Like dict.update() but adds counts instead of replacing them. Also, the iterable is expected to be a sequence of elements, not a sequence of (key, value) pairs.
You can also use a collections.defaultdict to do the counting:
from collections import defaultdict
l = [{'a': 3}, {'b': 1}, {None: 0}, {'a': 1}, {'b': 1}, {None: 0}]
counts = defaultdict(int)
for d in l:
for k, v in d.items():
counts[k] += v
print([{k: v} for k, v in counts.items()])
Or you could also count with initializing 0 yourself:
l = [{'a': 3}, {'b': 1}, {None: 0}, {'a': 1}, {'b': 1}, {None: 0}]
counts = {}
for d in l:
for k, v in d.items():
counts[k] = counts.get(k, 0) + 1
print([{k: v} for k, v in counts.items()])
From the docs for dict.get:
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
Output:
[{'a': 4}, {'b': 2}, {None: 0}]
Lets say:
l = [{'a': 3}, {'b': 1}, {None: 0}, {'a': 1}, {'b': 1}, {None: 0}]
Now we will extract and add up:
res = []
for k in l:
for i in k:
s = {i:sum(j[i] for j in l if i in j)}
if s not in res:
res.append(s)
gives:
[{'a': 4}, {'b': 2}, {None: 0}]
Or we could use (adapted from here ):
result = {}
for d in l:
for k in d.keys():
result[k] = result.get(k, 0) + d[k]
res = [{i:result[i]} for i in result]
Check the one-line code using pandas.
L = [{'a': 3}, {'b': 1}, {None: 0}, {'a': 1}, {'b': 1}, {None: 0}]
output = pd.DataFrame(L).sum(axis = 0).to_dict()

find particular value from list of dict in python

I have a list of dictionaries like this:
s = [{'a':1,'b':2},{'a':3},{'a':2},{'a':1}]
remove duplicate value pair
and I want a list of dictionaries like:
s = [{'a':1},{'a':3},{'a':2}]
Use list comprehension with filter a:
s = [{k: v for k, v in x.items() if k =='a'} for x in s]
print (s)
[{'a': 1}, {'a': 3}, {'a': 2}]
You could use a list comprehension adding new dictionary entries only if 'a' is contained:
[{'a':d['a']} for d in s if 'a' in d]
# [{'a': 1}, {'a': 3}, {'a': 2}]
You can try this.
s = [{'a':1,'b':2},{'a':3},{'a':2}]
s=[{'a':d['a']} for d in s]
# [{'a': 1}, {'a': 3}, {'a': 2}]
If you want to have a list of singleton dictionaries with only a keys, you can do this:
>>> [{'a': d.get('a')} for d in s]
[{'a': 1}, {'a': 3}, {'a': 2}]
But this just seems more suitable for a list of tuples:
>>> [('a', d.get('a')) for d in s]
[('a', 1), ('a', 3), ('a', 2)]
From the docs for dict.get:
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a Key Error.

Sorting a list of dicts based on another list of dicts in Python

I have 2 lists
A = [{'g': 'goal'}, {'b': 'ball'}, {'a': 'apple'}, {'f': 'float'}, {'e': 'egg'}]
B = [{'a': None}, {'e': None}, {'b': None}, {'g': None}, {'f': None}]
I want to sort A according to B. The reason I'm asking this is, I can't simply copy B's contents into A and over-writing A's object values with None. I want to retain A's values but sort it according to B's order.
How do I achieve this? Would prefer a solution in Python
spots = {next(iter(d)): i for i, d in enumerate(B)}
sorted_A = [None] * len(A)
for d in A:
sorted_A[spots[next(iter(d))]] = d
Average-case linear time. Place each dict directly into the spot it needs to go, without slow index calls or even calling sorted.
You could store the indices of keys in a dictionary and use those in the sorting function. This would work in O(n log(n)) time:
>>> keys = {next(iter(v)): i for i, v in enumerate(B)}
>>> keys
{'a': 0, 'e': 1, 'b': 2, 'g': 3, 'f': 4}
>>> A.sort(key=lambda x: keys[next(iter(x))])
>>> A
[{'a': 'apple'}, {'e': 'egg'}, {'b': 'ball'}, {'g': 'goal'}, {'f': 'float'}]
You can avoid sorting by iterating over the existing, ordered keys in B:
Merge list A into a single lookup dict
Build a new list from the order in B, using the lookup dict to find the value matching each key
Code:
import itertools
merged_A = {k: v for d in A for k, v in d.items()}
sorted_A = [{k: merged_A[k]} for k in itertools.chain.from_iterable(B)]
# [{'a': 'apple'}, {'e': 'egg'}, {'b': 'ball'}, {'g': 'goal'}, {'f': 'float'}]
If required, you can preserve the original dict objects from A instead of building new ones:
keys_to_dicts = {k: d for d in A for k in d}
sorted_A = [keys_to_dicts[k] for k in itertools.chain.from_iterable(B)]
How about this? Create a lookup dict on A and then use B's keys to create a new list in the right order.
In [103]: lookup_list = {k : d for d in A for k in d}
In [104]: sorted_list = [lookup_list[k] for d in B for k in d]; sorted_list
Out[104]: [{'a': 'apple'}, {'e': 'egg'}, {'b': 'ball'}, {'g': 'goal'}, {'f': 'float'}]
Performance
Setup:
import random
import copy
x = list(range(10000))
random.shuffle(x)
A = [{str(i) : 'test'} for i in x]
B = copy.deepcopy(A)
random.shuffle(B)
# user2357112's solution
%%timeit
spots = {next(iter(d)): i for i, d in enumerate(B)}
sorted_A = [None] * len(A)
for d in A:
sorted_A[spots[next(iter(d))]] = d
# Proposed in this post
%%timeit
lookup_list = {k : d for d in A for k in d}
sorted_list = [lookup_list[k] for d in B for k in d]; sorted_list
Results:
100 loops, best of 3: 9.27 ms per loop
100 loops, best of 3: 4.92 ms per loop
45% speedup to the original O(n), with twice the space complexity.

how to uniqify a list of dict in python

I have a list:
d = [{'x':1, 'y':2}, {'x':3, 'y':4}, {'x':1, 'y':2}]
{'x':1, 'y':2} comes more than once I want to remove it from the list.My result should be:
d = [{'x':1, 'y':2}, {'x':3, 'y':4} ]
Note:
list(set(d)) is not working here throwing an error.
If your value is hashable this will work:
>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 4, 'x': 3}, {'y': 2, 'x': 1}]
EDIT:
I tried it with no duplicates and it seemed to work fine
>>> d = [{'x':1, 'y':2}, {'x':3, 'y':4}]
>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 4, 'x': 3}, {'y': 2, 'x': 1}]
and
>>> d = [{'x':1,'y':2}]
>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 2, 'x': 1}]
Dicts aren't hashable, so you can't put them in a set. A relatively efficient approach would be turning the (key, value) pairs into a tuple and hashing those tuples (feel free to eliminate the intermediate variables):
tuples = tuple(set(d.iteritems()) for d in dicts)
unique = set(tuples)
return [dict(pairs) for pairs in unique]
If the values aren't always hashable, this is not possible at all using sets and you'll propably have to use the O(n^2) approach using an in check per element.
Avoid this whole problem and use namedtuples instead
from collections import namedtuple
Point = namedtuple('Point','x y'.split())
better_d = [Point(1,2), Point(3,4), Point(1,2)]
print set(better_d)
A simple loop:
tmp=[]
for i in d:
if i not in tmp:
tmp.append(i)
tmp
[{'x': 1, 'y': 2}, {'x': 3, 'y': 4}]
tuple the dict won't be okay, if the value of one dict item looks like a list.
e.g.,
data = [
{'a': 1, 'b': 2},
{'a': 1, 'b': 2},
{'a': 2, 'b': 3}
]
using [dict(y) for y in set(tuple(x.items()) for x in data)] will get the unique data.
However, same action on such data will be failed:
data = [
{'a': 1, 'b': 2, 'c': [1,2]},
{'a': 1, 'b': 2, 'c': [1,2]},
{'a': 2, 'b': 3, 'c': [3]}
]
ignore the performance, json dumps/loads could be a nice choice.
data = set([json.dumps(d) for d in data])
data = [json.loads(d) for d in data]
Another dark magic(please don't beat me):
map(dict, set(map(lambda x: tuple(x.items()), d)))

Categories