how to uniqify a list of dict in python - python

I have a list:
d = [{'x':1, 'y':2}, {'x':3, 'y':4}, {'x':1, 'y':2}]
{'x':1, 'y':2} comes more than once I want to remove it from the list.My result should be:
d = [{'x':1, 'y':2}, {'x':3, 'y':4} ]
Note:
list(set(d)) is not working here throwing an error.

If your value is hashable this will work:
>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 4, 'x': 3}, {'y': 2, 'x': 1}]
EDIT:
I tried it with no duplicates and it seemed to work fine
>>> d = [{'x':1, 'y':2}, {'x':3, 'y':4}]
>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 4, 'x': 3}, {'y': 2, 'x': 1}]
and
>>> d = [{'x':1,'y':2}]
>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 2, 'x': 1}]

Dicts aren't hashable, so you can't put them in a set. A relatively efficient approach would be turning the (key, value) pairs into a tuple and hashing those tuples (feel free to eliminate the intermediate variables):
tuples = tuple(set(d.iteritems()) for d in dicts)
unique = set(tuples)
return [dict(pairs) for pairs in unique]
If the values aren't always hashable, this is not possible at all using sets and you'll propably have to use the O(n^2) approach using an in check per element.

Avoid this whole problem and use namedtuples instead
from collections import namedtuple
Point = namedtuple('Point','x y'.split())
better_d = [Point(1,2), Point(3,4), Point(1,2)]
print set(better_d)

A simple loop:
tmp=[]
for i in d:
if i not in tmp:
tmp.append(i)
tmp
[{'x': 1, 'y': 2}, {'x': 3, 'y': 4}]

tuple the dict won't be okay, if the value of one dict item looks like a list.
e.g.,
data = [
{'a': 1, 'b': 2},
{'a': 1, 'b': 2},
{'a': 2, 'b': 3}
]
using [dict(y) for y in set(tuple(x.items()) for x in data)] will get the unique data.
However, same action on such data will be failed:
data = [
{'a': 1, 'b': 2, 'c': [1,2]},
{'a': 1, 'b': 2, 'c': [1,2]},
{'a': 2, 'b': 3, 'c': [3]}
]
ignore the performance, json dumps/loads could be a nice choice.
data = set([json.dumps(d) for d in data])
data = [json.loads(d) for d in data]

Another dark magic(please don't beat me):
map(dict, set(map(lambda x: tuple(x.items()), d)))

Related

Group all keys with the same value in a dictionary of sets

I am trying to transform a dictionary of sets as the values with duplication to a dictionary with the unique sets as the value and at the same time join the keys together.
dic = {'a': {1, 2, 3}, 'b': {1, 2}, 'c': {1, 3, 2}, 'd': {1, 2, 3}}
Should be changed to
{'a-c-d': {1, 2, 3}, 'b': {1, 2}}
My try is as below, but I think there has to be a better way.
def transform_dictionary(dic: dict) -> dict:
dic = {k: frozenset(v) for k, v in dic.items()}
key_list = list(dic.keys())
value_list = list(dic.values())
dict_transformed = {}
for v_uinque in set(value_list):
sub_key_list = []
for i, v in enumerate(value_list):
if v == v_uinque:
sub_key_list.append(str(key_list[i]))
dict_transformed['-'.join(sub_key_list)] = set(v_uinque)
return dict_transformed
print(transform_dictionary(dic))
You can "invert" the input dictionary into a dictionary mapping frozensets into a set of keys.
import collections
dic = {'a': {1, 2, 3}, 'b': {1, 2}, 'c': {1, 3, 2}, 'd': {1, 2, 3}}
keys_per_set = collections.defaultdict(list)
for key, value in dic.items():
keys_per_set[frozenset(value)].append(key)
Then invert that dictionary mapping back into the desired form:
{'-'.join(keys): value for (value, keys) in keys_per_set.items()}
Output:
{'a-c-d': frozenset({1, 2, 3}), 'b': frozenset({1, 2})}
This will turn the values into a frozenset, but you could "thaw" them with a set(value) in the last list comprehension.
from itertools import groupby
dic_output = {'-'.join(v):g for g,v in groupby(sorted(dic_input,
key=dic_input.get),
key=lambda x: dic_input[x])}
Output
{'b': {1, 2}, 'a-c-d': {1, 2, 3}}

find particular value from list of dict in python

I have a list of dictionaries like this:
s = [{'a':1,'b':2},{'a':3},{'a':2},{'a':1}]
remove duplicate value pair
and I want a list of dictionaries like:
s = [{'a':1},{'a':3},{'a':2}]
Use list comprehension with filter a:
s = [{k: v for k, v in x.items() if k =='a'} for x in s]
print (s)
[{'a': 1}, {'a': 3}, {'a': 2}]
You could use a list comprehension adding new dictionary entries only if 'a' is contained:
[{'a':d['a']} for d in s if 'a' in d]
# [{'a': 1}, {'a': 3}, {'a': 2}]
You can try this.
s = [{'a':1,'b':2},{'a':3},{'a':2}]
s=[{'a':d['a']} for d in s]
# [{'a': 1}, {'a': 3}, {'a': 2}]
If you want to have a list of singleton dictionaries with only a keys, you can do this:
>>> [{'a': d.get('a')} for d in s]
[{'a': 1}, {'a': 3}, {'a': 2}]
But this just seems more suitable for a list of tuples:
>>> [('a', d.get('a')) for d in s]
[('a', 1), ('a', 3), ('a', 2)]
From the docs for dict.get:
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a Key Error.

Python How to add all values in a 2d Dictionary and return a single dictionary with summed values?

Part of the program I am developing has a 2D dict of length n.
Dictionary Example:
test_dict = {
0: {'A': 2, 'B': 1, 'C': 5},
1: {'A': 3, 'B': 1, 'C': 2},
2: {'A': 1, 'B': 1, 'C': 1},
3: {'A': 4, 'B': 2, 'C': 5}
}
All of the dictionaries have the same keys but different values. I need to sum all the values as to equal below.
I have tried to merge the dictionaries using the following:
new_dict = {}
for k, v in test_dict.items():
new_dict.setdefault(k, []).append(v)
I also tried using:
new_dict = {**test_dict[0], **test_dict[1], **test_dict[2], **test_dict[3]}
Unfortuntly I have not had any luck in getting the desired outcome.
Desired Outcome: outcome = {'A': 10, 'B': 5, 'C': 13}
How can I add all the values into a single dictionary?
Solution using pandas
Convert your dict to pandas.DataFrame and then do summation on columns and convert it back to dict.
import pandas as pd
df = pd.DataFrame.from_dict(test_dict, orient='index')
print(df.sum().to_dict())
Output:
{'A': 10, 'B': 5, 'C': 13}
Alternate solution
Use collections.Counter which allows you to add the values of same keys within dict
from collections import Counter
d = Counter()
for _,v in test_dict.items():
d.update(v)
print(d)

Store variables in dictionary for large data

I can print variables in python.
for h in jl1["results"]["attributes-list"]["volume-attributes"]:
state = str(h["volume-state-attributes"]["state"])
if aggr in h["volume-id-attributes"]["containing-aggregate-name"]:
if state == "online":
print(h["volume-id-attributes"]["owning-vserver-name"]),
print(' '),
print(h["volume-id-attributes"]["name"]),
print(' '),
print(h["volume-id-attributes"]["containing-aggregate-name"]),
print(' '),
print(h["volume-space-attributes"]["size-used"]
These print function returns for example 100 lines. Now I want to print only top 5 values based on filter of "size-used".
I am trying to take these values in dictionary and filter out top five values for "size-used" but not sure how to take them in dictionary.
Some thing like this
{'vserver': (u'rcdn9-c01-sm-prod',), 'usize': u'389120', 'vname': (u'nprd_root_m01',), 'aggr': (u'aggr1_n01',)}
Any other options like namedtuples is also appreciated.
Thanks
To get a list of dictionaries sorted by a certain key, use sorted. Say I have a list of dictionaries with a and b keys and want to sort them by the value of the b element:
my_dict_list = [{'a': 3, 'b': 1}, {'a': 1, 'b': 4}, {'a': 4, 'b': 4},
{'a': 2, 'b': 7}, {'a': 2, 'b': 4.3}, {'a': 2, 'b': 9}, ]
my_sorted_dict_list = sorted(my_dict_list, key=lambda element: element['b'], reverse=True)
# Reverse is set to True because by default it sorts from smallest to biggest; we want to reverse that
# Limit to five results
biggest_five_dicts = my_sorted_dict_list[:5]
print(biggest_five_dicts) # [{'a': 2, 'b': 9}, {'a': 2, 'b': 7}, {'a': 2, 'b': 4.3}, {'a': 1, 'b': 4}, {'a': 4, 'b': 4}]
heapq.nlargest is the obvious way to go here:
import heapq
interesting_dicts = ... filter to keep only the dicts you care about (e.g. online dicts) ...
for large in heapq.nlargest(5, interesting_dicts,
key=lambda d: d["volume-space-attributes"]["size-used"]):
print(...)

How to combine two list containing dictionary with similar keys?

Assuming that there are two python list with the same structure like this:
var1 = [{'a':1,'b':2},{'c':2,'d':5,'h':4},{'c':2,'d':5,'e':4}]
var2 = [{'a':3,'b':2},{'c':1,'d':5,'h':4},{'c':5,'d':5,'e':4}]
In my case, i need to combine both of those list, so i'll get this value :
result = [{'a':4,'b':4},{'c':3,'d':10,'h':8},{'c':7,'d':10,'e':8}]
How can i do that?
zip-based one-liner comprehension:
result = [{k: d1[k]+d2[k] for k in d1} for d1, d2 in zip(var1, var2)]
This assumes that two dicts at the same index always have identical key sets.
Use list comprehensions to put the code in one line,
result = [{key : d1.get(key, 0)+d2.get(key, 0)
for key in set(d1.keys()) | set(d2.keys())} # union two sets
for d1, d2 in zip(var1, var2)]
print(result)
[{'a': 4, 'b': 4}, {'h': 8, 'c': 3, 'd': 10}, {'c': 7, 'e': 8, 'd': 10}]
This code takes into consideration the case that two dictionaries may not have the same keys.
var1 = [{'a':1,'b':2},{'c':2,'d':5,'h':4},{'c':2,'d':5,'e':4}]
var2 = [{'a':3,'b':2},{'c':1,'d':5,'h':4},{'c':5,'d':5,'e':4}]
res = []
for i in range(len(var1)):
dic = {}
dic1, dic2 = var1[i], var2[i]
for key, val in dic1.items(): // dic1.iteritems() in python 2.
dic[key] = dic1[key] + dic2[key]
res.append(dic)
>>>print(res)
[{'a': 4, 'b': 4}, {'c': 3, 'd': 10, 'h': 8}, {'c': 7, 'd': 10, 'e': 8}]
var1 = [{'a': 1, 'b': 2}, {'c': 2, 'd': 5, 'h': 4}, {'c': 2, 'd': 5, 'e': 4}]
var2 = [{'a': 3, 'b': 2}, {'c': 1, 'd': 5, 'h': 4}, {'c': 5, 'd': 5, 'e': 4}]
ret = []
for i, ele in enumerate(var1):
d = {}
for k, v in ele.items():
value = v
value += var2[i][k]
d[k] = value
ret.append(d)
print(ret)
For the sake of completeness, another zip-based one-liner that will work even if the dicts are uneven in the both lists:
result = [{k: d1.get(k, 0) + d2.get(k, 0) for k in set(d1) | set(d2)} for d1, d2 in zip(var1, var2)]
Would something like this help?
ar1 = [{'a':1,'b':2},{'c':2,'d':5,'h':4},{'c':2,'d':5,'e':4}]
var2 = [{'a':3,'b':2},{'c':1,'d':5,'h':4},{'c':5,'d':5,'e':4}]
combined_var = zip(var1, var2)
new_d = {}
list_new_ds = []
for i, j in combined_var:
new_d = {}
for key in i and j:
new_d[key] = i[key] + j[key]
list_new_ds.append(new_d)
list_new_ds = [{'a': 4, 'b': 4}, {'h': 8, 'c': 3, 'd': 10}, {'c': 7, 'e': 8, 'd': 10}]
To explain, the zip function merges the lists as a list of tuples. I then unpack the tuples and iterate through the keys in each dictionary and add the values for the same keys together using a new dictionary to store them. I then append the value to a list, and then re-initialise the temporary dictionary to empty before looking at the next tuple in the zipped list.
The order is different due to dictionary behaviour I believe.
I am a novice, so would appreciate any critiques of my answer!

Categories