combining a list of dictionaries with another dictionary - python

I have a list with a set amount of dictionaries inside which I have to compare to one other dictionary.
They have the following form (there is no specific form or pattern for keys and values, these are randomly chosen examples):
list1 = [
{'X1': 'Q587', 'X2': 'Q67G7', ...},
{'AB1': 'P5K7', 'CB2': 'P678', ...},
{'B1': 'P6H78', 'C2': 'BAA5', ...}]
dict1 = {
'X1': set([B00001,B00020,B00010]),
'AB1': set([B00001,B00007,B00003]),
'C2': set([B00001,B00002,B00003]), ...
}
What I want to have now is a new dictionary which has as keys: the values of the dictionaries in list1. and as values the values of dict1. And this only when the keys intersect in compared dictionaries.
I have done this in the following way:
nDicts = len(list1)
resultDict = {}
for key in range(0,nDicts):
for x in list1[key].keys():
if x in dict1.keys():
resultDict.update{list1[key][x]:dict1[x]}
print resultDict
The desired output should be of the form:
resulDict = {
'Q587': set([B00001,B00020,B00010]),
'P5K7': set([B00001,B00007,B00003]),
'BAA5': set([B00001,B00002,B00003]), ...
}
This works but since the amount of data is so high this takes forever.
Is there a better way to do this?
EDIT: I have changed the input values a little, the only ones that matter are the keys which intersect between the dictionaries within list1 and those within dict1.

The keys method in Python 2.x makes a list with a copy of all of the keys, and you're doing this not only for each dict in list1 (probably not a big deal, but it's hard to know for sure without knowing your data), but also doing it for dict1 over and over again.
On top of that, doing an in test on a list takes a long time, because it has to check each value in the list until it finds a match, but doing an in test on a dictionary is nearly instant, because it just has to look up the hash value.
Both keys are actually completely unnecessary—iterating a dict gives you the keys in order (an unspecified order, but the same is true for calling keys()), and in-checking a dict searches the same keys you'd get with keys(). So, just removing them does the same thing, but simpler, faster, and with less memory used. So:
for key in range(0,nDicts):
for x in list1[key]:
if x in dict1:
resultDict={list1[key][x]:dict1[x]}
print resultDict
There are also ways you can simplify this that probably won't help performance that much, but are still worth doing.
You can iterate directly over list1 instead of building a huge list of all the indices and iterating that.
for list1_dict in list1:
for x in list1_dict:
if x in dict1:
resultDict = {list_dict[x]: dict1[x]}
print resultDict
And you can get the keys and values in a single step:
for list1_dict in list1:
for k, v in list1_dict.iteritems():
if k in dict1:
resultDict = {v: dict1[k]}
print resultDict
Also, if you expect most of the values to be found, it will take about twice as long to first check for the value and then look it up as it would to just try to look it up and handle failure. (This is not true if most of the values will not be found, however.) So:
for list1_dict in list1:
for k, v in list1_dict.iteritems():
try:
resultDict = {v: dict1[k]}
print resultDict
except KeyError:
pass

You can simplify and optimize your operation with set intersections; as of Python 2.7 dictionaries can represent keys as sets using the dict.viewkeys() method, or dict.keys() in Python 3:
resultDict = {}
for d in list1:
for sharedkey in d.viewkeys() & dict1:
resultDict[d[sharedkey]] = dict1[sharedkey]
This can be turned into a dict comprehension even:
resultDict = {d[sharedkey]: dict1[sharedkey]
for d in list1 for sharedkey in d.viewkeys() & dict1}
I am assuming here you wanted one resulting dictionary, not a new dictionary per shared key.
Demo on your sample input:
>>> list1 = [
... {'X1': 'AAA1', 'X2': 'BAA5'},
... {'AB1': 'AAA1', 'CB2': 'BAA5'},
... {'B1': 'AAA1', 'C2': 'BAA5'},
... ]
>>> dict1 = {
... 'X1': set(['B00001', 'B00002', 'B00003']),
... 'AB1': set(['B00001', 'B00002', 'B00003']),
... }
>>> {d[sharedkey]: dict1[sharedkey]
... for d in list1 for sharedkey in d.viewkeys() & dict1}
{'AAA1': set(['B00001', 'B00002', 'B00003'])}
Note that both X1 and AB1 are shared with dictionaries in list1, but in both cases, the resulting key is AAA1. Only one of these wins (the last match), but since both values in dict1 are exactly the same anyway that doesn't make any odds in this case.
If you wanted separate dictionaries per dictionary in list1, simply move the for d in list1: loop out:
for d in list1:
resultDict = {d[sharedkey]: dict1[sharedkey] for sharedkey in d.viewkeys() & dict1}
if resultDict: # can be empty
print resultDict
If you really wanted one dictionary per shared key, move another loop out:
for d in list1:
for sharedkey in d.viewkeys() & dict1:
resultDict = {d[sharedkey]: dict1[sharedkey]}
print resultDict

#!/usr/bin/env python
list1 = [
{'X1': 'AAA1', 'X2': 'BAA5'},
{'AB1': 'AAA1', 'CB2': 'BAA5'},
{'B1': 'AAA1', 'C2': 'BAA5'}
]
dict1 = {
'X1': set(['B00001','B00002','B00003']),
'AB1': set(['B00001','B00002','B00003'])
}
g = ( k.iteritems() for k in list1)
ite = ((a,b) for i in g for a,b in i if dict1.has_key(a))
d = dict(ite)
print d

Related

Iteratively adding new lists as values to a dictionary

I have created a dictionary (dict1) which is not empty and contains keys with corresponding lists as their values. I want to create a new dictionary (dict2) in which new lists modified by some criterion should be stored as values with the corresponding keys from the original dictionary. However, when trying to add the newly created list (list1) during every loop iteratively to the dictionary (dict2) the stored values are empty lists.
dict1 = {"key1" : [-0.04819, 0.07311, -0.09809, 0.14818, 0.19835],
"key2" : [0.039984, 0.0492105, 0.059342, -0.0703545, -0.082233],
"key3" : [0.779843, 0.791255, 0.802576, 0.813777, 0.823134]}
dict2 = {}
list1 = []
for key in dict1:
if (index + 1 < len(dict1[key]) and index - 1 >= 0):
for index, element in enumerate(dict1[key]):
if element - dict1[key][index+1] > 0:
list1.append(element)
dict2['{}'.format(key)] = list1
list.clear()
print(dict2)
The output I want:
dict2 = {"key1" : [0.07311, 0.14818, 0.19835],
"key2" : [0.039984, 0.0492105, 0.059342],
"key3" : [0.779843, 0.791255, 0.802576, 0.813777, 0.823134]}
The problem is that list always refers to the same list, which you empty by calling clear. Therefore all values in the dict refer to the same empty list object in memory.
>>> # ... running your example ...
>>> [id(v) for v in dict2.values()]
[2111145975936, 2111145975936, 2111145975936]
It looks like you want to filter out negative elements from the values in dict1. A simple dict-comprehension will do the job.
>>> dict2 = {k: [x for x in v if x > 0] for k, v in dict1.items()}
>>> dict2
{'key1': [0.07311, 0.14818, 0.19835],
'key2': [0.039984, 0.0492105, 0.059342],
'key3': [0.779843, 0.791255, 0.802576, 0.813777, 0.823134]}
#timgeb gives a great solution which simplifies your code to a dictionary comprehension but doesn't show how to fix your existing code. As he says there, you are reusing the same list on each iteration of the for loop. So to fix your code, you just need to create a new list on each iteration instead:
for key in dict1:
my_list = []
# the rest of the code is the same, expect you don't need to call clear()

How to get specific number of elements from dict in python from middle

I have a dictinary
dict1 = {"one":"1", "two":"2", "three":"3", "four":"4", "five":"5"}
now I want another dictionary say dict2 which contains second and third element of dict1
How can I do this please help Thanks
Here's a one-liner using .items() to get an iterable of (key, value) pairs, and the dict constructor to build a new dictionary out of a slice of those pairs:
>>> dict(list(dict1.items())[1:3])
{'two': '2', 'three': '3'}
The slice indices are 1 (inclusive) and 3 (exclusive) since indices count from 0. Note that this uses whatever order the items are in the original dictionary, which will be the order they were inserted in (unless using an old version of Python, in which case the order is generally non-deterministic). If you want a different order, you can use sorted instead of list, perhaps with an appropriate key function.
dict2 = {}
pos = 0
for x in dict1:
if(pos == 1 or pos == 2):
dict2[x] = dict1[x]
pos += 1
Try this -
dict1 = {"one":"1", "two":"2", "three":"3", "four":"4", "five":"5"}
dict2 = {}
list1 = ['two','three'] # The elements you want to copy
for key in dict1:
if key in list1:
dict2[key] = dict1[key]
print(dict2)
Result:
{'two': '2', 'three': '3'}
Or alternatively -
for i in list1:
dict2[i] = dict1[i]
print(dict2)
in a single line
dict2, positions = {}, (2, 3)
[dict2.update({key: dict1[key]}) for i, key in enumerate(dict1.keys()) if i in positions]
print(dict2)

How to find and append dictionary values which are present in a list

I have a list which has unique sorted values
arr = ['Adam', 'Ben', 'Chris', 'Dean', 'Flower']
I have a dictionary which has values as such
dict = {
'abc': {'Dean': 1, 'Adam':0, 'Chris':1},
'def': {'Flower':0, 'Ben':1, 'Dean':0}
}
From looking at values from arr I need to have each item and if the value isn't present in subsequent smaller dict that should be assigned a value -1
Result
dict = {
'abc': {'Adam':0, 'Ben':-1, 'Chris':1, 'Dean': 1, 'Flower':-1},
'def': {'Adam':-1, 'Ben':1, 'Chris':-1, 'Dean': 0, 'Flower':0}
}
how can I achieve this using list and dict comprehensions in python
dd = {
key: {k: value.get(k, -1) for k in arr}
for key, value in dd.items()
}
{k: value.get(k, -1) for k in arr} will make sure that your keys are in the same order as you defined in the arr list.
A side note on the order of keys in dictionary.
Dictionaries preserve insertion order. Note that updating a key does
not affect the order. Keys added after deletion are inserted at the
end.
Changed in version 3.7: Dictionary order is guaranteed to be insertion
order. This behavior was an implementation detail of CPython from 3.6.
Please do not make a variable called dict, rename it to dct or something since dict it is a reserved python internal.
As for your question: just iterate through your dct and add the missing keys using setdefault:
arr = ['Adam', 'Ben', 'Chris', 'Dean', 'Flower']
dct = {
'abc': {'Dean': 1, 'Adam':0, 'Chris':1},
'def': {'Flower':0, 'Ben':1, 'Dean':0}
}
def add_dict_keys(dct, arr):
for key in arr:
dct.setdefault(key, -1)
return dct
for k, v in dct.items():
add_dict_keys(v, arr)
print(dct) # has updated values

Iterate through list of dictionories and compare with other dictionories

I have the following dictionaries
dic1 = { 'T1': "HI , china" , 'T2': "HI , finland" ,'T3': "HI , germany"}
dic2 = { 'T1': ['INC1','INC2','INC3'] , 'T2': ['INC2','INC4','INC5'],'T3': ['INC2','INC5']}
dic3 = { 'INC1': {'location':'china'} , 'INC2': {'location':'germany'},'INC3': {'location':'germany'},'INC4': {'location':'finland'}}
I need to remove the values in dic2 based on the dic1,dic3
example :
I have to iterate through dic2 first and check the T1 and its INC values.
If the corresponding T1 key value in the Dic1 matches with the corresponding INC values in the dic3 the keep the value in dic2 otherwise pop the value.
Detaileded explanation given in the picture. And I am expecting the following output.
dic2 = { 'T1': ['INC1'] , 'T2': ['INC4'],'T3': ['INC2']}
example code :
for k, v in dic2.items():
for k1, v1 in dic1.items():
if k is k1:
print k
for k2 in v:
for k3 in dic3.items():
I am new to python. I have tried the above pieces of code and I got struck down.Could you please help me out.
Can be done in a one-liner:
>>> {k: [i for i in v if dic3.get(i, {}).get('location', '#') in dic1[k]] for k, v in dic2.items()}
{'T1': ['INC1'], 'T2': ['INC4'], 'T3': ['INC2']}
[i for i in v if dic3.get(i, {}).get('location', '#') is the list comprehension to pick only the values from dic2's lists if dic3[i]['location'] is inside dic1[k], where i is the key in dict d3 and k is the key from dict d2 as well as dict d1.
I used dic3.get(i, {}).get('location', '#') (rather than dic3[i]['location'], you get KeyError for key INC5 which is not in dic3) to avoid the issue of key i not being in dic3 (would return an empty dict for my next .get) and in the second .get, again I use it to return the location key's corresponding value if it exists and check if that returned string lies inside the values of dict d1.
I return # (can be any string/char) if I know that the key i doesn’t exist in dic3 (i not existing will essentially give {}.get('location', '#') which in turn will give #, this is sure to fail the membership in d1, hence it's Ok).
Toned down for-loop version:
>>> ans = {}
>>> for k, v in dic2.items():
... temp = []
... for i in v:
... if i in dic3 and dic3[i]['location'] in dic1[k]:
... temp.append(i)
... ans[k] = temp
...
>>> ans
{'T1': ['INC1'], 'T2': ['INC4'], 'T3': ['INC2']}
From my understanding, you are supposed to modify the original dic2 dictionary rather than creating a new one for your answer, this is what I got:
delete = []
for key, value in dic2.items():
loc = dic1[key][5:]
for inc in value:
if inc not in dic3:
delete.append(inc)
else:
if loc != dic3.get(inc)['location']:
delete.append(inc)
for item in delete:
value.remove(item)
delete = []
print(dic2)
>>> {'T1': ['INC1'], 'T2': ['INC4'], 'T3': ['INC2']}
The first for loop iterates through dic2 and sets the location required to the variable loc. The next one iterates through the lists (values) and adds them to a delete list.
At the end of each loop, it iterates through the delete list, removing each element from the value list, and then setting delete to an empty list.
I am also relatively new to python so I am sure there could be efficiency issues.
This is essentially the same as the other answers but as a one-liner, it may be more readable (maybe not, since it's subjective).
{k: [el] for k, v in dic2.items() for el in v if (el in dic3.keys() and dic3[el]['location'] in dic1[k])}

Python Collections Counter for a List of Dictionaries

I have a dynamically growing list of arrays that I would like to add like values together. Here's an example:
{"something" : [{"one":"200"}, {"three":"400"}, {"one":"100"}, {"two":"800"} ... ]}
I'd like to be able to add together the dictionaries inside the list. So, in this case for the key "something", the result would be:
["one":400, "three": 400, "two": 800]
or something to that effect. I'm familiar with the Python's collection counter, but since the "something" list contains dicts, it will not work (unless I'm missing something). The dict is also being dynamically created, so I can't build the list without the dicts. EG:
Counter({'b':3, 'c':4, 'd':5, 'b':2})
Would normally work, but as soon as I try to add an element, the previous value will be overwritten. I've noticed other questions such as these:
Is there any pythonic way to combine two dicts (adding values for keys that appear in both)?
Python count of items in a dictionary of lists
But again, the objects within the list are dicts.
I think this does what you want, but I'm not sure because I don't know what "The dict is also being dynamically created, so I can't build the list without the dicts" means. Still:
input = {
"something" : [{"one":"200"}, {"three":"400"}, {"one":"100"}, {"two":"800"}],
"foo" : [{"a" : 100, "b" : 200}, {"a" : 300, "b": 400}],
}
def counterize(x):
return Counter({k : int(v) for k, v in x.iteritems()})
counts = {
k : sum((counterize(x) for x in v), Counter())
for k, v in input.iteritems()
}
Result:
{
'foo': Counter({'b': 600, 'a': 400}),
'something': Counter({'two': 800, 'three': 400, 'one': 300})
}
I expect using sum with Counter is inefficient (in the same way that using sum with strings is so inefficient that Guido banned it), but I might be wrong. Anyway, if you have performance problems, you could write a function that creates a Counter and repeatedly calls += or update on it:
def makeints(x):
return {k : int(v) for k, v in x.iteritems()}
def total(seq):
result = Counter()
for s in seq:
result.update(s)
return result
counts = {k : total(makeints(x) for x in v) for k, v in input.iteritems()}
One way would be do as follows:
from collections import defaultdict
d = {"something" :
[{"one":"200"}, {"three":"400"}, {"one":"100"}, {"two":"800"}]}
dd = defaultdict(list)
# first get and group values from the original data structure
# and change strings to ints
for inner_dict in d['something']:
for k,v in inner_dict.items():
dd[k].append(int(v))
# second. create output dictionary by summing grouped elemetns
# from the first step.
out_dict = {k:sum(v) for k,v in dd.items()}
print(out_dict)
# {'two': 800, 'one': 300, 'three': 400}
In here I don't use counter, but defaultdict. Its a two step approach.

Categories