Getting Lists of Summed Nested Dictionary Values - python

I am trying to write a part of program where the user inputs a Target Word (targetWord = input()), assigns a nested Dictionary with the key being the same as the input word.
For example:
mainDict = {
'x': {'one': 1, 'blue': 1, 'green' :1},
'y': {'red': 1, 'blue': 2, 'two': 1},
'z': {'one': 1, 'green': 1, 'red': 1}
}
where all the values in the nested dictionary are assigned integers.
The user may input 'x', to which the program will assign:
targetDict = mainDict['x']
The program should then allow the user to input words again, but this time every single word from input is appended to a lookup list, for example user inputs 'y', then 'z':
lookup = ['y', 'z']
Then the program should run through the nested dictionary and for every value with the corresponding key as in the targetDict, append just values to a new nested list and add whatever value the nested Dictionary values are. So the output of this section should be:
targetOutput = [[2], [1, 1]]
because in nested dict 'y', only 'blue' was a common key, to which its value 2 was put in a list, then appended onto targetOutput. The same case with dict 'z', where the keys 'one' and 'green' were present in both 'x' and 'z', putting their values, 1 and 1 into a nested list.
Here is a representation of the dysfunctional code I have for:
targetOutput = []
targetDict = mainDict[targetWord]
for tkey in targetDict:
tt = []
for word in lookup:
for wkey in primaryDict[word]:
if tkey == wkey:
tt.append(targetDict[tkey])
tl.append(sum(tt))
print((pl))
The sum function at the end is because my actually final output should be the sum of the values in the nested list, akin to:
tl = [[2], [2]]
I am also trying to make the reverse happen, where in another list for every key in the lookup, it returns a nested list containing the sum of every value the targetWord dictionary also has a key for, like:
ll = [[2], [2]]
My question is, how do I fix my code so that it outputs the 2 above lists? I'm quite new with dictionaries.

The .keys() method on a dictionary gives you a dictionary view, which can act like a set. This means you can take the intersection between the key views of two dictionaries! You want the intersection between the inital targetDict and the dictionaries named in lookup:
for word in lookup:
other_dict = mainDict[word]
common_keys = targetDict.keys() & other_dict
targetOutput.append([other_dict[common] for common in common_keys])
The targetDict.keys() & other_dict expression produces the intersection here:
>>> mainDict = {
... 'x': {'one': 1, 'blue': 1, 'green' :1},
... 'y': {'red': 1, 'blue': 2, 'two': 1},
... 'z': {'one': 1, 'green': 1, 'red': 1}
... }
>>> targetDict = mainDict['x']
>>> targetDict.keys() & mainDict['y']
{'blue'}
>>> targetDict.keys() & mainDict['z']
{'green', 'one'}
The [other_dict[common] for common in common_keys] list comprehension takes those keys and looks up the values for them from the other dictionary.
If you want to sum the values, just pass the same sequence of values to the sum() function:
for word in lookup:
other_dict = mainDict[word]
common_keys = targetDict.keys() & other_dict
summed_values = sum(other_dict[common] for common in common_keys)
targetOutput.append(summed_values)
There is no point in wrapping the summed values in another list there as there is only ever going to be a single sum. The above gives you a targetOutput list with [2, 2], not [[2], [2]].

Related

Get value of dictionaries into separate lists

I am trying to get array by first key.
The names of the keys are always the same and the number of elements is the same.
[{'a': 1, 'b':41, 'c':324}, {'a': 1, 'b':12, 'c':65}, {'a': 2, 'b':36, 'c':12}]
expected output:
[{'b':41, 'c':324}, {'b':12, 'c':65}]
[{'b':36, 'c':12}]
Make a new dictionary that uses the values of the a keys as its keys.
newdict = {}
for d in data:
newdict.setdefault(d['a'], []).append({'b': d['b'], 'c': d['c']})
result = list(new_dict.values())

Over counting pairs in python loop

I have a list of dictionaries where each dict is of the form:
{'A': a,'B': b}
I want to iterate through the list and for every (a,b) pair, find the pair(s), (b,a), if it exists.
For example if for a given entry of the list A = 13 and B = 14, then the original pair would be (13,14). I would want to search the entire list of dicts to find the pair (14,13). If (14,13) occurred multiple times I would like to record that too.
I would like to count the number of times for all original (a,b) pairs in the list, when the complement (b,a) appears, and if so how many times. To do this I have two for loops and a counter when a complement pair is found.
pairs_found = 0
for i, val in enumerate( list_of_dicts ):
for j, vol in enumerate( list_of_dicts ):
if val['A'] == vol['B']:
if vol['A'] == val['B']:
pairs_found += 1
This generates a pairs_found greater than the length of list_of_dicts. I realize this is because the same pairs will be over-counted. I am not sure how I can overcome this degeneracy?
Edit for Clarity
list_of_dicts = []
list_of_dicts[0] = {'A': 14, 'B', 23}
list_of_dicts[1] = {'A': 235, 'B', 98}
list_of_dicts[2] = {'A': 686, 'B', 999}
list_of_dicts[3] = {'A': 128, 'B', 123}
....
Lets say that the list has around 100000 entries. Somewhere in that list, there will be one or more entries, of the form {'A' 23, 'B': 14}. If this is true then I would like a counter to increase its value by one. I would like to do this for every value in the list.
Here is what I suggest:
Use tuple to represent your pairs and use them as dict/set keys.
Build a set of unique inverted pairs you'll look for.
Use a dict to store the number of time a pair appears inverted
Then the code should look like this:
# Create a set of unique inverted pairs
inverted_pairs_set = {(d['B'],d['A']) for d in list_of_dicts}
# Create a counter for original pairs
pairs_counter_dict = {(ip[1],ip[0]):0 for ip in inverted_pairs_set]
# Create list of pairs
pairs_list = [(d['A'],d['B']) for d in list_of_dicts]
# Count for each inverted pairs, how many times
for p in pairs_list:
if p in inverted_pairs_set:
pairs_counter_dict[(p[1],p[0])] += 1
You can create a counter dictionary that contains the values of the 'A' and 'B' keys in all your dictionaries:
complements_cnt = {(dct['A'], dct['B']): 0 for dct in list_of_dicts}
Then all you need is to iterate over your dictionaries again and increment the value for the "complements":
for dct in list_of_dicts:
try:
complements_cnt[(dct['B'], dct['A'])] += 1
except KeyError: # in case there is no complement there is nothing to increase
pass
For example with such a list_of_dicts:
list_of_dicts = [{'A': 1, 'B': 2}, {'A': 2, 'B': 1}, {'A': 1, 'B': 2}]
This gives:
{(1, 2): 1, (2, 1): 2}
Which basically says that the {'A': 1, 'B': 2} has one complement (the second) and {'A': 2, 'B': 1} has two (the first and the last).
The solution is O(n) which should be quite fast even for 100000 dictionaries.
Note: This is quite similar to #debzsud answer. I haven't seen it before I posted the answer though. :(
I am still not 100% sure what it is you want to do but here is my guess:
pairs_found = 0
for i, dict1 in enumerate(list_of_dicts):
for j, dict2 in enumerate(list_of_dicts[i+1:]):
if dict1['A'] == dict2['B'] and dict1['B'] == dict2['A']:
pairs_found += 1
Note the slicing on the second for loop. This avoids checking pairs that have already been checked before (comparing D1 with D2 is enough; no need to compare D2 to D1)
This is better than O(n**2) but still there is probably room for improvement
You could first create a list with the values of each dictionary as tuples:
example_dict = [{"A": 1, "B": 2}, {"A": 4, "B": 3}, {"A": 5, "B": 1}, {"A": 2, "B": 1}]
dict_values = [tuple(x.values()) for x in example_dict]
Then create a second list with the number of occurrences of each element inverted:
occurrences = [dict_values.count(x[::-1]) for x in dict_values]
Finally, create a dict with dict_values as keys and occurrences as values:
dict(zip(dict_values, occurrences))
Output:
{(1, 2): 1, (2, 1): 1, (4, 3): 0, (5, 1): 0}
For each key, you have the number of inverted keys. You can also create the dictionary on the fly:
occurrences = {dict_values: dict_values.count(x[::-1]) for x in dict_values}

python error 'dict' object has no attribute: 'add'

I wrote this code to perform as a simple search engine in a list of strings like the example below:
mii(['hello world','hello','hello cat','hellolot of cats']) == {'hello': {0, 1, 2}, 'cat': {2}, 'of': {3}, 'world': {0}, 'cats': {3}, 'hellolot': {3}}
but I constantly get the error
'dict' object has no attribute 'add'
how can I fix it?
def mii(strlist):
word={}
index={}
for str in strlist:
for str2 in str.split():
if str2 in word==False:
word.add(str2)
i={}
for (n,m) in list(enumerate(strlist)):
k=m.split()
if str2 in k:
i.add(n)
index.add(i)
return { x:y for (x,y) in zip(word,index)}
In Python, when you initialize an object as word = {} you're creating a dict object and not a set object (which I assume is what you wanted). In order to create a set, use:
word = set()
You might have been confused by Python's Set Comprehension, e.g.:
myset = {e for e in [1, 2, 3, 1]}
which results in a set containing elements 1, 2 and 3. Similarly Dict Comprehension:
mydict = {k: v for k, v in [(1, 2)]}
results in a dictionary with key-value pair 1: 2.
x = [1, 2, 3] # is a literal that creates a list (mutable array).
x = [] # creates an empty list.
x = (1, 2, 3) # is a literal that creates a tuple (constant list).
x = () # creates an empty tuple.
x = {1, 2, 3} # is a literal that creates a set.
x = {} # confusingly creates an empty dictionary (hash array), NOT a set, because dictionaries were there first in python.
Use
x = set() # to create an empty set.
Also note that
x = {"first": 1, "unordered": 2, "hash": 3} # is a literal that creates a dictionary, just to mix things up.
I see lots of issues in your function -
In Python {} is an empty dictionary, not a set , to create a set, you should use the builtin function set() .
The if condition - if str2 in word==False: , would never amount to True because of operator chaining, it would be converted to - if str2 in word and word==False , example showing this behavior -
>>> 'a' in 'abcd'==False
False
>>> 'a' in 'abcd'==True
False
In line - for (n,m) in list(enumerate(strlist)) - You do not need to convert the return of enumerate() function to list, you can just iterate over its return value (which is an iterator directly)
Sets do not have any sense of order, when you do - zip(word,index) - there is no guarantee that the elements are zipped in the correct order you want (since they do not have any sense of order at all).
Do not use str as a variable name.
Given this, you are better off directly creating the dictionary from the start , rather than sets.
Code -
def mii(strlist):
word={}
for i, s in enumerate(strlist):
for s2 in s.split():
word.setdefault(s2,set()).add(i)
return word
Demo -
>>> def mii(strlist):
... word={}
... for i, s in enumerate(strlist):
... for s2 in s.split():
... word.setdefault(s2,set()).add(i)
... return word
...
>>> mii(['hello world','hello','hello cat','hellolot of cats'])
{'cats': {3}, 'world': {0}, 'cat': {2}, 'hello': {0, 1, 2}, 'hellolot': {3}, 'of': {3}}
def mii(strlist):
word_list = {}
for index, str in enumerate(strlist):
for word in str.split():
if word not in word_list.keys():
word_list[word] = [index]
else:
word_list[word].append(index)
return word_list
print mii(['hello world','hello','hello cat','hellolot of cats'])
Output:
{'of': [3], 'cat': [2], 'cats': [3], 'hellolot': [3], 'world': [0], 'hello': [0, 1, 2]}
I think this is what you wanted.

Removing dictionaries from a list on the basis of duplicate value of key

I am new to Python. Suppose i have the following list of dictionaries:
mydictList= [{'a':1,'b':2,'c':3},{'a':2,'b':2,'c':4},{'a':2,'b':3,'c':4}]
From the above list, i want to remove dictionaries with same value of key b. So the resultant list should be:
mydictList = [{'a':1,'b':2,'c':3},{'a':2,'b':3,'c':4}]
You can create a new dictionary based on the value of b, iterating the mydictList backwards (since you want to retain the first value of b), and get only the values in the dictionary, like this
>>> {item['b'] : item for item in reversed(mydictList)}.values()
[{'a': 1, 'c': 3, 'b': 2}, {'a': 2, 'c': 4, 'b': 3}]
If you are using Python 3.x, you might want to use list function over the dictionary values, like this
>>> list({item['b'] : item for item in reversed(mydictList)}.values())
Note: This solution may not maintain the order of the dictionaries.
First, sort the list by b-values (Python's sorting algorithm is stable, so dictionaries with identical b values will retain their relative order).
from operator import itemgetter
tmp1 = sorted(mydictList, key=itemgetter('b'))
Next, use itertools.groupby to create subiterators that iterate over dictionaries with the same b value.
import itertools
tmp2 = itertools.groupby(tmp1, key=itemgetter('b))
Finally, create a new list that contains only the first element of each subiterator:
# Each x is a tuple (some-b-value, iterator-over-dicts-with-b-equal-some-b-value)
newdictList = [ next(x[1]) for x in tmp2 ]
Putting it all together:
from itertools import groupby
from operator import itemgetter
by_b = itemgetter('b')
newdictList = [ next(x[1]) for x in groupby(sorted(mydictList, key=by_b), key=by_b) ]
A very straight forward approach can go something like this:
mydictList= [{'a':1,'b':2,'c':3},{'a':2,'b':2,'c':4},{'a':2,'b':3,'c':4}]
b_set = set()
new_list = []
for d in mydictList:
if d['b'] not in b_set:
new_list.append(d)
b_set.add(d['b'])
Result:
>>> new_list
[{'a': 1, 'c': 3, 'b': 2}, {'a': 2, 'c': 4, 'b': 3}]

How to address a dictionary in a list of ordered dicts by unique key value?

(Using Python 2.7) The list, for example:
L = [
{'ID': 1, 'val': ['eggs']},
{'ID': 2, 'val': ['bacon']},
{'ID': 6, 'val': ['sausage']},
{'ID': 9, 'val': ['spam']}
]
This does what I want:
def getdict(list, dict_ID):
for rec in list
if rec['ID'] == dict_ID:
return rec
print getdict(L, 6)
but is there a way to address that dictionary directly, without iterating over the list until you find it?
The use case: reading a file of records (ordered dicts). Different key values from records with a re-occurring ID must be merged with the record with the first occurrence of that ID.
ID numbers may occur in other key values, so if rec['ID'] in list would produce false positives.
While reading records (and adding them to the list of ordered dicts), I maintain a set of unique ID's and only call getdict if a newly read ID is already in there. But then still, it's a lot of iterations and I wonder if there isn't a better way.
The use case: reading a file of records (ordered dicts). Different key
values from records with a re-occurring ID must be merged with the
record with the first occurrence of that ID.
You need to use a defaultdict for this:
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> d['a'].append(1)
>>> d['a'].append(2)
>>> d['b'].append(3)
>>> d['c'].append(4)
>>> d['b'].append(5)
>>> print(d['a'])
[1, 2]
>>> print(d)
defaultdict(<type 'list'>, {'a': [1, 2], 'c': [4], 'b': [3, 5]})
If you want to store other objects, for example a dictionary, just pass that as the callable:
>>> d = defaultdict(dict)
>>> d['a']['values'] = []
>>> d['b']['values'] = []
>>> d['a']['values'].append('a')
>>> d['a']['values'].append('b')
>>> print(d)
defaultdict(<type 'dict'>, {'a': {'values': ['a', 'b']}, 'b': {'values': []}})
Maybe I'm missing something, but couldn't you use a single dictionary?
L = {
1 : 'eggs',
2 : 'bacon',
6 : 'sausage',
9 : 'spam'
}
Then you can do L.get(ID). This will either return the value (eggs, etc) or None if the ID isn't in the dict.
You seem to be doing an inverse dictionary lookup, that is a lookup by value instead of a key. Inverse dictionary lookup - Python has some pointers on how to do this efficiently.

Categories