iterating over list-valued dictionary - python

Suppose I have a dictionary that has tuplets as keys and lists of tuples as values, for example:
d={(0,1):[(1,1)],
(0,2):[(1,1),(1,2)],
(0,3):[(1,1),(1,2),(1,3)]}
I would like to remove all entries such that their value is contained in the value of another key, for example:
from d I would like to remove the entry with key (0,1) because the (1,1) is contained in [(1,1),(1,2)]
and remove the entry with key (0,2) because [(1,1),(1,2)] is contained in [(1,1),(1,2),(1,3)].
Order of tuples in the lists matters.
I can solve this using a bunch of for loops, like this:
for key, val in d.items():
for k,v in d.items():
for i in range(0, len(val)):
if val[i] in v and len(v) - len(val) == 1:
del_me = True
else:
del_me = False
break
if del_me:
to_del.append(key)
for key in set(to_del):
del d[key]
edit: (further explanation)
Keys are not important here but will be important later.
In other words:
let a,b,c,d denote unique tuples
let k1,k2,..., denote keys
lets have the entries:
k1:[a],
k2:[d],
k3:[a,b],
k4:[b,a],
k5:[a,b,c],
I want to end up with:
k2,k4,k5
Removed entries will be:
k1 because a is in k3
k3 because a,b is in k5
It hurts my eyes when I'm looking at this, sorry.
What would be a pythonic way to do this?

Let's say that you have an dictionary that looks like as follows:
d={(0,1):[(1,1)],(0,2):[(1,1),(1,2)],(0,3):[(1,1),(1,2),(1,3)],(0,4):[(1,1),(1,2),(1,4)]}
Now you want to compare the values against all the keys with one another. I suggest you place all the values in this dictionary in a list.
mylist=[]
for key in d.keys():
mylist.append(d[key])
print(mylist)
mylist will look like as follows having all the values of the dictionary.
[[(1, 1)], [(1, 1), (1, 2)], [(1, 1), (1, 2), (1, 3)], [(1, 1), (1, 2), (1, 4)]]
Now you want to compare all these values with one another and remove those values that are the subsets of any of those values. Like in this example, [(1,1)] is the subset of [(1,1),(1,2)] and so [(1,1)] will be removed. Similarly, [(1,1),(1,2)] is the subset of [(1,1),(1,2),(1,3)] and so it will also be removed. We can accomplish this as follows:
out = []
for k, elm in enumerate(mylist):
for _,elm2 in enumerate(mylist[:k] + mylist[k + 1:]):
if frozenset(elm).issubset(elm2):
break
else:
out.append(elm)
print(out)
out list will give us unique elements of the list mylist.
[[(1, 1), (1, 2), (1, 3)], [(1, 1), (1, 2), (1, 4)]]
In the above segment of code, we loop over the list and take the current index. Then we loop again over the same list but we remove the element at the current index of the first loop. Then we convert the first element of the index of the first loop into a frozenset and use the method issubset in order to check if the first element if a subset of the second element or not.
Now we just compare our dictionary d values with out list. And if, d values are not in out list we delete that key from the dictionary d. This is accomplished as follows:
for key in list(d):
if d[key] not in out:
del d[key]
print(d)
Th output dictionary d will be:
{(0, 3): [(1, 1), (1, 2), (1, 3)], (0, 4): [(1, 1), (1, 2), (1, 4)]}

Related

Check if something in a dictionary is the same as the max value in that dictionary?

How can I check if something in a dictionary is the same as the max in that dictionary. In other words, get all the max values instead of the max value with lowest position.
I have this code which returns the max variable name and value:
d = {'g_dirt4': g_dirt4, 'g_destiny2': g_destiny2, 'g_southpark': g_southpark, 'g_codww2': g_codww2, 'g_bfront2': g_bfront2, 'g_reddead2': g_reddead2, 'g_fifa18': g_fifa18, 'g_motogp17': g_motogp17, 'g_elderscrolls': g_elderscrolls, 'g_crashbandicoot': g_crashbandicoot}
print("g_dirt4", g_dirt4, "g_destiny2", g_destiny2, "g_southpark", g_southpark, "g_codww2", g_codww2, "g_bfront2", g_bfront2, "g_reddead2", g_reddead2, "g_fifa18", g_fifa18, "g_motogp17", g_motogp17, "g_elderscrolls", g_elderscrolls, "g_crashbandicoot", g_crashbandicoot)
print (max(d.items(), key=lambda x: x[1]))
Now it prints the variable with the highest value plus the value itself, but what if there are two or three variables with the same max value? I would like to print all of the max values.
Edit:
The user has to fill in a form, which adds values to the variables in the dictionary. When the user is done, there will be one, two or more variables with the highest value. For example, the code gives me this:
2017-06-08 15:05:43 g_dirt4 9 g_destiny2 8 g_southpark 5 g_codww2 8 g_bfront2 8 g_reddead2 7 g_fifa18 8 g_motogp17 9 g_elderscrolls 5 g_crashbandicoot 6
2017-06-08 15:05:43 ('g_dirt4', 9)
Now it tells me that g_dirt4 has the highest value of 9, but if you look at motogp17, it also had 9 but it doesn't get printed because it's at a higher position in the dictionary. So how do I print them both? And what if it has 3 variables with the same max value?
Given a dictionary
d = {'too': 2, 'one': 1, 'two': 2, 'won': 1, 'to': 2}
the following command:
result = [(n,v) for n,v in d.items() if v == max(d.values())]
yields: [('too', 2), ('two', 2), ('to', 2)]
Let me introduce you to a more complicated but more powerful answer. If you sort your dictionary items, you can use itertools.groupby for some powerful results:
import itertools
foo = {"one": 1, "two": 2, "three": 3, "tres": 3, "dos": 2, "troi": 3}
sorted_kvp = sorted(foo.items(), key=lambda kvp: -kvp[1])
grouped = itertools.groupby(sorted_kvp, key=lambda kvp: kvp[1])
The sorted line takes the key/value pairs of dictionary items and sorts them based on the value. I put a - in front so that the values will end up being sorted descending. The results of that line are:
>>> print(sorted_kvp)
[('tres', 3), ('troi', 3), ('three', 3), ('two', 2), ('dos', 2), ('one', 1)]
Note, as the comments said above, the order of the keys (in this case, 'tres', 'troi', and 'three', and then 'two' and 'dos', is arbitrary, since the order of the keys in the dictionary is arbitrary.
The itertools.groupby line makes groups out of the runs of data. The lambda tells it to look at kvp[1] for each key-value pair, i.e. the value.
At the moment, you're only interested in the max, so you can then do this:
max, key_grouper = next(grouped)
print("{}: {}".format(max, list(key_grouper)))
And get the following results:
3: [('tres', 3), ('troi', 3), ('three', 3)]
But if you wanted all the information sorted, instead, with this method, that's just as easy:
for value, grouper in grouped:
print("{}: {}".format(value, list(grouper)))
produces:
3: [('tres', 3), ('troi', 3), ('three', 3)]
2: [('two', 2), ('dos', 2)]
1: [('one', 1)]
One last note: you can use next or you can use the for loop, but using both will give you different results. grouped is an iterator, and calling next on it moves it to its next result (and the for loop consumes the entire iterator, so a subsequent next(grouped) would cause a StopIteration exception).
You could do something like this:
max_value = (max(d.items(), key=lambda x: x[1]))[1]
max_list = [max_value]
for key, value in d.items():
if value == max_value:
max_list.append((key, value))
print(max_list)
This will get the maximum value, then loop through all the keys and values in your dictionary and add all the ones matching that max value to a list. Then you print the list and it should print all of them.

Collect same keys (by only the first part of the key) in a python dict?

I have a python dictionary with similar keys and I want to collect all keys (and values) with the same first part (name or title in this case) into a dict or list in order to find the most common values afterwards. As a side-note: I don't know how many copies of a key (with the same first part) exist. Here are 3, but there could be only 2 or more than 3.
{'name=a=AA': (2, 2), 'name=a_copy=AA': (3, 3), 'name=a_copy2=AA': (3, 2),
'title=b=AA': (1, 2), 'title=b_copy=AA': (3, 3), 'title=b_copy2=AA': (1, 2)}
Is this possible? I though about using key.split("=")[0]
Just loop over the key-values and collect them into a dictionary with lists:
results = {}
for key, value in input_dict.items():
prefix = key.partition('=')[0]
results.setdefault(prefix, []).append((key, value))
This splits of the first part using str.partition(); this is faster for the single-split case. You could use key.split('=', 1)[0] as well, however.
Using defaultdict:
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> for key in D: # this is the original dictionary
... d[key.split("=")[0]].append(key)
...
>>> d
defaultdict(<class 'list'>, {'title': ['title=b_copy2=AA', 'title=b_copy=AA', 'title=b=AA'], 'name': ['name=a=AA', 'name=a_copy=AA', 'name=a_copy2=AA']})
ِAnother way is to use the itertools.groupby method and group the keys according to first item of split over =:
>>> d
{'name=a=AA': (2, 2), 'name=a_copy2=AA': (3, 2), 'title=b=AA': (1, 2), 'name=a_copy=AA': (3, 3), 'title=b_copy=AA': (3, 3), 'title=b_copy2=AA': (1, 2)}
>>>
>>> dd = {}
>>>
>>> for k,v in groupby(d, key=lambda s:s.split('=')[0]):
if k in dd:
dd[k].extend(list(v))
else:
dd[k] = list(v)
>>> dd
{'name': ['name=a=AA', 'name=a_copy2=AA', 'name=a_copy=AA'], 'title': ['title=b=AA', 'title=b_copy=AA', 'title=b_copy2=AA']}

Replacing an element in an OrderedDict?

Should I remove item at the index and add item at the index?
Where should I look at to find the source for OrderedDict class?
From the Python documentation:
If a new entry overwrites an existing entry, the original insertion position is left unchanged. Deleting an entry and reinserting it will move it to the end.
The OrderedDict uses the position of a value if it is already present; if it isn't, just treats it as a new value and adds it at the end.
This is in the documentation. If you need to replace and maintain order, you'll have to do it manually:
od = OrderedDict({i:i for i in range(4)})
# od = OrderedDict([(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)])
# Replace the key and value for key == 0:
d = OrderedDict(('replace','key') if key == 0 else (key, value) for key, value in od.items())
# d = OrderedDict([('replace', 'key'), (1, 1), (2, 2), (3, 3), (4, 4)])
# Single value replaces are done easily:
d[1] = 20 # and so on..
Additionally, at the top of the documentation page you'll see a reference to the file containing, among others, the source for the OrderedDict class. It is in collections.py and, actually, the first class defined.

Remove duplicate values from a defaultdict python

I have a dictionary.
a = {6323: [169635, 169635, 169635], 6326: [169634,169634,169634,169634,169634,169634,169638,169638,169638,169638], 6425: [169636,169636,169636,169639,169639,169640]}
How do I remove the duplicate values for each key in dictionary a? And make the values become [value, occurrences]?
The output should be
b = {6323: [(169635, 3)], 6326: [(169634, 6), (19638, 4)], 6425: [(169636, 3), (19639, 2), (19640, 1)]}.
EDIT:
Sorry, I pasted the dict.items() output so they weren't dictionaries. I corrected it now.
Also edited the question to be more clear.
I would suggest iterating on the items and for each value build a defaultdict incrementing the occurence. Then convert that indo your tuple list (with the item method) and drop that in the output dictionnary.
b = {}
for k,v in a.items():
d = defaultdict(int)
for i in v:
d[i] += 1
b[k] = d.items()

Adding the values of a dictionary to a list

Hi I am having a bit of difficulty adding tuples which are the values of a dictionary I have extracted the tuples and need to added to an iterable item say an empty list. i.e.
path = [1,2,3,4]
pos = {1:(3,7), 2(3,0),3(2,0),4(5,8)}
h = []
for key in path:
if key in pos:
print pos[key]
h.append(pos[Key])#Gives an error
Please how can i append the values in pos[key] into a h. Thanks
You can use list comprehension:
h = [pos[key] for key in path if key in pos]
Demo:
print h
>>> [(3, 7), (3, 0), (2, 0), (5, 8)]
Notes:
A dictionary should be declared like pairs of key:value. Your syntax is incorrect.
Also, Python is case-sensitive so key is different than Key.

Categories