Append elements in the value field of a dictionary using comprehensions - python

I have a list of elements, lets say:
y = [1, 3, 1, 5, 1]
And I would like to create a dictionary where:
Keys: are the elements in y
Values: is a list of the elements that appear before the Key in y
I attempted the following comprehension.
a={elem:y[i] for i, elem in enumerate(y[1:])}
However, since the value field in the dictionary is not a list, it only keeps the previous element in the last occurrence of the key.
In other words, for this example I get the following:
{3: 1, 1: 5, 5: 3}
Is there a way to do so using comprehensions ?
Note: I forgot to add the desired result.
{3: [1], 1: [3,5], 5: [1]}

Your keys are duplicated, so you cannot create a dictionary with them (you'll lose the first elements).
So comprehensions are difficult to use (and inefficient, as stated by other comprehension answers here) because of the accumulation effect that you need.
I suggest using collections.defaultdict(list) instead and a good old loop:
import collections
y = [1, 3, 1, 5, 1]
d = collections.defaultdict(list)
for i,x in enumerate(y[1:]):
d[x].append(y[i]) # i is the index of the previous element in y
print(d)
result:
defaultdict(<class 'list'>, {1: [3, 5], 3: [1], 5: [1]})

Use enumerate and set operations.
{value: set(y[:i]) - {value} for i, value in enumerate(y)}
Out: {1: {3, 5}, 3: {1}, 5: {1, 3}}
It's a bit ugly and inefficient because in your example it works out a new answer each time it encounters 1, but it works out right because the final time it does this is the final time it encounters 1.

Just for the fun of it. Here's a comprehension.
a = {y[i]: [y[x-1] for x in range(len(y)) if y[x]==y[i]] for i in range(1, len(y))}
>> {3: [1], 1: [3,5], 5: [1]}
Just note that it's too long and inefficient to be allowed in any practical program.
Using the defaultdict as Jean-François Fabre suggested in his answer below should be the proper way.

Related

Alternative way to use setdefault() using dictionary comprehension?

I have a nested dictionary that was created from a nested list where the first item in the nested list would be the outer key and outer value would be a dictionary which is the next two items. The following code is working great using the two setdefault() functions because it just adds to the nested dictionary when it sees a duplicate key of the outer. I was just wondering how you could do this same logic using a dictionary comprehension?
dict1 = {}
list1 = [[1, 2, 6],
[1, 3, 7],
[2, 5, 8],
[2, 8, 9]]
for i in list1:
dict1.setdefault(i[0], {}).setdefault(i[1], i[2])
OUTPUT:
{1: {2: 6, 3: 7}, 2: {5: 8, 8: 9}}
Use the loop because it's very readable and efficient. Not all code has to be a one-liner.
Having said that, it's possible. It abuses syntax, extremely unreadable, inefficient, and generally just plain bad code (don't do it!)
out = {k: next(gg for gg in [{}] if all(gg.setdefault(a, b) for a,b in v)) for k, v in next(g for g in [{}] if not any(g.setdefault(key, []).append(v) for key, *v in list1)).items()}
Output:
{1: {2: 6, 3: 7}, 2: {5: 8, 8: 9}}
I actually tried to achieve that result and failed.
The comprehension overwrites the new entries.
After, giving this idea a look, I found a similar post in which it is stated it is not possible:https://stackoverflow.com/questions/11276473/append-to-a-dict-of-lists-with-a-dict-comprehension
I believe Amber's answer best sumarizes what the conclusion with my failed attempt with dict comprehensions:
No - dict comprehensions are designed to generate non-overlapping keys with each iteration; they don't support aggregation. For this particular use case, a loop is the proper way to accomplish the task efficiently (in linear time)

Pythonic way of getting hierarchy of elements in numeric list

I have a numeric list a and I want to output a list with the hierarchical position of every element in a (0 for the highest value, 1 for the second-highest, etc).
I want to know if this is the most Pythonic and efficient way to do this. Perhaps there is a better way?
a = [3,5,6,25,-3,100]
b = sorted(a)
b = b[::-1]
[b.index(i) for i in a]
#ThierryLathuille's answer works only if there are no duplicates in the input list since the answer relies on a dict with the list values as keys. If there can be duplicates in the list, you should sort the items in the input list with their indices generated by enumerate, and map those indices to their sorted positions instead:
from operator import itemgetter
mapping = dict(zip(map(itemgetter(0), sorted(enumerate(a), key=itemgetter(1), reverse=True)), range(len(a))))
mapping becomes:
{5: 0, 3: 1, 2: 2, 1: 3, 0: 4, 4: 5}
so that you can then iterate an index over the length of the list to obtain the sorted positions in order:
[mapping[i] for i in range(len(a))]
which returns:
[4, 3, 2, 1, 5, 0]
You could also you numpy.argsort(-a) (-a because argsort assumes ascending order). It could have better performance for large arrays (though there's no official analysis that I know of).
One problem with your solution is the repeated use of index, that will make your final comprehension O(n**2), as index has to go over the sorted list each time.
It would be more efficient to build a dict with the rank of each value in the sorted list:
a = [3,5,6,25,-3,100]
ranks = {val:idx for idx, val in enumerate(sorted(a, reverse=True))}
# {100: 0, 25: 1, 6: 2, 5: 3, 3: 4, -3: 5}
out = [ranks[val] for val in a]
print(out)
# [4, 3, 2, 1, 5, 0]
in order to have a final step in O(n).
First, zip the list with a with range(len(a)) to create a list of tuples (of element and their positions), sort this list in reverse order, zip this with range(len(a)) to mark the positions of each element after the sort, now unsort this list (by sorting this based on the original position of each element), and finally grab the position of each element when it was sorted
>>> a = [3,5,6,25,-3,100]
>>> [i for _,i in sorted(zip(sorted(zip(a, range(len(a))), reverse=True), range(len(a))), key=lambda t:t[0][1])]
[4, 3, 2, 1, 5, 0]

Converting a list of "pairs" into a dictionary of dictionaries?

This question was previously asked here with an egregious typo: Counting "unique pairs" of numbers into a python dictionary?
This is an algorithmic problem, and I don't know of the most efficient solution. My idea would be to somehow cache values in a list and enumerate pairs...but that would be so slow. I'm guessing there's something useful from itertools.
Let's say I have a list of integers whereby are never repeats:
list1 = [2, 3]
In this case, there is a unique pair 2-3 and 3-2, so the dictionary should be:
{2:{3: 1}, 3:{2: 1}}
That is, there is 1 pair of 2-3 and 1 pair of 3-2.
For larger lists, the pairing is the same, e.g.
list2 = [2, 3, 4]
has the dicitonary
{2:{3:1, 4:1}, 3:{2:1, 4:1}, 4:{3:1, 2:1}}
(1) Once the size of the lists become far larger, how would one algorithmically find the "unique pairs" in this format using python data structures?
(2) I mentioned that the lists cannot have repeat integers, e.g.
[2, 2, 3]
is impossible, as there are two 2s.
However, one may have a list of lists:
list3 = [[2, 3], [2, 3, 4]]
whereby the dictionary must be
{2:{3:2, 4:1}, 3:{2:2, 4:1}, 4:{2:1, 3:1}}
as there are two pairs of 2-3 and 3-2. How would one "update" the dictionary given multiple lists within a list?
EDIT: My ultimate use case is, I want to iterate through hundreds of lists of integers, and create a single dictionary with the "counts" of pairs. Does this make sense? There might be another data structure which is more useful.
For the nested list example, you can do the following, making use of itertools.permutations and dict.setdefault:
from itertools import permutations
list3 = [[2, 3], [2, 3, 4]]
d = {}
for l in list3:
for a, b in permutations(l, 2):
d[a][b] = d.setdefault(a, {}).setdefault(b, 0) + 1
# {2: {3: 2, 4: 1}, 3: {2: 2, 4: 1}, 4: {2: 1, 3: 1}}
For flat lists l, use only the inner loop and omit the outer one
For this example I'll just use a list with straight numbers and no nested list:
values = [3, 2, 4]
result = dict.from_keys(values)
for key, value in result.items():
value = {}
for num in values:
if num != key:
value[num] = 1
This creates a dict with each number as a key. Now in each key, make the value a nested dict who's contents are num: 1 for each number in the original values list if it isn't the name of the key that we're in
use defaultdict with permutations
from collections import defaultdict
from itertools import permutations
d = defaultdict(dict)
for i in [x for x in permutations([4,2,3])]:
d[i[0]] = {k: 1 for k in i[1:]}
output is
In [22]: d
Out[22]: defaultdict(dict, {2: {3: 1, 4: 1}, 4: {2: 1, 3: 1}, 3: {2: 1, 4: 1}})
for inherit list of lists https://stackoverflow.com/a/52206554/8060120

How can I deal with duplicate values when creating a dictionary in Python?

I am a beginner in python, and I would like to create a simple program that assigns each element in list1 to its respective element in list2 using the zip function.
list1=[1,2,3,4,1]
list2=[2,3,4,5,6]
dictionary=dict(zip(list1,list2))
print(dictionary)
However, because I have a duplicate value in list1, the dictionary displays the following results:
{1: 6, 2: 3, 3: 4, 4: 5}
Because 1 is a duplicate value in list1, only 1:6 is displayed and not 1:2 as well. How can I change my code such that the duplicate value is taken into account and is displayed in its respective order?
{1: 2, 2: 3, 3: 4, 4: 5, 1: 6}
Thank you
What you ask is not possible with a Python dict, since that goes against the definition of a dict--keys must be unique. The comments explain why that is the case.
However, there are multiple other ways that may be useful to you that can achieve almost the same affect. One simple, yet not terribly useful, way is:
almost_dictionary = list(zip(list1, list2))
This gives the result
[(1, 2), (2, 3), (3, 4), (4, 5), (1, 6)]
which sometimes can be used like a dictionary. However, this probably is not what you want. Better is a dict or defaultdict that, for each key, holds a list of all the values connected with that key. The defaultdict is easier to use, though harder to set up:
dictionary = defaultdict(list)
for k, v in zip(list1, list2):
dictionary[k].append(v)
print(dictionary)
This gives the result
defaultdict(<class 'list'>, {1: [2, 6], 2: [3], 3: [4], 4: [5]})
and you see that each key has each value--the values are just in lists. The value of dictionary[1] is [2, 6], so you have both values to work with.
Which method you choose depends on your purpose for the dictionary.

Flatten a dictionary of dictionaries (2 levels deep) of lists

I'm trying to wrap my brain around this but it's not flexible enough.
In my Python script I have a dictionary of dictionaries of lists. (Actually it gets a little deeper but that level is not involved in this question.) I want to flatten all this into one long list, throwing away all the dictionary keys.
Thus I want to transform
{1: {'a': [1, 2, 3], 'b': [0]},
2: {'c': [4, 5, 1], 'd': [3, 8]}}
to
[1, 2, 3, 0, 4, 5, 1, 3, 8]
I could probably set up a map-reduce to iterate over items of the outer dictionary to build a sublist from each subdictionary and then concatenate all the sublists together.
But that seems inefficient for large data sets, because of the intermediate data structures (sublists) that will get thrown away. Is there a way to do it in one pass?
Barring that, I would be happy to accept a two-level implementation that works... my map-reduce is rusty!
Update:
For those who are interested, below is the code I ended up using.
Note that although I asked above for a list as output, what I really needed was a sorted list; i.e. the output of the flattening could be any iterable that can be sorted.
def genSessions(d):
"""Given the ipDict, return an iterator that provides all the sessions,
one by one, converted to tuples."""
for uaDict in d.itervalues():
for sessions in uaDict.itervalues():
for session in sessions:
yield tuple(session)
...
# Flatten dict of dicts of lists of sessions into a list of sessions.
# Sort that list by start time
sessionsByStartTime = sorted(genSessions(ipDict), key=operator.itemgetter(0))
# Then make another copy sorted by end time.
sessionsByEndTime = sorted(sessionsByStartTime, key=operator.itemgetter(1))
Thanks again to all who helped.
[Update: replaced nthGetter() with operator.itemgetter(), thanks to #intuited.]
I hope you realize that any order you see in a dict is accidental -- it's there only because, when shown on screen, some order has to be picked, but there's absolutely no guarantee.
Net of ordering issues among the various sublists getting catenated,
[x for d in thedict.itervalues()
for alist in d.itervalues()
for x in alist]
does what you want without any inefficiency nor intermediate lists.
edit: re-read the original question and reworked answer to assume that all non-dictionaries are lists to be flattened.
In cases where you're not sure how far down the dictionaries go, you would want to use a recursive function. #Arrieta has already posted a function that recursively builds a list of non-dictionary values.
This one is a generator that yields successive non-dictionary values in the dictionary tree:
def flatten(d):
"""Recursively flatten dictionary values in `d`.
>>> hat = {'cat': ['images/cat-in-the-hat.png'],
... 'fish': {'colours': {'red': [0xFF0000], 'blue': [0x0000FF]},
... 'numbers': {'one': [1], 'two': [2]}},
... 'food': {'eggs': {'green': [0x00FF00]},
... 'ham': ['lean', 'medium', 'fat']}}
>>> set_of_values = set(flatten(hat))
>>> sorted(set_of_values)
[1, 2, 255, 65280, 16711680, 'fat', 'images/cat-in-the-hat.png', 'lean', 'medium']
"""
try:
for v in d.itervalues():
for nested_v in flatten(v):
yield nested_v
except AttributeError:
for list_v in d:
yield list_v
The doctest passes the resulting iterator to the set function. This is likely to be what you want, since, as Mr. Martelli points out, there's no intrinsic order to the values of a dictionary, and therefore no reason to keep track of the order in which they were found.
You may want to keep track of the number of occurrences of each value; this information will be lost if you pass the iterator to set. If you want to track that, just pass the result of flatten(hat) to some other function instead of set. Under Python 2.7, that other function could be collections.Counter. For compatibility with less-evolved pythons, you can write your own function or (with some loss of efficiency) combine sorted with itertools.groupby.
A recursive function may work:
def flat(d, out=[]):
for val in d.values():
if isinstance(val, dict):
flat(d, out)
else:
out+= val
If you try it with :
>>> d = {1: {'a': [1, 2, 3], 'b': [0]}, 2: {'c': [4, 5, 6], 'd': [3, 8]}}
>>> out = []
>>> flat(d, out)
>>> print out
[1, 2, 3, 0, 4, 5, 6, 3, 8]
Notice that dictionaries have no order, so the list is in random order.
You can also return out (at the end of the loop) and don't call the function with a list argument.
def flat(d, out=[]):
for val in d.values():
if isinstance(val, dict):
flat(d, out)
else:
out+= val
return out
call as:
my_list = flat(d)

Categories