Find intersection of dictionary values which are lists

Find intersection of dictionary values which are lists - python

I have dictionaries in a list with same keys, while the values are variant:
[{1:[1,2,3,4,5], 2:[6,7,8], 3:[1,3,5,7,9]},
{1:[2,3,4], 2:[6,7], 3:[1,3,5]},
...]
I would like to get intersection as dictionary under same keys like this:
{1:[2,3,4], 2:[6,7], 3:[1,3,5]}

Give this a try?
dicts = [{1:[1,2,3,4,5], 2:[6,7,8], 3:[1,3,5,7,9]},{1:[2,3,4], 2:[6,7], 3:[1,3,5]}]
result = { k: set(dicts[0][k]).intersection(*(d[k] for d in dicts[1:])) for k in dicts[0].keys() }
print(result)
# Output:
# {1: {2, 3, 4}, 2: {6, 7}, 3: {1, 3, 5}}
If you want lists instead of sets as the output value type, just throw a list(...) around the set intersection.

For a list of dictionaries, reduce the whole list as this:
>>> from functools import reduce
>>> d = [{1:[1,2,3,4,5], 2:[6,7,8], 3:[1,3,5,7,9]},{1:[2,3,4], 2:[6,7], 3:[1,3,5]}]
>>> reduce(lambda x, y: {k: sorted(list(set(x[k])&set(y[k]))) for k in x.keys()}, d)
{1: [2, 3, 4], 2: [6, 7], 3: [1, 3, 5]}

I would probably do something along these lines:
# Take the first dict and convert the values to `set`.
output = {k: set(v) for k, v in dictionaries[0].items()}
# For the rest of the dicts, update the set at a given key by intersecting it with each of the other lists that have the same key.
for d in dictionaries[1:]:
for k, v in output.items():
output[k] = v.intersection(d[k])
There are different variations on this same theme, but I find this one to be about as simple to read as it gets (and since code is read more often than it is written, I consider that a win :-)

use dict.viewkeys and dict.viewitems
In [103]: dict.viewkeys?
Docstring: D.viewkeys() -> a set-like object providing a view on D's keys
dict.viewitems?
Docstring: D.viewitems() -> a set-like object providing a view on D's items
a = [{1: [1, 2, 3, 4, 5], 2: [6, 7, 8], 3: [1, 3, 5, 7, 9]},
{1: [2, 3, 4], 2: [6, 7], 3: [1, 3, 5]}]
In [100]: dict(zip(a[0].viewkeys() and a[1].viewkeys(), a[0].viewvalues() and a[1].viewvalues()))
Out[100]: {1: [2, 3, 4], 2: [6, 7], 3: [1, 3, 5]}

Related

Dict from two lists including multiple values for keys

Is there a possibility to create a dict from two lists with same key occurring multiple times without iterating over the whole dataset?
Minimal example:
keys = [1, 2, 3, 2, 3, 4, 5, 1]
values = [1, 2, 3, 4, 5, 6, 7, 8]
# hoped for result:
dictionary = dict(???)
dictionary = {1 : [1,8], 2:[2,4], 3:[3,5], 4:[6], 5:[7]}
When using zip the key-value-pair is inserted overwriting the old one:
dictionary = dict(zip(keys,values))
dictionary = {1: 8, 2: 4, 3: 5, 4: 6, 5: 7}
I would be happy with a Multidict as well.

This is one approach that doesn't require 2 for loops
h = defaultdict(list)
for k, v in zip(keys, values):
h[k].append(v)
print(h)
# defaultdict(<class 'list'>, {1: [1, 8], 2: [2, 4], 3: [3, 5], 4: [6], 5: [7]})
print(dict(h))
# {1: [1, 8], 2: [2, 4], 3: [3, 5], 4: [6], 5: [7]}

This is the only one-liner I could do.
dictionary = {k: [values[i] for i in [j for j, x in enumerate(keys) if x == k]] for k in set(keys)}
It is far from readable. Remember that clear code is always better than pseudo-clever code ;)

Here is an example that I think is easy to follow logically. Unfortunately it does not use zip like you would prefer, nor does it avoid iterating, because a task like this has to involve iterating In some form.
# Your data
keys = [1, 2, 3, 2, 3, 4, 5, 1]
values = [1, 2, 3, 4, 5, 6, 7, 8]
# Make result dict
result = {}
for x in range(1, max(keys)+1):
result[x] = []
# Populate result dict
for index, num in enumerate(keys):
result[num].append(values[index])
# Print result
print(result)
If you know the range of values in the keys array, you could make this faster by providing the results dictionary as a literal with integer keys and empty list values.

Fastest way to generate dictionary from two lists

I have two lists. For example:
keys = [1, 2, 3, 2, 4, 2, 1]
and
values = [1, 2, 3, 4, 5, 6, 7]
I want to create a dictionary of lists out of them as shown below:
dict = {1: [1, 7], 2: [2, 4, 6], 3: [3], 4: [5]}
What is the fastest way to do it and what generates the efficiency gains, both by using any module and also by not importing any additional module?

Using collections.defaultdict
from collections import defaultdict
d_dict = defaultdict(list)
for k,v in zip(keys, values):
d_dict[k].append(v)
dict(d_dict)
#{1: [1, 7], 2: [2, 4, 6], 3: [3], 4: [5]}

You can use the following:
d = {}
for k, v in zip(keys, values):
d.setdefault(k, []).append(v)
print(d)
which outputs:
{1: [1, 7], 2: [2, 4, 6], 3: [3], 4: [5]}

You could use defaultdict for your case. In case if some key does not exist the default dict calls the factory method (list() for current case) and instead of raising KeyError it will return new empty list, that is appended with value:
from collections import defaultdict
ld = defaultdict(list)
for k,v in zip(keys, values):
ld[k].append(v)
print(ld)

Most pythonic way to initialize a dict

Suppose I have a dict like this
d = {
1: [1,4,7],
2: [2,5,8],
0: [3,6,9]
}
It can be constructed by
d = {}
for i in range(1,10):
key = i % 3
if key not in d: d[key] = []
d[key].append(i)
I used this line if key not in d: d[key] = [] to check existence of the key/value pair in the dict and initiate the pair.
Is there a more pythonic way to achieve this?

This is probably best handled with a defaultdict, which will automatically create any key-value mapping that is accessed if it doesn't already exist. You pass a callable to the defaultdict constructor that will be used to initialize the value. For example:
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> d
defaultdict(list, {})
>>> d[3]
[]
>>> d
defaultdict(list, {3: []})

Using a comprehension:
>>> {n%3: list(range(n, n+7, 3)) for n in range(1,4)}
{0: [3, 6, 9], 1: [1, 4, 7], 2: [2, 5, 8]}
Using dict.setdefault():
>>> d = {}
>>> for i in range(1, 10):
... d.setdefault(i%3, []).append(i)
...
>>> d
{0: [3, 6, 9], 1: [1, 4, 7], 2: [2, 5, 8]}
Using defaultdict:
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> for i in range(1, 10):
... d[i%3].append(i)
...
>>> d
defaultdict(<class 'list'>, {0: [3, 6, 9], 1: [1, 4, 7], 2: [2, 5, 8]})

from collections import defaultdict
d = defaultdict(list)
for i in range(1,10):
key = i % 3
d[key].append(i)
print(d)
out:
defaultdict(<class 'list'>, {0: [3, 6, 9], 1: [1, 4, 7], 2: [2, 5, 8]})
When each key is encountered for the first time, it is not already in
the mapping; so an entry is automatically created using the
default_factory function which returns an empty list. The
list.append() operation then attaches the value to the new list. When
keys are encountered again, the look-up proceeds normally (returning
the list for that key) and the list.append() operation adds another
value to the list. This technique is simpler and faster than an
equivalent technique using dict.setdefault():
>>> d = {}
>>> for k, v in s:
d.setdefault(k, []).append(v)

You can use list slices [start:stop:step] nominclature
d={}
for i in range(3):
d[i] = list(range(1,10))[(i+2)%3::3]
{0: [3, 6, 9],
1: [1, 4, 7],
2: [2, 5, 8]}

Given that you haven't given any input nor variable parts you might just initialize it with the literal you already have:
d = {1: [1,4,7],
2: [2,5,8],
0: [3,6,9]}
If you have variable input you may use collections.defaultdict with list as factory. Given that this operation is very common several external libraries have functions for this:
iteration_utilities.groupedby
toolz.groupby
For example:
>>> from iteration_utilities import groupedby
>>> groupedby(range(1, 10), lambda x: x % 3)
{0: [3, 6, 9], 1: [1, 4, 7], 2: [2, 5, 8]}
or:
>>> from toolz import groupby
>>> groupby(lambda x: x % 3, range(1, 10))
{0: [3, 6, 9], 1: [1, 4, 7], 2: [2, 5, 8]}

Extract list of duplicate values and locations from array

Given an array a of length N, which is a list of integers, I want to extract the duplicate values, where I have a seperate list for each value containing the location of the duplicates. In pseudo-math:
If |M| > 1:
val -> M = { i | a[i] == val }
Example (N=11):
a = [0, 3, 1, 6, 8, 1, 3, 3, 2, 10, 10]
should give the following lists:
3 -> [1, 6, 7]
1 -> [2, 5]
10 -> [9, 10]
I added the python tag since I'm currently programming in that language (numpy and scipy are available), but I'm more interestead in a general algorithm of how to do it. Code examples are fine, though.
One idea, which I did not yet flesh out: Construct a list of tuples, pairing each entry of a with its index: (i, a[i]). Sort the list with the second entry as key, then check consecutive entries for which the second entry is the same.

Here's an implementation using a python dictionary (actually a defaultdict, for convenience)
a = [0, 3, 1, 6, 8, 1, 3, 3, 2, 10, 10]
from collections import defaultdict
d = defaultdict(list)
for k, item in enumerate(a):
d[item].append(k)
finalD = {key : value for key, value in d.items() if len(value) > 1} # Filter dict for items that only occurred once.
print(finalD)
# {1: [2, 5], 10: [9, 10], 3: [1, 6, 7]}

The idea is to create a dictionary mapping the values to the list of the position where it appears.
This can be done in a simple way with setdefault. This can also be done using defaultdict.
>>> a = [0, 3, 1, 6, 8, 1, 3, 3, 2, 10, 10]
>>> dup={}
>>> for i,x in enumerate(a):
... dup.setdefault(x,[]).append(i)
...
>>> dup
{0: [0], 1: [2, 5], 2: [8], 3: [1, 6, 7], 6: [3], 8: [4], 10: [9, 10]}
Then, actual duplicates can be extracted using set comprehension to filter out elements appearing only once.
>>> {i:x for i,x in dup.iteritems() if len(x)>1}
{1: [2, 5], 10: [9, 10], 3: [1, 6, 7]}

Populate a dictionary whose keys are the values of the integers, and whose values are the lists of positions of those keys. Then go through that dictionary and remove all key/value pairs with only one position. You will be left with the ones that are duplicated.

Create python dictionary from value of another dictionary?

I am trying to efficiently construct a python dictionary from the keys' values of another dictionary.
For example...
dict1 = {'foo': [1, 3, 7], 'bar': [2, 4, 8]} ## note: all values in {key: value} will be unique
## Algorithm here...
dict2 = {1: [3, 7], 3: [1, 7], 7: [1, 3], 2: [4, 8], 4: [2, 8], 8: [2, 4]}
I can get this result through brute force methods but these dictionaries are for graphs with over 100000 nodes so I need this to be efficient.
Any help would be greatly appreciated.

Here is how I would do this:
dict2 = {k: x[:i] + x[i+1:] for x in dict1.values() for i, k in enumerate(x)}
If you are on Python 2.x you may want to use dict1.itervalues().

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Find intersection of dictionary values which are lists - python

I have dictionaries in a list with same keys, while the values are variant: [{1:[1,2,3,4,5], 2:[6,7,8], 3:[1,3,5,7,9]}, {1:[2,3,4], 2:[6,7], 3:[1,3,5]}, ...] I would like to get intersection as dictionary under same keys like this: {1:[2,3,4], 2:[6,7], 3:[1,3,5]}

For a list of dictionaries, reduce the whole list as this: >>> from functools import reduce >>> d = [{1:[1,2,3,4,5], 2:[6,7,8], 3:[1,3,5,7,9]},{1:[2,3,4], 2:[6,7], 3:[1,3,5]}] >>> reduce(lambda x, y: {k: sorted(list(set(x[k])&set(y[k]))) for k in x.keys()}, d) {1: [2, 3, 4], 2: [6, 7], 3: [1, 3, 5]}

Related

Dict from two lists including multiple values for keys

Fastest way to generate dictionary from two lists

Most pythonic way to initialize a dict

Extract list of duplicate values and locations from array

Create python dictionary from value of another dictionary?

Categories

Resources