Python map() dictionary values - python

I'm trying to use map() on the dict_values object returned by the values() function on a dictionary. However, I can't seem to be able to map() over a dict_values:
map(print, h.values())
Out[31]: <builtins.map at 0x1ce1290>
I'm sure there's an easy way to do this. What I'm actually trying to do is create a set() of all the Counter keys in a dictionary of Counters, doing something like this:
# counters is a dict with Counters as values
whole_set = set()
map(lambda x: whole_set.update(set(x)), counters.values())
Is there a better way to do this in Python?

In Python 3, map returns an iterator, not a list. You still have to iterate over it, either by calling list on it explicitly, or by putting it in a for loop. But you shouldn't use map this way anyway. map is really for collecting return values into an iterable or sequence. Since neither print nor set.update returns a value, using map in this case isn't idiomatic.
Your goal is to put all the keys in all the counters in counters into a single set. One way to do that is to use a nested generator expression:
s = set(key for counter in counters.values() for key in counter)
There's also the lovely dict comprehension syntax, which is available in Python 2.7 and higher (thanks Lattyware!) and can generate sets as well as dictionaries:
s = {key for counter in counters.values() for key in counter}
These are both roughly equivalent to the following:
s = set()
for counter in counters.values():
for key in counter:
s.add(key)

You want the set-union of all the values of counters? I.e.,
counters[1].union(counters[2]).union(...).union(counters[n])
? That's just functools.reduce:
import functools
s = functools.reduce(set.union, counters.values())
If counters.values() aren't already sets (e.g., if they're lists), then you should turn them into sets first. You can do it using a dict comprehension using iteritems, which is a little clunky:
>>> counters = {1:[1,2,3], 2:[4], 3:[5,6]}
>>> counters = {k:set(v) for (k,v) in counters.iteritems()}
>>> print counters
{1: set([1, 2, 3]), 2: set([4]), 3: set([5, 6])}
or of course you can do it inline, since you don't care about counters.keys():
>>> counters = {1:[1,2,3], 2:[4], 3:[5,6]}
>>> functools.reduce(set.union, [set(v) for v in counters.values()])
set([1, 2, 3, 4, 5, 6])

Related

Return the first item of a dictionar as a dictionary in Python?

Let's assume we have a dictionary like this:
>>> d={'a': 4, 'b': 2, 'c': 1.5}
If I want to select the first item of d, I can simply run the following:
>>> first_item = list(d.items())[0]
('a', 4)
However, I am trying to have first_item return a dict instead of a tuple i.e., {'a': 4}. Thanks for any tips.
Use itertools.islice to avoid creating the entire list, that is unnecessarily wasteful. Here's a helper function:
from itertools import islice
def pluck(mapping, pos):
return dict(islice(mapping.items(), pos, pos + 1))
Note, the above will return an empty dictionary if pos is out of bounds, but you can check that inside pluck and handle that case however you want (IMO it should probably raise an error).
>>> pluck(d, 0)
{'a': 4}
>>> pluck(d, 1)
{'b': 2}
>>> pluck(d, 2)
{'c': 1.5}
>>> pluck(d, 3)
{}
>>> pluck(d, 4)
{}
Note, accessing an element by position in a dict requires traversing the dict. If you need to do this more often, for arbitrary positions, consider using a sequence type like list which can do it in constant time. Although dict objects maintain insertion order, the API doesn't expose any way to manipulate the dict as a sequence, so you are stuck with using iteration.
Dictionary is a collections of key-value pairs. It is not really ordered collection, though since python 3.7 it keeps the order in which keys were added.
Anyway if you really want some "first" element you can get it in this manner:
some_item = next(iter(d.items()))
You should not convert it into a list because it will eat much (O(n)) memory and walk through whole dict as well.
Anyway I'd recommend not to think that dictionary has "first" element. It has keys and values. You can iterate over them in some unknown order (if you do not control how it is created)

How to make nested for loop more Pythonic

I have to create a list of blocked users per key. Each user has multiple attributes and if any of these attributes are in keys, the user is blocked.
I wrote the following nested for-loop and it works for me, but I want to write it in a more pythonic way with fewer lines and more readable fashion. How can I do that?
for key in keys:
key.blocked_users = []
for user in get_users():
for attribute in user.attributes:
for key in keys:
if attribute.name == key.name:
key.blocked_users.append(user)
In your specific case, where the inner for loops rely on the outer loop variables, I'd leave the code just as is. You don't make code more pythonic or readable by forcefully reducing the number of lines.
If those nested loops were intuitively written, they are probably easy to read.
If you have nested for loops with "independent" loop variables, you can use itertools.product however. Here's a demo:
>>> from itertools import product
>>> a = [1, 2]
>>> b = [3, 4]
>>> c = [5]
>>> for x in product(a, b, c): x
...
(1, 3, 5)
(1, 4, 5)
(2, 3, 5)
(2, 4, 5)
You could use a conditional comprehension in your first for-loop:
for key in keys:
keyname = key.name
key.blocked_users = [user for user in get_users() if any(attribute.name == keyname for attribute in user)]
Aside from making it shorter, you could try to reduce the operations to functions that are optimized in Python. It may not be shorter but it could be faster then - and what's more pythonic than speed?. :)
For example you iterate over the keys for each attribute of each user. That just sreams to be optimized "away". For example you could collect the key-names in a dictionary (for the lookup) and a set (for the intersection with attribute names) once:
for key in keys:
key.blocked_users = []
keyname_map = {key.name: key.blocked_users for key in keys} # map the key name to blocked_user list
keynames = set(keyname_map)
The set(keyname_map) is a very efficient operation so it doesn't matter much that you keep two collections around.
And then use set.intersection to get the keynames that match an attribute name:
for user in get_users():
for key in keynames.intersection({attribute.name for attribute in user.attributes}):
keyname_map[key].append(user)
set.intersection is pretty fast too.
However, this approach requires that your attribute.names and key.names are hashable.
Try using listed for loop in list comprehension, if that's considered more Pythonic, something like:
[key.blocked_users.append(user) for key in keys
for attribute in user.attributes
for user in get_users()
if attribute.name == key.name]

How to keep a unique bag of dicts?

I use a set when I need to keep a reference list of values which I want to keep unique (and later on, check if something is in that set). This does not work with dict because it is not hashable.
There are quite a few techniques to "uniquify" a list of dict but all of them assume that I have a final list, which I want to reduce to unique elements.
How to do that in a dynamic way? For a set I would just .add() and element and would know that it will be added only if it is unique. Is such a (EDIT: ideally, but not necessarily) built-in mechanism available for a bag of dict (I use the word "bag" because I do not want to limit possible answers to any data container)
You can use a frozen dict which is an immutable implementation of a regular dict.
This approach should allow you to use frozen dicts inside a set.
>>> from frozendict import frozendict
>>> x = [frozendict({'a':2, 'b':3}),frozendict({'b':3, 'a':2})]
>>> set(x)
{<frozendict {'b': 3, 'a': 2}>}
>>> frozendict({'b': 3, 'a': 2}) in set(x)
True
>>> frozendict({'b': 4, 'a': 2}) in set(x)
False
>>> frozendict({'a': 2, 'b': 3}) in set(x)
True
The set classes are implemented using dictionaries. Accordingly, the
requirements for set elements are the same as those for dictionary
keys; namely, that the element defines both eq() and hash().
As a result, sets cannot contain mutable elements such as lists or
dictionaries. However, they can contain immutable collections such as
tuples or instances of ImmutableSet.
So if you only want to use built-ins, you could convert your dictionaries to a tuple of tuples upon entering them to a set, and converting them back to dictionaries when you want to use them.
dict_set = set()
dict_set.add(tuple(a_dict.items()))
For a set I would just .add() and element and would know that it will be added only if it is unique.
Instead of add(), use update() or |= with the dictionary's items. That will meet your goal of adding dynamically and incrementally while "knowing that it will be added only if it is unique.":
>>> d = dict(raymond='red')
>>> e = dict(raymond='blue')
>>> f = dict(raymond='red')
>>> s = set()
>>> s |= d.items()
>>> s |= e.items()
>>> s |= f.items()
>>> s

Appending object to a dictionary by key is replicating over the whole dictionary [duplicate]

This question already has answers here:
How do I initialize a dictionary of empty lists in Python?
(7 answers)
Closed 2 years ago.
I came across this behavior that surprised me in Python 2.6 and 3.2:
>>> xs = dict.fromkeys(range(2), [])
>>> xs
{0: [], 1: []}
>>> xs[0].append(1)
>>> xs
{0: [1], 1: [1]}
However, dict comprehensions in 3.2 show a more polite demeanor:
>>> xs = {i:[] for i in range(2)}
>>> xs
{0: [], 1: []}
>>> xs[0].append(1)
>>> xs
{0: [1], 1: []}
>>>
Why does fromkeys behave like that?
Your Python 2.6 example is equivalent to the following, which may help to clarify:
>>> a = []
>>> xs = dict.fromkeys(range(2), a)
Each entry in the resulting dictionary will have a reference to the same object. The effects of mutating that object will be visible through every dict entry, as you've seen, because it's one object.
>>> xs[0] is a and xs[1] is a
True
Use a dict comprehension, or if you're stuck on Python 2.6 or older and you don't have dictionary comprehensions, you can get the dict comprehension behavior by using dict() with a generator expression:
xs = dict((i, []) for i in range(2))
In the first version, you use the same empty list object as the value for both keys, so if you change one, you change the other, too.
Look at this:
>>> empty = []
>>> d = dict.fromkeys(range(2), empty)
>>> d
{0: [], 1: []}
>>> empty.append(1) # same as d[0].append(1) because d[0] references empty!
>>> d
{0: [1], 1: [1]}
In the second version, a new empty list object is created in every iteration of the dict comprehension, so both are independent from each other.
As to "why" fromkeys() works like that - well, it would be surprising if it didn't work like that. fromkeys(iterable, value) constructs a new dict with keys from iterable that all have the value value. If that value is a mutable object, and you change that object, what else could you reasonably expect to happen?
To answer the actual question being asked: fromkeys behaves like that because there is no other reasonable choice. It is not reasonable (or even possible) to have fromkeys decide whether or not your argument is mutable and make new copies every time. In some cases it doesn't make sense, and in others it's just impossible.
The second argument you pass in is therefore just a reference, and is copied as such. An assignment of [] in Python means "a single reference to a new list", not "make a new list every time I access this variable". The alternative would be to pass in a function that generates new instances, which is the functionality that dict comprehensions supply for you.
Here are some options for creating multiple actual copies of a mutable container:
As you mention in the question, dict comprehensions allow you to execute an arbitrary statement for each element:
d = {k: [] for k in range(2)}
The important thing here is that this is equivalent to putting the assignment k = [] in a for loop. Each iteration creates a new list and assigns it to a value.
Use the form of the dict constructor suggested by #Andrew Clark:
d = dict((k, []) for k in range(2))
This creates a generator which again makes the assignment of a new list to each key-value pair when it is executed.
Use a collections.defaultdict instead of a regular dict:
d = collections.defaultdict(list)
This option is a little different from the others. Instead of creating the new list references up front, defaultdict will call list every time you access a key that's not already there. You can there fore add the keys as lazily as you want, which can be very convenient sometimes:
for k in range(2):
d[k].append(42)
Since you've set up the factory for new elements, this will actually behave exactly as you expected fromkeys to behave in the original question.
Use dict.setdefault when you access potentially new keys. This does something similar to what defaultdict does, but it has the advantage of being more controlled, in the sense that only the access you want to create new keys actually creates them:
d = {}
for k in range(2):
d.setdefault(k, []).append(42)
The disadvantage is that a new empty list object gets created every time you call the function, even if it never gets assigned to a value. This is not a huge problem, but it could add up if you call it frequently and/or your container is not as simple as list.

How to count co-ocurrences with collections.Counter() in python?

I learned about the collections.Counter() class recently and, as it's a neat (and fast??) way to count stuff, I started using it.
But I detected a bug on my program recently due to the fact that when I try to update the count with a tuple, it actually treats it as a sequence and updates the count for each item in the tuple instead of counting how many times I inserted that particular tuple.
For example, if you run:
import collections
counter = collections.Counter()
counter.update(('user1', 'loggedin'))
counter.update(('user2', 'compiled'))
counter.update(('user1', 'compiled'))
print counter
You'll get:
Counter({'compiled': 2, 'user1': 2, 'loggedin': 1, 'user2': 1})
as a result. Is there a way to count tuples with the Counter()? I could concatenate the strings but this is... ugly. Could I use named tuples? Implement my own very simple dictionary counter? Don't know what's best.
Sure: you simply have to add one level of indirection, namely pass .update a container with the tuple as an element.
>>> import collections
>>> counter = collections.Counter()
>>> counter.update((('user1', 'loggedin'),))
>>> counter.update((('user2', 'compiled'),))
>>> counter.update((('user1', 'compiled'),))
>>> counter.update((('user1', 'compiled'),))
>>> counter
Counter({('user1', 'compiled'): 2, ('user1', 'loggedin'): 1, ('user2', 'compiled'): 1})

Categories