I have to create a list of blocked users per key. Each user has multiple attributes and if any of these attributes are in keys, the user is blocked.
I wrote the following nested for-loop and it works for me, but I want to write it in a more pythonic way with fewer lines and more readable fashion. How can I do that?
for key in keys:
key.blocked_users = []
for user in get_users():
for attribute in user.attributes:
for key in keys:
if attribute.name == key.name:
key.blocked_users.append(user)
In your specific case, where the inner for loops rely on the outer loop variables, I'd leave the code just as is. You don't make code more pythonic or readable by forcefully reducing the number of lines.
If those nested loops were intuitively written, they are probably easy to read.
If you have nested for loops with "independent" loop variables, you can use itertools.product however. Here's a demo:
>>> from itertools import product
>>> a = [1, 2]
>>> b = [3, 4]
>>> c = [5]
>>> for x in product(a, b, c): x
...
(1, 3, 5)
(1, 4, 5)
(2, 3, 5)
(2, 4, 5)
You could use a conditional comprehension in your first for-loop:
for key in keys:
keyname = key.name
key.blocked_users = [user for user in get_users() if any(attribute.name == keyname for attribute in user)]
Aside from making it shorter, you could try to reduce the operations to functions that are optimized in Python. It may not be shorter but it could be faster then - and what's more pythonic than speed?. :)
For example you iterate over the keys for each attribute of each user. That just sreams to be optimized "away". For example you could collect the key-names in a dictionary (for the lookup) and a set (for the intersection with attribute names) once:
for key in keys:
key.blocked_users = []
keyname_map = {key.name: key.blocked_users for key in keys} # map the key name to blocked_user list
keynames = set(keyname_map)
The set(keyname_map) is a very efficient operation so it doesn't matter much that you keep two collections around.
And then use set.intersection to get the keynames that match an attribute name:
for user in get_users():
for key in keynames.intersection({attribute.name for attribute in user.attributes}):
keyname_map[key].append(user)
set.intersection is pretty fast too.
However, this approach requires that your attribute.names and key.names are hashable.
Try using listed for loop in list comprehension, if that's considered more Pythonic, something like:
[key.blocked_users.append(user) for key in keys
for attribute in user.attributes
for user in get_users()
if attribute.name == key.name]
Related
a = [1, 2, 3, 1, 2, 3]
b = [3, 2, 1, 3, 2, 1]
a & b should be considered equal, because they have exactly the same elements, only in different order.
The thing is, my actual lists will consist of objects (my class instances), not integers.
O(n): The Counter() method is best (if your objects are hashable):
def compare(s, t):
return Counter(s) == Counter(t)
O(n log n): The sorted() method is next best (if your objects are orderable):
def compare(s, t):
return sorted(s) == sorted(t)
O(n * n): If the objects are neither hashable, nor orderable, you can use equality:
def compare(s, t):
t = list(t) # make a mutable copy
try:
for elem in s:
t.remove(elem)
except ValueError:
return False
return not t
You can sort both:
sorted(a) == sorted(b)
A counting sort could also be more efficient (but it requires the object to be hashable).
>>> from collections import Counter
>>> a = [1, 2, 3, 1, 2, 3]
>>> b = [3, 2, 1, 3, 2, 1]
>>> print (Counter(a) == Counter(b))
True
If you know the items are always hashable, you can use a Counter() which is O(n)
If you know the items are always sortable, you can use sorted() which is O(n log n)
In the general case you can't rely on being able to sort, or has the elements, so you need a fallback like this, which is unfortunately O(n^2)
len(a)==len(b) and all(a.count(i)==b.count(i) for i in a)
If you have to do this in tests:
https://docs.python.org/3.5/library/unittest.html#unittest.TestCase.assertCountEqual
assertCountEqual(first, second, msg=None)
Test that sequence first contains the same elements as second, regardless of their order. When they don’t, an error message listing the differences between the sequences will be generated.
Duplicate elements are not ignored when comparing first and second. It verifies whether each element has the same count in both sequences. Equivalent to: assertEqual(Counter(list(first)), Counter(list(second))) but works with sequences of unhashable objects as well.
New in version 3.2.
or in 2.7:
https://docs.python.org/2.7/library/unittest.html#unittest.TestCase.assertItemsEqual
Outside of tests I would recommend the Counter method.
The best way to do this is by sorting the lists and comparing them. (Using Counter won't work with objects that aren't hashable.) This is straightforward for integers:
sorted(a) == sorted(b)
It gets a little trickier with arbitrary objects. If you care about object identity, i.e., whether the same objects are in both lists, you can use the id() function as the sort key.
sorted(a, key=id) == sorted(b, key==id)
(In Python 2.x you don't actually need the key= parameter, because you can compare any object to any object. The ordering is arbitrary but stable, so it works fine for this purpose; it doesn't matter what order the objects are in, only that the ordering is the same for both lists. In Python 3, though, comparing objects of different types is disallowed in many circumstances -- for example, you can't compare strings to integers -- so if you will have objects of various types, best to explicitly use the object's ID.)
If you want to compare the objects in the list by value, on the other hand, first you need to define what "value" means for the objects. Then you will need some way to provide that as a key (and for Python 3, as a consistent type). One potential way that would work for a lot of arbitrary objects is to sort by their repr(). Of course, this could waste a lot of extra time and memory building repr() strings for large lists and so on.
sorted(a, key=repr) == sorted(b, key==repr)
If the objects are all your own types, you can define __lt__() on them so that the object knows how to compare itself to others. Then you can just sort them and not worry about the key= parameter. Of course you could also define __hash__() and use Counter, which will be faster.
If the comparison is to be performed in a testing context, use assertCountEqual(a, b) (py>=3.2) and assertItemsEqual(a, b) (2.7<=py<3.2).
Works on sequences of unhashable objects too.
If the list contains items that are not hashable (such as a list of objects) you might be able to use the Counter Class and the id() function such as:
from collections import Counter
...
if Counter(map(id,a)) == Counter(map(id,b)):
print("Lists a and b contain the same objects")
Let a,b lists
def ass_equal(a,b):
try:
map(lambda x: a.pop(a.index(x)), b) # try to remove all the elements of b from a, on fail, throw exception
if len(a) == 0: # if a is empty, means that b has removed them all
return True
except:
return False # b failed to remove some items from a
No need to make them hashable or sort them.
I hope the below piece of code might work in your case :-
if ((len(a) == len(b)) and
(all(i in a for i in b))):
print 'True'
else:
print 'False'
This will ensure that all the elements in both the lists a & b are same, regardless of whether they are in same order or not.
For better understanding, refer to my answer in this question
You can write your own function to compare the lists.
Let's get two lists.
list_1=['John', 'Doe']
list_2=['Doe','Joe']
Firstly, we define an empty dictionary, count the list items and write in the dictionary.
def count_list(list_items):
empty_dict={}
for list_item in list_items:
list_item=list_item.strip()
if list_item not in empty_dict:
empty_dict[list_item]=1
else:
empty_dict[list_item]+=1
return empty_dict
After that, we'll compare both lists by using the following function.
def compare_list(list_1, list_2):
if count_list(list_1)==count_list(list_2):
return True
return False
compare_list(list_1,list_2)
from collections import defaultdict
def _list_eq(a: list, b: list) -> bool:
if len(a) != len(b):
return False
b_set = set(b)
a_map = defaultdict(lambda: 0)
b_map = defaultdict(lambda: 0)
for item1, item2 in zip(a, b):
if item1 not in b_set:
return False
a_map[item1] += 1
b_map[item2] += 1
return a_map == b_map
Sorting can be quite slow if the data is highly unordered (timsort is extra good when the items have some degree of ordering). Sorting both also requires fully iterating through both lists.
Rather than mutating a list, just allocate a set and do a left-->right membership check, keeping a count of how many of each item exist along the way:
If the two lists are not the same length you can short circuit and return False immediately.
If you hit any item in list a that isn't in list b you can return False
If you get through all items then you can compare the values of a_map and b_map to find out if they match.
This allows you to short-circuit in many cases long before you've iterated both lists.
plug in this:
def lists_equal(l1: list, l2: list) -> bool:
"""
import collections
compare = lambda x, y: collections.Counter(x) == collections.Counter(y)
ref:
- https://stackoverflow.com/questions/9623114/check-if-two-unordered-lists-are-equal
- https://stackoverflow.com/questions/7828867/how-to-efficiently-compare-two-unordered-lists-not-sets
"""
compare = lambda x, y: collections.Counter(x) == collections.Counter(y)
set_comp = set(l1) == set(l2) # removes duplicates, so returns true when not sometimes :(
multiset_comp = compare(l1, l2) # approximates multiset
return set_comp and multiset_comp #set_comp is gere in case the compare function doesn't work
Let's assume we have a dictionary like this:
>>> d={'a': 4, 'b': 2, 'c': 1.5}
If I want to select the first item of d, I can simply run the following:
>>> first_item = list(d.items())[0]
('a', 4)
However, I am trying to have first_item return a dict instead of a tuple i.e., {'a': 4}. Thanks for any tips.
Use itertools.islice to avoid creating the entire list, that is unnecessarily wasteful. Here's a helper function:
from itertools import islice
def pluck(mapping, pos):
return dict(islice(mapping.items(), pos, pos + 1))
Note, the above will return an empty dictionary if pos is out of bounds, but you can check that inside pluck and handle that case however you want (IMO it should probably raise an error).
>>> pluck(d, 0)
{'a': 4}
>>> pluck(d, 1)
{'b': 2}
>>> pluck(d, 2)
{'c': 1.5}
>>> pluck(d, 3)
{}
>>> pluck(d, 4)
{}
Note, accessing an element by position in a dict requires traversing the dict. If you need to do this more often, for arbitrary positions, consider using a sequence type like list which can do it in constant time. Although dict objects maintain insertion order, the API doesn't expose any way to manipulate the dict as a sequence, so you are stuck with using iteration.
Dictionary is a collections of key-value pairs. It is not really ordered collection, though since python 3.7 it keeps the order in which keys were added.
Anyway if you really want some "first" element you can get it in this manner:
some_item = next(iter(d.items()))
You should not convert it into a list because it will eat much (O(n)) memory and walk through whole dict as well.
Anyway I'd recommend not to think that dictionary has "first" element. It has keys and values. You can iterate over them in some unknown order (if you do not control how it is created)
a = [1, 2, 3, 1, 2, 3]
b = [3, 2, 1, 3, 2, 1]
a & b should be considered equal, because they have exactly the same elements, only in different order.
The thing is, my actual lists will consist of objects (my class instances), not integers.
O(n): The Counter() method is best (if your objects are hashable):
def compare(s, t):
return Counter(s) == Counter(t)
O(n log n): The sorted() method is next best (if your objects are orderable):
def compare(s, t):
return sorted(s) == sorted(t)
O(n * n): If the objects are neither hashable, nor orderable, you can use equality:
def compare(s, t):
t = list(t) # make a mutable copy
try:
for elem in s:
t.remove(elem)
except ValueError:
return False
return not t
You can sort both:
sorted(a) == sorted(b)
A counting sort could also be more efficient (but it requires the object to be hashable).
>>> from collections import Counter
>>> a = [1, 2, 3, 1, 2, 3]
>>> b = [3, 2, 1, 3, 2, 1]
>>> print (Counter(a) == Counter(b))
True
If you know the items are always hashable, you can use a Counter() which is O(n)
If you know the items are always sortable, you can use sorted() which is O(n log n)
In the general case you can't rely on being able to sort, or has the elements, so you need a fallback like this, which is unfortunately O(n^2)
len(a)==len(b) and all(a.count(i)==b.count(i) for i in a)
If you have to do this in tests:
https://docs.python.org/3.5/library/unittest.html#unittest.TestCase.assertCountEqual
assertCountEqual(first, second, msg=None)
Test that sequence first contains the same elements as second, regardless of their order. When they don’t, an error message listing the differences between the sequences will be generated.
Duplicate elements are not ignored when comparing first and second. It verifies whether each element has the same count in both sequences. Equivalent to: assertEqual(Counter(list(first)), Counter(list(second))) but works with sequences of unhashable objects as well.
New in version 3.2.
or in 2.7:
https://docs.python.org/2.7/library/unittest.html#unittest.TestCase.assertItemsEqual
Outside of tests I would recommend the Counter method.
The best way to do this is by sorting the lists and comparing them. (Using Counter won't work with objects that aren't hashable.) This is straightforward for integers:
sorted(a) == sorted(b)
It gets a little trickier with arbitrary objects. If you care about object identity, i.e., whether the same objects are in both lists, you can use the id() function as the sort key.
sorted(a, key=id) == sorted(b, key==id)
(In Python 2.x you don't actually need the key= parameter, because you can compare any object to any object. The ordering is arbitrary but stable, so it works fine for this purpose; it doesn't matter what order the objects are in, only that the ordering is the same for both lists. In Python 3, though, comparing objects of different types is disallowed in many circumstances -- for example, you can't compare strings to integers -- so if you will have objects of various types, best to explicitly use the object's ID.)
If you want to compare the objects in the list by value, on the other hand, first you need to define what "value" means for the objects. Then you will need some way to provide that as a key (and for Python 3, as a consistent type). One potential way that would work for a lot of arbitrary objects is to sort by their repr(). Of course, this could waste a lot of extra time and memory building repr() strings for large lists and so on.
sorted(a, key=repr) == sorted(b, key==repr)
If the objects are all your own types, you can define __lt__() on them so that the object knows how to compare itself to others. Then you can just sort them and not worry about the key= parameter. Of course you could also define __hash__() and use Counter, which will be faster.
If the comparison is to be performed in a testing context, use assertCountEqual(a, b) (py>=3.2) and assertItemsEqual(a, b) (2.7<=py<3.2).
Works on sequences of unhashable objects too.
If the list contains items that are not hashable (such as a list of objects) you might be able to use the Counter Class and the id() function such as:
from collections import Counter
...
if Counter(map(id,a)) == Counter(map(id,b)):
print("Lists a and b contain the same objects")
Let a,b lists
def ass_equal(a,b):
try:
map(lambda x: a.pop(a.index(x)), b) # try to remove all the elements of b from a, on fail, throw exception
if len(a) == 0: # if a is empty, means that b has removed them all
return True
except:
return False # b failed to remove some items from a
No need to make them hashable or sort them.
I hope the below piece of code might work in your case :-
if ((len(a) == len(b)) and
(all(i in a for i in b))):
print 'True'
else:
print 'False'
This will ensure that all the elements in both the lists a & b are same, regardless of whether they are in same order or not.
For better understanding, refer to my answer in this question
You can write your own function to compare the lists.
Let's get two lists.
list_1=['John', 'Doe']
list_2=['Doe','Joe']
Firstly, we define an empty dictionary, count the list items and write in the dictionary.
def count_list(list_items):
empty_dict={}
for list_item in list_items:
list_item=list_item.strip()
if list_item not in empty_dict:
empty_dict[list_item]=1
else:
empty_dict[list_item]+=1
return empty_dict
After that, we'll compare both lists by using the following function.
def compare_list(list_1, list_2):
if count_list(list_1)==count_list(list_2):
return True
return False
compare_list(list_1,list_2)
from collections import defaultdict
def _list_eq(a: list, b: list) -> bool:
if len(a) != len(b):
return False
b_set = set(b)
a_map = defaultdict(lambda: 0)
b_map = defaultdict(lambda: 0)
for item1, item2 in zip(a, b):
if item1 not in b_set:
return False
a_map[item1] += 1
b_map[item2] += 1
return a_map == b_map
Sorting can be quite slow if the data is highly unordered (timsort is extra good when the items have some degree of ordering). Sorting both also requires fully iterating through both lists.
Rather than mutating a list, just allocate a set and do a left-->right membership check, keeping a count of how many of each item exist along the way:
If the two lists are not the same length you can short circuit and return False immediately.
If you hit any item in list a that isn't in list b you can return False
If you get through all items then you can compare the values of a_map and b_map to find out if they match.
This allows you to short-circuit in many cases long before you've iterated both lists.
plug in this:
def lists_equal(l1: list, l2: list) -> bool:
"""
import collections
compare = lambda x, y: collections.Counter(x) == collections.Counter(y)
ref:
- https://stackoverflow.com/questions/9623114/check-if-two-unordered-lists-are-equal
- https://stackoverflow.com/questions/7828867/how-to-efficiently-compare-two-unordered-lists-not-sets
"""
compare = lambda x, y: collections.Counter(x) == collections.Counter(y)
set_comp = set(l1) == set(l2) # removes duplicates, so returns true when not sometimes :(
multiset_comp = compare(l1, l2) # approximates multiset
return set_comp and multiset_comp #set_comp is gere in case the compare function doesn't work
I use a set when I need to keep a reference list of values which I want to keep unique (and later on, check if something is in that set). This does not work with dict because it is not hashable.
There are quite a few techniques to "uniquify" a list of dict but all of them assume that I have a final list, which I want to reduce to unique elements.
How to do that in a dynamic way? For a set I would just .add() and element and would know that it will be added only if it is unique. Is such a (EDIT: ideally, but not necessarily) built-in mechanism available for a bag of dict (I use the word "bag" because I do not want to limit possible answers to any data container)
You can use a frozen dict which is an immutable implementation of a regular dict.
This approach should allow you to use frozen dicts inside a set.
>>> from frozendict import frozendict
>>> x = [frozendict({'a':2, 'b':3}),frozendict({'b':3, 'a':2})]
>>> set(x)
{<frozendict {'b': 3, 'a': 2}>}
>>> frozendict({'b': 3, 'a': 2}) in set(x)
True
>>> frozendict({'b': 4, 'a': 2}) in set(x)
False
>>> frozendict({'a': 2, 'b': 3}) in set(x)
True
The set classes are implemented using dictionaries. Accordingly, the
requirements for set elements are the same as those for dictionary
keys; namely, that the element defines both eq() and hash().
As a result, sets cannot contain mutable elements such as lists or
dictionaries. However, they can contain immutable collections such as
tuples or instances of ImmutableSet.
So if you only want to use built-ins, you could convert your dictionaries to a tuple of tuples upon entering them to a set, and converting them back to dictionaries when you want to use them.
dict_set = set()
dict_set.add(tuple(a_dict.items()))
For a set I would just .add() and element and would know that it will be added only if it is unique.
Instead of add(), use update() or |= with the dictionary's items. That will meet your goal of adding dynamically and incrementally while "knowing that it will be added only if it is unique.":
>>> d = dict(raymond='red')
>>> e = dict(raymond='blue')
>>> f = dict(raymond='red')
>>> s = set()
>>> s |= d.items()
>>> s |= e.items()
>>> s |= f.items()
>>> s
I'm trying to use map() on the dict_values object returned by the values() function on a dictionary. However, I can't seem to be able to map() over a dict_values:
map(print, h.values())
Out[31]: <builtins.map at 0x1ce1290>
I'm sure there's an easy way to do this. What I'm actually trying to do is create a set() of all the Counter keys in a dictionary of Counters, doing something like this:
# counters is a dict with Counters as values
whole_set = set()
map(lambda x: whole_set.update(set(x)), counters.values())
Is there a better way to do this in Python?
In Python 3, map returns an iterator, not a list. You still have to iterate over it, either by calling list on it explicitly, or by putting it in a for loop. But you shouldn't use map this way anyway. map is really for collecting return values into an iterable or sequence. Since neither print nor set.update returns a value, using map in this case isn't idiomatic.
Your goal is to put all the keys in all the counters in counters into a single set. One way to do that is to use a nested generator expression:
s = set(key for counter in counters.values() for key in counter)
There's also the lovely dict comprehension syntax, which is available in Python 2.7 and higher (thanks Lattyware!) and can generate sets as well as dictionaries:
s = {key for counter in counters.values() for key in counter}
These are both roughly equivalent to the following:
s = set()
for counter in counters.values():
for key in counter:
s.add(key)
You want the set-union of all the values of counters? I.e.,
counters[1].union(counters[2]).union(...).union(counters[n])
? That's just functools.reduce:
import functools
s = functools.reduce(set.union, counters.values())
If counters.values() aren't already sets (e.g., if they're lists), then you should turn them into sets first. You can do it using a dict comprehension using iteritems, which is a little clunky:
>>> counters = {1:[1,2,3], 2:[4], 3:[5,6]}
>>> counters = {k:set(v) for (k,v) in counters.iteritems()}
>>> print counters
{1: set([1, 2, 3]), 2: set([4]), 3: set([5, 6])}
or of course you can do it inline, since you don't care about counters.keys():
>>> counters = {1:[1,2,3], 2:[4], 3:[5,6]}
>>> functools.reduce(set.union, [set(v) for v in counters.values()])
set([1, 2, 3, 4, 5, 6])