Need help in understanding list comprehension expression - python

I was finding a memory efficient way to get the unique numbers from a list and came across the below expression
used = set()
mylist = [1,2,3,4,3,2,3,4,5,3,2,1]
[x for x in mylist if x not in used and (used.add(x) or True)]
This is working to get unique numbers but did not understand exactly how it worked. Below is my understanding
x for x in mylist # Iterating through a list
if x not in used # If statement saying if X not in empty set as defined above
and (used.add(x) or True) # No idea what it is saying

In (used.add(x) or True), used.add(x) executes and returns None, and then None or True will return True. Basically it is a work around to achieve set.add() in a list comprehension when the item in mylist is being iterated the first time.

This comprehension is equivalent to this loop:
used = set()
mylist = [1,2,3,4,3,2,3,4,5,3,2,1]
result = []
for x in mylist:
if x not in used:
used.add(x)
result.append(x)
in the comprehension: the x not in used part is for actual filtering, and the second part is meant to add xs to the used set as the comprehension progresses (the or True is a hack to prevent it from affecting the filter condition).

x not in used
This expression returns True if it's the first time you're seeing this value. So it becomes:
True and (used.add(x) or True)
Since the left hand side of and is True, it continues evaluating the right hand side. This executes used.add(x), to add x to the used set. Since .add doesn't return anything, the or True ensures that this expression results in a truthy value. So the entire if condition results in:
True and (None or True)
Which is True, so it's if True, so this x is being kept in the list comprehension.
Conversely, if this is not the first time you're seeing the value, the expression boils down as:
x not in used and (used.add(x) or True)
→ False and (used.add(x) or True)
→ False
So, the add is not executed and the entire expression results in False, so this x is excluded from the list comprehension.
TBH, this is a rather obscure way of doing it, as demonstrated by the existence of this very question.

The if part is what you need to understand ...you've almost got it.
The first part you had right - if (x not in used). That will be True (and so add x to the comprehension list) if x isn't in used.
At the start of the comp, used is empty, so nothing is in it, so any x will say true.
But after that, you want to add x to the used list - as you've seen it now, and don't want (x not in used) to be true again.
So the second part - (used.add(x) or True) - adds x to the used list. It needs the 'or True' in case add(x) is False, which it will be if x is already in the list, and it tries to add it again.

The idea is just that used.add(x) needs to be executed "somehow". It appends the current number to used and will always return None. Therefore, (used.add(x) or True) will always be True, which does not affect the list comprehension. That said, simply calling list(OrderedSet(mylist)) with from orderedset import OrderedSet does the same thing. Calling functions without return values in comprehensions should be avoided.
Edit: see https://pypi.org/project/orderedset/

This comprehension is a very silly and time-consuming way of writing:
mylist = [1,2,3,4,3,2,3,4,5,3,2,1]
list(set(mylist))
which will give you exactly same result.

Related

How can I directly obtain the item from this list comprehension?

Here's part of the code I'm working on in Python:
def ee(mid_circ = False):
initial_layout = [[1,0] if mid_circ == True else [1,2,3,0]]
return initial_layout
However, the output is either [[1,0]] or [[1,2,3,0]]. Is there a way I can directly obtain [1,0] or [1,2,3,0] from the code?
That's not a list comprehension. You are simply constructing a list consisting of a single item (which also happens to be a list).
You should simply remove the outer brackets, i.e.
initial_layout = [1,0] if mid_circ else [1,2,3,0]
Note that I've also dropped the redundant == True check.

Python "all" function with conditional generator expression returning True. Why?

Can anyone help me understand why the following Python script returns True?
x = ''
y = all(i == ' ' for i in x)
print(y)
I imagine it's something to do with x being a zero-length entity, but cannot fully comprehend.
all() always returns True unless there is an element in the sequence that is False.
Your loop produces 0 items, so True is returned.
This is documented:
Return True if all elements of the iterable are true (or if the iterable is empty).
Emphasis mine.
Similarly, any() will always return False, unless an element in the sequence is True, so for empty sequences, any() returns the default:
>>> any(True for _ in '')
False
As the documentation states, what all does is:
Return True if all elements of the iterable are true (or if the iterable is empty).

See if all items in a list = certain string

How would I find if I have a list of a given string, 'hello':
x = ['hello', 'hello', 'hello']
# evaluates to True
x = ['hello', '1']
# evaluates to False
Use the all() function to test if a condition holds True for all elements:
all(el == 'hello' for el in x)
The all() function takes an iterable (something that produces results one by one) and will only return True itself if all those elements are true. The moment it finds anything that is false, it'll return False and not look further.
Here the iterable is a generator expression, one that executes an equality test for each element in the input sequence. The fact that all() stops iterating early if a false value is encountered makes this test very efficient if the test in the contained generator expression is False for any element early on.
Note that if x is empty, then all() returns True as well as it won't find any elements that are false in an empty sequence. You could test for the sequence being non-empty first:
if x and all(el == 'hello' for el in x):
to work around that.
This ought to work:
# First check to make sure 'x' isn't empty, then use the 'all' built-in
if x and all(y=='hello' for y in x):
Nice thing about the all built-in is that it stops on the first item it finds that doesn't meet the condition. This means it is quite efficient with large lists.
Also, if all of the items in the list are strings, then you can use the lower method of a string to match things like `'HellO', 'hELLO', etc.
if x and all(y.lower()=='hello' for y in x):
Yet another way to do what you want (all is the most idiomatic way to do that, as all other answers note), useful in case if you need to check more than once:
s = set(l)
cond = (len(s) == 1) and (item in s)
It helps to avoid O(n) traversal every time you want to check the condition.
Using filter and len is easy.
x = ['hello', 'hello', 'hello']
s = 'hello'
print len(filter(lambda i:i==s, x))==len(x)
Youn can do that using set:
set(x) == {'hello'}

all builtin function of empty list

Can anybody explain why in python builtin buinction all return True in this case all([])?
In [33]: all([])
Out[33]: True
In [34]: all([0])
Out[34]: False
In [35]: __builtins__.all([])
Out[35]: True
I'm not convinced that any of the other answers have really address the question of why this should be the case.
The definition for Python's all() comes from boolean logic. If for example we say that "all swans are white" then a single black swan disproves the statement. However, if we say that "all unicorns are pink" logicians would take that as a true statement simply because there are no non-pink unicorns. Or in other words "all " is vacuously true.
Practically it gives us a useful invariant. If all(A) and all(B) are both true then the combination of all(A + B) is also true. If all({}) was false we should have a less useful situation because combining two expressions one of which is false suddenly gives an unexpected true result.
So Python takes all([]) == True from boolean logic, and for consistency with other languages with a similar construct.
Taking that back into Python, in many cases the vacuous truth makes algorithms simpler. For example, if we have a tree and want to validate all of the nodes we might say a node is valid if it meets some conditions and all of its children are valid. With the alternative definition of all() this becomes more complex as we have to say it is valid if it meets the conditions and either has no children or all its children are valid.
class Node:
def isValid(self):
return some_condition(self) and all(child.isValid for child in self.children)
From the docs:
Return True if all elements of the iterable are true (or if the iterable is empty).
So, roughly, it's simply defined this way.
You can get around that by using
list = []
if list and all(list):
pass
As the docs say, all is equivalent to:
def all(iterable):
for element in iterable:
if not element:
return False
return True
For an empty iterable the loop body is never executed, so True is immediately returned.
Another explanation for this is that all and any are generalisations of the binary operators and and or for arbitrarily long numbers of parameters. Thus, all and any can be defined as:
def all(xs):
return reduce(lambda x,y: x and y, xs, True)
def any(xs):
return reduce(lambda x,y: x or y, xs, False)
The True and False parameters show that all([]) == True and any([]) == False.
Any expression with all can be rewritten by any and vice versa:
not all(iterable)
# is the same as:
any(not x for x in iterable)
and symmetrically
not any(iterable)
# is the same as:
all(not x for x in iterable)
These rules require that all([]) == True.
The function all is very useful for readable asserts:
assert all(required_condition(x) for x in some_results_being_verified)
(It is not so bad if a task has no results, but something is very broken if any result is incorrect.)

How to do "if-for" statement in python?

With python, I would like to run a test over an entire list, and, if all the statements are true for each item in the list, take a certain action.
Pseudo-code: If "test involving x" is true for every x in "list", then do "this".
It seems like there should be a simple way to do this.
What syntax should I use in python?
Use all(). It takes an iterable as an argument and return True if all entries evaluate to True. Example:
if all((3, True, "abc")):
print "Yes!"
You will probably need some kind of generator expression, like
if all(x > 3 for x in lst):
do_stuff()
>>> x = [True, False, True, False]
>>> all(x)
False
all() returns True if all the elements in the list are True
Similarly, any() will return True if any element is true.
Example (test all elements are greater than 0)
if all(x > 0 for x in list_of_xs):
do_something()
Above originally used a list comprehension (if all([x > 0 for x in list_of_xs]): ) which as pointed out by delnan (Thanks) a generator expression would be faster as the generator expression terminates at the first False, while this expression applies the comparison to all elements of the list.
However, be careful with generator expression like:
all(x > 0 for x in list_of_xs)
If you are using pylab (launch ipython as 'ipython -pylab'), the all function is replaced with numpy.all which doesn't process generator expressions properly.
all([x>0 for x in [3,-1,5]]) ## False
numpy.all([x>0 for x in [3,-1,5]]) ## False
all(x>0 for x in [3,-1,5]) ## False
numpy.all(x>0 for x in [3,-1,5]) ## True
I believe you want the all() method:
$ python
>>> help(all)
Help on built-in function all in module __builtin__:
all(...)
all(iterable) -> bool
Return True if bool(x) is True for all values x in the iterable.
if reduce(lambda x, y: x and involve(y), yourlist, True):
certain_action()
involve is the action you want to involve for each element in the list, yourlist is your original list, certain_action is the action you want to perform if all the statements are true.
all() alone doesn't work well if you need an extra map() phase.
see below:
all((x==0 for x in xrange(1000))
and:
all([x==0 for x in xrange(1000)])
the 2nd example will perform 1000 compare even the 2nd compare render the whole result false.

Categories