#L is a very large list
A = [x/sum(L) for x in L]
When the interpreter evaluates this, how many times will sum(L) be calculated? Just once, or once for each element?
A list comprehension executes the expression for each iteration.
sum(L) is executed for each x in L. Calculate it once outside the list comprehension:
s = sum(L)
A = [x/s for x in L]
Python has no way of knowing that the outcome of sum(L) is stable, and cannot optimize the call away for you.
sum() could be rebound to a different function that returns random values. The elements in L could implement __add__ methods that produce side effects; the built-in sum() would be calling these. L itself could implement a custom __iter__ method that alters the list in-place as you iterate, affecting both the list comprehension and the sum() call. Any of those hooks could rebind sum or give x elements a __div__ method that alters sum, etc.
In other words, Python is too dynamic to accurately predict expression outcomes.
I would opt for Martijn's approach, but thought I'd point out that you can (ab)use a lambda with a default argument and a map if you wanted to retain a "one-liner", eg:
L = range(1, 10)
A = map(lambda el, total=sum(L, 0.0): el / total, L)
Related
I have a large number of arithmetic expressions that I store in a list. For example
exp_list = [exp1, exp2, ...,exp10000]
I also have indices of the few expressions I need to evaluate.
inds = [ind1,ind2,...,ind10]
exp_selected = [exp_list[i] for i in inds ]
Is there a way to avoid having to evaluate all the expressions in exp_list?
Suppose you decide to store you expressions as lambdas (to avoid them being immediately evaluated) then you could selectively evaluate them with a simple list comprehension:
exp_list = [lambda: 1+2, lambda: 3+4, lambda: 5+6, lambda: 7+8]
inds = [1, 3]
print [exp() for i, exp in enumerate(exp_list) if i in inds]
Produces:
[7, 15]
If those expressions share some pattern and can be created 'mid-air' it would be better to use generator instead of just creating the list. Especially if you don't need to remember the results, but just check if any (or all) of them are true/false.
i am refreshing my python (2.7) and i am discovering iterators and generators.
As i understood, they are an efficient way of navigating over values without consuming too much memory.
So the following code do some kind of logical indexing on a list:
removing the values of a list L that triggers a False conditional statement represented here by the function f.
I am not satisfied with my code because I feel this code is not optimal for three reasons:
I read somewhere that it is better to use a for loop than a while loop.
However, in the usual for i in range(10), i can't modify the value of 'i' because it seems that the iteration doesn't care.
Logical indexing is pretty strong in matrix-oriented languages, and there should be a way to do the same in python (by hand granted, but maybe better than my code).
Third reason is just that i want to use generator/iterator on this example to help me understand.
Third reason is just that i want to use generator/iterator on this example to help me understand.
TL;DR : Is this code a good pythonic way to do logical indexing ?
#f string -> bool
def f(s):
return 'c' in s
L=['','a','ab','abc','abcd','abcde','abde'] #example
length=len(L)
i=0
while i < length:
if not f(L[i]): #f is a conditional statement (input string output bool)
del L[i]
length-=1 #cut and push leftwise
else:
i+=1
print 'Updated list is :', L
print length
This code has a few problems, but the main one is that you must never modify a list you're iterating over. Rather, you create a new list from the elements that match your condition. This can be done simply in a for loop:
newlist = []
for item in L:
if f(item):
newlist.append(item)
which can be shortened to a simple list comprehension:
newlist = [item for item in L if f(item)]
It looks like filter() is what you're after:
newlist = filter(lambda x: not f(x), L)
filter() filters (...) an iterable and only keeps the items for which a predicate returns True. In your case f(..) is not quite the predicate but not f(...).
Simpler:
def f(s):
return 'c' not in s
newlist = filter(f, L)
See: https://docs.python.org/2/library/functions.html#filter
Never modify a list with del, pop or other methods that mutate the length of the list while iterating over it. Read this for more information.
The "pythonic" way to filter a list is to use reassignment and either a list comprehension or the built-in filter function:
List comprehension:
>>> [item for item in L if f(item)]
['abc', 'abcd', 'abcde']
i want to use generator/iterator on this example to help me understand
The for item in L part is implicitly making use of the iterator protocol. Python lists are iterable, and iter(somelist) returns an iterator .
>>> from collections import Iterable, Iterator
>>> isinstance([], Iterable)
True
>>> isinstance([], Iterator)
False
>>> isinstance(iter([]), Iterator)
True
__iter__ is not only being called when using a traditional for-loop, but also when you use a list comprehension:
>>> class mylist(list):
... def __iter__(self):
... print('iter has been called')
... return super(mylist, self).__iter__()
...
>>> m = mylist([1,2,3])
>>> [x for x in m]
iter has been called
[1, 2, 3]
Filtering:
>>> filter(f, L)
['abc', 'abcd', 'abcde']
In Python3, use list(filter(f, L)) to get a list.
Of course, to filter a list, Python needs to iterate over it, too:
>>> filter(None, mylist())
iter has been called
[]
"The python way" to do it would be to use a generator expression:
# list comprehension
L = [l for l in L if f(l)]
# alternative generator comprehension
L = (l for l in L if f(l))
It depends on your context if a list or a generator is "better" (see e.g. this so question). Because your source data is coming from a list, there is no real benefit of using a generator here.
For simply deleting elements, especially if the original list is no longer needed, just iterate backwards:
Python 2.x:
for i in xrange(len(L) - 1, -1, -1):
if not f(L[i]):
del L[i]
Python 3.x:
for i in range(len(L) - 1, -1, -1):
if not f(L[i]):
del L[i]
By iterating from the end, the "next" index does not change after deletion and a for loop is possible. Note that you should use the xrange generator in Python 2, or the range generator in Python 3, to save memory*.
In cases where you must iterate forward, use your given solution above.
*Note that Python 2's xrange will break if there are >= 2 ** 32 - 1 elements. Python 3's range, as well as the less efficient Python 2's range do not have this limitation.
The following python tutorial says that:
List comprehension is a complete substitute for the lambda function as well as the functions map(), filter() and reduce().
http://python-course.eu/python3_list_comprehension.php
However, it does not mention an example how a list comprehension can substitute a reduce() and I can't think of an example how it should be possible.
Can please someone explain how to achieve a reduce-like functionality with list comprehension or confirm that it isn't possible?
Ideally, list comprehension is to create a new list. Quoting official documentation,
List comprehensions provide a concise way to create lists. Common applications are to make new lists where each element is the result of some operations applied to each member of another sequence or iterable, or to create a subsequence of those elements that satisfy a certain condition.
whereas reduce is used to reduce an iterable to a single value. Quoting functools.reduce,
Apply function of two arguments cumulatively to the items of sequence, from left to right, so as to reduce the sequence to a single value.
So, list comprehension cannot be used as a drop-in replacement for reduce.
I was surprised at first to find that Guido van Rossum, creator of Python, was against reduce. His reasoning was that beyond summing, multiplying, and-ing, and or-ing, using reduce yields an unreadable solution that is better suited by a function which iterates through and updates an accumulator. His article on the matter is here. So no, there isn't a list comprehension alternative to reduce, instead the "pythonic" way is to implement an accumulating function the old fashioned way:
Instead of:
out = reduce((lambda x,y: x*y),[1,2,3])
Use:
def prod(myList):
out = 1
for el in myList:
out *= el
return out
Of course nothing stops you from continuing to use reduce (python 2) or functools.reduce (python 3)
List comprehensions are supposed to return lists. If your reduce is supposed to return a list, then yes, you can replace it with a list comprehension.
But this is no obstacle to providing "reduce-like functionality". Python lists can contain any object. If you'll accept your result contained in a single-item list, then there is a [...][0] list comprehension form that can replace any reduce() whatsoever.
This should be obvious, but that form is
[x for x in [reduce(function, sequence, initial)]][0]
for some binary function and and some iterable sequence and some initial value. Or, if you want the initial from the first of the iterable,
[x for x in [reduce(function, sequence)]][0]
Arguably, the above is cheating, and also pointless, since you could just use reduce without the comprehension. So let's try it without reduce.
[stack.append(function(stack.pop(), e)) or stack[0]
for stack in ([initial],)
for e in sequence][-1]
This produces a list of all the intermediate values, and we want the last one. [-1] is just as easy as [0]. We need an accumulator to reduce, but can't use assignment statements in a comprehension, hence the stack (which is just a list), but we could have used many other data structures here. The .append() always returns None, so we use or stack[0] to put the value so far in the resulting list.
It's a little more difficult without initial,
[stack.append(function(stack.pop(), e)) or stack[0]
for it in [iter(sequence)]
for stack in [[next(it)]]
for e in it][-1]
Really, you might as well use a for statement at this point.
But this takes up memory for the list of intermediate values. For a very long sequence, that might be a problem. But we can avoid that too by using generator expressions.
Doing this is tricky, so let's start with an easier example and work up to it.
stack = [initial]
[stack.append(function(stack.pop(), e)) for e in sequence]
stack.pop() # returns the answer
It computes the answer, but also creates a useless list of Nones. We can avoid that by converting it to a generator expression inside a list comprehension.
stack = [initial]
[_ for _s in (stack.append(function(stack.pop(), e)) or ()
for e in sequence)
for _ in _s]
stack.pop()
The list comprehension exhausts the generator that updates the stack, but returns an empty list itself. This is possible because the inner loop always has zero iterations, because _s is always an empty tuple.
We can move the stack.pop() inside if the last _s has one element. It doesn't matter what that element is though. So we chain on a [None] as the final _s.
from itertools import chain
stack = [initial]
[stack.pop()
for _s in chain((stack.append(function(stack.pop(), e)) or ()
for e in sequence),
[[None]])
for _ in _s][0]
Again, we have a single-item list comprehension. We can also implement chain as a generator expression. And you've already seen how to move the stack variable inside using a single-item list.
[stack.pop()
for stack in [[initial]]
for _s in (
x
for xs in [
(stack.append(function(stack.pop(), e)) or ()
for e in sequence),
[[None]],
]
for x in xs)
for _ in _s][0]
And we can also get the initial from the sequence for the two-argument reduce.
[stack.pop()
for it in [iter(sequence)]
for stack in [[next(it)]]
for _s in (
x
for xs in [
(stack.append(function(stack.pop(), e)) or ()
for e in it),
[[None]],
]
for x in xs)
for _ in _s][0]
This is insane. But it works. So yes, it's possible to get "reduce-like functionality" with comprehensions. That doesn't mean you should. Seven fors is too hard!
You could accomplish something like a reduce with a comprehension by using a couple of helper functions that I've named last and cofold:
>>> last(r(a+b) for a, b, r in cofold(range(10)))
45
This is functionally equivalent to
>>> reduce(lambda a, b: a+b, range(10))
45
Note that unlike reduce() the comprehension didn't use a lambda.
The trick is to use a generator with a callback to "return" the result of the operator. cofold is the corecursive dual of the reduce (or fold) function.
_sentinel = object()
def cofold(it, initial=_sentinel):
if initial is _sentinel:
it = iter(it)
accumulator = next(it)
else:
accumulator = initial
def callback(result):
nonlocal accumulator
accumulator = result
return result
for element in it:
yield accumulator, element, callback
Here's cofold in a list comprehension.
>>> [r(a+b) for a, b, r in cofold(range(10))]
[1, 3, 6, 10, 15, 21, 28, 36, 45]
The elements represent each step in the dual reduction. The last one is our answer. The last function is trivial.
def last(it):
for e in it:
pass
return e
Unlike reduce, cofold is a lazy generator, so it can safely act on infinite iterables when used in a generator expression.
>>> from itertools import islice, count
>>> lazy_results = (r(a+b) for a, b, r in cofold(count()))
>>> [*islice(lazy_results, 0, 9)]
[1, 3, 6, 10, 15, 21, 28, 36, 45]
>>> next(lazy_results)
55
>>> next(lazy_results)
66
I am wondering if there is there is a simple Pythonic way (maybe using generators) to run a function over each item in a list and result in a list of returns?
Example:
def square_it(x):
return x*x
x_set = [0,1,2,3,4]
squared_set = square_it(x for x in x_set)
I notice that when I do a line by line debug on this, the object that gets passed into the function is a generator.
Because of this, I get an error:
TypeError: unsupported operand type(s) for *: 'generator' and 'generator'
I understand that this generator expression created a generator to be passed into the function, but I am wondering if there is a cool way to accomplish running the function multiple times only by specifying an iterable as the argument? (without modifying the function to expect an iterable).
It seems to me that this ability would be really useful to cut down on lines of code because you would not need to create a loop to fun the function and a variable to save the output in a list.
Thanks!
You want a list comprehension:
squared_set = [square_it(x) for x in x_set]
There's a builtin function, map(), for this common problem.
>>> map(square_it, x_set)
[0,1,4,9,16] # On Python 3, a generator is returned.
Alternatively, one can use a generator expression, which is memory-efficient but lazy (meaning the values will not be computed now, only when needed):
>>> (square_it(x) for x in x_set)
<generator object <genexpr> at ...>
Similarly, one can also use a list comprehension, which computes all the values upon creation, returning a list.
Additionally, here's a comparison of generator expressions and list comprehensions.
You want to call the square_it function inside the generator, not on the generator.
squared_set = (square_it(x) for x in x_set)
As the other answers have suggested, I think it is best (most "pythonic") to call your function explicitly on each element, using a list or generator comprehension.
To actually answer the question though, you can wrap your function that operates over scalers with a function that sniffs the input, and has different behavior depending on what it sees. For example:
>>> import types
>>> def scaler_over_generator(f):
... def wrapper(x):
... if isinstance(x, types.GeneratorType):
... return [f(i) for i in x]
... return f(x)
... return wrapper
>>> def square_it(x):
... return x * x
>>> square_it_maybe_over = scaler_over_generator(square_it)
>>> square_it_maybe_over(10)
100
>>> square_it_maybe_over(x for x in range(5))
[0, 1, 4, 9, 16]
I wouldn't use this idiom in my code, but it is possible to do.
You could also code it up with a decorator, like so:
>>> #scaler_over_generator
... def square_it(x):
... return x * x
>>> square_it(x for x in range(5))
[0, 1, 4, 9, 16]
If you didn't want/need a handle to the original function.
Note that there is a difference between list comprehension returning a list
squared_set = [square_it(x) for x in x_set]
and returning a generator that you can iterate over it:
squared_set = (square_it(x) for x in x_set)
I'm familiar with the for loop in a block-code context. eg:
for c in "word":
print c
I just came across some examples that use for differently. Rather than beginning with the for statement, they tag it at the end of an expression (and don't involve an indented code-block). eg:
sum(x*x for x in range(10))
Can anyone point me to some documentation that outlines this use of for? I've been able to find examples, but not explanations. All the for documentation I've been able to find describes the previous use (block-code example). I'm not even sure what to call this use, so I apologize if my question's title is unclear.
What you are pointing to is Generator in Python. Take a look at: -
http://wiki.python.org/moin/Generators
http://www.python.org/dev/peps/pep-0255/
http://docs.python.org/whatsnew/2.5.html#pep-342-new-generator-features
See the documentation: - Generator Expression which contains exactly the same example you have posted
From the documentation: -
Generators are a simple and powerful tool for creating iterators. They
are written like regular functions but use the yield statement
whenever they want to return data. Each time next() is called, the
generator resumes where it left-off (it remembers all the data values
and which statement was last executed)
Generators are similar to List Comprehension that you use with square brackets instead of brackets, but they are more memory efficient. They don't return the complete list of result at the same time, but they return generator object. Whenever you invoke next() on the generator object, the generator uses yield to return the next value.
List Comprehension for the above code would look like: -
[x * x for x in range(10)]
You can also add conditions to filter out results at the end of the for.
[x * x for x in range(10) if x % 2 != 0]
This will return a list of numbers multiplied by 2 in the range 1 to 5, if the number is not divisible by 2.
An example of Generators depicting the use of yield can be: -
def city_generator():
yield("Konstanz")
yield("Zurich")
yield("Schaffhausen")
yield("Stuttgart")
>>> x = city_generator()
>>> x.next()
Konstanz
>>> x.next()
Zurich
>>> x.next()
Schaffhausen
>>> x.next()
Stuttgart
>>> x.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
So, you see that, every call to next() executes the next yield() in generator. and at the end it throws StopIteration.
Those are generator expressions and they are related to list comprehensions
List comprehensions allow for the easy creation of lists. For example, if you wanted to create a list of perfect squares you could do this:
>>> squares = []
>>> for x in range(10):
... squares.append(x**2)
...
>>> squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
But instead you could use a list comprehension:
squares = [x**2 for x in range(10)]
Generator expressions are like list comprehensions, except they return a generator object instead of a list. You can iterate over this generator object in a similar manner to list comprehensions, but you don't have to store the whole list in memory at once, as you would if you created the list in a list comprehension.
Documentation for generator expressions is here https://www.python.org/dev/peps/pep-0289/
Following is the code using generator expression .
list(x**2 for x in range(0,10))
Your specific example is called a generator expression. List comprehensions, dictionary comprehensions, and set comprehensions are similar in meaning (different result types, and generator expressions are lazy) and have the same syntax, modulo being inside other kinds of brackets, and in the case of a dict comprehension having expr1: expr2 instead of a single expression (x*x in your example).