To execute a generator - python

Suppose I create a generator of the following form:
e=[(lambda x:2*x)(x) for x in range(10)]
The way to execute and accumulate the results would be :
list([(lambda x:2*x)(x) for x in range(10)])
However,if I am actually performing a cleaning-up operation(maybe file deletion) as follows:
[(lambda x:db.delete(x.path()))(x) for x in self.candidates if x is not None]
What is the convention to execute this - a list really looks odd in this scenario as there is no result I am interested in?

Just use a plain-old for loop.
for x in self.candidates:
if x is not None:
db.delete(x.path())
List comprehensions and lambdas are needless sophistication here, it's just making your code less readable.
If, in a more appropriate use-case, you actually need to consume a generator you can do this by nomming it into a zero-length deque:
>>> from __future__ import print_function
>>> import collections
>>> g = (print(x) for x in 'potato')
>>> _ = collections.deque(g, maxlen=0)
p
o
t
a
t
o

Related

Pythonic way to cycle through purely side-effect-based comprehension

What is the most pythonic way to execute a full generator comprehension where you don't care about the return values and instead the operations are purely side-effect-based?
An example would be splitting a list based on a predicate value as discussed here. It's natural to think of writing a generator comprehension
split_me = [0, 1, 2, None, 3, '']
a, b = [], []
gen_comp = (a.append(v) if v else b.append(v) for v in split_me)
In this case the best solution I can come up with is to use any
any(gen_comp)
However that's not immediately obvious what's happening for someone who hasn't seen this pattern. Is there a better way to cycle through that full comprehension without holding all the return values in memory?
You do so by not using a generator expression.
Just write a proper loop:
for v in split_me:
if v:
a.append(v)
else:
b.append(v)
or perhaps:
for v in split_me:
target = a if v else b
target.append(v)
Using a generator expression here is pointless if you are going to execute the generator immediately anyway. Why produce an object plus a sequence of None return values when all you wanted was to append values to two other lists?
Using an explicit loop is both more comprehensible for future maintainers of the code (including you) and more efficient.
itertools has this consume recipe
def consume(iterator, n):
"Advance the iterator n-steps ahead. If n is none, consume entirely."
# Use functions that consume iterators at C speed.
if n is None:
# feed the entire iterator into a zero-length deque
collections.deque(iterator, maxlen=0)
else:
# advance to the empty slice starting at position n
next(islice(iterator, n, n), None)
in your case n is None, so:
collections.deque(iterator, maxlen=0)
Which is interesting, but also a lot of machinery for a simple task
Most people would just use a for loop
As others have said, don't use comprehensions just for side-effects.
Here's a nice way to do what you're actually trying to do using the partition() recipe from itertools:
try: # Python 3
from itertools import filterfalse
except ImportError: # Python 2
from itertools import ifilterfalse as filterfalse
from itertools import ifilter as filter
from itertools import tee
def partition(pred, iterable):
'Use a predicate to partition entries into false entries and true entries'
# From itertools recipes:
# https://docs.python.org/3/library/itertools.html#itertools-recipes
# partition(is_odd, range(10)) --> 0 2 4 6 8 and 1 3 5 7 9
t1, t2 = tee(iterable)
return filterfalse(pred, t1), filter(pred, t2)
split_me = [0, 1, 2, None, 3, '']
trueish, falseish = partition(lambda x: x, split_me)
# You can iterate directly over trueish and falseish,
# or you can put them into lists
trueish_list = list(trueish)
falseish_list = list(falseish)
print(trueish_list)
print(falseish_list)
Output:
[0, None, '']
[1, 2, 3]
There's nothing non-pythonic in writing things on many lines and make use of if-statements:
for v in split_me:
if v:
a.append(v)
else:
b.append(v)
If you want a one-liner you could do so by putting the loop on one line anyway:
for v in split_me: a.append(v) if v else b.append(v)
If you want it in an expression (which still beats me why you want unless you have a value you want to get out of it) you could use list comprehension to force looping:
[x for x in (a.append(v) if v else b.append(v) for v in split_me) if False]
Which solution do you think best shows what you're doing? I'd say the first solution. To be pythonic you should probably consider the zen of python, especially:
Readability counts.
If the implementation is hard to explain, it's a bad idea.
Just to throw in another reason why using any() to consume a generator is a horrible idea, you need to remember that any() and all() are guaranteed to do short-circuit evaluation which means that if the generator ever returns a True value then all() will early-out on you and leave your generator incompletely consumed.
This is adding an extra conditional test / stop condition that you A) probably don't want, and B) may be far away from where the generator is created.
Many standard library functions return None so you could get away with all() for a while until suddenly it's not doing what you expect, and you might stare at that code for a long time before it occurs to you if you've gotten into the habit of using all() in this way.
If you must do something like this, then itertools.consume() is really the only reasonable way to do it I think.
any is short, but is not a general solution. Something which works for any generator is the straightforward
for _ in gen_comp: pass
which is also shorter and more efficient than a generally working any method,
any(None for _ in gen_comp)
so the for loop is really the clearest and best. Its only downside is that it cannot be used in expressions.

loop for inside lambda

I need to simplify my code as much as possible: it needs to be one line of code.
I need to put a for loop inside a lambda expression, something like that:
x = lambda x: (for i in x : print i)
Just in case, if someone is looking for a similar problem...
Most solutions given here are one line and are quite readable and simple. Just wanted to add one more that does not need the use of lambda(I am assuming that you are trying to use lambda just for the sake of making it a one line code).
Instead, you can use a simple list comprehension.
[print(i) for i in x]
BTW, the return values will be a list on None s.
Since a for loop is a statement (as is print, in Python 2.x), you cannot include it in a lambda expression. Instead, you need to use the write method on sys.stdout along with the join method.
x = lambda x: sys.stdout.write("\n".join(x) + "\n")
To add on to chepner's answer for Python 3.0 you can alternatively do:
x = lambda x: list(map(print, x))
Of course this is only if you have the means of using Python > 3 in the future... Looks a bit cleaner in my opinion, but it also has a weird return value, but you're probably discarding it anyway.
I'll just leave this here for reference.
anon and chepner's answers are on the right track. Python 3.x has a print function and this is what you will need if you want to embed print within a function (and, a fortiori, lambdas).
However, you can get the print function very easily in python 2.x by importing from the standard library's future module. Check it out:
>>>from __future__ import print_function
>>>
>>>iterable = ["a","b","c"]
>>>map(print, iterable)
a
b
c
[None, None, None]
>>>
I guess that looks kind of weird, so feel free to assign the return to _ if you would like to suppress [None, None, None]'s output (you are interested in the side-effects only, I assume):
>>>_ = map(print, iterable)
a
b
c
>>>
If you are like me just want to print a sequence within a lambda, without get the return value (list of None).
x = range(3)
from __future__ import print_function # if not python 3
pra = lambda seq=x: map(print,seq) and None # pra for 'print all'
pra()
pra('abc')
lambda is nothing but an anonymous function means no need to define a function like def name():
lambda <inputs>: <expression>
[print(x) for x in a] -- This is the for loop in one line
a = [1,2,3,4]
l = lambda : [print(x) for x in a]
l()
output
1
2
3
4
We can use lambda functions in for loop
Follow below code
list1 = [1,2,3,4,5]
list2 = []
for i in list1:
f = lambda i: i /2
list2.append(f(i))
print(list2)
First of all, it is the worst practice to write a lambda function like x = some_lambda_function. Lambda functions are fundamentally meant to be executed inline. They are not meant to be stored. Thus when you write x = some_lambda_function is equivalent to
def some_lambda_funcion():
pass
Moving to the actual answer. You can map the lambda function to an iterable so something like the following snippet will serve the purpose.
a = map(lambda x : print(x),[1,2,3,4])
list(a)
If you want to use the print function for the debugging purpose inside the reduce cycle, then logical or operator will help to escape the None return value in the accumulator variable.
def test_lam():
'''printing in lambda within reduce'''
from functools import reduce
lam = lambda x, y: print(x,y) or x + y
print(reduce(lam,[1,2,3]))
if __name__ =='__main__':
test_lam()
Will print out the following:
1 2
3 3
6
You can make it one-liner.
Sample
myList = [1, 2, 3]
print_list = lambda list: [print(f'Item {x}') for x in list]
print_list(myList)
otherList = [11, 12, 13]
print_list(otherList)
Output
Item 1
Item 2
Item 3
Item 11
Item 12
Item 13

How to use python generator expressions to create a oneliner to run a function multiple times and get a list output

I am wondering if there is there is a simple Pythonic way (maybe using generators) to run a function over each item in a list and result in a list of returns?
Example:
def square_it(x):
return x*x
x_set = [0,1,2,3,4]
squared_set = square_it(x for x in x_set)
I notice that when I do a line by line debug on this, the object that gets passed into the function is a generator.
Because of this, I get an error:
TypeError: unsupported operand type(s) for *: 'generator' and 'generator'
I understand that this generator expression created a generator to be passed into the function, but I am wondering if there is a cool way to accomplish running the function multiple times only by specifying an iterable as the argument? (without modifying the function to expect an iterable).
It seems to me that this ability would be really useful to cut down on lines of code because you would not need to create a loop to fun the function and a variable to save the output in a list.
Thanks!
You want a list comprehension:
squared_set = [square_it(x) for x in x_set]
There's a builtin function, map(), for this common problem.
>>> map(square_it, x_set)
[0,1,4,9,16] # On Python 3, a generator is returned.
Alternatively, one can use a generator expression, which is memory-efficient but lazy (meaning the values will not be computed now, only when needed):
>>> (square_it(x) for x in x_set)
<generator object <genexpr> at ...>
Similarly, one can also use a list comprehension, which computes all the values upon creation, returning a list.
Additionally, here's a comparison of generator expressions and list comprehensions.
You want to call the square_it function inside the generator, not on the generator.
squared_set = (square_it(x) for x in x_set)
As the other answers have suggested, I think it is best (most "pythonic") to call your function explicitly on each element, using a list or generator comprehension.
To actually answer the question though, you can wrap your function that operates over scalers with a function that sniffs the input, and has different behavior depending on what it sees. For example:
>>> import types
>>> def scaler_over_generator(f):
... def wrapper(x):
... if isinstance(x, types.GeneratorType):
... return [f(i) for i in x]
... return f(x)
... return wrapper
>>> def square_it(x):
... return x * x
>>> square_it_maybe_over = scaler_over_generator(square_it)
>>> square_it_maybe_over(10)
100
>>> square_it_maybe_over(x for x in range(5))
[0, 1, 4, 9, 16]
I wouldn't use this idiom in my code, but it is possible to do.
You could also code it up with a decorator, like so:
>>> #scaler_over_generator
... def square_it(x):
... return x * x
>>> square_it(x for x in range(5))
[0, 1, 4, 9, 16]
If you didn't want/need a handle to the original function.
Note that there is a difference between list comprehension returning a list
squared_set = [square_it(x) for x in x_set]
and returning a generator that you can iterate over it:
squared_set = (square_it(x) for x in x_set)

Square braces not required in list comprehensions when used in a function

I submitted a pull request with this code:
my_sum = sum([x for x in range(10)])
One of the reviewers suggested this instead:
my_sum = sum(x for x in range(10))
(the difference is just that the square braces are missing).
I was surprised that the second form seems to be identical. But when I tried to use it in other contexts where the first one works, it fails:
y = x for x in range(10)
^ SyntaxError !!!
Are the two forms identical? Is there any important reason for why the square braces aren't necessary in the function? Or is this just something that I have to know?
This is a generator expression. To get it to work in the standalone case, use braces:
y = (x for x in range(10))
and y becomes a generator. You can iterate over generators, so it works where an iterable is expected, such as the sum function.
Usage examples and pitfalls:
>>> y = (x for x in range(10))
>>> y
<generator object <genexpr> at 0x0000000001E15A20>
>>> sum(y)
45
Be careful when keeping generators around, you can only go through them once. So after the above, if you try to use sum again, this will happen:
>>> sum(y)
0
So if you pass a generator where actually a list or a set or something similar is expected, you have to be careful. If the function or class stores the argument and tries to iterate over it multiple times, you will run into problems. For example consider this:
def foo(numbers):
s = sum(numbers)
p = reduce(lambda x,y: x*y, numbers, 1)
print "The sum is:", s, "and the product:", p
it will fail if you hand it a generator:
>>> foo(x for x in range(1, 10))
The sum is: 45 and the product: 1
You can easily get a list from the values a generator produces:
>>> y = (x for x in range(10))
>>> list(y)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
You can use this to fix the previous example:
>>> foo(list(x for x in range(1, 10)))
The sum is: 45 and the product: 362880
However keep in mind that if you build a list from a generator, you will need to store every value. This might use a lot more memory in situations where you have lots of items.
Why use a generator in your situation?
The much lower memory consumption is the reason why sum(generator expression) is better than sum(list): The generator version only has to store a single value, while the list-variant has to store N values. Therefore you should always use a generator where you don't risk side-effects.
They are not identical.
The first form,
[x for x in l]
is a list comprehension. The other is a generator expression and written thus:
(x for x in l)
It returns a generator, not a list.
If the generator expression is the only argument in a function call, its parentheses can be skipped.
See PEP 289
First one is list comprehnsion Where second one is generator expression
(x for x in range(10))
<generator object at 0x01C38580>
>>> a = (x for x in range(10))
>>> sum(a)
45
>>>
Use brace for generators:
>>> y = (x for x in range(10))
>>> y
<generator object at 0x01C3D2D8>
>>>
Read this PEP: 289
For instance, the following summation code will build a full list of squares in memory, iterate over those values, and, when the reference is no longer needed, delete the list:
sum([x*x for x in range(10)])
Memory is conserved by using a generator expression instead:
sum(x*x for x in range(10))
As the data volumes grow larger, generator expressions tend to perform better because they do not exhaust cache memory and they allow Python to re-use objects between iterations.
Use brace product a generator:
>>> y = (x for x in range(10))
>>> y
<generator object <genexpr> at 0x00AC3AA8>

Python: Nested Loop

Consider this:
>>> a = [("one","two"), ("bad","good")]
>>> for i in a:
... for x in i:
... print x
...
one
two
bad
good
How can I write this code, but using a syntax like:
for i in a:
print [x for x in i]
Obviously, This does not work, it prints:
['one', 'two']
['bad', 'good']
I want the same output. Can it be done?
List comprehensions and generators are only designed to be used as expressions, while printing is a statement. While you can effect what you're trying to do by doing
from __future__ import print_function
for x in a:
[print(each) for each in x]
doing so is amazingly unpythonic, and results in the generation of a list that you don't actually need. The best thing you could do would simply be to write the nested for loops in your original example.
Given your example you could do something like this:
a = [("one","two"), ("bad","good")]
for x in sum(map(list, a), []):
print x
This can, however, become quite slow once the list gets big.
The better way to do it would be like Tim Pietzcker suggested:
from itertools import chain
for x in chain(*a):
print x
Using the star notation, *a, allows you to have n tuples in your list.
>>> a = [("one","two"), ("bad","good")]
>>> print "\n".join(j for i in a for j in i)
one
two
bad
good
>>> for i in a:
... print "\n".join(i)
...
one
two
bad
good
import itertools
for item in itertools.chain(("one","two"), ("bad","good")):
print item
will produce the desired output with just one for loop.
The print function really is superior, but here is a much more pythonic suggestion inspired by Benjamin Pollack's answer:
from __future__ import print_function
for x in a:
print(*x, sep="\n")
Simply use * to unpack the list x as arguments to the function, and use newline separators.
You'll need to define your own print method (or import __future__.print_function)
def pp(x): print x
for i in a:
_ = [pp(x) for x in i]
Note the _ is used to indicate that the returned list is to be ignored.
This code is straightforward and simpler than other solutions here:
for i in a:
print '\n'.join([x for x in i])
Not the best, but:
for i in a:
some_function([x for x in i])
def some_function(args):
for o in args:
print o

Categories