Converting "yield from" statement to Python 2.7 code - python

I had a code below in Python 3.2 and I wanted to run it in Python 2.7. I did convert it (have put the code of missing_elements in both versions) but I am not sure if that is the most efficient way to do it. Basically what happens if there are two yield from calls like below in upper half and lower half in missing_element function? Are the entries from the two halves (upper and lower) appended to each other in one list so that the parent recursion function with the yield from call and use both the halves together?
def missing_elements(L, start, end): # Python 3.2
if end - start <= 1:
if L[end] - L[start] > 1:
yield from range(L[start] + 1, L[end])
return
index = start + (end - start) // 2
# is the lower half consecutive?
consecutive_low = L[index] == L[start] + (index - start)
if not consecutive_low:
yield from missing_elements(L, start, index)
# is the upper part consecutive?
consecutive_high = L[index] == L[end] - (end - index)
if not consecutive_high:
yield from missing_elements(L, index, end)
def main():
L = [10, 11, 13, 14, 15, 16, 17, 18, 20]
print(list(missing_elements(L, 0, len(L)-1)))
L = range(10, 21)
print(list(missing_elements(L, 0, len(L)-1)))
def missing_elements(L, start, end): # Python 2.7
return_list = []
if end - start <= 1:
if L[end] - L[start] > 1:
return range(L[start] + 1, L[end])
index = start + (end - start) // 2
# is the lower half consecutive?
consecutive_low = L[index] == L[start] + (index - start)
if not consecutive_low:
return_list.append(missing_elements(L, start, index))
# is the upper part consecutive?
consecutive_high = L[index] == L[end] - (end - index)
if not consecutive_high:
return_list.append(missing_elements(L, index, end))
return return_list

If you don't use the results of your yields,* you can always turn this:
yield from foo
… into this:
for bar in foo:
yield bar
There might be a performance cost,** but there is never a semantic difference.
Are the entries from the two halves (upper and lower) appended to each other in one list so that the parent recursion function with the yield from call and use both the halves together?
No! The whole point of iterators and generators is that you don't build actual lists and append them together.
But the effect is similar: you just yield from one, then yield from another.
If you think of the upper half and the lower half as "lazy lists", then yes, you can think of this as a "lazy append" that creates a larger "lazy list". And if you call list on the result of the parent function, you of course will get an actual list that's equivalent to appending together the two lists you would have gotten if you'd done yield list(…) instead of yield from ….
But I think it's easier to think of it the other way around: What it does is exactly the same the for loops do.
If you saved the two iterators into variables, and looped over itertools.chain(upper, lower), that would be the same as looping over the first and then looping over the second, right? No difference here. In fact, you could implement chain as just:
for arg in *args:
yield from arg
* Not the values the generator yields to its caller, the value of the yield expressions themselves, within the generator (which come from the caller using the send method), as described in PEP 342. You're not using these in your examples. And I'm willing to bet you're not in your real code. But coroutine-style code often uses the value of a yield from expression—see PEP 3156 for examples. Such code usually depends on other features of Python 3.3 generators—in particular, the new StopIteration.value from the same PEP 380 that introduced yield from—so it will have to be rewritten. But if not, you can use the PEP also shows you the complete horrid messy equivalent, and you can of course pare down the parts you don't care about. And if you don't use the value of the expression, it pares down to the two lines above.
** Not a huge one, and there's nothing you can do about it short of using Python 3.3 or completely restructuring your code. It's exactly the same case as translating list comprehensions to Python 1.5 loops, or any other case when there's a new optimization in version X.Y and you need to use an older version.

Replace them with for-loops:
yield from range(L[start] + 1, L[end])
==>
for i in range(L[start] + 1, L[end]):
yield i
The same about elements:
yield from missing_elements(L, index, end)
==>
for el in missing_elements(L, index, end):
yield el

I just came across this issue and my usage was a bit more difficult since I needed the return value of yield from:
result = yield from other_gen()
This cannot be represented as a simple for loop but can be reproduced with this:
_iter = iter(other_gen())
try:
while True: #broken by StopIteration
yield next(_iter)
except StopIteration as e:
if e.args:
result = e.args[0]
else:
result = None
Hopefully this will help people who come across the same problem. :)

What about using the definition from pep-380 in order to construct a Python 2 syntax version:
The statement:
RESULT = yield from EXPR
is semantically equivalent to:
_i = iter(EXPR)
try:
_y = next(_i)
except StopIteration as _e:
_r = _e.value
else:
while 1:
try:
_s = yield _y
except GeneratorExit as _e:
try:
_m = _i.close
except AttributeError:
pass
else:
_m()
raise _e
except BaseException as _e:
_x = sys.exc_info()
try:
_m = _i.throw
except AttributeError:
raise _e
else:
try:
_y = _m(*_x)
except StopIteration as _e:
_r = _e.value
break
else:
try:
if _s is None:
_y = next(_i)
else:
_y = _i.send(_s)
except StopIteration as _e:
_r = _e.value
break
RESULT = _r
In a generator, the statement:
return value
is semantically equivalent to
raise StopIteration(value)
except that, as currently, the exception cannot be caught by except clauses within the returning generator.
The StopIteration exception behaves as though defined thusly:
class StopIteration(Exception):
def __init__(self, *args):
if len(args) > 0:
self.value = args[0]
else:
self.value = None
Exception.__init__(self, *args)

I think I found a way to emulate Python 3.x yield from construct in Python 2.x. It's not efficient and it is a little hacky, but here it is:
import types
def inline_generators(fn):
def inline(value):
if isinstance(value, InlineGenerator):
for x in value.wrapped:
for y in inline(x):
yield y
else:
yield value
def wrapped(*args, **kwargs):
result = fn(*args, **kwargs)
if isinstance(result, types.GeneratorType):
result = inline(_from(result))
return result
return wrapped
class InlineGenerator(object):
def __init__(self, wrapped):
self.wrapped = wrapped
def _from(value):
assert isinstance(value, types.GeneratorType)
return InlineGenerator(value)
Usage:
#inline_generators
def outer(x):
def inner_inner(x):
for x in range(1, x + 1):
yield x
def inner(x):
for x in range(1, x + 1):
yield _from(inner_inner(x))
for x in range(1, x + 1):
yield _from(inner(x))
for x in outer(3):
print x,
Produces output:
1 1 1 2 1 1 2 1 2 3
Maybe someone finds this helpful.
Known issues: Lacks support for send() and various corner cases described in PEP 380. These could be added and I will edit my entry once I get it working.

I've found using resource contexts (using the python-resources module) to be an elegant mechanism for implementing subgenerators in Python 2.7. Conveniently I'd already been using the resource contexts anyway.
If in Python 3.3 you would have:
#resources.register_func
def get_a_thing(type_of_thing):
if type_of_thing is "A":
yield from complicated_logic_for_handling_a()
else:
yield from complicated_logic_for_handling_b()
def complicated_logic_for_handling_a():
a = expensive_setup_for_a()
yield a
expensive_tear_down_for_a()
def complicated_logic_for_handling_b():
b = expensive_setup_for_b()
yield b
expensive_tear_down_for_b()
In Python 2.7 you would have:
#resources.register_func
def get_a_thing(type_of_thing):
if type_of_thing is "A":
with resources.complicated_logic_for_handling_a_ctx() as a:
yield a
else:
with resources.complicated_logic_for_handling_b_ctx() as b:
yield b
#resources.register_func
def complicated_logic_for_handling_a():
a = expensive_setup_for_a()
yield a
expensive_tear_down_for_a()
#resources.register_func
def complicated_logic_for_handling_b():
b = expensive_setup_for_b()
yield b
expensive_tear_down_for_b()
Note how the complicated-logic operations only require the registration as a resource.

Another solution: by using my yield-from-as-an-iterator library, you can turn any yield from foo into
for value, handle_send, handle_throw in yield_from(foo):
try:
handle_send((yield value))
except:
if not handle_throw(*sys.exc_info()):
raise
To make sure this answer stands alone even if the PyPI package is ever lost, here is an entire copy of that library's yieldfrom.py from the 1.0.0 release:
# SPDX-License-Identifier: 0BSD
# Copyright 2022 Alexander Kozhevnikov <mentalisttraceur#gmail.com>
"""A robust implementation of ``yield from`` behavior.
Allows transpilers, backpilers, and code that needs
to be portable to minimal or old Pythons to replace
yield from ...
with
for value, handle_send, handle_throw in yield_from(...):
try:
handle_send(yield value)
except:
if not handle_throw(*sys.exc_info()):
raise
"""
__version__ = '1.0.0'
__all__ = ('yield_from',)
class yield_from(object):
"""Implementation of the logic that ``yield from`` adds around ``yield``."""
__slots__ = ('_iterator', '_next', '_default_next')
def __init__(self, iterable):
"""Initializes the yield_from instance.
Arguments:
iterable: The iterable to yield from and forward to.
"""
# Mutates:
# self._next: Prepares to use built-in function next in __next__
# for the first iteration on the iterator.
# self._default_next: Saves initial self._next tuple for reuse.
self._iterator = iter(iterable)
self._next = self._default_next = next, (self._iterator,)
def __repr__(self):
"""Represent the yield_from instance as a string."""
return type(self).__name__ + '(' + repr(self._iterator) + ')'
def __iter__(self):
"""Return the yield_from instance, which is itself an iterator."""
return self
def __next__(self):
"""Execute the next iteration of ``yield from`` on the iterator.
Returns:
Any: The next value from the iterator.
Raises:
StopIteration: If the iterator is exhausted.
Any: If the iterator raises an error.
"""
# Mutates:
# self._next: Resets to default, in case handle_send or
# or handle_throw changed it for this iteration.
next_, arguments = self._next
self._next = self._default_next
value = next_(*arguments)
return value, self.handle_send, self.handle_throw
next = __next__ # Python 2 used `next` instead of ``__next__``
def handle_send(self, value):
"""Handle a send method call for a yield.
Arguments:
value: The value sent through the yield.
Raises:
AttributeError: If the iterator has no send method.
"""
# Mutates:
# self._next: If value is not None, prepares to use the
# iterator's send attribute instead of the built-in
# function next in the next iteration of __next__.
if value is not None:
self._next = self._iterator.send, (value,)
def handle_throw(self, type, exception, traceback):
"""Handle a throw method call for a yield.
Arguments:
type: The type of the exception thrown through the yield.
If this is GeneratorExit, the iterator will be closed
by callings its close attribute if it has one.
exception: The exception thrown through the yield.
traceback: The traceback of the exception thrown through the yield.
Returns:
bool: Whether the exception will be forwarded to the iterator.
If this is false, you should bubble up the exception.
If this is true, the exception will be thrown into the
iterator at the start of the next iteration, and will
either be handled or bubble up at that time.
Raises:
TypeError: If type is not a class.
GeneratorExit: Re-raised after successfully closing the iterator.
Any: If raised by the close function on the iterator.
"""
# Mutates:
# self._next: If type was not GeneratorExit and the iterator
# has a throw attribute, prepares to use that attribute
# instead of the built-in function next in the next
# iteration of __next__.
iterator = self._iterator
if issubclass(type, GeneratorExit):
try:
close = iterator.close
except AttributeError:
return False
close()
return False
try:
throw = iterator.throw
except AttributeError:
return False
self._next = throw, (type, exception, traceback)
return True
What I really like about this way is that:
The implementation is much easier to fully think through and verify for correctness than the alternatives*.
The usage is still simple, and doesn't require decorators or any other code changes anywhere other than just replacing the yield from ... line.
It still has robust forwarding of .send and .throw and handling of errors, StopIteration, and GeneratorExit.
This yield_from implementation will work on any Python 3 and on Python 2 all the way back to Python 2.5**.
* The formal specification ends up entangling all the logic into one big loop, even with some duplication thrown in. All the fully-featured backport implementations I've seen further add complication on top of that. But we can do better by embracing manually implementing the iterator protocol:
We get StopIteration handling for free from Python itself around our __next__ method.
The logic can be split into separate pieces which are entirely decoupled except for the state-saving between them, which frees you from having to de-tangle the logic by yourself - fundamentally yield from is just three simple ideas:
call a method on the iterator to get the next element,
how to handle .send (which may change the method called in step 1), and
how to handle .throw (which may change the method called in step 1).
By asking for modest boilerplate at each yield from replacement, we can avoid needing any hidden magic with special wrapper types, decorators, and so on.
** Python 2.5 is when PEP-342 made yield an expression and added GeneratorExit. Though if you are ever unfortunate enough to need to backport or "backpile" (transpile to an older version of the language) this yield_from would still do all the hard parts of building yield from on top of yield for you.
Also, this idea leaves a lot of freedom for how how usage boilerplate looks. For example,
handle_throw could be trivially refactored into a context manager, enabling usage like this:
for value, handle_send, handle_throw in yield_from(foo):
with handle_throw:
handle_send(yield value)
and
you could make value, handle_send, handle_throw something like a named tuple if you find this usage nicer:
for step in yield_from(foo):
with step.handle_throw:
step.handle_send(yield step.value)

Related

How to get the value from "return" when using "yield" with it in Python? [duplicate]

Since Python 3.3, if a generator function returns a value, that becomes the value for the StopIteration exception that is raised. This can be collected a number of ways:
The value of a yield from expression, which implies the enclosing function is also a generator.
Wrapping a call to next() or .send() in a try/except block.
However, if I'm simply wanting to iterate over the generator in a for loop - the easiest way - there doesn't appear to be a way to collect the value of the StopIteration exception, and thus the return value. Im using a simple example where the generator yields values, and returns some kind of summary at the end (running totals, averages, timing statistics, etc).
for i in produce_values():
do_something(i)
values_summary = ....??
One way is to handle the loop myself:
values_iter = produce_values()
try:
while True:
i = next(values_iter)
do_something(i)
except StopIteration as e:
values_summary = e.value
But this throws away the simplicity of the for loop. I can't use yield from since that requires the calling code to be, itself, a generator. Is there a simpler way than the roll-ones-own for loop shown above?
You can think of the value attribute of StopIteration (and arguably StopIteration itself) as implementation details, not designed to be used in "normal" code.
Have a look at PEP 380 that specifies the yield from feature of Python 3.3: It discusses that some alternatives of using StopIteration to carry the return value where considered.
Since you are not supposed to get the return value in an ordinary for loop, there is no syntax for it. The same way as you are not supposed to catch the StopIteration explicitly.
A nice solution for your situation would be a small utility class (might be useful enough for the standard library):
class Generator:
def __init__(self, gen):
self.gen = gen
def __iter__(self):
self.value = yield from self.gen
This wraps any generator and catches its return value to be inspected later:
>>> def test():
... yield 1
... return 2
...
>>> gen = Generator(test())
>>> for i in gen:
... print(i)
...
1
>>> print(gen.value)
2
You could make a helper wrapper, that would catch the StopIteration and extract the value for you:
from functools import wraps
class ValueKeepingGenerator(object):
def __init__(self, g):
self.g = g
self.value = None
def __iter__(self):
self.value = yield from self.g
def keep_value(f):
#wraps(f)
def g(*args, **kwargs):
return ValueKeepingGenerator(f(*args, **kwargs))
return g
#keep_value
def f():
yield 1
yield 2
return "Hi"
v = f()
for x in v:
print(x)
print(v.value)
A light-weight way to handle the return value (one that doesn't involve instantiating an auxiliary class) is to use dependency injection.
Namely, one can pass in the function to handle / act on the return value using the following wrapper / helper generator function:
def handle_return(generator, func):
returned = yield from generator
func(returned)
For example, the following--
def generate():
yield 1
yield 2
return 3
def show_return(value):
print('returned: {}'.format(value))
for x in handle_return(generate(), show_return):
print(x)
results in--
1
2
returned: 3
The most obvious method I can think of for this would be a user defined type that would remember the summary for you..
>>> import random
>>> class ValueProducer:
... def produce_values(self, n):
... self._total = 0
... for i in range(n):
... r = random.randrange(n*100)
... self._total += r
... yield r
... self.value_summary = self._total/n
... return self.value_summary
...
>>> v = ValueProducer()
>>> for i in v.produce_values(3):
... print(i)
...
25
55
179
>>> print(v.value_summary)
86.33333333333333
>>>
Another light weight way sometimes appropriate is to yield the running summary in every generator step in addition to your primary value in a tuple. The loop stays simple with an extra binding which is still available afterwards:
for i, summary in produce_values():
do_something(i)
show_summary(summary)
This is especially useful if someone could use more than just the last summary value, e. g. updating a progress view.

Skipping iterations inside Python iterator __next__ in a manner similar to continue

I would like to make a Python iterator class using __next__ that skips certain elements.
For example, here I have implemented a generator using yield and continue and I would like the iterator class below it to do exactly the same thing.
def generator_style(my_iter):
for i in my_iter:
if i < 999990:
continue
yield i
print(", ".join(str(s) for s in generator_style(range(1000000))))
# prints 999990, 999991, 999992, 999993, 999994, 999995, 999996, 999997, 999998, 999999
class iterator_style:
def __init__(self, my_iter):
self.my_iter = iter(my_iter)
def __iter__(self):
return self
def __next__(self):
i = next(self.my_iter)
if i < 999990:
# what do I do here??? I can't use continue
return next(self)
return i
print(", ".join(str(s) for s in iterator_style(range(1000000))))
# I want it to print 999990, 999991, 999992, 999993, 999994, 999995, 999996, 999997, 999998, 999999
Unfortunately, this version crashes due to
RecursionError: maximum recursion depth exceeded while calling a Python object
It works for smaller numbers though.
I cannot replace the return next(self) line with continue because:
continue
^
SyntaxError: 'continue' not properly in loop
You're correct that recursion has a lot of limitations for algorithms like this. Typically you would want to avoid deep recursion to avoid RecursionErrors. In this case you need to recursively call __next__ 999990 times to get your value.
To avoid this you can use a while-loop instead. Just keep calling next until you reach a value that meets your criteria.
def __next__(self):
while (i := next(self.my_iter)) < 999990:
pass
return i
Another way you could write it, if you find the assignment expression confusing:
def __next__(self):
while True:
i = next(self.my_iter)
if i < 999990:
continue
return i

Randomly sample between multiple generators?

I'm trying to iterate over multiple generators randomly, and skip those that are exhausted by removing them from the list of available generators. However, the CombinedGenerator doesn't call itself like it should to switch generator. Instead it throws a StopIteration when the smaller iterator is exhausted. What am I missing?
The following works:
gen1 = (i for i in range(0, 5, 1))
gen2 = (i for i in range(100, 200, 1))
list_of_gen = [gen1, gen2]
print(list_of_gen)
list_of_gen.remove(gen1)
print(list_of_gen)
list_of_gen.remove(gen2)
print(list_of_gen)
where each generator is removed by their reference.
But here it doesn't:
import random
gen1 = (i for i in range(0, 5, 1))
gen2 = (i for i in range(100, 200, 1))
total = 105
class CombinedGenerator:
def __init__(self, generators):
self.generators = generators
def __call__(self):
generator = random.choice(self.generators)
try:
yield next(generator)
except StopIteration:
self.generators.remove(generator)
if len(self.generators) != 0:
self.__call__()
else:
raise StopIteration
c = CombinedGenerator([gen1, gen2])
for i in range(total):
print(f"iter {i}")
print(f"yielded {next(c())}")
As #Tomerikoo mentioned, you are basically creating your own Generator and it is better to implement __next__ which is cleaner and pythonic way.
The above code can be fixed with below lines.
def __call__(self):
generator = random.choice(self.generators)
try:
yield next(generator)
except StopIteration:
self.generators.remove(generator)
if len(self.generators) != 0:
# yield your self.__call__() result as well
yield next(self.__call__())
else:
raise StopIteration
First of all, in order to fix your current code, you just need to match the pattern you created by changing the line:
self.__call__()
to:
yield next(self.__call__())
Then, I would make a few small changes to your original code:
Instead of implementing __call__ and calling the object, it seems more reasonable to implement __next__ and simply call next on the object.
Instead of choosing the generator, I would choose the index. This mainly serves for avoiding the use of remove which is not so efficient when you can directly access the deleted object.
Personally I prefer to avoid recursion where possible so will change where I check that are still generators to use:
class CombinedGenerator:
def __init__(self, generators):
self.generators = generators
def __next__(self):
while self.generators:
i = random.choice(range(len(self.generators)))
try:
return next(self.generators[i])
except StopIteration:
del self.generators[i]
raise StopIteration
c = CombinedGenerator([gen1, gen2])
for i in range(total):
print(f"iter {i}")
print(f"yielded {next(c)}")
A nice bonus can be to add this to your class:
def __iter__(self):
return self
Which then allows you to directly iterate on the object itself and you don't need the total variable:
for i, num in enumerate(c):
print(f"iter {i}")
print(f"yielded {num}")

How to intercept the first value of a generator and transparently yield from the rest

Update: I've started a thread on python-ideas to propose additional syntax or a stdlib function for this purpose (i.e. specifying the first value sent by yield from). So far 0 replies... :/
How do I intercept the first yielded value of a subgenerator but delegate the rest of the iteration to the latter using yield from?
For example, suppose we have an arbitrary bidirectional generator subgen, and we want to wrap this in another generator gen. The purpose of gen is to intercept the first yielded value of subgen and delegate the rest of the generation—including sent values, thrown exceptions, .close(), etc.—to the sub-generator.
The first thing that might come to mind could be this:
def gen():
g = subgen()
first = next(g)
# do something with first...
yield "intercepted"
# delegate the rest
yield from g
But this is wrong, because when the caller .sends something back to the generator after getting the first value, it will end up as the value of the yield "intercepted" expression, which is ignored, and instead g will receive None as the first .send value, as part of the semantics of yield from.
So we might think to do this:
def gen():
g = subgen()
first = next(g)
# do something with first...
received = yield "intercepted"
g.send(received)
# delegate the rest
yield from g
But what we've done here is just moving the problem back by one step: as soon as we call g.send(received), the generator resumes its execution and doesn't stop until it reaches the next yield statement, whose value becomes the return value of the .send call. So we'd also have to intercept that and re-send it. And then send that, and that again, and so on... So this won't do.
Basically, what I'm asking for is a yield from with a way to customize what the first value sent to the generator is:
def gen():
g = subgen()
first = next(g)
# do something with first...
received = yield "intercepted"
# delegate the rest
yield from g start with received # pseudocode; not valid Python
...but without having to re-implement all of the semantics of yield from myself. That is, the laborious and poorly maintainable solution would be:
def adaptor(generator, init_send_value=None):
send = init_send_value
try:
while True:
send = yield generator.send(send)
except StopIteration as e:
return e.value
which is basically a bad re-implementation of yield from (it's missing handling of throw, close, etc.). Ideally I would like something more elegant and less redundant.
If you're trying to implement this generator wrapper as a generator function using yield from, then your question basically boils down to whether it is possible to specify the first value sent to the "yielded from" generator. Which it is not.
If you look at the formal specification of the yield from expression in PEP 380, you can see why. The specification contains a (surprisingly complex) piece of sample code that behaves the same as a yield from expression. The first few lines are:
_i = iter(EXPR)
try:
_y = next(_i)
except StopIteration as _e:
_r = _e.value
else:
...
You can see that the first thing that is done to the iterator is to call next() on it, which is basically equivalent to .send(None). There is no way to skip that step and your generator will always receive another None whenever yield from is used.
The solution I've come up with is to implement the generator protocol using a class instead of a generator function:
class Intercept:
def __init__(self, generator):
self._generator = generator
self._intercepted = False
def __next__(self):
return self.send(None)
def send(self, value):
yielded_value = self._generator.send(value)
# Intercept the first value yielded by the wrapped generator and
# replace it with a different value.
if not self._intercepted:
self._intercepted = True
print(f'Intercepted value: {yielded_value}')
yielded_value = 'intercepted'
return yielded_value
def throw(self, type, *args):
return self._generator.throw(type, *args)
def close(self):
self._generator.close()
__next__(), send(), throw(), close() are described in the Python Reference Manual.
The class wraps the generator passed to it when created will mimic its behavior. The only thing it changes is that the first value yielded by the generator is replaced by a different value before it is returned to the caller.
We can test the behavior with an example generator f() which yields two values and a function main() which sends values into the generator until the generator terminates:
def f():
y = yield 'first'
print(f'f(): {y}')
y = yield 'second'
print(f'f(): {y}')
def main():
value_to_send = 0
gen = f()
try:
x = gen.send(None)
while True:
print(f'main(): {x}')
# Send incrementing integers to the generator.
value_to_send += 1
x = gen.send(value_to_send)
except StopIteration:
print('main(): StopIteration')
main()
When ran, this example will produce the following output, showing which values arrive in the generator and which are returned by the generator:
main(): first
f(): 1
main(): second
f(): 2
main(): StopIteration
Wrapping the generator f() by changing the statement gen = f() to gen = Intercept(f()), produces the following output, showing that the first yielded value has been replaced:
Intercepted value: first
main(): intercepted
f(): 1
main(): second
f(): 2
As all other calls to any of the generator API are forwarded directly to the wrapped generator, it should behave equivalently to the wrapped generator itself.
If I understand the question, I think this works? Meaning, I ran this script and it did what I expected, which was to print all but the first line of the input file. But as long as the generator passed as the argument to the skip_first function can be iterator over, it should work.
def skip_first(thing):
_first = True
for _result in thing:
if _first:
_ first = False
continue
yield _result
inp = open("/var/tmp/test.txt")
for line in skip_first(inp):
print(line, end="")

SICP "streams as signals" in Python

I have found some nice examples (here, here) of implementing SICP-like streams in Python. But I am still not sure how to handle an example like the integral found in SICP 3.5.3 "Streams as signals."
The Scheme code found there is
(define (integral integrand initial-value dt)
(define int
(cons-stream initial-value
(add-streams (scale-stream integrand dt)
int)))
int)
What is tricky about this one is that the returned stream int is defined in terms of itself (i.e., the stream int is used in the definition of the stream int).
I believe Python could have something similarly expressive and succinct... but not sure how. So my question is, what is an analogous stream-y construct in Python? (What I mean by a stream is the subject of 3.5 in SICP, but briefly, a construct (like a Python generator) that returns successive elements of a sequence of indefinite length, and can be combined and processed with operations such as add-streams and scale-stream that respect streams' lazy character.)
There are two ways to read your question. The first is simply: How do you use Stream constructs, perhaps the ones from your second link, but with a recursive definition? That can be done, though it is a little clumsy in Python.
In Python you can represent looped data structures but not directly. You can't write:
l = [l]
but you can write:
l = [None]
l[0] = l
Similarly you can't write:
def integral(integrand,initial_value,dt):
int_rec = cons_stream(initial_value,
add_streams(scale_stream(integrand,dt),
int_rec))
return int_rec
but you can write:
def integral(integrand,initial_value,dt):
placeholder = Stream(initial_value,lambda : None)
int_rec = cons_stream(initial_value,
add_streams(scale_stream(integrand,dt),
placeholder))
placeholder._compute_rest = lambda:int_rec
return int_rec
Note that we need to clumsily pre-compute the first element of placeholder and then only fix up the recursion for the rest of the stream. But this does all work (alongside appropriate definitions of all the rest of the code - I'll stick it all at the bottom of this answer).
However, the second part of your question seems to be asking how to do this naturally in Python. You ask for an "analogous stream-y construct in Python". Clearly the answer to that is exactly the generator. The generator naturally provides the lazy evaluation of the stream concept. It differs by not being naturally expressed recursively but then Python does not support that as well as Scheme, as we will see.
In other words, the strict stream concept can be expressed in Python (as in the link and above) but the idiomatic way to do it is to use generators.
It is more or less possible to replicate the Scheme example by a kind of direct mechanical transformation of stream to generator (but avoiding the built-in int):
def integral_rec(integrand,initial_value,dt):
def int_rec():
for x in cons_stream(initial_value,
add_streams(scale_stream(integrand,dt),int_rec())):
yield x
for x in int_rec():
yield x
def cons_stream(a,b):
yield a
for x in b:
yield x
def add_streams(a,b):
while True:
yield next(a) + next(b)
def scale_stream(a,b):
for x in a:
yield x * b
The only tricky thing here is to realise that you need to eagerly call the recursive use of int_rec as an argument to add_streams. Calling it doesn't start it yielding values - it just creates the generator ready to yield them lazily when needed.
This works nicely for small integrands, though it's not very pythonic. The Scheme version works by optimising the tail recursion - the Python version will exceed the max stack depth if your integrand is too long. So this is not really appropriate in Python.
A direct and natural pythonic version would look something like this, I think:
def integral(integrand,initial_value,dt):
value = initial_value
yield value
for x in integrand:
value += dt * x
yield value
This works efficiently and correctly treats the integrand lazily as a "stream". However, it uses iteration rather than recursion to unpack the integrand iterable, which is more the Python way.
In moving to natural Python I have also removed the stream combination functions - for example, replaced add_streams with +=. But we could still use them if we wanted a sort of halfway house version:
def accum(initial_value,a):
value = initial_value
yield value
for x in a:
value += x
yield value
def integral_hybrid(integrand,initial_value,dt):
for x in accum(initial_value,scale_stream(integrand,dt)):
yield x
This hybrid version uses the stream combinations from the Scheme and avoids only the tail recursion. This is still pythonic and python includes various other nice ways to work with iterables in the itertools module. They all "respect streams' lazy character" as you ask.
Finally here is all the code for the first recursive stream example, much of it taken from the Berkeley reference:
class Stream(object):
"""A lazily computed recursive list."""
def __init__(self, first, compute_rest, empty=False):
self.first = first
self._compute_rest = compute_rest
self.empty = empty
self._rest = None
self._computed = False
#property
def rest(self):
"""Return the rest of the stream, computing it if necessary."""
assert not self.empty, 'Empty streams have no rest.'
if not self._computed:
self._rest = self._compute_rest()
self._computed = True
return self._rest
def __repr__(self):
if self.empty:
return '<empty stream>'
return 'Stream({0}, <compute_rest>)'.format(repr(self.first))
Stream.empty = Stream(None, None, True)
def cons_stream(a,b):
return Stream(a,lambda : b)
def add_streams(a,b):
if a.empty or b.empty:
return Stream.empty
def compute_rest():
return add_streams(a.rest,b.rest)
return Stream(a.first+b.first,compute_rest)
def scale_stream(a,scale):
if a.empty:
return Stream.empty
def compute_rest():
return scale_stream(a.rest,scale)
return Stream(a.first*scale,compute_rest)
def make_integer_stream(first=1):
def compute_rest():
return make_integer_stream(first+1)
return Stream(first, compute_rest)
def truncate_stream(s, k):
if s.empty or k == 0:
return Stream.empty
def compute_rest():
return truncate_stream(s.rest, k-1)
return Stream(s.first, compute_rest)
def stream_to_list(s):
r = []
while not s.empty:
r.append(s.first)
s = s.rest
return r
def integral(integrand,initial_value,dt):
placeholder = Stream(initial_value,lambda : None)
int_rec = cons_stream(initial_value,
add_streams(scale_stream(integrand,dt),
placeholder))
placeholder._compute_rest = lambda:int_rec
return int_rec
a = truncate_stream(make_integer_stream(),5)
print(stream_to_list(integral(a,8,.5)))

Categories