updating a python generator after it has been created - python

Is there any way to do something like this in python 2.7?
def scaleit(g, k):
for item in g:
yield item*k
promise = ??????
# defines a generator for reference but not use:
# other functions can make use of it,
# but it requires a call to promise.fulfill() to
# define what the generator is going to yield;
# promise raises an error if next() is called
# before the promise is fulfilled
f = scaleit(promise, 3)
promise.fulfill(range(10))
for item in f:
print item

Yes; generators don't run until they're actually iterated, so you can just defer iterating the fulfilled promise's value until requested:
class Promise(object):
def fulfill(self, result):
self.result = result
def __iter__(self):
return iter(self.result)
def scaleit(g, k):
for item in g:
yield item*k
promise = Promise()
f = scaleit(promise, 3)
promise.fulfill(range(10))
print list(f)

Is this what you want?
def scaleit(g, k):
for item in g:
yield item * k
class Promise(object):
def __init__(self):
self.g = None
def fulfill(self, g):
self.g = iter(g)
def __iter__(self):
return self
def next(self):
return next(self.g)
promise = Promise()
f = scaleit(promise, 3)
promise.fulfill(range(10))
for item in f:
print item

I think you want the send() method on generators:
def gen():
reply = yield None
if not reply: # reply will be None if send() wasn't called
raise ValueError("promise not fulfilled")
yield 5
g1 = gen()
next(g1) # advance to the first yield
g1.send(True)
print(next(g1)) # prints 5
g2 = gen()
next(g2)
# forget to send
print(next(g2)) # raises ValueError

Related

How to get Python iterators not to communicate with each other?

Here's a simple iterator through the characters of a string.
class MyString:
def __init__(self,s):
self.s = s
self._ix = 0
def __iter__(self):
return self
def __next__(self):
try:
item = self.s[self._ix]
except IndexError:
self._ix = 0
raise StopIteration
self._ix += 1
return item
string = MyString('abcd')
iter1 = iter(string)
iter2 = iter(string)
print(next(iter1))
print(next(iter2))
Trying to get this iterator to function like it should. There are a few requirements. First, the __next__ method MUST raise StopIteration and multiple iterators running at the same time must not interact with each other.
I accomplished objective 1, but need help on objective 2. As of right now the output is:
'a'
'b'
When it should be:
'a'
'a'
Any advice would be appreciated.
Thank you!
MyString acts as its own iterator much like a file object
>>> f = open('deleteme', 'w')
>>> iter(f) is f
True
You use this pattern when you want all iterators to affect each other - in this case advancing through the lines of a file.
The other pattern is to use a separate class to iterate much like a list whose iterators are independent.
>>> l = [1, 2, 3]
>>> iter(l) is l
False
To do this, move the _ix indexer to a separate class that references MyString. Have MyString.__iter__ create an instance of the class. Now you have a separate indexer per iterator.
class MyString:
def __init__(self,s):
self.s = s
def __iter__(self):
return MyStringIter(self)
class MyStringIter:
def __init__(self, my_string):
self._ix = 0
self.my_string = my_string
def __iter__(self):
return self
def __next__(self):
try:
item = self.my_string.s[self._ix]
except IndexError:
raise StopIteration
self._ix += 1
return item
string = MyString('abcd')
iter1 = iter(string)
iter2 = iter(string)
print(next(iter1))
print(next(iter2))
Your question title asks how to get iterators, plural, to not communicate with each other, but you don't have multiple iterators, you only have one. If you want to be able to get distinct iterators from MyString, you can add a copy method:
class MyString:
def __init__(self,s):
self.s = s
self._ix = 0
def __iter__(self):
return self
def __next__(self):
try:
item = self.s[self._ix]
except IndexError:
self._ix = 0
raise StopIteration
self._ix += 1
return item
def copy(self):
return MyString(self.s)
string = MyString('abcd')
iter1 = string.copy()
iter2 = string.copy()
print(next(iter1))
print(next(iter2))

Can you yield from a lambda function?

I have a generator function in a class:
self.s = [['a',1],['b',2],['c',3]
def generator(self):
for r in self.s:
yield r
In another function I initalize it as a variable:
var = self.generator()
And it's yielded as necessary:
>>> next(var) # returns ['a',1]
>>> next(var) # returns ['b',2]
>>> next(var) # returns ['c',3]
Can defining the generator be done in one line, however? I've considered the below:
var = lambda: [(yield r) for r in self.s]
>>> next(var) # SyntaxError: 'yield' inside list comprehension
Here's a minimal code I'm working with:
class Foo():
def __init__(self):
self.s = {}
def generator(self):
for r in self.s:
yield r
def fetch(self):
if not self.s:
self.fetch_gen = self.generator()
self.s['a'] = 1
self.s['b'] = 2
self.s['c'] = 3
try:
var = self.s.get(next(self.fetch_gen))
except StopIteration:
return None
return var
BAR = Foo()
while True:
OUT = BAR.fetch()
if OUT is None:
break
print(OUT)
Output is below:
1
2
3
I just wanted to see if I could get rid of Foo.generator and instead declare the generator in one line.
You are returning a list comprehension. You can just do:
var = (r for r in self.s)
that will generate a generator with the values you want. You test it later with next(var) is in your code.

Current value of generator

In Python I can build a generator like so:
def gen():
x = range(0, 100)
for i in x:
yield i
I can now define an instance of the generator using:
a = gen()
And pull new values from the generator using
a.next()
But is there a way—a.current()—to get the current value of the generator?
There isn't such a method, and you cannot add attributes to a generator. A workaround would be to create an iterator object that wraps your generator, and contains a 'current' attribute. Taking it an extra step is to use it as a decorator on the generator.
Here's a utility decorator class which does that:
class with_current(object):
def __init__(self, generator):
self.__gen = generator()
def __iter__(self):
return self
def __next__(self):
self.current = next(self.__gen)
return self.current
def __call__(self):
return self
You can then use it like this:
#with_current
def gen():
x=range(0,100)
for i in x:
yield i
a = gen()
print(next(a))
print(next(a))
print(a.current)
Outputs:
0
1
1
You set the value of a variable.
current_value = a.next()
then use current_value for all it's worth.
Python uses this often in for statements
a = xrange(10)
for x in a:
print(x)
Here you are defining x as the current value of a.
Using another global variable to store the current value might be feasible.
# Variable that store the current value of the generator
generator_current_val = None
def generator():
global generator_current_val # to set the current value
for i in range(10):
generator_current_val = i
yield i
a = generator()
print(next(a)) # 0
print(next(a)) # 1
print(next(a)) # 2
print(generator_current_val) # 2
print(next(a)) # 3
You can use a sliding window generator, something like
def _window(i: Iterator[T]) -> Iterator[Tuple[T, Optional[T]]]:
prev = None
for x in i:
if prev:
yield prev, x
prev = x
yield prev, None
i = _window(iter([1,2,3,4]))
print(next(i)) # (1, 2)
print(next(i)) # (2, 3)
print(next(i)) # (3, 4)
print(next(i)) # (4, None)

stay on same value in python iterator

I'm creating an interator like so:
some_list = [1,2,5,12,30,75,180,440]
iter(some_list)
I have a need to access the current value of an iterator again. Is there a current() method that allows me to stay on the same position?
You certainly can make a class which will allow you to do this:
from collections import deque
class RepeatableIter(object):
def __init__(self,iterable):
self.iter = iter(iterable)
self.deque = deque([])
def __iter__(self):
return self
#define `next` and `__next__` for forward/backward compatability
def next(self):
if self.deque:
return self.deque.popleft()
else:
return next(self.iter)
__next__ = next
def requeue(self,what):
self.deque.append(what)
x = RepeatableIter([1, 2, 3, 4, 5, 6])
count = 0
for i in x:
print i
if i == 4 and count == 0:
count += 1
x.requeue(i)
The question is really why would you want to?
You can use numpy.nditer to build your iterator, then you have many amazing options including the current value.
import numpy
rng = range(100)
itr = numpy.nditer([rng])
print itr.next() #0
print itr.next() #1
print itr.next() #2
print itr.value #2 : current iterator value
Adapting the third example from this answer:
class CurrentIterator():
_sentinal = object()
_current = _sentinal
#property
def current(self):
if self._current is self._sentinal:
raise ValueError('No current value')
return self._current
def __init__(self, iterable):
self.it = iter(iterable)
def __iter__(self):
return self
def __next__(self):
try:
self._current = current = next(self.it)
except StopIteration:
self._current = self._sentinal
raise
return current
next = __next__ # for python2.7 compatibility
Some interesting points:
use of _sentinal so an error can be raised if no current value exists
use of property so current looks like a simple attribute
use of __next__ and next = __next__ for Python 2&3 compatibility

Adapt an iterator to behave like a file-like object in Python

I have a generator producing a list of strings. Is there a utility/adapter in Python that could make it look like a file?
For example,
>>> def str_fn():
... for c in 'a', 'b', 'c':
... yield c * 3
...
>>> for s in str_fn():
... print s
...
aaa
bbb
ccc
>>> stream = some_magic_adaptor(str_fn())
>>> while True:
... data = stream.read(4)
... if not data:
... break
... print data
aaab
bbcc
c
Because data may be big and needs to be streamable (each fragment is a few kilobytes, the entire stream is tens of megabytes), I do not want to eagerly evaluate the whole generator before passing it to stream adaptor.
The "correct" way to do this is inherit from a standard Python io abstract base class. However it doesn't appear that Python allows you to provide a raw text class, and wrap this with a buffered reader of any kind.
The best class to inherit from is TextIOBase. Here's such an implementation, handling readline, and read while being mindful of performance. (gist)
import io
class StringIteratorIO(io.TextIOBase):
def __init__(self, iter):
self._iter = iter
self._left = ''
def readable(self):
return True
def _read1(self, n=None):
while not self._left:
try:
self._left = next(self._iter)
except StopIteration:
break
ret = self._left[:n]
self._left = self._left[len(ret):]
return ret
def read(self, n=None):
l = []
if n is None or n < 0:
while True:
m = self._read1()
if not m:
break
l.append(m)
else:
while n > 0:
m = self._read1(n)
if not m:
break
n -= len(m)
l.append(m)
return ''.join(l)
def readline(self):
l = []
while True:
i = self._left.find('\n')
if i == -1:
l.append(self._left)
try:
self._left = next(self._iter)
except StopIteration:
self._left = ''
break
else:
l.append(self._left[:i+1])
self._left = self._left[i+1:]
break
return ''.join(l)
Here's a solution that should read from your iterator in chunks.
class some_magic_adaptor:
def __init__( self, it ):
self.it = it
self.next_chunk = ""
def growChunk( self ):
self.next_chunk = self.next_chunk + self.it.next()
def read( self, n ):
if self.next_chunk == None:
return None
try:
while len(self.next_chunk)<n:
self.growChunk()
rv = self.next_chunk[:n]
self.next_chunk = self.next_chunk[n:]
return rv
except StopIteration:
rv = self.next_chunk
self.next_chunk = None
return rv
def str_fn():
for c in 'a', 'b', 'c':
yield c * 3
ff = some_magic_adaptor( str_fn() )
while True:
data = ff.read(4)
if not data:
break
print data
The problem with StringIO is that you have to load everything into the buffer up front. This can be a problem if the generator is infinite :)
from itertools import chain, islice
class some_magic_adaptor(object):
def __init__(self, src):
self.src = chain.from_iterable(src)
def read(self, n):
return "".join(islice(self.src, None, n))
Here's a modified version of John and Matt's answer that can read a list/generator of strings and output bytearrays
import itertools as it
from io import TextIOBase
class IterStringIO(TextIOBase):
def __init__(self, iterable=None):
iterable = iterable or []
self.iter = it.chain.from_iterable(iterable)
def not_newline(self, s):
return s not in {'\n', '\r', '\r\n'}
def write(self, iterable):
to_chain = it.chain.from_iterable(iterable)
self.iter = it.chain.from_iterable([self.iter, to_chain])
def read(self, n=None):
return bytearray(it.islice(self.iter, None, n))
def readline(self, n=None):
to_read = it.takewhile(self.not_newline, self.iter)
return bytearray(it.islice(to_read, None, n))
usage:
ff = IterStringIO(c * 3 for c in ['a', 'b', 'c'])
while True:
data = ff.read(4)
if not data:
break
print data
aaab
bbcc
c
alternate usage:
ff = IterStringIO()
ff.write('ddd')
ff.write(c * 3 for c in ['a', 'b', 'c'])
while True:
data = ff.read(4)
if not data:
break
print data
ddda
aabb
bccc
There is one called werkzeug.contrib.iterio.IterIO but note that it stores the entire iterator in its memory (up to the point you have read it as a file) so it might not be suitable.
http://werkzeug.pocoo.org/docs/contrib/iterio/
Source: https://github.com/mitsuhiko/werkzeug/blob/master/werkzeug/contrib/iterio.py
An open bug on readline/iter: https://github.com/mitsuhiko/werkzeug/pull/500
Looking at Matt's answer, I can see that it's not always necessary to implement all the read methods. read1 may be sufficient, which is described as:
Read and return up to size bytes, with at most one call to the underlying raw stream’s read()...
Then it can be wrapped with io.TextIOWrapper which, for instance, has implementation of readline. As an example here's streaming of CSV-file from S3's (Amazon Simple Storage Service) boto.s3.key.Key which implements iterator for reading.
import io
import csv
from boto import s3
class StringIteratorIO(io.TextIOBase):
def __init__(self, iter):
self._iterator = iter
self._buffer = ''
def readable(self):
return True
def read1(self, n=None):
while not self._buffer:
try:
self._buffer = next(self._iterator)
except StopIteration:
break
result = self._buffer[:n]
self._buffer = self._buffer[len(result):]
return result
conn = s3.connect_to_region('some_aws_region')
bucket = conn.get_bucket('some_bucket')
key = bucket.get_key('some.csv')
fp = io.TextIOWrapper(StringIteratorIO(key))
reader = csv.DictReader(fp, delimiter = ';')
for row in reader:
print(row)
Update
Here's an answer to related question which looks a little better. It inherits io.RawIOBase and overrides readinto. In Python 3 it's sufficient, so instead of wrapping IterStream in io.BufferedReader one can wrap it in io.TextIOWrapper. In Python 2 read1 is needed but it can be simply expressed though readinto.
If you only need a read method, then this can be enough
def to_file_like_obj(iterable, base):
chunk = base()
offset = 0
it = iter(iterable)
def up_to_iter(size):
nonlocal chunk, offset
while size:
if offset == len(chunk):
try:
chunk = next(it)
except StopIteration:
break
else:
offset = 0
to_yield = min(size, len(chunk) - offset)
offset = offset + to_yield
size -= to_yield
yield chunk[offset - to_yield:offset]
class FileLikeObj:
def read(self, size=-1):
return base().join(up_to_iter(float('inf') if size is None or size < 0 else size))
return FileLikeObj()
which can be used for an iterable yielding str
my_file = to_file_like_object(str_fn, str)
or if you have an iterable yielding bytes rather than str, and you want a file-like object whose read method returns bytes
my_file = to_file_like_object(bytes_fn, bytes)
This pattern has a few nice properties I think:
Not much code, which can be used for both str and bytes
Returns exactly what has been asked for in terms of length, in both of the cases of the iterable yielding small chunks, and big chunks (other than at the end of the iterable)
Does not append str/bytes - so avoids copying
Leverages slicing - so also avoids copying because a slice of a str/bytes that should be the entire instance will return exactly that same instance
For the bytes case, it's enough of a file-like object to pass through to boto3's upload_fileobj for multipart upload to S3
this is exactly what stringIO is for ..
>>> import StringIO
>>> some_var = StringIO.StringIO("Hello World!")
>>> some_var.read(4)
'Hell'
>>> some_var.read(4)
'o Wo'
>>> some_var.read(4)
'rld!'
>>>
Or if you wanna do what it sounds like
Class MyString(StringIO.StringIO):
def __init__(self,*args):
StringIO.StringIO.__init__(self,"".join(args))
then you can simply
xx = MyString(*list_of_strings)

Categories