What is the Pythonic way to make a generator that also produces aggregate results? In meta code, something like this (but not for real, as my Python version does not support mixing yield and return):
def produce():
total = 0
for item in find_all():
total += 1
yield item
return total
As I see it, I could:
Not make produce() a generator, but pass it a callback function to call on every item.
With every yield, also yield the aggregate results up until now. I'd rather not calculate the intermediate results with every yield, only when finishing.
Send a dict as argument to produce() that will be populated with the aggregate results.
Use a global to store aggregate results.
All of them don't seem very attractive...
NB. total is a simple example, my actual code requires complex aggregations. And I need intermediate results before produce() finishes, hence a generator.
Maybe you shouldn't use a generator but an iterator.
def findall(): # no idea what your "find_all" does so I use this instead. :-)
yield 1
yield 2
yield 3
class Produce(object):
def __init__(self, iterable):
self._it = iterable
self.total = 0
def __iter__(self):
return self
def __next__(self):
self.total += 1
return next(self._it)
next = __next__ # only necessary for python2 compatibility
Maybe better to see this with an example:
>>> it = Produce(findall())
>>> it.total
0
>>> next(it)
1
>>> next(it)
2
>>> it.total
2
you can use enumerate to count stuff, for example
i=0
for i,v in enumerate(range(10), 1 ):
print(v)
print("total",i)
(notice the start value of the enumerate)
for more complex stuff, you can use the same principle, make produce a generator that yield both values and ignore one in the iteration and use it later when finished.
other alternative is passing a modifiable object, for example
def produce(mem):
t=0
for x in range(10):
t+=1
yield x
mem.append(t)
aggregate=[]
for x in produce(aggregate):
print(x)
print("total",aggregate[0])
in either case the result is the same for this example
0
1
2
3
4
5
6
7
8
9
total 10
Am I missing something? Why not:
def produce():
total = 0
for item in find_all():
total += 1
yield item
yield total
Related
I want to use next to skip one or more items returned from a generator. Here is a simplified example designed to skip one item per loop (in actual use, I'd test n and depending on the result, may repeat the next() and the generator is from a package I don't control):
def gen():
for i in range(10):
yield i
for g in gen():
n = next(gen())
print(g, n)
I expected the result to be
0 1
2 3
etc.
Instead I got
0 0
1 0
etc.
What am I doing wrong?
You're making a new generator each time you call gen(). Each new generator starts from 0.
Instead, you can call it once and capture the return value.
def gen():
for i in range(10):
yield i
x = gen()
for g in x:
n = next(x)
print(g, n)
I'm trying to create an iterable object, and when I do 1 loop it is okay, but when doing multiple loops, it doesn't work. Here is my simplified code:
class test():
def __init__(self):
self.n = 0
def __iter__(self):
return self
def __next__(self):
if self.n < len(self)-1:
self.n += 1
return self.n
else:
raise StopIteration
def __len__(self):
return 5
#this is an example iteration
test = test()
for i in test:
for j in test:
print(i,j)
#it prints is
1 2
1 3
1 4
#What i expect is
1 1
1 2
1 3
1 4
2 1
2 2
2 3
...
4 3
4 4
How can I make this object (in this case test) to iterate twice and get all the combinations of number i and j in the example loop?
You want an instance of test to be iterable, but not its own iterator. What's the difference?
An iterable is something that, upon request, can supply an iterator. Lists are iterable, because iter([1,2,3]) returns a new listiterator object (not the list itself). To make test iterable, you just need to supply an __iter__ method (more on how to define it in a bit).
An iterator is something that, upon request, can produce a new element. It does this by calling its __next__ method. An iterator can be thought of as two pieces of information: a sequence of items to produce, and a cursor indicating how far along that sequence it currently is. When it reaches the end of its sequence, it raises a StopIteration exception to indicate that the iteration is at an end. To make an instance an iterator, you supply a __next__ method in its class. An iterator should also have a __iter__ method that just returns itself.
So how do you make test iterable without being an iterator? By having its __iter__ method return a new iterator each time it is called, and getting rid of its __next__ method. The simplest way to do that is to make __iter__ a generator function. Define your class something like:
class Test():
def __init__(self):
self._size = 5
def __iter__(self):
n = 0
while n < self._size:
yield n
n += 1
def __len__(self):
return self._size
Now when you write
test = Test()
for i in test: # implicit call to iter(test)
for j in test: # implicit call to iter(test)
print(i, j)
i and j both draw values from separate iterators over the same iterable. Each call to test.__iter__ returns a different generator object that keeps track of its own n.
Take a look at itertools.product.
You should be able to accomplish what you're looking for:
from itertools import product
...
test = test()
for i, j in product(test, repeat=2):
print(i,j)
I love this library!
I got a list of functions after calling the function count() below. I want to know how to execute these functions besides using the way in my code.
def count():
fs = []
for i in range(1, 4):
def f():
return i*i
fs.append(f)
return fs
print(count()[0](), count()[1](), count()[2]())
You can use map to apply a function to every item in an iterable. The result, in Python 3.x, is an iterable.
There are several different ways you can then extract results. Below is an example.
I have also corrected some errors in your logic. Your function f should take a parameter, presumably i, as this is used in your function. Similarly, it seems you want the function count to take a parameter n to determine your range of inputs.
def count(n):
def f(i):
return i*i
return map(f, range(1, n))
## iterate using next
res = count(5)
print(next(res)) # 1
print(next(res)) # 4
print(next(res)) # 9
## iterate using for loop
for k in count(4):
print(k)
# 1
# 4
# 9
## build list and exhaust all results
print(list(count(4)))
# [1, 4, 9]
I'm trying to write a function that returns the next element of a generator and if it is at the end of the generator it resets it and returns the next result. The expected output of the code below would be:
1
2
3
1
2
However that is not what I get obviously. What am I doing that is incorrect?
a = '123'
def convert_to_generator(iterable):
return (x for x in iterable)
ag = convert_to_generator(a)
def get_next_item(gen, original):
try:
return next(gen)
except StopIteration:
gen = convert_to_generator(original)
get_next_item(gen, original)
for n in range(5):
print(get_next_item(ag,a))
1
2
3
None
None
Is itertools.cycle(iterable) a possible alternative?
You need to return the result of your recursive call:
return get_next_item(gen, original)
which still does not make this a working approach.
The generator ag used in your for-loop is not changed by the rebinding of the local variable gen in your function. It will stay exhausted...
As has been mentioned in the comments, check out itertools.cycle.
the easy way is just use itertools.cycle, otherwise you would need to remember the elements in the iterable if said iterable is an iterator (aka a generator) becase those can't be reset, if its not a iterator, you can reuse it many times.
the documentation include a example implementation
def cycle(iterable):
# cycle('ABCD') --> A B C D A B C D A B C D ...
saved = []
for element in iterable:
yield element
saved.append(element)
while saved:
for element in saved:
yield element
or for example, to do the reuse thing
def cycle(iterable):
# cycle('ABCD') --> A B C D A B C D A B C D ...
if iter(iterable) is iter(iterable): # is a iterator
saved = []
for element in iterable:
yield element
saved.append(element)
else:
saved = iterable
while saved:
for element in saved:
yield element
example use
test = cycle("123")
for i in range(5):
print(next(test))
now about your code, the problem is simple, it don't remember it state
def get_next_item(gen, original):
try:
return next(gen)
except StopIteration:
gen = convert_to_generator(original) # <-- the problem is here
get_next_item(gen, original) #and you should return something here
in the marked line a new generator is build, but you would need to update your ag variable outside this function to get the desire behavior, there are ways to do it, like changing your function to return the element and the generator, there are other ways, but they are not recommended or more complicated like building a class so it remember its state
get_next_item is a generator, that returns an iterator, that gives you the values it yields via the __next__ method. For that reason, your statement doesn't do anything.
What you want to do is this:
def get_next_item(gen, original):
try:
return next(gen)
except StopIteration:
gen = convert_to_generator(original)
for i in get_next_item(gen, original):
return i
or shorter, and completely equivalent (as long as gen has a __iter__ method, which it probably has):
def get_next_item(gen, original):
for i in gen:
yield i
for i in get_next_item(convert_to_generator(original)):
yield i
Or without recursion (which is a big problem in python, as it is 1. limited in depth and 2. slow):
def get_next_item(gen, original):
for i in gen:
yield i
while True:
for i in convert_to_generator(original):
yield i
If convert_to_generator is just a call to iter, it is even shorter:
def get_next_item(gen, original):
for i in gen:
yield i
while True:
for i in original:
yield i
or, with itertools:
import itertools
def get_next_item(gen, original):
return itertools.chain(gen, itertools.cycle(original))
and get_next_item is equivalent to itertools.cycle if gen is guaranteed to be an iterator for original.
Side note: You can exchange for i in x: yield i for yield from x (where x is some expression) with Python 3.3 or higher.
I have a long running task that must be guided by external code. But external code need some information about this task. This is my homebrew example:
def longtask(self):
yield self.get_step_length(1)
for x in self.perform_step(1):
...
yield x.id
yield self.get_step_length(2)
for x in self.perform_step(2):
...
yield x.value
...
# call site
generator = self.longtask()
step1len = generator.Next()
step1pb = ProgressBar('Step 1', step1len)
# pull only step 1 items
for index, id in itertools.izip(xrange(0, step1len), generator):
step1pb.update(index)
...do something with id
step2len = generator.Next()
step2pb = ProgressBar('Step 2', step1len)
# pull only step 1 items
for index, value in itertools.izip(xrange(0, step1len), generator):
step2pb.update(index)
... do something other with value
Is it right to use such a complex generator protocols in python, or I need to refactor this code?
I'd refactor this to returning separate generators; you can use nested functions:
def longtask(self):
def step_generator(step):
for x in self.perform_step(step):
...
yield x.id
yield step_length_1, step_one_generator(1)
yield step_length_2, step_one_generator(2)
generators = self.longtask()
for counter, (steplength, stepgen) in enumerate(generators):
ProgressBar('Step %d' % counter, steplength)
for index, value in enumerate(stepgen):
# ....
Now you can also use the enumerate() function to add numbers to the items; that is much more readable than zipping together an xrange() and the generator.