Multiple usage of generator from list in Python - python

Basically, I'm in following situation - I generate a list, e.g.
l = [2*x for x in range(10)]
which I iterate through later on multipletimes, e.g.
for i in l: print i # 0,2,4,6,8,10,12,14,16,18
for i in l: print i # 0,2,4,6,8,10,12,14,16,18
for i in l: print i # 0,2,4,6,8,10,12,14,16,18
The problem is that the list is way too large to fit into memory, hence I use its generator form, i.e.:
l = (2*x for x in range(10))
However, after this construction only first iteration works:
for i in l: print i # 0,2,4,6,8,10,12,14,16,18
for i in l: print i #
for i in l: print i #
Where is the problem? How may I iterate through it multipletimes?

Your generator is exhausted the first time. You should recreate your generator each time to renew it:
l = (2*x for x in range(10))
for i in l: print i
l = (2*x for x in range(10))
for i in l: print i
(Note: you should use xrange in python 2 because range creates a list in memory)
You can create also a shortcut function to help you or even a generator function:
def gen():
for i in range(10):
yield 2 * i
and then:
for i in gen(): print i
for i in gen(): print i

You can also iterate on the generator directly:
for i in (2*x for x in range(10)): print i
for i in (2*x for x in range(10)): print i
...

Related

Python zip(): Check which iterable got exhausted

In Python 3, zip(*iterables) as of the documentation
Returns an iterator of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables. The iterator stops when the shortest input iterable is exhausted.
As an example, I am running
for x in zip(a,b):
f(x)
Is there a way to find out which of the iterables, a or b, led to the stopping of the zip iterator?
Assume that len() is not reliable and iterating over both a and b to check their lengths is not feasible.
I found the following solution which replaces zip with a for loop over only the first iterable and iterates over the second one inside the loop.
ib = iter(b)
for r in a:
try:
s = next(ib)
except StopIteration:
print('Only b exhausted.')
break
print((r,s))
else:
try:
s = next(ib)
print('Only a exhausted.')
except StopIteration:
print('a and b exhausted.')
Here ib = iter(b) makes sure that it also works if b is a sequence or generator object. print((r,s)) would be replaced by f(x) from the question.
I think Jan has the best answer. Basically, you want to handle the last iteration from zip separately.
import itertools as it
a = (x for x in range(5))
b = (x for x in range(3))
iterables = ((it.chain(g,[f"generator {i} was exhausted"]) for i,g in enumerate([a,b])))
for i, j in zip(*iterables):
print(i, j)
# 0 0
# 1 1
# 2 2
# 3 generator 1 was exhausted
If you have only two iterables, you can use the below code. The exhausted[0] will have your indicator for which iterator was exhausted. Value of None means both were exhausted.
However I must say that I do not agree with len() not being reliable. In fact, you should depend on the len() call to determine the answer. (unless you tell us the reason why you can not.)
def f(val):
print(val)
def manual_iter(a,b, exhausted):
iters = [iter(it) for it in [a,b]]
iter_map = {}
iter_map[iters[0]] = 'first'
iter_map[iters[1]] = 'second'
while 1:
values = []
for i, it in enumerate(iters):
try:
value = next(it)
except StopIteration:
if i == 0:
try:
next(iters[1])
except StopIteration:
return None
exhausted.append(iter_map[it])
return iter_map[it]
values.append(value)
yield tuple(values)
if __name__ == '__main__':
exhausted = []
a = [1,2,3]
b = [10,20,30]
for x in manual_iter(a,b, exhausted):
f(x)
print(exhausted)
exhausted = []
a = [1,2,3,4]
b = [10,20,30]
for x in manual_iter(a,b, exhausted):
f(x)
print(exhausted)
exhausted = []
a = [1,2,3]
b = [10,20,30,40]
for x in manual_iter(a,b, exhausted):
f(x)
print(exhausted)
See below for by me written function zzip() which will do what you want to achieve. It uses the zip_longest method from the itertools module and returns a tuple with what zip would return plus a list of indices which if not empty shows at which 0-based position(s) was/were the iterable/iterables) becoming exhausted before other ones:
def zzip(*args):
""" Returns a tuple with the result of zip(*args) as list and a list
with ZERO-based indices of iterables passed to zzip which got
exhausted before other ones. """
from itertools import zip_longest
nanNANaN = 'nanNANaN'
Zipped = list(zip_longest(*args, fillvalue=nanNANaN))
ZippedT = list(zip(*Zipped))
Indx_exhausted = []
indx_nanNANaN = None
for i in range(len(args)):
try: # gives ValueError if nanNANaN is not in the column
indx_nanNANaN = ZippedT[i].index(nanNANaN)
Indx_exhausted += [(indx_nanNANaN, i)]
except ValueError:
pass
if Indx_exhausted: # list not empty, iterables were not same length
Indx_exhausted.sort()
min_indx_nanNANaN = Indx_exhausted[0][0]
Indx_exhausted = [
i for n, i in Indx_exhausted if n == min_indx_nanNANaN ]
return (Zipped[:min_indx_nanNANaN], Indx_exhausted)
else:
return (Zipped, Indx_exhausted)
assert zzip(iter([1,2,3]),[4,5],iter([6])) ==([(1,4,6)],[2])
assert zzip(iter([1,2]),[3,4,5],iter([6,7]))==([(1,3,6),(2,4,7)],[0,2])
assert zzip([1,2],[3,4],[5,6]) ==([(1,3,5),(2,4,6)],[])
The code above runs without raising an assertion error on the used test cases.
Notice that the 'for loop' in the function loops over the items of the passed parameter list and not over the elements of the passed iterables.

Simple list function in Python

I am trying to append values (x) to a list if the numbers are divisible by 2 and then print out that list. Seems pretty simple. The following runs but returns None:
x = []
def test():
while x in range(0,100):
if x % 2 == 0:
x.append()
print(test())
Use for to iterate a range - not while.
You have ambiguous meaning to x - both as iteration variable and as a list.
You need to pass the value to append.
You need to return a value so it would be printed through the print statement - otherwise None is the default.
Fixed:
x = []
def test():
for i in range(0,100):
if i % 2 == 0:
x.append(i)
return x
print(test())
Other notes:
You can only use this once. The next call for test would return a list twice the size, since x is a global variable. I believe this is unintended and can be solved by putting the x = [] inside the function.
A list comprehension like [x for x in range(100) if x % 2 == 0] would be much better.
Problems and Fixes
Your have several problems with your code:
You named your list x and your iterate variable x.
You never append a value to your list.
You never return a list from test. Rather than appending to a global list, make a local list in test and return that list.
You're using a while loop when you should be using a for loop.
After the above changes, you code look likes:
def test():
even_numbers = []
for number in range(0, 100):
if number % 2 == 0:
even_numbers.append(number)
return even_numbers
print(test())
Improvements
Note there are better ways to do this. In this case, a list comprehension is a better choice. List comprehensions can be used to avoid the common pattern of building a list of values - such as your case:
def test():
return [n for n in range(0, 100) if n % 2 == 0]
print(test())
Generally you should pass the variable to the function and return it from the function instead of relying on global variables:
def test(x):
...
return x
However while x in range(0, 100) won't work because it will check if x (an empty list) is contained in the range object. But the range only contains numbers so the while loop body will never execute. So you could use a for-loop instead:
def test(x):
for i in range(0, 100):
if i % 2 == 0:
x.append(i)
return x
But you could also use the fact that range supports a step and remove the if by just using step=2:
def test(x):
for i in range(0, 100, 2):
x.append(i)
return x
At this point you could even just extend (append an iterable to) the list:
def test(x):
x.extend(range(0, 100, 2))
return x
You could also use a step of 2 to avoid if tests :
x = []
def test():
for i in range(0,100,2):
x.append(i)
return x
Or use a list comprehension :
x = []
def test():
x = [i for i in range(0,100,2)]
return x
Or use the range it's self as a list:
x = []
def test()
x = list(range(0,100,2))
return x
Rename your inner x variable for simplicity. Use x.append(n) instead of x.append(). Finally, print the variable not the method.
You should either change the list variable or the iteration variable(x in while loop) . and append function works like this list.append(value).

Nesting a string inside a list n times ie list of a list of a list

def nest(x, n):
a = []
for i in range(n):
a.append([x])
return a
print nest("hello", 5)
This gives an output
[['hello'], ['hello'], ['hello'], ['hello'], ['hello']]
The desired output is
[[[[["hello"]]]]]
Every turn through the loop you are adding to the list. You want to be further nesting the list, not adding more stuff onto it. You could do it something like this:
def nest(x, n):
for _ in range(n):
x = [x]
return x
Each turn through the loop, x has another list wrapped around it.
instead of appending you sould wrap x and call recursively the method till call number is lesser than n
def nest(x, n):
if n <= 0:
return x
else:
return [nest(x, n-1)]
Here is a pythonic recursion approach:
In [8]: def nest(x, n):
...: return nest([x], n-1) if n else x
DEMO:
In [9]: nest(3, 4)
Out[9]: [[[[3]]]]
In [11]: nest("Stackoverflow", 7)
Out[11]: [[[[[[['Stackoverflow']]]]]]]

Understanding iterables and generators in Python

I recently came across the idea of generators in Python, so I made a basic example for myself:
def gen(lim):
print 'This is a generator'
for elem in xrange(lim):
yield elem
yield 'still generator...'
print 'done'
x = gen
print x
x = x(10)
print x
print x.next()
print x.next()
I was wondering if there was any way to iterate through my variable x and have to write out print x.next() 11 times to print everything.
That's the whole point of using a generator in the first place:
for i in x:
print i
This is a generator
0
1
2
3
4
5
6
7
8
9
still generator...
done
Yes. You can actually just iterate through the generator as if it were a list (or other iterable):
x = gen(11)
for i in x:
print i
Calling x.next() is actually not particular to generators — you could do it with a list too if you wanted to. But you don't do it with a list, you use a for loop: same with generators.
You can use for loop to iterate generator.
def gen(lim):
print 'This is a generator'
for elem in xrange(lim):
yield elem
yield 'still generator...'
print 'done'
for x in gen(10):
print x

Python Generator Cutoff

I have a generator that will keep giving numbers that follow a specific formula. For sake of argument let's say this is the function:
# this is not the actual generator, just an example
def Generate():
i = 0
while 1:
yield i
i+=1
I then want to get a list of numbers from that generator that are below a certain threshold. I'm trying to figure out a pythonic way of doing this. I don't want to edit the function definition. I realize you could just use a while loop with your cutoff as the condition, but I'm wondering if there is a better way. I gave this a try, but soon realized why it wouldn't work.
l = [x for x in Generate() x<10000] # will go on infinitely
So is there a correct way of doing this.
Thanks
An itertools solution to create another iterator:
from itertools import takewhile
l = takewhile(lambda x: x < 10000, generate())
Wrap it in list() if you are sure you want a list:
l = list(takewhile(lambda x: x < 10000, generate()))
Or if you want a list and like inventing wheels:
l = []
for x in generate():
if x < 10000:
l.append(x)
else:
break
Wrap your generator within another generator:
def no_more_than(limit):
def limiter(gen):
for item in gen:
if item > limit:
break
yield item
return limiter
def fib():
a,b = 1,1
while 1:
yield a
a,b = b,a+b
cutoff_at_100 = no_more_than(100)
print list(cutoff_at_100(fib()))
Prints:
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
itertools.takewhile will only work until it comes across an item that does not fulfill the predicate. If you need to return all values from a possibly unordered iterable, I'd recommend using itertools.ifilter for Python 2.x as in
from itertools import ifilter
f = ifilter(lambda x: x < 400, gen())
f.next()
This filtered a generator yielding random integers between 0 and 400 as hoped.
FWIW itertools.ifilter was deprecated in Python 3.x in favour of the built-in filter() which has slightly different syntax for iterating
f = filter(lambda x: x < 400, gen())
next(f)
Wrap it on a zip generator of a range of the limit:
gen = range(100_000_000_000)
limit = 10
(z[1] for z in zip(range(limit), gen))
zip creates a tuple, that is the reason for z[1]
This may be used on for loops:
for g in (z[1] for z in zip(range(limit), gen)):
print(g)
Or you could use lambda:
wrap = lambda gen, limit: (z[1] for z in zip(range(limit), gen))
for g in wrap(gen, 10):
print(g)
Just use a counter for an infinite generator:
gen=Generate() # your generator function example
l=[gen.next() for i in range(100)]
But since it is a generator, use a generator expression:
seq=(gen.next() for i in xrange(100)) #need x in xrange in Python 2.x; 3.x use range
Edit
OK, then just use a controller:
def controler(gen,limit):
n=gen.next()
while n<limit:
yield n
n=gen.next()
seq=[i for i in controler(Generate(),100)]
replace Generate() with xrange(10000)

Categories