This question already has answers here:
What does the "yield" keyword do in Python?
(51 answers)
Closed 5 months ago.
I stumble upon this code from pymotw.com in merging and splitting section.
from itertools import *
def make_iterables_to_chain():
yield [1, 2, 3]
yield ['a', 'b', 'c']
for i in chain.from_iterable(make_iterables_to_chain()):
print(i, end=' ')
print()
I can not understand how make_iterables_to_chain() is working. It contains two yield statement, how does it work?
I know how generators work but there but there was only single yield statement.
Help, please!
The same way a single yield works.
You can have as many yields as you like in a generator, when __next__ is called on it, it will execute until it bumps into the next yield. You then get back the yielded expression and the generator pauses until it's __next__ method is invoked again.
Run a couple of next calls on the generator to see this:
>>> g = make_iterables_to_chain() # get generator
>>> next(g) # start generator, go to first yield, get result
[1, 2, 3]
>>> next(g) # resume generator, go to second yield, get result
['a', 'b', 'c']
>>> # next(g) raises Exception since no more yields are found
A generator effectively allows a function to return multiple times. Every time a yield statement is executed, the value is returned to the caller, and the caller can continue the function's execution.
Usually, they are used as iterables in for loops.
The following function increments every element in an iterable by an amount:
def inc_each(nums, inc):
for i in nums:
yield i + inc
Here is an example of the usage:
gen = inc_each([1, 2, 3, 4], 100)
print(list(gen)) # [101, 102, 103, 104]
list is used here to convert an arbitrary iterable (in this case a generator) to a list.
The function you describe executes two yield statements:
def make_iterables_to_chain():
yield [1, 2, 3]
yield ['a', 'b', 'c']
If you call it, it returns a generator that, if iterated through, yields the lists [1, 2, 3] and ['a', 'b', 'c'].
gen = make_iterables_to_chain()
print(list(gen)) # [[1, 2, 3], ['a', 'b', 'c']]
itertools.chain.from_iterable will take a (possibly infinite) iterable of iterables and "flatten" it, returning a (possible infinite) iterable as the result.
Here is a way it could be implemented:
def from_iterable(iterables):
for iterable in iterables:
for i in iterable:
yield i
Just adding on top of the previous answers that when using such code:
def my_yield_function():
yield 1
yield 'test'
yield 2
yield true
You can unpack all the value with such code:
w,x,y,z = my_yield_function()
Related
I have an iterator that consists of several lists of the same size. For my purpose I need to know the length of at least one of these lists. But as it is with iterators they can't be accessed the same way as ordinary arrays. So my idea was to get this length by saying:
for i in iter:
list_len = len(i)
break
And this works, however, when using this list later on, and wanting to loop over it again it skips the first iteration, and basically continues from the next iteration from the previous loop (the one above).
Is there some way to fix this ? Or, what is the pythonic way of doing it ?
I was thinking/reading about doing it like:
from itertools import tee
iter_tmp, iter = tee(iter)
for i in iter_tmp:
list_len = len(i)
break
And yeah, that works too, since I can now use the original iter for later use, but it just hurt my eyes that I have to make a loop, import itertools and such just to get the length of a list in an iterator. But maybe that is just the way to go about it ?
UPDATE
Just trying to further explain what I'm doing.
As such iterations is not a list or an array, but in my case, if I were to loop through my iterator I would get something like (in the case of my iterator having four "lists" in it):
>>> for i in iter_list:
print(i)
[1, 2, 5, 3]
[3, 2, 5, 8]
[6, 8, 3, 7]
[1, 4, 6, 1]
Now, all "lists" in the iterator has the same length, but since the lists themselves are calculated through many steps, I really don't know the length in any way before it enters the iterator. If I don't use an iterator I run out of memory - so it is a pro/con solution. But yeah, it is the length of just one of the lists I need as a constant I can use throughout the rest of my code.
That is how iterators work. But you have a few options apart from tee.
You can extract the first element and reuse it when iterating the second time:
first_elem = next(my_iter)
list_len = len(first_elem)
for l in itertools.chain([first_elem], my_iter):
pass
Or if you are going to iterate over the iterator more times, you could perhaps listify it (if it's feasible to fit in memory).
my_list = list(my_iter)
first_len = len(my_list[0])
for l in my_list:
pass
And certainly not the least, as Palivek said, keep/get the information about the length of the lists (from) somewhere else.
In general iterators are not re-iteratable so you'll probably need to store something additional anyway.
class peek_iterator(object):
def __init__(self, source):
self._source = iter(source)
self._first = None
self._sent = False
def __iter__(self):
return self
def next(self):
if self._first is None:
self._first = self._source.next()
if self._sent:
return self._source.next()
self._sent = True
return self._first
def get_isotropic(self, getter):
if self._first is None:
self._first = self._source.next()
return getter(self._first)
lists = [[1, 2, 3], [4, 5, 6]]
i = peek_iterator(lists)
print i.get_isotropic(len) # 3
for j in i: print j # [1, 2, 3]; [4, 5, 6]
You can do a little trick and wrap the original iterator in a generator. This way, you can obtain the first element and "re-yield" it with the generator without consuming the entire iterator. The head() function below returns the first element and a generator that iterates over the original sequence.
def head(seq):
seq_iter = iter(seq)
first = next(seq_iter)
def gen():
yield first
yield from seq_iter
return first, gen()
seq = range(100, 300, 50)
first, seq2 = head(seq)
print('first item: {}'.format(first))
for item in seq2:
print(item)
Output:
first item: 100
100
100
150
200
250
This is conceptually equivalent to Moberg's answer, but uses a generator to "re-assemble" the original sequence instead of itertools.chain().
So currently working through MIT's OpenCourseWare computer science course online and I am having trouble trying to understand one of the recursive examples.
def f(L):
result = []
for e in L:
if type(e) != list:
result.append(e)
else:
return f(e)
return result
When the following input is given:
print f([1, [[2, 'a'], ['a','b']], (3, 4)])
The output is:
[2, 'a']
I am having trouble trying to understand how this function actually works or what it is doing. Shouldn't the function eventually be adding every string or int into the result list? I just need help with trying to understand how this function "winds up" and "unwinds"
I feel like the output should be:
[1,2,'a','a','b',3,4]
Any help would be appreciated thanks!
The function f returns a (shallow) copy of the first flat list it encounters with depth first search.
Why? Well first let us take a look at the base case: a list that contains no lists. Like [1,'a',2,5]. In that case the the if statement will always succeed, and therefore all elements of e will be added to the result and the result is returned.
Now what about the recursive case. This means there is an element that is a list. Like for instance [1,['a',2],5]. Now for the first element, the if succeeds, so 1 is added to the result list. But for the second element ['a',2] the if fails. This means we perform a recursive call on f with ['a',2]. Now since that list does not contain any sublists, we know it will return a copy of that list.
Note however that we immediately return the result of that recursive call. So from the moment we take the else branch, the result is of no importance anymore: we will return what that f(e) returns.
If we make the assumption we cannot construct a loop of infinite deep sublists (actually we can, but in that case we will get a stack overflow exception), we will eventually obtain a flat list and obtain that copy.
Example: If we take your sample input [1, [[2, 'a'], ['a','b']], (3, 4)]. We can trace the calls. So we first call f on that list, it will generate the following "trace":
# **trace** of an example function call
f([1, [[2, 'a'], ['a','b']], (3, 4)]):
result = []
# for 1 in L:
# if type(1) == list: # fails
# else
result.append(1) # result is now [1]
# for [[2,'a'],['a','b']] in L:
# if type([[2,'a'],['a','b']]) == list: succeeds
return f([[2,'a'],['a','b']])
result = []
# for [2,'a'] in L:
# if type([2,'a']) == list: succeeds
return f([2,'a'])
result = []
# for 2 in L:
# if type(2) == list: fails
# else:
result.append(2) # result is now [2]
# for 'a' in [2,'a']:
# if type('a') == list: fails
# else:
result.append('a') # result is now [2,'a']
return [2,'a']
return [2,'a']
return [2,'a']
Flattening:
Given you wanted to flatten the list instead of returning the first flat list, you can rewrite the code to:
def f(L):
result = []
for e in L:
if type(e) != list:
result.append(e)
else:
result += f(e)
return result
Note that this will only flatten lists (and not even subclasses of lists).
so by your suggested answer I see that you understand about the idea of the code. It digs deeper and deeper until it finds an element. But look about the back-step to the upper levels:
When it reaches the deepest point for the first time (the elements of the [2,'a'] list) it finish the loop on this level and returns the results 2 and a. And here is a RETURN statement ... that means the loop is stoped and thus no other elements are found.
The open question is now, why it is not showing 1 as part of result ? For the same reason, the RETURN is the result of the lower levels (2,a) and the result of the upper level. If you change "result" to a global variable, the outcome will be [1, 2, 'a']
best regards
The function as posted returns/exits when running into a first bottom-down list-element which doesn't contain a list - this prevents walking all the further branches of the recursion. For example:
print( f([1, [[2, 'a', [3, 'b', [4, 'c']]], ['a','b']], (3, 4)]) )
# gives: [4, 'c']
print( f([1, ['X','Y'], [[2, 'a', [3, 'b', [4, 'c']]], ['a','b']], (3, 4)]) )
# gives: ['X','Y']
The key point causing this behaviour is the line
result = []
This resets the list with results on each call of the function to an empty list. This way only one item is returned up from the chain of the recursion calls.
By the way the function f below does what you have expected, doesn't it?
def f(L, result):
for e in L:
if type(e) != list:
result.append(e)
else:
f(e, result)
result=[]; f([1, [[2, 'a', [3, 'b', [4, 'c']]], ['a','b']], (3, 4)], result) print( result )
# gives: [1, 2, 'a', 3, 'b', 4, 'c', 'a', 'b', (3, 4)]
result=[]; f( [1, ['X','Y'], [[2, 'a', [3, 'b', [4, 'c']]], ['a','b']], (3, 4)], result); print( result )
# gives: [1, 'X', 'Y', 2, 'a', 3, 'b', 4, 'c', 'a', 'b', (3, 4)]
NOTICE: (3,4) is a TUPLE not a list ...
The function f as above collects items from a list if these items are not a list themselves. In the special case, when the item in the list is a list, the function calls itself to collect the items from this list. This way all they way down of the hierarchy every element is collected, no matter how deep one needs to dig down. This is the beauty of recursion - a function calling itself does the 'magic' of visiting all of the branches and their leaves down a tree :)
The best answer in What is the most “pythonic” way to iterate over a list in chunks? using the the function izip_longest to chunk a list. But I cannot understand it.
def grouper(iterable, n, fillvalue=None):
args = [iter(iterable)] * n
return izip_longest(*args, fillvalue=fillvalue)
for item in grouper(range(10), 4):
print list(item)
I run the code above, then the chunked lists is created:
[1 ,2, 3, 4]
[5, 6, 7, 8]
[9, 10, None, None]
I tried to run it step by step:
In [1]: args = [iter(range(10))] * 4
In [2]: args
Out[2]:
[<listiterator at 0x1ad7610>,
<listiterator at 0x1ad7610>,
<listiterator at 0x1ad7610>,
<listiterator at 0x1ad7610>]
A list is created by the same iterator. I know the function izip_longest is implemented to generate pairs of lists. How is the iterator transformed to the chunked lists by izip_longest? Thanks.
The grouper function just zips the original iterable with offset versions of itself. Using [iter(iterable)] * n creates a list with n references to the same iterator. These are not independent copies; they are all references to the same object, so advancing one advances them all. Here's a simple example:
>>> x = [1, 2, 3]
>>> a, b = [iter(x)] * 2
>>> next(a)
1
>>> next(b)
2
izip_longest, like zip, takes one element at a time from each iterator. So first it grabs the first element from the first iterable in args, which will be the first element of the original iterable. But when it grabs this element, it advances all the iterators, because they are all linked. So when izip_longest goes to get an element from the next iterator, it will get the second element from the original iterable. It goes on like this; every time it grabs an element from one iterator, it advances all of them, so that the item it grabs from the next iterator will be the next item from the original iterable.
I'm new to Python and programming. Can someone explain the following codes in details?
def myzip(*seqs):
seqs = [list(S) for S in seqs]
res = []
while all(seqs):
res.append(tuple(S.pop(0) for S in seqs))
return res
>>> myzip([1, 2, 3], ['a', 'b', 'c'])
[(1, 'a'), (2, 'b'), (3, 'c')]
Especially, I don't understand the S is for element in a list (e.g. 1, 2...) or the list ([1, 2, 3]).
I think I need a detail explanation for each line.
In the list comprehension, S is assigned each of the arguments passed to the function; seqs is a list of arguments passed in, and you passed in two lists. So S is first bound to [1, 2, 3] then ['a', 'b', 'c'].
>>> seqs = [[1, 2, 3], ['a', 'b', 'c']]
>>> seqs[0]
[1, 2, 3]
The first line just makes sure that all arguments are turned into lists, explicitly, so that you can later on call list.pop(0) on each. This allows you to pass in strings, tuples, dictionaries, or any other iterable as an argument to this function:
>>> myzip('123', 'abc')
[('1', 'a'), ('2', 'b'), ('3', 'c')]
The while all(seqs): loop then iterates until there is at least one argument that is empty. In other words, the loop terminates when the shortest sequence has been exhausted:
>>> myzip('1', 'abc')
[('1', 'a')]
In the loop, the first element of each of the input arguments is removed from the list and added to res as a tuple. For the 2 input lists, that means that first (1, 'a') is added to res, followed by (2, 'b') then (3, 'c').
seqs is the list of two separate lists: [1,2,3] and ['a', 'b', 'c']
Now while all(seqs): will iterate through the elements of seqs - the two lists mentioned above.
We then create an empty list res and append to it tuple objects.
Each tuple object will progressively contain the first element of each of the list in seqs. pop(0) will return the first element and remove it from the list thus changing the list in place (lists are mutable).
Thus what you are doing is you are creating a list of tuples obtained by pairing the corresponding elements in both the lists.
When you say seqs = [list(S) for S in seqs], S refers to each of the list element in seqs. However, in this particular call to the function, since you are passing lists as elements this statement becomes redundant.
First You need to know what is zip function. Because this function is doing the same job as zip in python.
def myzip(*seqs):
First line says this function gets as many argument you want and all of them will be gather in one list as seqs. Usage like myzip([1, 2, 3], ['a', 'b', 'c']) gives you seqs = [[1, 2, 3], ['a', 'b', 'c']].
seqs = [list(S) for S in seqs]
Then you want to make sure every item in seqs are list items. This line convert every item to list. This is what list does. (Even '123' to ['1', '2', '3'])
res = []
while all(seqs):
res.append(tuple(S.pop(0) for S in seqs))
return res
In these four lines it pops first element of each S of seqs and creates a tuple for final result. The final result is a list (res = []).
In the loop condition: all(seqs) it checks if all elements of seqs are available. If one them goes empty it breaks the loop.
In side the loop, pop(0) removes the first element from S and return it as value S.pop(0). This way it updates all elements of seqs. for next loop.
tuple creates tuple like (1, 'a') out of all first elements. Next iteration is going to be `(2, 'b') because all first elements popped before.
All these tuples in a list res is its goal. res.append adds these tuple to the final result.
I have a generator function and want to get the first ten items from it; my first attempt was:
my_generator()[:10]
This doesn't work because generators aren't subscriptable, as the error tells me. Right now I have worked around that with:
list(my_generator())[:10]
This works since it converts the generator to a list; however, it's inefficient and defeats the point of having a generator. Is there some built-in, Pythonic equivalent of [:10] for generators?
import itertools
itertools.islice(mygenerator(), 10)
itertools has a number of utilities for working with iterators. islice takes start, stop, and step arguments to slice an iterator just as you would slice a list.
to clarify the above comments:
from itertools import islice
def fib_gen():
a, b = 1, 1
while True:
yield a
a, b = b, a + b
assert [1, 1, 2, 3, 5] == list(islice(fib_gen(), 5))