In Python 3, zip(*iterables) as of the documentation
Returns an iterator of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables. The iterator stops when the shortest input iterable is exhausted.
As an example, I am running
for x in zip(a,b):
f(x)
Is there a way to find out which of the iterables, a or b, led to the stopping of the zip iterator?
Assume that len() is not reliable and iterating over both a and b to check their lengths is not feasible.
I found the following solution which replaces zip with a for loop over only the first iterable and iterates over the second one inside the loop.
ib = iter(b)
for r in a:
try:
s = next(ib)
except StopIteration:
print('Only b exhausted.')
break
print((r,s))
else:
try:
s = next(ib)
print('Only a exhausted.')
except StopIteration:
print('a and b exhausted.')
Here ib = iter(b) makes sure that it also works if b is a sequence or generator object. print((r,s)) would be replaced by f(x) from the question.
I think Jan has the best answer. Basically, you want to handle the last iteration from zip separately.
import itertools as it
a = (x for x in range(5))
b = (x for x in range(3))
iterables = ((it.chain(g,[f"generator {i} was exhausted"]) for i,g in enumerate([a,b])))
for i, j in zip(*iterables):
print(i, j)
# 0 0
# 1 1
# 2 2
# 3 generator 1 was exhausted
If you have only two iterables, you can use the below code. The exhausted[0] will have your indicator for which iterator was exhausted. Value of None means both were exhausted.
However I must say that I do not agree with len() not being reliable. In fact, you should depend on the len() call to determine the answer. (unless you tell us the reason why you can not.)
def f(val):
print(val)
def manual_iter(a,b, exhausted):
iters = [iter(it) for it in [a,b]]
iter_map = {}
iter_map[iters[0]] = 'first'
iter_map[iters[1]] = 'second'
while 1:
values = []
for i, it in enumerate(iters):
try:
value = next(it)
except StopIteration:
if i == 0:
try:
next(iters[1])
except StopIteration:
return None
exhausted.append(iter_map[it])
return iter_map[it]
values.append(value)
yield tuple(values)
if __name__ == '__main__':
exhausted = []
a = [1,2,3]
b = [10,20,30]
for x in manual_iter(a,b, exhausted):
f(x)
print(exhausted)
exhausted = []
a = [1,2,3,4]
b = [10,20,30]
for x in manual_iter(a,b, exhausted):
f(x)
print(exhausted)
exhausted = []
a = [1,2,3]
b = [10,20,30,40]
for x in manual_iter(a,b, exhausted):
f(x)
print(exhausted)
See below for by me written function zzip() which will do what you want to achieve. It uses the zip_longest method from the itertools module and returns a tuple with what zip would return plus a list of indices which if not empty shows at which 0-based position(s) was/were the iterable/iterables) becoming exhausted before other ones:
def zzip(*args):
""" Returns a tuple with the result of zip(*args) as list and a list
with ZERO-based indices of iterables passed to zzip which got
exhausted before other ones. """
from itertools import zip_longest
nanNANaN = 'nanNANaN'
Zipped = list(zip_longest(*args, fillvalue=nanNANaN))
ZippedT = list(zip(*Zipped))
Indx_exhausted = []
indx_nanNANaN = None
for i in range(len(args)):
try: # gives ValueError if nanNANaN is not in the column
indx_nanNANaN = ZippedT[i].index(nanNANaN)
Indx_exhausted += [(indx_nanNANaN, i)]
except ValueError:
pass
if Indx_exhausted: # list not empty, iterables were not same length
Indx_exhausted.sort()
min_indx_nanNANaN = Indx_exhausted[0][0]
Indx_exhausted = [
i for n, i in Indx_exhausted if n == min_indx_nanNANaN ]
return (Zipped[:min_indx_nanNANaN], Indx_exhausted)
else:
return (Zipped, Indx_exhausted)
assert zzip(iter([1,2,3]),[4,5],iter([6])) ==([(1,4,6)],[2])
assert zzip(iter([1,2]),[3,4,5],iter([6,7]))==([(1,3,6),(2,4,7)],[0,2])
assert zzip([1,2],[3,4],[5,6]) ==([(1,3,5),(2,4,6)],[])
The code above runs without raising an assertion error on the used test cases.
Notice that the 'for loop' in the function loops over the items of the passed parameter list and not over the elements of the passed iterables.
Related
I'm working on a problem from stuy's coding problems and came across this one.
So given two generators that each output numbers in increasing order, merge the two generators into one generator that outputs the numbers in increasing order. If duplicates occur, output the number as many times as it occurs.
My attempt: Since I'm more familiar with working with lists, tuples, dictionaries, etc, I thought I'd just make a helper to create a list of items in the generators. Then I'd merge the two lists and sort them
def list_maker(gener):
l1 = []
for item in gener:
l1.append(item)
return l1
def merge_gens(first_gen, second_gen):
first_list = list_maker(first_gen)
second_list = list_maker(second_gen)
first_list.extend(second_list)
final_list = first_list
final_list.sort()
yield from final_list
Although this approach seems to work on finite generators, it does not on infinite generators(which I forgot to account for). I obviously can't have a list of infinite items. Could I get help on how to do this without importing python libraries?
You can try :
def merge(first, second):
a = next(first)
b = next(second)
while(True):
# yield the smaller one
yield a if a < b else b
# get the next number from the
# generator that yielded the smaller one
if a < b:
a = next(first)
elif a==b:
# when the numbers are equal
# yield second number a second time
yield a
# get the next numbers from both the generators.
a = next(first)
b = next(second)
else:
b = next(second)
Sorry for the lack of comments and explanation. I haven't tested edge cases. I hope you get the general gist of the approach and would help you get the pointers to work on your task further.
Assumption
- StopIteration exceptions will be handled by the callee
This was a bit tricky to handle the edge cases, I had fun with this. Haven't tested it fully, and it's in a pretty verbose state right now, a couple helper functions could add clarity:
def merge(first, second):
first = iter(first)
second = iter(second)
exhausted = object()
f = next(first, exhausted)
if f is exhausted:
yield from second
s = next(second, exhausted)
if s is exhausted:
yield f
yield from first
return
while True:
if f is exhausted:
if s is not exhausted:
yield s
yield from second
return
elif s is exhausted:
if f is not exhausted:
yield f
yield from first
return
elif f < s:
yield f
f = next(first, exhausted)
elif f == s:
yield f
yield s
f = next(first, exhausted)
s = next(second, exhausted)
else:
yield s
s = next(second, exhausted)
I think the following makes it more readable by removing some of the deeper nesting and re-using logic:
def merge(first, second):
first = iter(first)
second = iter(second)
exhausted = object() # just a unique sentinel value
def _cleanup(item, iterator):
if item is not exhausted:
yield item
yield from iterator
f = next(first, exhausted)
if f is exhausted:
yield from second
s = next(second, exhausted)
if s is exhausted:
yield from _cleanup(f, first)
return
while True:
if f is exhausted:
yield from _cleanup(s, second)
return
elif s is exhausted:
yield from _cleanup(f, first)
return
elif f < s:
yield f
f = next(first, exhausted)
elif f == s:
yield f
yield s
f = next(first, exhausted)
s = next(second, exhausted)
else:
yield s
s = next(second, exhausted)
The key idea is to keep asking for a value from each of the iterators, yielding the smallest item (or if they are equal, yield both items), and only drawing from the iterator that gave you the smallest item (or from both if they are equal) until one iterator is exhausted then you clean it all up by delegating to the other.
Is it possible to use a generator or iterator in a while loop in Python? For example, something like:
i = iter(range(10))
while next(i):
# your code
The point of this would be to build iteration into the while loop statement, making it similar to a for loop, with the difference being that you can now additional logic into the while statement:
i = iter(range(10))
while next(i) and {some other logic}:
# your code
It then becomes a nice for loop/while loop hybrid.
Does anyone know how to do this?
In Python >= 3.8, you can do the following, using assignment expressions:
i = iter(range(10))
while (x := next(i, None)) is not None and x < 5:
print(x)
In Python < 3.8 you can use itertools.takewhile:
from itertools import takewhile
i = iter(range(10))
for x in takewhile({some logic}, i):
# do stuff
"Some logic" here would be a 1-arg callable receciving whatever next(i) yields:
for x in takewhile(lambda e: 5 > e, i):
print(x)
0
1
2
3
4
There are two problems with while next(i):
Unlike a for loop, the while loop will not catch the StopIteration exception that is raised if there is no next value; you could use next(i, None) to return a "falsey" value in that case, but then the while loop will also stop whenever the iterator returns an actual falsey value
The value returned by next will be consumed and no longer available in the loop's body. (In Python 3.8+, that could be solved with an assignment expression, see other answer.)
Instead, you could use a for loop with itertools.takewhile, testing the current element from the iterable, or just any other condition. This will loop until either the iterable is exhausted, or the condition evaluates to false.
from itertools import takewhile
i = iter(range(10))
r = 0
for x in takewhile(lambda x: r < 10, i):
print("using", x)
r += x
print("result", r)
Output:
using 0
...
using 4
result 10
You just need to arrange for your iterator to return a false-like value when it expires. E.g., if we reverse the range so that it counts down to 0:
>>> i = iter(range(5, -1, -1))
>>> while val := next(i):
... print('doing something here with value', val)
...
This will result in:
doing something here with value 5
doing something here with value 4
doing something here with value 3
doing something here with value 2
doing something here with value 1
a = iter(range(10))
try:
next(a)
while True:
print(next(a))
except StopIteration:
print("Stop iteration")
You can do
a = iter(range(10))
try:
a.next()
while True and {True or False logic}:
print("Bonjour")
a.next()
except StopIteration:
print("a.next() Stop iteration")
So I'm trying to figure out this problem and I can't figure out why it isn't working.
The premise is that you're given an input list and you have to find the second-lowest value. The list can have any number of integers and can repeat values; you can't change the list.
My code:
def second_min(x):
input_list = list(x)
print input_list
list_copy = list(input_list)
list_set = set(list_copy)
if len(list_set) > 1:
list_copy2 = list(list_set)
list_copy2 = list_copy2.sort()
return list_copy2[1]
else:
return None
print second_min([4,3,1,5,1])
print second_min([1,1,1])
The outputs for those two inputs are:
3
None
It's giving me errors on lines 9 and 13.
TypeError: 'NoneType' object has no attribute '__getitem__'
Thanks!
list_copy2 = list_copy2.sort()
.sort() sorts the list in place and returns None. So you're sorting the list, then throwing it away. You want just:
list_copy2.sort()
Or:
list_copy2 = sorted(list_set)
sorted always returns a list, so you can use it to sort the set and convert it to a list in one step!
You need to use sorted instead of sort. sorted returns a new list, that is a sorted version of the original. sort will sort the list in-place, and returns None upon doing so.
def second_min(x):
if len(x) > 1:
return sorted(x)[1]
else:
return None
>>> second_min([4,3,1,5,1])
1
Help, I can't use sorted! It's not allowed!
def second_min(li):
if len(li) < 2:
return None
it = iter(li)
a, b = next(it), next(it)
next_lowest, lowest = max(a, b), min(a, b)
for x in it:
if x < next_lowest:
if x < lowest:
lowest, next_lowest = x, lowest
else:
next_lowest = x
return next_lowest
my code consists of me recreating the function 'filter()' and using it with a function to filter words longer than 5 characters. It worked with the actual function filter when I tried it btw...I'm using python 3+
def filter1(fn, a):
i = 0
while i != len(a):
u = i - 1
a[i] = fn(a[i], a[u])
i += 1
return a
def filter_long_words(l):
if len[l] > 5:
return [l]
listered = ['blue', 'hdfdhsf', 'dsfjbdsf', 'jole']
print(list(filter1(filter_long_words, listered)))
getting error
TypeError: filter_long_words() takes 1 positional argument but 2 were given
You are passing two parameters to fn (which refers to filter_long_words) here:
a[i] = fn(a[i], a[u])
But filter_long_words only accepts one parameter.
Notes:
You can loop through lists using for item in my_list, or if you want index as well for index, item in enumerate(my_list).
I think you might get an IndexError since u will be -1 in the first round of your loop.
The filter function can also be expressed as a list comprehension: (item for item in listered if filter_long_words(item))
My version of filter would look like this, if I have to use a for loop:
def my_filter(fn, sequence):
if fn is None:
fn = lambda x: x
for item in sequence:
if fn(item):
yield item
Since you have stated that you are using Python 3, this returns a generator instead of a list. If you want it to return a list:
def my_filter(fn, sequence):
if fn is None:
fn = lambda x: x
acc = []
for item in sequence:
if fn(item):
acc.append(item)
return acc
If you don't need to use a for loop:
def my_filter(fn, sequence):
if fn is None:
fn = lambda x: x
return (item for item in sequence if fn(item))
Your're calling fn with 2 parameters in filter1(fn, a), and since you've passed filter_long_words() to filter1 as fn, that triggers the error.
But there's more weird stuff:
I don't understand the magick of filter1 or what you were trying to
accomplish, but it seems to me that you don't have a clear idea what to do.
But if you want to mimic (somehow) how filter works, you have to return a
list which contains only items for which the fn function returns true. When
you know this, you can rewrite it - here are a few suggestions for rewrite
# explicit, inefficient and long, but straightforward version:
def filter1(fn, a):
new_list = []
for item in a:
if fn(item):
new_list.append(item):
return new_list
# shorter version using list comprehensions:
def filter1(fn, a):
return [item for item in a if fn(item)]
The filter_long_words function is wrong too - it should return True or
False. The only reason why it could work is because any non-empty list is
treated as True by python and default return value of a function is None,
which translates to False. But it's confusing and syntactically wrong to use
len[l] - the proper usage is len(l).
There are a few suggestions for rewrite, which all returns explicit boolean
values:
# unnecessary long, but self-explanatory:
def filter_long_words(l):
if len(l) > 5:
return True
else
return False
# short variant
def filter_long_words(l):
return len(l) > 5
You are calling "filter_long_words" with 2 parameter => fn(a[i], a[u]) also there is an error
def filter_long_words(l):
if **len[l]** > 5:
return [l]
len is builtin method it should be len(l)
I am trying to use iterators more for looping since I heard it is faster than index looping. One thing I am not sure is about how to treat the end of the sequence nicely. The way I can think of is to use try and except StopIteration, which looks ugly to me.
To be more concrete, suppose we are asked to print the merged sorted list of two sorted lists a and b. I would write the following
aNull = False
I = iter(a)
try:
tmp = I.next()
except StopIteration:
aNull = True
for x in b:
if aNull:
print x
else:
if x < tmp:
print x
else:
print tmp,x
try:
tmp = I.next()
except StopIteration:
aNull = True
while not aNull:
print tmp
try:
tmp = I.next()
except StopIteration:
aNull = True
How would you code it to make it neater?
I think handling a and b more symmetrically would make it easier to read. Also, using the built-in next function in Python 2.6 with a default value avoids the need to handle StopIteration:
def merge(a, b):
"""Merges two iterators a and b, returning a single iterator that yields
the elements of a and b in non-decreasing order. a and b are assumed to each
yield their elements in non-decreasing order."""
done = object()
aNext = next(a, done)
bNext = next(b, done)
while (aNext is not done) or (bNext is not done):
if (bNext is done) or ((aNext is not done) and (aNext < bNext)):
yield aNext
aNext = next(a, done)
else:
yield bNext
bNext = next(b, done)
for i in merge(iter(a), iter(b)):
print i
The following function generalizes the approach to work for arbitrarily many iterators.
def merge(*iterators):
"""Merges a collection of iterators, returning a single iterator that yields
the elements of the original iterators in non-decreasing order. Each of
the original iterators is assumed to yield its elements in non-decreasing
order."""
done = object()
n = [next(it, done) for it in iterators]
while any(v is not done for v in n):
v, i = min((v, i) for (i, v) in enumerate(n) if v is not done)
yield v
n[i] = next(iterators[i], done)
You're missing the whole point of iterators. You don't manually call I.next(), you just iterate through I.
for tmp in I:
print tmp
Edited
To merge two iterators, use the very handy functions in the itertools module. The one you want is probably izip:
merged = []
for x, y in itertools.izip(a, b):
if x < y:
merged.append(x)
merged.append(y)
else:
merged.append(y)
merged.append(x)
Edit again
As pointed out in the comments, this won't actually work, because there could be multiple items from list a smaller than the next item in list b. However, I realised that there is another built-in funciton that deals with this: heapq.merge.
The function sorted works with lists and iterators. Maybe it is not what you desire, but the following code works.
a.expand(b)
print sorted(iter(a))