Python islice class is not working as expected - python

I am writing a function that divides a list into (almost) equal 'n' distributions. I want this function to return a generator, but there appears to be an issue with yielding a generator. The function works just fine with iterables. Take a look at this snippet:
import itertools
def divide_list(array, n, gen_length=None):
"""
:param array: some iterable that you wish to divide
:param n: the number of lists you would like to return
:param gen_length: The length of the generator if array is a generator. Not necessary for lists and tuples.
:return: a generator of the divided list
Example:
In: list(divide_list([1, 2, 3, 4, 5, 6, 7, 8, 9], 4))
Out: [[1, 2, 3], [4, 5], [6, 7], [8, 9]]
"""
if isinstance(array, (list, tuple)):
floor, rem = divmod(len(array), n)
items_index = (0, floor)
for _ in range(n):
prev, next_ = items_index[0], items_index[1] + 1 if rem > 0 else items_index[1]
yield array[prev:next_]
items_index = (next_, next_ + floor)
rem -= 1
else:
floor, rem = divmod(gen_length, n)
items_index = (0, floor)
for _ in range(n):
prev, next_ = items_index[0], items_index[1] + 1 if rem > 0 else items_index[1]
yield itertools.islice(array, prev, next_)
items_index = (next_, next_ + floor)
rem -= 1
if __name__ == '__main__':
array_ = iter([12, 7, 9, 31, 13, 11, 7, 3])
print('Generator:')
print('----------')
for value in divide_list(array_, 3, gen_length=8):
print(list(value))
print('')
array_ = [12, 7, 9, 31, 13, 11, 7, 3]
print('List:')
print('-----')
for value in divide_list(array_, 3):
print(value)
Here is the ouput:
Generator:
----------
[12, 7, 9]
[7, 3]
[]
List:
-----
[12, 7, 9]
[31, 13, 11]
[7, 3]
Why is the last generator exhausted? Sometimes, it exhausts the last two generators.

The explanation as to why this isn't working is because you are using islice to skip elements when you provided it a non-zero starting point. The key issue here is that you are supposed to advance the iterator by an amount, not skipping any at each yield. This is different than the sequence case, where you give it explicit indices for each case.
However, note, you don't need to handle these cases differently. Here's a super simple approach that handles both cases - the key is to always use an iterator:
def divide(iterable, n, length=None):
if length is None:
length = len(iterable)
it = iter(iterable)
floor, rem = divmod(length, n)
while result := list(islice(it, floor + bool(rem))):
yield result
rem = max(rem - 1, 0)
In the REPL:
>>> from itertools import islice
>>> def divide(iterable, n, length=None):
... if length is None:
... length = len(iterable)
... it = iter(iterable)
... floor, rem = divmod(length, n)
... while result := list(islice(it, floor + bool(rem))):
... yield result
... rem = max(rem - 1, 0)
...
>>> list(divide([1, 2, 3, 4, 5, 6, 7, 8, 9], 4))
[[1, 2, 3], [4, 5], [6, 7], [8, 9]]
>>> list(divide(iter([1, 2, 3, 4, 5, 6, 7, 8, 9]), 4, length=9))
[[1, 2, 3], [4, 5], [6, 7], [8, 9]]

the problem is you don't take into account that you already consumed the iterator
>>> import itertools
>>> array_ = iter([12, 7, 9, 31, 13, 11, 7, 3])
>>> list(itertools.islice(array_,0,3))
[12, 7, 9]
>>> list(itertools.islice(array_,3,6)) #where are 31,13 and 11? you skipped them bacause, see below
[7, 3]
>>> array_ = iter([12, 7, 9, 31, 13, 11, 7, 3])
>>> list(itertools.islice(array_,0,3))
[12, 7, 9]
>>> list(array_) #this is what remains in the iterator
[31, 13, 11, 7, 3]
>>>

Thank's to #KellyBundy, I was able to modify the code to get the expected results. I was not aware that islice was cutting the generator then shifting those values back to the beginning at index 0. I was treating it as if it left a null value at the indices that I cut off. Here is the modified code:
import itertools
def divide_list(array, n, gen_length=None):
"""
:param array: some iterable that you wish to divide
:param n: the number of lists you would like to return
:param gen_length: The length of the generator if array is a generator. Not necessary for lists and tuples.
:return: a generator of the divided list
Example:
In: list(divide_list([1, 2, 3, 4, 5, 6, 7, 8, 9], 4))
Out: [[1, 2, 3], [4, 5], [6, 7], [8, 9]]
"""
if isinstance(array, (list, tuple)):
floor, rem = divmod(len(array), n)
items_index = (0, floor)
for _ in range(n):
prev, next_ = items_index[0], items_index[1] + 1 if rem > 0 else items_index[1]
yield array[prev:next_]
items_index = (next_, next_ + floor)
rem -= 1
else:
floor, rem = divmod(gen_length, n)
up_to = floor
for _ in range(n):
up_to = up_to + 1 if rem > 0 else up_to
yield itertools.islice(array, up_to)
up_to = floor
rem -= 1
if __name__ == '__main__':
array_ = iter([12, 7, 9, 31, 13, 11, 7, 3])
print('Generator:')
print('----------')
for value in divide_list(array_, 3, gen_length=8):
print(list(value))
print('')
array_ = [12, 7, 9, 31, 13, 11, 7, 3]
print('List:')
print('-----')
for value in divide_list(array_, 3):
print(value)
This will output the expected result:
Generator:
----------
[12, 7, 9]
[31, 13, 11]
[7, 3]
List:
-----
[12, 7, 9]
[31, 13, 11]
[7, 3]

Related

Splitting a list on non-sequential numbers

I have an ordered list of entities, numbered in a broken sequence:
[1, 2, 3, 6, 7, 11, 17, 18, 19]
I'd like to break the list where there's a gap, and collect the results in a new list:
[[1, 2, 3], [6, 7], [11], [17, 18, 19]]
I have the feeling there's a name for what I want to do and probably a nice library function for it - but I can't think of it. Can anyone shine some light before I possibly reinvent a wheel?
edit: Thanks, folks, but I was asking if there's a name for this operation and an existing algorithm, not for implementations - this is what I came up with:
def group_adjoining(elements, key=lambda x: x):
"""Returns list of lists of contiguous elements
:key: function to get key integer from list element
"""
if not elements:
return elements
result = [[elements[0]]]
for a, b in zip(elements, elements[1:]):
if key(a) + 1 == key(b):
result[-1].append(b)
else:
result.append([b])
return result
Plain itertools.groupby approach:
from itertools import groupby
lst = [1, 2, 3, 6, 7, 11, 17, 18, 19]
out = []
for _, g in groupby(enumerate(lst), lambda x: x[0] - x[1]):
out.append([v for _, v in g])
print(out)
Prints:
[[1, 2, 3], [6, 7], [11], [17, 18, 19]]
Try greedy approach:
lst = [1, 2, 3, 6, 7, 11, 17, 18, 19]
res = []
tmp = []
prv = lst[0]
for l in lst:
if l-prv > 1:
res.append(tmp)
tmp = []
tmp.append(l)
prv = l
res.append(tmp)
print(res)
Output: [[1, 2, 3], [6, 7], [11], [17, 18, 19]]
I first came across more_itertools today, and I think this package is useful for this problem.
pip install more-itertools
from more_itertools import split_when
l = [1, 2, 3, 6, 7, 11, 17, 18, 19]
res = list(split_when(l, lambda a, b: a + 1 != b))
print(res)
You could use a simple generator.
def split(lst):
result = []
for item in lst:
if (not result) or result[-1] + 1 == item:
result.append(item)
else:
yield result
result = [item]
if result:
yield result
foo = [1, 2, 3, 6, 7, 11, 17, 18, 19]
result = [i for i in split(foo)]
print(result) # [[1, 2, 3], [6, 7], [11], [17, 18, 19]]
This assumes a sorted homogeneous list of int.
You could always avoid the sorted assumption with for item in sorted(lst):.
It's pretty easy by using this simple function:
li = [1, 2, 3, 6, 7, 9, 10, 11, 12, 14, 16, 17, 18]
def split(li):
result = []
temp = [li[0]]
for i in range(1, len(li)):
if li[i] - temp[-1] == 1:
temp.append(li[i])
else:
result.append(temp)
temp = [li[i]]
result.append(temp)
return result
print(split(li))

reverse ascending sequences in a list

Trying to figure out how to reverse multiple ascending sequences in a list.
For instance: input = [1,2,2,3] to output = [2,1,3,2].
I have used mylist.reverse() but of course it reverses to [3,2,2,1]. Not sure which approach to take?
Example in detail:
So lets say [5, 7, 10, 2, 7, 8, 1, 3] is the input - the output should be [10,7,5,8,7,2,3,1]. In this example the first 3 elements 5,7,10 are in ascending order, 2,7,8 is likewise in ascending order and 1,3 also in ascending order. The function should be able to recognize this pattern and reverse each sequence and return a new list.
All you need is to find all non-descreasing subsequences and reverse them:
In [47]: l = [5, 7, 10, 2, 7, 8, 1, 3]
In [48]: res = []
In [49]: start_idx = 0
In [50]: for idx in range(max(len(l) - 1, 0)):
...: if l[idx] >= l[idx - 1]:
...: continue
...: step = l[start_idx:idx]
...: step.reverse()
...: res.extend(step)
...: start_idx = idx
...:
In [51]: step = l[start_idx:]
In [52]: step.reverse()
In [53]: res.extend(step)
In [54]: print(res)
[10, 7, 5, 8, 7, 2, 3, 1]
For increasing subsequences you need to change if l[idx] >= l[idx - 1] to if l[idx] > l[idx - 1]
Walk the list making a bigger and bigger window from x to y positions. When you find a place where the next number is not ascending, or reach the end, reverse-slice the window you just covered and add it to the end of an output list:
data = [5, 7, 10, 2, 7, 8, 1, 3]
output = []
x = None
for y in range(len(data)):
if y == len(data) - 1 or data[y] >= data[y+1]:
output.extend(data[y:x:-1])
x = y
print(output)
There is probably a more elegant way to do this, but one approach would be to use itertools.zip_longest along with enumerate to iterate over sequential element pairs in your list and keep track of each index where the sequence is no longer ascending or the list is exhausted in order to slice, reverse, and extend your output list with the sliced items.
from itertools import zip_longest
d = [5, 7, 10, 2, 7, 8, 1, 3]
results = []
stop = None
for i, (a, b) in enumerate(zip_longest(d, d[1:])):
if not b or b <= a:
results.extend(d[i:stop:-1])
stop = i
print(results)
# [10, 7, 5, 8, 7, 2, 3, 1]
data = [5, 7, 10, 2, 7, 8, 1, 3,2]
def func(data):
result =[]
temp =[]
data.append(data[-1])
for i in range(1,len(data)):
if data[i]>=data[i-1]:
temp.append(data[i-1])
else:
temp.append(data[i-1])
temp.reverse()
result.extend(temp)
temp=[]
if len(temp)!=0:
temp.reverse()
result.extend(temp)
temp.clear()
return result
print(func(data))
# output [10, 7, 5, 8, 7, 2, 3, 1, 2]
You could define a general handy method which returns slices of an array based on condition (predicate).
def slice_when(predicate, iterable):
i, x, size = 0, 0, len(iterable)
while i < size-1:
if predicate(iterable[i], iterable[i+1]):
yield iterable[x:i+1]
x = i + 1
i += 1
yield iterable[x:size]
Now, the slice has to be made when the next element is smaller then the previous, for example:
array = [5, 7, 10, 2, 7, 8, 1, 3]
slices = slice_when(lambda x,y: x > y, array)
print(list(slices))
#=> [[5, 7, 10], [2, 7, 8], [1, 3]]
So you can use it as simple as:
res = []
for e in slice_when(lambda x,y: x > y, array):
res.extend(e[::-1] )
res #=> [10, 7, 5, 8, 7, 2, 3, 1]

Stretch lists to fit each others sizes

Say I have 3 mismatched sizes of lists, [3, 7, 6], [12, 67, 89, 98], and [1, 2, 3, 4, 5, 6, 7]
I want a function to do this:
>>> stretch([3, 7, 6], [12, 67, 89, 2], [1, 2, 3, 4, 5, 6, 7])
[3, 3, 3, 7, 7, 6, 6], [12, 67, 67, 89, 89, 2, 2], [1, 2, 3, 4, 5, 6, 7]
So, what I want is for all the smaller lists to be stretched to the length of the largest list. If there is 2 largest equally sized lists leave them the same length. I tried using one function, but it only worked using ranges. I would like it to work with everything. Here it is for reference:
import numpy
def zipstretch(*args):
range_tups = [(x[0], x[-1]) for x in args]
shifts = [x[0] for x in range_tups]
range_tups = [(x[0]-y, x[1]-y, n) for x, y in zip(range_tups, shifts)]
ranges = []
for x, s in zip(range_tups, shifts):
h = s
temp = list()
for y in range(max(range_tups, key=lambda z: len(z))[0], max(range_tups, key=lambda z: len(z))[1]):
temp.append(h)
h+=x[1]/max(range_tups, key=lambda z: len(z))[1]
ranges.append(numpy.array(temp))
return ranges
I came up with this:
def stretch(*lists):
length = max([len(l) for l in lists])
return [[l[i * len(l) // length] for i in range(length)]
for l in lists]
It computes the the target length as the maximum over all lists, and then stretches the list based on their index, similar to how you would implement naïve, one-dimensional image scaling.
Not super Pythonic, Florian's answer is much better, but I wanted to post the answer I came up with anyway.
def stretch(*lists):
max_len = max([len(l) for l in lists])
stretched = []
for l in lists:
init_factor = factor = (max_len - 1) / len(l)
new_l = []
i = 0
j = 1
while j <= max_len:
new_l.append(l[i])
if j > factor:
i += 1
factor += init_factor
j += 1
stretched.append(new_l)
return stretched
stretch([3, 7, 6], [12, 67, 89, 2], [1, 2, 3, 4, 5, 6, 7])
# => [[3, 3, 3, 7, 7, 6, 6], [12, 12, 67, 67, 89, 2, 2], [1, 2, 3, 4, 5, 6, 7]]

trouble appending lists to another list

I have this code which takes a list of numbers and groups together all those that add up to 21. My problem is that in the end I want the numbers to all be lists in a list, but I am having trouble achieving that. Any advise would be appreciated
def twentyone(seq, groups = []):
goal = 21
s = sum(groups)
final = []
if s == goal:
final.append(groups)
print (final)
if s >= goal:
return
for i in range(len(seq)):
n = seq[i]
remaining = seq[i+1:]
twentyone(remaining, groups + [n])
#
seq = [1, 5, 6, 7, 10, 2, 11]
(twentyone(seq))
current output is:
[[1, 5, 6, 7, 2]]
[[1, 7, 2, 11]]
[[5, 6, 10]]
[[10, 11]]
I want the output to be:
[[1, 5, 6, 7, 2], [1, 7, 2, 11], [5, 6, 10], [10, 11]]
You are creating new final list each time when it recursively calls itself. You just have to pass it as a default argument.
def twentyone(seq, groups = [], final = []): #default final list
goal = 21
s = sum(groups)
if s == goal:
final.append(groups)
if s >= goal:
return
for i in range(len(seq)):
n = seq[i]
remaining = seq[i+1:]
twentyone(remaining, groups + [n])
return final
seq = [1, 5, 6, 7, 10, 2, 11]
print twentyone(seq)
Results:-
[[1, 5, 6, 7, 2], [1, 7, 2, 11], [5, 6, 10], [10, 11]]
But the above solution will cause the final list grow each time twentyone function will be called. So we can create a new final list only for the first time it is called using first_call flag as follows:
def twentyone(seq, groups = None, final = None, first_call=True):
if not groups:
groups = []
if first_call:
final = []
goal = 21
s = sum(groups)
if s == goal:
final.append(groups)
if s >= goal:
return
for i in range(len(seq)):
n = seq[i]
remaining = seq[i+1:]
twentyone(remaining, groups + [n], final, False)
return final
seq = [1, 5, 6, 7, 10, 2, 11]
print twentyone(seq)
print twentyone(seq)
Yields:
[[1, 5, 6, 7, 2], [1, 7, 2, 11], [5, 6, 10], [10, 11]]
[[1, 5, 6, 7, 2], [1, 7, 2, 11], [5, 6, 10], [10, 11]]
extending on Tanveers answer (which is the best approach in my opinion), you could also move the final variable outside or even use a static variable. The main problem in your code is that you are creating new final variable in each recursive call. These code below fixes that.
Local variable approach:
final = []
def twentyone(seq, groups = []):
goal = 21
s = sum(groups)
if s == goal:
final.append(groups)
if s >= goal:
return
for i in range(len(seq)):
n = seq[i]
remaining = seq[i+1:]
twentyone(remaining, groups + [n])
seq = [1, 5, 6, 7, 10, 2, 11]
twentyone(seq)
print (final)
Static variable approach:
class myfinal:
final=[]
def twentyone(seq, groups = []):
goal = 21
s = sum(groups)
if s == goal:
myfinal.final.append(groups)
if s >= goal:
return
for i in range(len(seq)):
n = seq[i]
remaining = seq[i+1:]
twentyone(remaining, groups + [n])
seq = [1, 5, 6, 7, 10, 2, 11]
twentyone(seq)
print (myfinal.final)

Python grouping elements in a list in increasing size

my_list = [my_list[int((i**2 + i)/2):int((i**2 + 3*i + 3)/2)] for i in range(int((-1 + (1 + 8*len(my_list))**0.5)/2))]
Is there a neater solution to grouping the elements of a list into subgroups of increasing size than this?
Examples:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] --> [[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]
[1, 2, 3, 4] --> [[1], [2, 3]]
[1, 2, 3, 4, 5, 6] --> [[1], [2, 3], [4, 5, 6]]
EDIT
Here are the results from timeit:
from timeit import Timer
from itertools import count
def martijn(it):
it = iter(it)
return list([next(it) for _ in range(s)] for s in count(1))
def mathematical(it):
upper_bound = int(((1 + 8*len(it))**0.5 + 1)//2)
return [it[i*(i-1)//2:i*(i+1)//2] for i in range(1, upper_bound)]
def time(test, n):
a = Timer(lambda: martijn(test)).timeit(n)
b = Timer(lambda: mathematical(test)).timeit(n)
return round(a, 3), round(b, 3)
>>> for i in range(8):
loops = 10**max(0, (6-i))
print(time([n for n in range(10**i)], loops), loops)
(6.753, 4.416) 1000000
(1.166, 0.629) 100000
(0.366, 0.123) 10000
(0.217, 0.036) 1000
(0.164, 0.017) 100
(0.157, 0.017) 10
(0.167, 0.021) 1
(1.749, 0.251) 1
>>> for i in range(8):
loops = 10**max(0, (6-i))
print(time(range(10**i), loops), loops)
(6.721, 4.779) 1000000
(1.184, 0.796) 100000
(0.367, 0.173) 10000
(0.218, 0.051) 1000
(0.202, 0.015) 100
(0.178, 0.005) 10
(0.207, 0.002) 1
(1.872, 0.005) 1
Using a generator expression:
from itertools import count
try:
_range = xrange
except NameError:
# Python 3
_range = range
def incremental_window(it):
"""Produce monotonically increasing windows on an iterable.
Only complete windows are yielded, if the last elements do not form
a complete window they are ignored.
incremental_window('ABCDEF') -> ['A'], ['B', 'C'], ['D', 'E', 'F']
incremental_window('ABCDE') -> ['A'], ['B', 'C']
"""
it = iter(it)
return ([next(it) for _ in _range(s)] for s in count(1))
Demo:
>>> list(incremental_window([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]))
[[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]
>>> list(incremental_window([1, 2, 3, 4]))
[[1], [2, 3]]
>>> list(incremental_window([1, 2, 3, 4, 5, 6]))
[[1], [2, 3], [4, 5, 6]]
This is a generator that'll work with any iterable, including endless iterables:
>>> from itertools import count
>>> for window in incremental_window(count()):
... print window
... if 25 in window:
... break
...
[0]
[1, 2]
[3, 4, 5]
[6, 7, 8, 9]
[10, 11, 12, 13, 14]
[15, 16, 17, 18, 19, 20]
[21, 22, 23, 24, 25, 26, 27]
You could make that a one-liner with a little cheating to 'inline' the iter() call on your list object:
list([next(it) for _ in _range(s)] for it in (iter(my_list),) for s in count(1))
I'm not honestly totally clear why you want to do this, which I mention purely because there's likely a task-specific way to answer your question, but I would argue that the following is at least clearer:
def increasing_groups(l):
current_size = 1
while l:
yield l[:current_size]
l = l[current_size:]
current_size += 1
at which point you can get it via list(increasing_groups(some_list)).
You can keep track of the number of items to slice with itertools.count and you can pick the items with itertools.islice.
# Initializations and declarations
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
from itertools import count, islice
counter, it = count(0), iter(data)
# Actual list construction
result = [[item] + list(islice(it, next(counter))) for item in it]
# Making sure that the last item of the list is consistent with the previous item
if len(result) > 1 and len(result[-1]) <= len(result[-2]): del result[-1]
print(result)
# [[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]
The important thing is
if len(result) > 1 and len(result[-1]) <= len(result[-2]): del result[-1]
this line makes sure that, the last item in the list stays only if its length is greater than the last but one.
def incr_grouped(iterable):
it, n = iter(iterable), 1
while True:
yield [next(it) for _ in range(n)]
n += 1
The key here is that StopIteration exception of next(it) breaks the while loop as well. This means that you may loose the last elems which are not fitted in a group.
>>> list(incr_grouped('ABCDEF'))
[['A'], ['B', 'C'], ['D', 'E', 'F']]
>>> list(incr_grouped([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]))
[[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]
It can be made even more compact using itertools. Check Martijn Pieters' answer.
Yes, there is simple answer.
>>> test = [1, 2, 3, 4, 5, 6, 7]
>>> bound = int((-1 + (1 + 8 * len(test)) ** 0.5) / 2)
>>> res = [test[(i + 1) * i // 2 : (i + 1) * (i + 2) // 2] for i in xrange(bound)]
>>> res
[[1], [2, 3], [4, 5, 6]]
Because the size of each slice is an arithmetic sequence. And the equation to compute the total number of arithmetic sequence is known. So we could simply compute the begin and end index of each slice directly with that equation.
This
(n * (n - 1) / 2, n * (n + 1) / 2)
Gives you, according to Gauss, the start and end indices of the nth element of your new list.
Therefore
my_list[n * (n - 1) / 2 : n * (n + 1) / 2]
Is the nth element of the list, and with a bit blunt filtering:
my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[my_list[n * (n - 1) / 2: n * (n + 1)/ 2] for n in range(1, len(my_list)) if n * (n + 1)/ 2 <= len(my_list)]
# [[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]
A proper loop with an actual break would probably be better, though
Edit
Now that I know about how StopIteration is caught by list (Thank you Martjin), a simple closing condition can be done using:
list(my_list[n * (n - 1) // 2: n * (n + 1) // 2] for n in count(1) if iter(my_list[n * (n + 1)/ 2:]).next() > -1)
Provided -1 is lower than any item in your list. (And the floor divisions are for integer typing in python 3.)

Categories