Related
I am writing a function that divides a list into (almost) equal 'n' distributions. I want this function to return a generator, but there appears to be an issue with yielding a generator. The function works just fine with iterables. Take a look at this snippet:
import itertools
def divide_list(array, n, gen_length=None):
"""
:param array: some iterable that you wish to divide
:param n: the number of lists you would like to return
:param gen_length: The length of the generator if array is a generator. Not necessary for lists and tuples.
:return: a generator of the divided list
Example:
In: list(divide_list([1, 2, 3, 4, 5, 6, 7, 8, 9], 4))
Out: [[1, 2, 3], [4, 5], [6, 7], [8, 9]]
"""
if isinstance(array, (list, tuple)):
floor, rem = divmod(len(array), n)
items_index = (0, floor)
for _ in range(n):
prev, next_ = items_index[0], items_index[1] + 1 if rem > 0 else items_index[1]
yield array[prev:next_]
items_index = (next_, next_ + floor)
rem -= 1
else:
floor, rem = divmod(gen_length, n)
items_index = (0, floor)
for _ in range(n):
prev, next_ = items_index[0], items_index[1] + 1 if rem > 0 else items_index[1]
yield itertools.islice(array, prev, next_)
items_index = (next_, next_ + floor)
rem -= 1
if __name__ == '__main__':
array_ = iter([12, 7, 9, 31, 13, 11, 7, 3])
print('Generator:')
print('----------')
for value in divide_list(array_, 3, gen_length=8):
print(list(value))
print('')
array_ = [12, 7, 9, 31, 13, 11, 7, 3]
print('List:')
print('-----')
for value in divide_list(array_, 3):
print(value)
Here is the ouput:
Generator:
----------
[12, 7, 9]
[7, 3]
[]
List:
-----
[12, 7, 9]
[31, 13, 11]
[7, 3]
Why is the last generator exhausted? Sometimes, it exhausts the last two generators.
The explanation as to why this isn't working is because you are using islice to skip elements when you provided it a non-zero starting point. The key issue here is that you are supposed to advance the iterator by an amount, not skipping any at each yield. This is different than the sequence case, where you give it explicit indices for each case.
However, note, you don't need to handle these cases differently. Here's a super simple approach that handles both cases - the key is to always use an iterator:
def divide(iterable, n, length=None):
if length is None:
length = len(iterable)
it = iter(iterable)
floor, rem = divmod(length, n)
while result := list(islice(it, floor + bool(rem))):
yield result
rem = max(rem - 1, 0)
In the REPL:
>>> from itertools import islice
>>> def divide(iterable, n, length=None):
... if length is None:
... length = len(iterable)
... it = iter(iterable)
... floor, rem = divmod(length, n)
... while result := list(islice(it, floor + bool(rem))):
... yield result
... rem = max(rem - 1, 0)
...
>>> list(divide([1, 2, 3, 4, 5, 6, 7, 8, 9], 4))
[[1, 2, 3], [4, 5], [6, 7], [8, 9]]
>>> list(divide(iter([1, 2, 3, 4, 5, 6, 7, 8, 9]), 4, length=9))
[[1, 2, 3], [4, 5], [6, 7], [8, 9]]
the problem is you don't take into account that you already consumed the iterator
>>> import itertools
>>> array_ = iter([12, 7, 9, 31, 13, 11, 7, 3])
>>> list(itertools.islice(array_,0,3))
[12, 7, 9]
>>> list(itertools.islice(array_,3,6)) #where are 31,13 and 11? you skipped them bacause, see below
[7, 3]
>>> array_ = iter([12, 7, 9, 31, 13, 11, 7, 3])
>>> list(itertools.islice(array_,0,3))
[12, 7, 9]
>>> list(array_) #this is what remains in the iterator
[31, 13, 11, 7, 3]
>>>
Thank's to #KellyBundy, I was able to modify the code to get the expected results. I was not aware that islice was cutting the generator then shifting those values back to the beginning at index 0. I was treating it as if it left a null value at the indices that I cut off. Here is the modified code:
import itertools
def divide_list(array, n, gen_length=None):
"""
:param array: some iterable that you wish to divide
:param n: the number of lists you would like to return
:param gen_length: The length of the generator if array is a generator. Not necessary for lists and tuples.
:return: a generator of the divided list
Example:
In: list(divide_list([1, 2, 3, 4, 5, 6, 7, 8, 9], 4))
Out: [[1, 2, 3], [4, 5], [6, 7], [8, 9]]
"""
if isinstance(array, (list, tuple)):
floor, rem = divmod(len(array), n)
items_index = (0, floor)
for _ in range(n):
prev, next_ = items_index[0], items_index[1] + 1 if rem > 0 else items_index[1]
yield array[prev:next_]
items_index = (next_, next_ + floor)
rem -= 1
else:
floor, rem = divmod(gen_length, n)
up_to = floor
for _ in range(n):
up_to = up_to + 1 if rem > 0 else up_to
yield itertools.islice(array, up_to)
up_to = floor
rem -= 1
if __name__ == '__main__':
array_ = iter([12, 7, 9, 31, 13, 11, 7, 3])
print('Generator:')
print('----------')
for value in divide_list(array_, 3, gen_length=8):
print(list(value))
print('')
array_ = [12, 7, 9, 31, 13, 11, 7, 3]
print('List:')
print('-----')
for value in divide_list(array_, 3):
print(value)
This will output the expected result:
Generator:
----------
[12, 7, 9]
[31, 13, 11]
[7, 3]
List:
-----
[12, 7, 9]
[31, 13, 11]
[7, 3]
I need help to create two lists; one for quotient and one for remainder.
Eg. y = [20, 7, 88, 66, 18] and d = 9
After dividing the numbers from the list (y), I want to generate a single list to host quotient and remainder respectively; instead of adding more lists incrementally. Basically, I want the output code to generate like this:
This is the quotient.
[2, 0, 9, 7, 2]
This is the remainder.
[2, 7, 7, 3, 0]
Currently, my code makes it generate like this:
//(INPUT)//
#def calculate_quotient_and_remainder(y, d):
y = [20, 7, 88, 66, 18]
d = 9
r = []
q = []
print('This is the quotient.')
#def returnQuotient():
for i, one_a in enumerate(y):
r.append(one_a // d)
print (r)
print('\n')
#def returnRemainder():
print('This is the remainder.')
for j, one_b in enumerate(y):
q.append(one_b % d)
print (q)
//(OUTPUT)//
This is the quotient.
[2]
[2, 0]
[2, 0, 9]
[2, 0, 9, 7]
[2, 0, 9, 7, 2]
This is the remainder.
[2]
[2, 7]
[2, 7, 7]
[2, 7, 7, 3]
[2, 7, 7, 3, 0]
PLEASE HELP!
Using print only once after each loop would solve your problem. Also, firstly, you can define a function:
def calculate_quotient_and_remainder(y, d):
quotients = []
remainders = []
for num in y:
quotients.append(num // d) # This is interger division. Use "int(num/d)" if you want.
remainders.append(num % d)
return quotients, remainders
Then, you can call and print the quotients and the remainders:
y = [20, 7, 88, 66, 18]
d = 9
q, r = calculate_quotient_and_remainder(y, d)
print('Quotients:', q)
print('Remainders:', r)
Output:
Quotients: [2, 0, 9, 7, 2]
Remainders: [2, 7, 7, 3, 0]
Using list comprehension :
y = [20, 7, 88, 66, 18]
d = 9
quotient = [i//d for i in y]
reminder = [i%d for i in y]
I would like to split a list into even size chunks with n-max as an argument. If n-max is greater than the list length, split the list into it's greatest overlapping chunk.
Example:
l = [1,2,3,4,5,6,7,8,9]
n-max = 4
for k, v in enumerate (l):
if k < len(l):
a = l[k:k+n-max]
if len(a) == n-max:
yield a
>>> [[1,2,3,4],[2,3,4,5],[3,4,5,6,. [4,5,6,7],[5,6,7,8],[6,7,8,9]]
When the length of l is less than the n-max. I want it to return the maximum overlapping chunk,
l = [1,2,3]
>>> [[1,2],[2,3]]
The maximum overlapping chunk size is 2. I need the function to automatically set the decrease n-max chunk size to the possible overlapping chunk size of the list if the list cannot chunk it at the n-max.
As a pythonic way you can use a list comprehension for splitting your list by choosing the minimum of the n and len(your_list)-1 as the chunk size.
def chunk(n, lst):
n = min(n, len(lst)-1)
return [lst[i:i+n] for i in range(len(lst) - n+1)]
Demo:
In [40]: chunk(4, l)
Out[40]:
[[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 6],
[4, 5, 6, 7],
[5, 6, 7, 8],
[6, 7, 8, 9]]
In [41]: l = [1,2,3,4,5,6,7,8,9,10,11]
In [42]: chunk(16, l)
Out[42]: [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]]
my_list = [my_list[int((i**2 + i)/2):int((i**2 + 3*i + 3)/2)] for i in range(int((-1 + (1 + 8*len(my_list))**0.5)/2))]
Is there a neater solution to grouping the elements of a list into subgroups of increasing size than this?
Examples:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] --> [[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]
[1, 2, 3, 4] --> [[1], [2, 3]]
[1, 2, 3, 4, 5, 6] --> [[1], [2, 3], [4, 5, 6]]
EDIT
Here are the results from timeit:
from timeit import Timer
from itertools import count
def martijn(it):
it = iter(it)
return list([next(it) for _ in range(s)] for s in count(1))
def mathematical(it):
upper_bound = int(((1 + 8*len(it))**0.5 + 1)//2)
return [it[i*(i-1)//2:i*(i+1)//2] for i in range(1, upper_bound)]
def time(test, n):
a = Timer(lambda: martijn(test)).timeit(n)
b = Timer(lambda: mathematical(test)).timeit(n)
return round(a, 3), round(b, 3)
>>> for i in range(8):
loops = 10**max(0, (6-i))
print(time([n for n in range(10**i)], loops), loops)
(6.753, 4.416) 1000000
(1.166, 0.629) 100000
(0.366, 0.123) 10000
(0.217, 0.036) 1000
(0.164, 0.017) 100
(0.157, 0.017) 10
(0.167, 0.021) 1
(1.749, 0.251) 1
>>> for i in range(8):
loops = 10**max(0, (6-i))
print(time(range(10**i), loops), loops)
(6.721, 4.779) 1000000
(1.184, 0.796) 100000
(0.367, 0.173) 10000
(0.218, 0.051) 1000
(0.202, 0.015) 100
(0.178, 0.005) 10
(0.207, 0.002) 1
(1.872, 0.005) 1
Using a generator expression:
from itertools import count
try:
_range = xrange
except NameError:
# Python 3
_range = range
def incremental_window(it):
"""Produce monotonically increasing windows on an iterable.
Only complete windows are yielded, if the last elements do not form
a complete window they are ignored.
incremental_window('ABCDEF') -> ['A'], ['B', 'C'], ['D', 'E', 'F']
incremental_window('ABCDE') -> ['A'], ['B', 'C']
"""
it = iter(it)
return ([next(it) for _ in _range(s)] for s in count(1))
Demo:
>>> list(incremental_window([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]))
[[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]
>>> list(incremental_window([1, 2, 3, 4]))
[[1], [2, 3]]
>>> list(incremental_window([1, 2, 3, 4, 5, 6]))
[[1], [2, 3], [4, 5, 6]]
This is a generator that'll work with any iterable, including endless iterables:
>>> from itertools import count
>>> for window in incremental_window(count()):
... print window
... if 25 in window:
... break
...
[0]
[1, 2]
[3, 4, 5]
[6, 7, 8, 9]
[10, 11, 12, 13, 14]
[15, 16, 17, 18, 19, 20]
[21, 22, 23, 24, 25, 26, 27]
You could make that a one-liner with a little cheating to 'inline' the iter() call on your list object:
list([next(it) for _ in _range(s)] for it in (iter(my_list),) for s in count(1))
I'm not honestly totally clear why you want to do this, which I mention purely because there's likely a task-specific way to answer your question, but I would argue that the following is at least clearer:
def increasing_groups(l):
current_size = 1
while l:
yield l[:current_size]
l = l[current_size:]
current_size += 1
at which point you can get it via list(increasing_groups(some_list)).
You can keep track of the number of items to slice with itertools.count and you can pick the items with itertools.islice.
# Initializations and declarations
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
from itertools import count, islice
counter, it = count(0), iter(data)
# Actual list construction
result = [[item] + list(islice(it, next(counter))) for item in it]
# Making sure that the last item of the list is consistent with the previous item
if len(result) > 1 and len(result[-1]) <= len(result[-2]): del result[-1]
print(result)
# [[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]
The important thing is
if len(result) > 1 and len(result[-1]) <= len(result[-2]): del result[-1]
this line makes sure that, the last item in the list stays only if its length is greater than the last but one.
def incr_grouped(iterable):
it, n = iter(iterable), 1
while True:
yield [next(it) for _ in range(n)]
n += 1
The key here is that StopIteration exception of next(it) breaks the while loop as well. This means that you may loose the last elems which are not fitted in a group.
>>> list(incr_grouped('ABCDEF'))
[['A'], ['B', 'C'], ['D', 'E', 'F']]
>>> list(incr_grouped([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]))
[[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]
It can be made even more compact using itertools. Check Martijn Pieters' answer.
Yes, there is simple answer.
>>> test = [1, 2, 3, 4, 5, 6, 7]
>>> bound = int((-1 + (1 + 8 * len(test)) ** 0.5) / 2)
>>> res = [test[(i + 1) * i // 2 : (i + 1) * (i + 2) // 2] for i in xrange(bound)]
>>> res
[[1], [2, 3], [4, 5, 6]]
Because the size of each slice is an arithmetic sequence. And the equation to compute the total number of arithmetic sequence is known. So we could simply compute the begin and end index of each slice directly with that equation.
This
(n * (n - 1) / 2, n * (n + 1) / 2)
Gives you, according to Gauss, the start and end indices of the nth element of your new list.
Therefore
my_list[n * (n - 1) / 2 : n * (n + 1) / 2]
Is the nth element of the list, and with a bit blunt filtering:
my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[my_list[n * (n - 1) / 2: n * (n + 1)/ 2] for n in range(1, len(my_list)) if n * (n + 1)/ 2 <= len(my_list)]
# [[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]
A proper loop with an actual break would probably be better, though
Edit
Now that I know about how StopIteration is caught by list (Thank you Martjin), a simple closing condition can be done using:
list(my_list[n * (n - 1) // 2: n * (n + 1) // 2] for n in count(1) if iter(my_list[n * (n + 1)/ 2:]).next() > -1)
Provided -1 is lower than any item in your list. (And the floor divisions are for integer typing in python 3.)
I am trying to come up with a function to split the length of a list evenly depending on it's original length.
So for example if I have a dataset returned that is 2000 I would like to split it into 4. Whereas if the dataset is 1500 split it into 3.
Then to call the function:
Thread_A_DATA, Thread_B_DATA = split_list( SQL_RETURN )
I would like to do something like the following:
if len(dataset) <= 1000:
# Split in 2
a, b = split_list(dataset, 2)
if len(dataset) > 1000 or len(dataset) <= 1500:
# Split in 3
a, b, c = split_list(dataset, 3)
# etc etc...
I've managed to split a dataset in half using this code found previously on stackoverflow:
def split_list( a_list ):
half = len( a_list ) / 2
return a_list[:half], a_list[half:]
But I can't work it out with 3,4 or 5 splits!
If anyone can help that would be great.
Thanks in advance.
As I understand the question, you don't want to split every 500 elements but instead split in 2 if there are less than 1000 elements, in 3 if less than 1500, 4 for 2000, etc. But if there are 1700 elements, you would split in 4 groups of 425 elements (that's what I understand by "split evenly").
So, here's my solution:
def split_list(a_list, number_of_splits):
step = len(a_list) / number_of_splits + (1 if len(a_list) % number_of_splits else 0)
return [a_list[i*step:(i+1)*step] for i in range(number_of_splits)]
l = [1, 8, 2, 3, 4, 5, 6, 7, 1, 5, 3, 1, 2, 5]
print l
print split_list(l, 3)
print split_list(l, 2)
Output
[1, 8, 2, 3, 4, 5, 6, 7, 1, 5, 3, 1, 2, 5]
[[1, 8, 2, 3, 4], [5, 6, 7, 1, 5], [3, 1, 2, 5]]
[[1, 8, 2, 3, 4, 5, 6], [7, 1, 5, 3, 1, 2, 5]]
edit: Python 3 version:
def split_list(a_list, number_of_splits):
step = len(a_list) // number_of_splits + (1 if len(a_list) % number_of_splits else 0)
return [a_list[i*step:(i+1)*step] for i in range(number_of_splits)]
l = [1, 8, 2, 3, 4, 5, 6, 7, 1, 5, 3, 1, 2, 5]
print(l)
print(split_list(l, 3))
print(split_list(l, 2))
Python 3
def splitList(L):
return[L[i:i+500] for i in range(0, len(L), 500)]
Python 2
def splitList(L):
return[L[i:i+500] for i in xrange(0, len(L), 500)]
def split_it(a_list,size_of_split):
return zip(*[iter(a_list)]*size_of_split)
is fun
print split_it(range(100),3) # splits it into groups of 3
unfortunatly this will truncate the end of the list if it does not divide evenly into split_size ... you can fix it like so
return zip(*[iter(a_list)]*size_of_split) + [tuple(a_list[-(len(a_list)%size_of_split):])]
if you wanted to cut it into 7 pieces say you can find the size of the split by
split_size = len(a_list) / num_splits
Python 2.7
>>> import math
>>> lst = range(35)
>>> t = 3 # how many items to be splited
>>> n = int(math.ceil(len(lst) / float(t)))
>>> res = [lst[i:i+n] for i in range(0, len(lst), n)]
>>> res
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], [24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34]]