Split list into chunks with repeats between chunks - python

I have an array of elements, for example r = np.arange(15).
I'm trying to split this array into chunks of consecutive elements, where each chunk (except maybe the last one) has size M and there are m repeating elements between each pair of chunks.
For example: split_to_chunks(np.arange(15), M=5, m=1) should yield four lists:
[0, 1, 2, 3, 4], [4, 5, 6, 7, 8], [8, 9, 10, 11, 12], [12, 13, 14]
Obviously this can be done iteratively, but I'm looking for a more "pythonic" (and faster) way of doing this.

Something like this with list comprehension:
[l[i*(M-m):i*(M-m)+M] for i in range(math.ceil((len(l)-m)/(M-m)))]
Example:
import math
l = list(range(15))
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
m, M = 2, 5
[l[i*(M-m):i*(M-m)+M] for i in range(math.ceil((len(l)-m)/(M-m)))]
# [[0, 1, 2, 3, 4],
# [3, 4, 5, 6, 7],
# [6, 7, 8, 9, 10],
# [9, 10, 11, 12, 13],
# [12, 13, 14]]
m, M = 3, 5
[l[i*(M-m):i*(M-m)+M] for i in range(math.ceil((len(l)-m)/(M-m)))]
# [[0, 1, 2, 3, 4],
# [2, 3, 4, 5, 6],
# [4, 5, 6, 7, 8],
# [6, 7, 8, 9, 10],
# [8, 9, 10, 11, 12],
# [10, 11, 12, 13, 14]]
l = range(5)
m, M = 2, 3
[l[i*(M-m):i*(M-m)+M] for i in range(math.ceil((len(l)-m)/(M-m)))]
# [range(0, 3), range(1, 4), range(2, 5)]
Explanation:
Chunk i starts at index i*(M-m) and ends M positions later at index i*(M-m) + M.
chunk index starts ends
-------------------------------------------------
0 0 M
1 M-m M-m+M = 2*M-m
2 2*M-m-m=2(M-m) 2*(M-m)+M = 3M-2m
...
Now the problem is to determine how many chunks.
At each step we increase the initial index by M-m, so to count the total number of steps we need to divide the length of the list by M-m (but after subtracting m because in the first chunk we're not skipping anything).
Finally, use the ceiling function to add the last incomplete chunk in case the division is not exact.

This should do the job:
def split_to_chunks(r, M=5, m=1):
return [r[i*(M-m): (i+1)*M-i*m] for i in range(len(r)//(M-m)+1) if i*(M-m) < len(r)]
Explanation: in a list comprehension loop through the indexes in the way explained in the question. Each start of a chunk will start at i*(M-m) and end at (i+1)*M-i*m. Finally if the start of the chunk is after the length of the array it will skip it.

Related

Generate all possible sequences of N elements with sequential rules

I have a function get_appendable_values(sequence) that takes a sequence (even empty) and returns a list of all the values appendable (as a last element) to that sequence. I need to generate all the possible sequences of 4 elements, with respect to the rules defined in this function and starting with an empty sequence.
Example :
Let's say the implementation of get_appendable_values is :
def get_appendable_values(sequence):
'''Dummy rules'''
if len(sequence) == 2:
return [4, 12]
if sequence[-1] == 4:
return [7]
return [0, 9]
Expected output :
[[0, 0, 4, 7],
[0, 0, 12, 0],
[0, 0, 12, 9],
[0, 9, 4, 7],
[0, 9, 12, 0],
[0, 9, 12, 9],
[9, 0, 4, 7],
[9, 0, 12, 0],
[9, 0, 12, 9],
[9, 9, 4, 7],
[9, 9, 12, 0],
[9, 9, 12, 9]]
I have the feeling that recursion is the key, but I could not figure it out.
Yes, recursion is the key. To generate a sequence of size 4, you first generate all sequences of size 3, and add all possible endings to them. Likewise, to generate a sequence of size 3, you need all sequences of size 2... and so forth down to size 0.
def get_appendable_values(sequence):
'''Dummy rules'''
if len(sequence) == 2:
return [4, 12]
#need a len check here to avoid IndexError when `sequence` is empty
if len(sequence) > 0 and sequence[-1] == 4:
return [7]
return [0, 9]
def generate_sequences(size):
if size == 0:
yield []
else:
for left_part in generate_sequences(size-1):
for right_part in get_appendable_values(left_part):
yield left_part + [right_part]
for seq in generate_sequences(4):
print(seq)
Result:
[0, 0, 4, 7]
[0, 0, 12, 0]
[0, 0, 12, 9]
[0, 9, 4, 7]
[0, 9, 12, 0]
[0, 9, 12, 9]
[9, 0, 4, 7]
[9, 0, 12, 0]
[9, 0, 12, 9]
[9, 9, 4, 7]
[9, 9, 12, 0]
[9, 9, 12, 9]
I'm not sure if I understand your problem correctly. Do you want to get a list of the possible permutations of length 4, drawn from the sequence?
In that case, the itertools package might come in handy (see How do I generate all permutations of a list?):
import itertools
a = [2, 4, 6, 8, 10]
permutations_object = itertools.permutations(a, 4)
print(list( permutations_object ))
This outputs a list of tuples which are the permutations:
[(2, 4, 6, 8), (2, 4, 6, 10), (2, 4, 8, 6), (2, 4, 8, 10), ...]
Here's a solution which works using recursion, although as Adrian Usler suggests, using the itertools library probably works better. I think the following code works:
def gen_all_sequences(sequence):
# Recursive base case:
if len(sequence) <= 1:
return [sequence] # Only one possible sequence of length 1
# Recursive general case:
all_sequences = []
for i, elem in enumerate(sequence):
# Construct and append all possible sequences beginning with elem
remaining_elements = sequence[:i]+sequence[i+1:]
all_sequences += [[elem]+seq for seq in gen_all_sequences(remaining_elements )]
return all_sequences
print(gen_all_sequences(["a","b","c","d"]))
# Will return all arrangements (permutations) of "a", "b", "c" and "d", e.g. abcd, abdc, acbd, acdb etc.

Finding all possible combinations of elements in a list of integers. all elements of any of the new list have to be at least 2 apart

I need a function that gets a list as an input and returns all the combinations with the maximum amount of integers used (here 5) which don't have 2 adjacent integers like 2, 3 or 6,7.
list0 = [0, 3, 4, 6, 10, 11, 12, 13]
all_combinations = magic_function(list0)
all_combinations would be this:
[[0, 3, 6, 10, 12],
[0, 3, 6, 11, 13],
[0, 4, 6, 10, 12],
[0, 4, 6, 11, 13]]
It could be done by getting all combinations and then picking out the correct ones, but I can't have it use much memory or be slow, because it has to work with lists with length up to 98 elements.
You can use a recursive generator function:
def combos(d, c = []):
if len(c) == 5:
yield c
else:
for i in d:
if not c or c[-1]+1 < i:
yield from combos(d, c+[i])
list0 = [0, 3, 4, 6, 10, 11, 12, 13]
print(list(combos(list0)))
Output:
[[0, 3, 6, 10, 12],
[0, 3, 6, 10, 13],
[0, 3, 6, 11, 13],
[0, 4, 6, 10, 12],
[0, 4, 6, 10, 13],
[0, 4, 6, 11, 13]]
My approach is as follows:
import itertools
lst = [0, 3, 4, 6, 10, 11, 12, 13] # 0 | 3 4 | 6 | 10 11 12 13
chunks, chunk = [], [] # defining chunk here is actually useless
prev = None
for x in lst:
if prev is None or x - prev > 1: # if jump > 1
chunks.append(chunk := []) # insert a brand-new chunk
chunk.append(x)
prev = x # update the previous number
def max_nonadjacents(chunk): # maximal nonadjacent sublists (given a chunk)
if not chunk or len(chunk) % 2: # odd length is easy
return {tuple(chunk[::2])}
return{tuple((chunk[:i] + chunk[i+1:])[::2]) for i in range(len(chunk))}
output = [list(itertools.chain.from_iterable(prod)) # flattening
for prod in itertools.product(*map(max_nonadjacents, chunks))]
print(output)
# [[0, 3, 6, 11, 13], [0, 3, 6, 10, 12], [0, 3, 6, 10, 13], [0, 4, 6, 11, 13], [0, 4, 6, 10, 12], [0, 4, 6, 10, 13]]
I am assuming that the input list is sorted.
Basically, my approach starts with recognizing that the problem can be divided into smaller pieces; the list can be divided into chunks, where each chunk comprises of running integers; [0], [3, 4], [6], [10, 11, 12, 13].
Then you can see you can get all the possible combinations by taking all the maximal non-adjacent lists from each chunk, and then taking the products of the lists across the chunks.
The code follows this procedure: (i) get the chunks, (ii) define a helper function max_nonadjacents that extracts all the maximal non-adjacent lists, (iii) apply it to each chunk (map(max_nonadjacents, ...)), and then (iv) take the products.

Efficient way to find index of elements in a large list of integers starting with max to min elements

I have a large list of integers unsorted, numbers might be duplicated. I would like to create another list which is a list of sub-lists of indexes from the first list starting with max element to min, in decreasing order.
For example, if I have a list like this:
list = [4, 1, 4, 8, 5, 13, 2, 4, 3, 7, 14, 4, 4, 9, 12, 1, 6, 14, 10, 8, 6, 4, 11, 1, 2, 11, 3, 9]
The output should be:
indexList = [[10, 17], [5], [14], [22, 25], [18], [13, 27], [3, 19], [9], [16, 20], [4], [0, 2, 7, 11, 12, 21], [8, 26], [6, 24], [1, 15, 23]]
where, [10, 17] is the index of where '14' is present and so on...
Shared my code below. Profiling it using cProfile for a list of around 9000 elements takes around ~6 seconds.
def indexList(list):
# List with sorted elements
sortedList = sorted(list, reverse = True)
seen = set()
uSortedList = [x for x in sortedList if x not in seen and not seen.add(x)]
indexList = []
for e in uSortedList:
indexList.append([i for i, j in enumerate(list) if j == e])
return indexList
Here you go:
def get_list_indices(ls):
indices = {}
for n, i in enumerate(ls):
try:
indices[i].append(n)
except KeyError:
indices[i] = [n]
return [i[1] for i in sorted(indices.items(), reverse=True)]
test_list = [4, 1, 4, 8, 5, 13, 2, 4, 3, 7, 14, 4, 4, 9, 12, 1, 6, 14, 10, 8, 6, 4, 11, 1, 2, 11, 3, 9]
print(get_list_indices(test_list))
Based on some very basic testing, it is about twice as fast as the code you posted.

python generate sublist with offset and condition

Hey I'm trying to generate sublists of a list. For example I've a list like this:
l = [1,2,3,4,5,6,7,8,9,10,11,12]
I want to split them in sublists with the length of 4. But to first element is the same like the last element from the previous list AND like I said it must have the length of 4. Like this:
l1 = [1,2,3,4]
l2 = [4,5,6,7]
l3 = [7,8,9,10]
l4 = [10, 11, 12] <-- should be ignored
Does someone has an idea?! I'm thinking about an generator but I'm not quite sure.
A simple but flexible generator implementation:
def overlapping_sublists(l, n, overlap=1, start=0):
while start <= len(l) - n:
yield l[start:start+n]
start += n - overlap
Example usage:
>>> l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
>>> list(overlapping_sublists(l, 4))
[[1, 2, 3, 4], [4, 5, 6, 7], [7, 8, 9, 10]]
>>> list(overlapping_sublists(l, 4, 2, 3))
[[4, 5, 6, 7], [6, 7, 8, 9], [8, 9, 10, 11]]
a = []
l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
for i in range(0, len(l)-3, 3):
a.append(l[i:i+4])
will give a = [[1, 2, 3, 4], [4, 5, 6, 7], [7, 8, 9, 10]]
or you can use as a list comprehension:
[l[i:i+4] for i in range(0, len(l)-3, 3)]
print([l[i:i+4] for i in range(0, len(l), 3)])
Output:
[[1, 2, 3, 4], [4, 5, 6, 7], [7, 8, 9, 10], [10, 11, 12]]
Only sublists of length 4:
print([m for m in [l[i:i+4] for i in range(0, len(l), 3)] if len(m) == 4])
Output:
[[1, 2, 3, 4], [4, 5, 6, 7], [7, 8, 9, 10]]
Using generators:
for n in (m for m in (l[i:i+4] for i in range(0, len(l), 3)) if len(m) == 4):
print(n)
Output:
[1, 2, 3, 4]
[4, 5, 6, 7]
[7, 8, 9, 10]

How can I turn a flat list into a 2D array in python?

How can I turn a list such as:
data_list = [0,1,2,3,4,5,6,7,8,9]
into a array (I'm using numpy) that looks like:
data_array = [ [0,1] , [2,3] , [4,5] , [6,7] , [8,9] ]
Can I slice segments off the beginning of the list and append them to an empty array?
Thanks
>>> import numpy as np
>>> np.array(data_list).reshape(-1, 2)
array([[0, 1],
[2, 3],
[4, 5],
[6, 7],
[8, 9]])
(The reshape method returns a new "view" on the array; it doesn't copy the data.)
def nest_list(list1,rows, columns):
result=[]
start = 0
end = columns
for i in range(rows):
result.append(list1[start:end])
start +=columns
end += columns
return result
for:
list1=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
nest_list(list1,4,4)
Output:
[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15]]

Categories