You are given an integer array with N elements: d[0], d[1], ... d[N - 1].
You can perform AT MOST one move on the array: choose any two integers [L, R], and flip all the elements between (and including) the L-th and R-th bits. L and R represent the left-most and right-most index of the bits marking the boundaries of the segment which you have decided to flip.
What is the maximum number of 1-bits (indicated by S) which you can obtain in the final bit-string?
'Flipping' a bit means, that a 0 is transformed to a 1 and a 1 is transformed to a 0 (0->1,1->0).
Input Format: An integer N, next line contains the N bits, separated by spaces: d[0] d[1] ... d[N - 1]
Output: S
Constraints:
1 <= N <= 100000,
d[i] can only be 0 or 1 ,
0 <= L <= R < n ,
Sample Input:
8
1 0 0 1 0 0 1 0
Sample Output: 6
Explanation:
We can get a maximum of 6 ones in the given binary array by performing either of the following operations:
Flip [1, 5] ==> 1 1 1 0 1 1 1 0
Cleaned up and made Pythonic
arr1 = [1, 0, 0, 1, 0, 0, 1, 0]
arr2 = [1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1]
arr3 = [0,0,0,1,1,0,1,0,1,1,0,0,1,1,1]
def maximum_ones(arr):
"""
Returns max possible number of ones after flipping a span of bit array
"""
total_one = 0
net = 0
maximum = 0
for bit in arr:
if bit:
total_one += 1
net -= 1
else:
net += 1
maximum = max(maximum, net)
if net < 0:
net = 0
return total_one + maximum
print(maximum_ones(arr1))
print(maximum_ones(arr2))
print(maximum_ones(arr3))
Output:
6
14
11
If we want the L and R indices
Not so sure about this one. It can probably be made cleaner.
arr1 = [1, 0, 0, 1, 0, 0, 1, 0]
arr2_0 = [1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1]
arr2_1 = [1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1]
arr2_2 = [1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1]
arr3 = [0,0,0,1,1,0,1,0,1,1,0,0,1,1,1]
def maximum_ones(arr):
"""
Returns max possible number of ones after flipping a span of bit array
and the (L,R) indices (inclusive) of such a flip
"""
total_one = 0
net = 0
maximum = 0
L = R = 0
started_flipping = False
for i, bit in enumerate(arr):
if bit:
total_one += 1
net -= 1
else:
net += 1
if not started_flipping:
started_flipping = True
L = i
if net > maximum:
maximum = net
R = i
if net < 0:
net = 0
if i < R:
L = i
return (total_one + maximum, (L,R))
print(maximum_ones(arr1))
print(maximum_ones(arr2_0))
print(maximum_ones(arr2_1))
print(maximum_ones(arr2_2))
print(maximum_ones(arr3))
Output:
(6, (1, 5))
(14, (1, 16))
(14, (2, 16))
(14, (3, 16))
(11, (0, 2))
First Iteration
Here is what I had originally, if you want to see the evolution of the thought processes. Here, I was essentially transliterating what I came up with on paper.
Essentially, we traverse the array and start flipping bits (ok, not really), keeping track of cumulative flipped zeros and cumulative flipped ones in two separate arrays along with the total flipped ones in an integer counter. If the difference between flipped ones and zeroes at a given index - the "net" - drops below zero, we 'reset' the cumulative counts back at zero at that index (but nothing else). Along the way, we also keep track of the maximum net we've achieved and the index at which that occurs. Thus, the total is simply the total 1's we've seen, plus the net at the maximum index.
arr = [1, 0, 0, 1, 0, 0, 1, 0]
total_one = 0
one_flip = [0 for _ in range(len(arr))]
zero_flip = [0 for _ in range(len(arr))]
# deal with first element of array
if arr[0]:
total_one += 1
else:
zero_flip[0] = 1
maximum = dict(index=0,value=0) #index, value
i = 1
# now deal with the rest
while i < len(arr):
# if element is 1 we definitely increment total_one, else, we definitely flip
if arr[i]:
total_one += 1
one_flip[i] = one_flip[i-1] + 1
zero_flip[i] = zero_flip[i-1]
else:
zero_flip[i] = zero_flip[i-1] + 1
one_flip[i] = one_flip[i-1]
net = zero_flip[i] - one_flip[i]
if net > 0:
if maximum['value'] < net:
maximum['value'] = net
maximum['index'] = i
else: # net == 0, we restart counting our "net"
one_flip[i] = 0
zero_flip[i] = 0
i += 1
maximum_flipped = total_one - one_flip[maximum['index']] + zero_flip[maximum['index']]
Results:
print(total_one, -one_flip[maximum['index']], zero_flip[maximum['index']] )
print(maximum_flipped)
print('________________________________________________')
print(zero_flip, arr, one_flip, sep='\n')
print('maximum index', maximum['index'])
Output:
3 -1 4
6
________________________________________________
[0, 1, 2, 2, 3, 4, 4, 5]
[1, 0, 0, 1, 0, 0, 1, 0]
[0, 0, 0, 1, 1, 1, 2, 2]
maximum index 5
if arr = [1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1]
6 -4 12
14
________________________________________________
[0, 1, 2, 3, 3, 4, 5, 5, 6, 6, 7, 8, 9, 10, 10, 11, 12, 12]
[1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1]
[0, 0, 0, 0, 1, 1, 1, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 5]
maximum index 16
Finally, if arr = [0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1]
8 0 3
11
________________________________________________
[1, 2, 3, 3, 3, 4, 4, 5, 5, 0, 1, 2, 2, 0, 0]
[0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1]
[0, 0, 0, 1, 2, 2, 3, 3, 4, 0, 0, 0, 1, 0, 0]
maximum index 2
Great, now tear it apart, people!
Traverse the whole array. Keep a count in the following way:
Do +1 for every 0 bit encountered.
Do -1 for every 1.
If this count reaches -ve at any stage, reset it to 0. Keep track of max value of this count. Add this max_count to number of 1's in input array. This will be your answer.
Code:
arr = [1, 0, 0, 1, 0, 0, 1, 0]
# I'm taking your sample case. Take the input the way you want
count,count_max,ones = 0,0,0
for i in arr:
if i == 1:
ones += 1
count -= 1
if i == 0:
count += 1
if count_max < count:
count_max = count
if count < 0:
count = 0
print (ones + count_max)
Small and simple :)
Related
I have a list of integers which I want to separate according to a certain condition. I want to get the sum and the count of the list elements, stopping when three or more consecutive elements are equal to 0; then the sum and count orders restart again from where they stopped.
For example, part of the list is:
[8, 2, 1, 1, 2, 0, 0, 0, 0, 0, 6, 0, 2, 0, 0, 0, 8, 0, 0, 2, 0, 0, 0, 6, 0, 0]
The process would be:
8, 2, 1, 1, 2 -> sum: 14, length: 5
0, 0, 0, 0, 0
6, 0, 2 -> sum: 8, length: 3
0, 0, 0
8, 0, 0, 2 -> sum: 10, length: 4
0, 0, 0
6, 0, 0 -> sum: 6, length: 3
So the output I want is:
[[14, 5], [8, 3], [10, 4], [6, 3]]
What I've written so far computes the sum okay, but my problem is that zeros within sections aren't counted in the lengths.
Current (incorrect) output:
[[14, 5], [8, 2], [10, 2], [6, 2]]
Code:
arr = [8, 2, 1, 1, 2, 0, 0, 0, 0, 0, 6, 0, 2, 0, 0, 0, 8, 0, 0, 2, 0, 0, 0, 6, 0, 0]
result = []
summed, count = 0, 0
for i in range(0, len(arr) - 2):
el, el1, el2 = arr[i], arr[i + 1], arr[i + 2]
if el != 0:
summed = summed + el
count = count + 1
if el == 0 and el1 == 0 and el2 == 0:
if summed != 0:
result.append([summed, count])
summed = 0
count = 0
elif i == len(arr) - 3:
summed = el + el1 + el2
count = count + 1
result.append([summed, count])
break
print(result)
It is quite hard to understand what your code does. Working with Strings seems more straightforward and readable, your output can be achieved in just two lines (thanks to #CrazyChucky for the improvement):
import re
arr = [8, 2, 1, 1, 2, 0, 0, 0, 0, 0, 6, 0, 2, 0, 0, 0, 8, 0, 0, 2, 0, 0, 0, 6, 0, 0]
# Convert to String by joining integers, and split into substrings, the separator being three zeros or more
strings = re.split(r'0{3,}', ''.join(str(i) for i in arr))
# Sums and counts using list comprehensions
output = [[sum(int(x) for x in substring), len(substring)] for substring in strings]
Output:
>>>output
>>>[[14, 5], [8, 3], [10, 4], [6, 3]]
Remember that readability is always the most important factor in any code. One should read your code for the first time and understand how it works.
If the full list contains numbers with more than one digit, you can do the following:
# Convert to String by joining integers, seperating them by a commade, and split into substrings, the separator being three zeros or more
strings = re.split(r',?(?:0,){3,}', ','.join(str(i) for i in arr))
# Make a list of numbers from those strings
num_lists = [string.split(',') for string in strings]
# # Sums and counts using list comprehensions
output = [[sum(int(x) for x in num_list), len(num_list)] for num_list in num_lists]
This answer is not so much to suggest a way I'd recommend doing it, as to highlight how clever Paul Lemarchand's idea of using a regular expression is. Without Python's re module doing the heavy lifting for you, you have to either look ahead to see how many zeros are coming (as in Prakash Dahal's answer), or keep track of how many zeros you've seen as you go. I think this implementation of the latter is about the simplest and shortest way you could solve this problem "from scratch":
input_list = [8, 2, 1, 1, 2, 0, 0, 0, 0, 0, 6, 0, 2, 0, 0, 0, 8, 0, 0, 2, 0, 0,
0, 6, 0, 0]
output_list = []
current_run = []
pending_zeros = 0
for num in input_list:
# If the number is 0, increment the number of "pending" zeros. (We
# don't know yet if they're part of a separating chunk or not.)
if num == 0:
pending_zeros += 1
# If this is the first nonzero after three or more zeros, process
# the existing run and start over from the current number.
elif pending_zeros >= 3:
output_list.append((sum(current_run), len(current_run)))
current_run = [num]
pending_zeros = 0
# Otherwise, the pending zeros (if any) should be included in the
# current run. Add them, and then the current number.
else:
current_run += [0] * pending_zeros
current_run.append(num)
pending_zeros = 0
# Once we're done looping, there will still be a run of numbers in the
# buffer (assuming the list had any nonzeros at all). It may have
# pending zeros at the end, too. Include the zeros if there are 2 or
# fewer, then process.
if current_run:
if pending_zeros <= 2:
current_run += [0] * pending_zeros
output_list.append((sum(current_run), len(current_run)))
print(output_list)
[(14, 5), (8, 3), (10, 4), (6, 3)]
One note: I made each entry in the list a tuple rather than a list. Tuples and lists have a lot of overlap, and in this case either would probably work perfectly well... but a tuple is a more idiomatic choice for an immutable data structure that will always be the same length, in which each position refers to something different. (In other words, it's not a list of equivalent items, but rather a well-defined combination of (sum, length).)
Use this:
a = [8, 2, 1, 1, 2, 0, 0, 0, 0, 0, 6, 0, 2, 0, 0, 0, 8, 0, 0, 2, 0, 0, 0, 6, 0,0]
total_list = []
i = 0
sum_int = 0
count_int = 0
for ind, ele in enumerate(a):
if ind < (len(a) - 2):
sum_int += ele
if sum_int != 0:
count_int += 1
if (a[ind] == 0) and (a[ind+1] == 0) and (a[ind+2] == 0):
if sum_int != 0:
count_int -= 1
total_list.append([sum_int, count_int])
sum_int = 0
count_int = 0
else:
sum_int += ele
count_int += 1
if sum_int != 0:
total_list.append([sum_int, count_int+1])
sum_int = 0
count_int = 0
print(total_list)
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 2 years ago.
Improve this question
I have an assignment where I am to count the occurrences of consecutive 1:s in a sequence. This is where I'm at. Why doesn't this work?
def count11(seq):
x = 0
for i in seq:
if i == 1:
if seq[i+1] == 1:
x += 1
return x
print(count11([0, 0, 1, 1, 1, 0]))
Edit: The function is supposed to count the number of pairs of ones, so the given sequence should output 2.
You can also make a one-liner:
import re
import math
def count(seq):
return math.ceil(len(max(re.sub(r"[^1]"," ","".join(map(str, seq))).split(), key=len))/2)
print (count([0, 0, 1, 1, 1, 0]))
This returns the max number of pairs (longest repeat / 2 round up), two (overlapping) pairs in your example.
Code Review and fix
First let's review your code:
for i in seq
The for loop will get you each element from the sequence. So this will be values of 0s and 1s.
Inside the loop, you are checking
if seq[i+1] == 1:
Since the values of i can be only 0 or 1, you are checking through the loop for values of seq[1] or seq[2] depending on the value of i in the list seq. Instead, you should iterate through the list for the length of seq - 1 times (since python list index starts from 0).
To do that, you can set the counter to 0 and for every iteration inside the for loop, increment the counter by 1 and use that to check seq. Python makes it easy by having enumerate(seq). This will give you a tuple (counter, value). That way you have direct access to the counter and can use that to check the counter + 1 value as well. Or you can iterate through the loop using range (0 to len of seq - 1).
Your modified code will be as follows:
def count11(seq):
x = 0
for i in range (len(seq)-1):
if seq[i] == seq[i+1] == 1: #check if ith and i+1th position are both equal to 1
x += 1
return x
For a value of [0, 0, 1, 1, 1, 0], the result will be 2
Alternate solutions:
There are three types of consecutive checks we can do. I am going to give you code for all 3 types.
Type 1: If the value is [1, 1, 1, 0, 1, 1], there are two sequence of 1s in this list. So the result can be 2.
Type 2: An alternate way to look at this is to find any two elements and if they are both 1s, then count it as sequence. If that's the case, the result will be 3 (1, 1 for first and second, then 1, 1 for second and third, then 1, 1 for last two elements).
Type 3: Another check you can do is to find the highest sequence of 1s in a list. In this case, the highest will be 3.
In the below code,
consec_a function will result in 3 (check if the next element is also a 1). This solution is similar to what you have. I have made it a list comprehension and a single line.
consec_b function will result in 2 (check if there is a series of uninterrupted 1s)
consec_c function will check for 3 (find the max number of uninterrupted 1s)
def consec_a(sequ):
return sum(sequ[i] == sequ[i+1] == 1 for i in range(len(sequ)-1))
def consec_b(sequ):
seq_flag = False
seq_count = 0
for i in range (len(sequ)-1):
if sequ[i] == sequ[i+1] == 1:
if not seq_flag: seq_count += 1
seq_flag = True
else: seq_flag = False
return seq_count
def consec_c(sequ):
seq_flag = False
seq_count = 1
max_count = 0
for i in range (len(sequ)-1):
if sequ[i] == sequ[i+1] == 1:
seq_count += 1
if max_count < seq_count: max_count = seq_count
seq_flag = True
else:
seq_flag = False
seq_count = 1
return max_count
print ('Results of consec_a function :')
print('[0, 0, 1, 1, 1, 0]', consec_a([0, 0, 1, 1, 1, 0]))
print('[0, 1, 1, 0, 1, 1]', consec_a([0, 1, 1, 0, 1, 1]))
print('[1, 1, 1, 0, 1, 1]', consec_a([1, 1, 1, 0, 1, 1]))
print('[1, 1, 1, 1, 1, 1]', consec_a([1, 1, 1, 1, 1, 1]))
print('[0, 0, 0, 0, 0, 0]', consec_a([0, 0, 0, 0, 0, 0]))
print ('\nResults of consec_b function :')
print('[0, 0, 1, 1, 1, 0]', consec_b([0, 0, 1, 1, 1, 0]))
print('[0, 1, 1, 0, 1, 1]', consec_b([0, 1, 1, 0, 1, 1]))
print('[1, 1, 1, 0, 1, 1]', consec_b([1, 1, 1, 0, 1, 1]))
print('[1, 1, 1, 1, 1, 1]', consec_b([1, 1, 1, 1, 1, 1]))
print('[0, 0, 0, 0, 0, 0]', consec_b([0, 0, 0, 0, 0, 0]))
print ('\nResults of consec_c function :')
print('[0, 0, 1, 1, 1, 0]', consec_c([0, 0, 1, 1, 1, 0]))
print('[0, 1, 1, 0, 1, 1]', consec_c([0, 1, 1, 0, 1, 1]))
print('[1, 1, 1, 0, 1, 1]', consec_c([1, 1, 1, 0, 1, 1]))
print('[1, 1, 1, 1, 1, 1]', consec_c([1, 1, 1, 1, 1, 1]))
print('[0, 0, 0, 0, 0, 0]', consec_c([0, 0, 0, 0, 0, 0]))
The output of this will be:
Results of consec_a function :
[0, 0, 1, 1, 1, 0] 2
[0, 1, 1, 0, 1, 1] 2
[1, 1, 1, 0, 1, 1] 3
[1, 1, 1, 1, 1, 1] 5
[0, 0, 0, 0, 0, 0] 0
Results of consec_b function :
[0, 0, 1, 1, 1, 0] 1
[0, 1, 1, 0, 1, 1] 2
[1, 1, 1, 0, 1, 1] 2
[1, 1, 1, 1, 1, 1] 1
[0, 0, 0, 0, 0, 0] 0
Results of consec_c function :
[0, 0, 1, 1, 1, 0] 3
[0, 1, 1, 0, 1, 1] 2
[1, 1, 1, 0, 1, 1] 3
[1, 1, 1, 1, 1, 1] 6
[0, 0, 0, 0, 0, 0] 0
Hopefully the code is easy to understand. If you need clarification, let me know.
I am using chaining to check for 1s.
if sequ[i] == sequ[i+1] == 1
This checks if all the values are equal to 1 and if yes, then the equation is set to True
sum(sequ[i] == sequ[i+1] == 1 .....)
This sums all the True values. As you know True = 1 and False = 0.
You try to use i as your index, which is the current item in the sequence, and not its index. At that point, i is always 1, because you checked it before. A way to solve it would be using emumerate, which gives you a tuple with the current item itself and its index.
for idx, item in enumerate(seq):
# idx is the index, item is 1 or 0
the i in your code is the item in seq, not the index of seq. you can get the first consecutive 1s through the following codes:
def count11(seq):
count = 0
for item in seq:
if not item and count == 0:
continue
elif item == 1:
count += 1
else:
break
return count
if you want to get the max consecutive 1s:
def count11(seq):
if not seq:
return 0
count = 0
max_count = 0
for item in seq:
if item == 1:
count += 1
max_count = max(max_count, count)
else:
count = 0
return max_count
You can find a similar question and variations to it for practice LeetCode - Max Consecutive Ones
Below was my solution for it
Intuition of the solution -
Maintain c(counter) to track the current run of 1s. Update the max_run after increment of cnt.
Reset c(counter) to 0 when we encounter 0 in the array.
def findMaxConsecutiveOnes(nums):
n = len(nums)
if n == 1 and nums[0] == 1:
return 1
res = 0
c = 0
for i in range(n):
if nums[i] == 1:
c += 1
res = max(c,res)
else:
c = 0
return res
>>> findMaxConsecutiveOnes([0, 0, 1, 1, 1, 0])
3
Suppose I have an array like
a = np.array([[0, 1, 1, 1, 0, 0, 0, 0, 1, 0],
[0, 0, 1, 0, 0, 1, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 1, 0, 0, 1, 0]])
I want each row to have a specific number of ones- let's say, 5 ones per row.
So in the first row I need to add 1 one, second row needs 3 ones, and the third needs 2. I need to randomly generate those ones in places where x = 0.
How do I do this?
This was a bit tricky but here is a fully vectorized solution:
import numpy as np
def add_ones_up_to(data, n):
# Count number of ones to add to each row
c = np.maximum(n - np.count_nonzero(data, axis=-1), 0)
# Make row-shuffling indices
shuffle = np.argsort(np.random.random(data.shape), axis=-1)
# Row-shuffled data
data_shuffled = np.take_along_axis(data, shuffle, axis=-1)
# Sorting indices for shuffled data (indices of zeros will be first)
sorter = np.argsort(np.abs(data_shuffled), axis=-1)
# Sorted row-shuffled data
data_sort = np.take_along_axis(data_shuffled, sorter, axis=-1)
# Mask for number of ones to add
m = c[..., np.newaxis] > np.arange(data.shape[-1])
# Replace values with ones or previous value depending on mask
data_sort = np.where(m, 1, data_sort)
# Undo sorting and shuffling
reorderer = np.empty_like(sorter)
np.put_along_axis(reorderer, sorter, np.arange(reorderer.shape[-1]), axis=-1)
np.put_along_axis(reorderer, shuffle, reorderer.copy(), axis=-1)
return np.take_along_axis(data_sort, reorderer, axis=-1)
np.random.seed(100)
data = np.array([[0, 1, 1, 1, 0, 0, 0, 0, 1, 0],
[0, 0, 1, 0, 0, 1, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 1, 0, 0, 1, 0]])
n = 5
print(add_ones_up_to(data, n))
# [[0 1 1 1 1 0 0 0 1 0]
# [0 1 1 1 0 1 0 1 0 0]
# [1 0 0 0 0 1 1 0 1 1]]
import numpy as np
a = np.array([[0, 1, 1, 1, 0, 0, 0, 0, 1, 0],
[0, 0, 1, 0, 0, 1, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 1, 0, 0, 1, 0]])
ones = 5
to_add = ones - np.count_nonzero(a, axis=1)
for i in range(a.shape[0]):
idx = np.random.choice(np.flatnonzero(a[i, :] == 0), size=to_add[i], replace=False)
a[i, idx] = 1
For each row you count the numbers of non zeros to calculate how many ones to add.
You than chose that many indices out of the set of indices where a is zero and set those to 1.
Seemingly straightforward problem: I want to create an array that gives the count since the last occurence of a given condition. In this condition, let the condition be that a > 0:
in: [0, 0, 5, 0, 0, 2, 1, 0, 0]
out: [0, 0, 0, 1, 2, 0, 0, 1, 2]
I assume step one would be something like np.cumsum(a > 0), but not sure where to go from there.
Edit: Should clarify that I want to do this without iteration.
Numpy one-liner:
x = numpy.array([0, 0, 5, 0, 0, 2, 1, 0, 0])
result = numpy.arange(len(x)) - numpy.maximum.accumulate(numpy.arange(len(x)) * (x > 0))
Gives
[0, 1, 0, 1, 2, 0, 0, 1, 2]
If you want to have zeros in the beginning, turn it to zero explicitly:
result[:numpy.nonzero(x)[0][0]] = 0
Split the array based on the condition and use the lengths of the remaining pieces and the condition state of the first and last element in the array.
A pure python solution:
result = []
delta = 0
for val in [0, 0, 5, 0, 0, 2, 1, 0, 0]:
delta += 1
if val > 0:
delta = 0
result.append(delta)
I have matrices with rows that need to be centralised. In other words each row has trailing zeros at both ends, while the actual data is between the trailing zeros. However, I need the number of trailing zeros to be equal at both ends or in other words what I call the data (values between the trailing zeros) to be centred at the middle of the row. Here is an example:
array:
[[0, 1, 2, 0, 2, 1, 0, 0, 0],
[2, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 2, 0, 0, 0]]
centred_array:
[[0, 0, 1, 2, 0, 2, 1, 0, 0],
[0, 0, 0, 2, 1, 1, 0, 0, 0],
[0, 0, 1, 0, 0, 2, 0, 0, 0]]
I hope that explains it well enough so that you can see some of the issues I am having. One, I am not guaranteed a even value for the size of the "data" so the function needs to pick a centre for even values which is consistent; also this is the case for rows (rows might have an even size which means one placed needs to be chosen as the centre).
EDIT: I should probably note that I have a function that does this; its just that I can get 10^3 number of rows to centralise and my function is too slow, so efficiency would really help.
#HYRY
a = np.array([[0, 1, 2, 0, 2, 1, 0, 0, 0],
[2, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 2, 0, 0, 0]])
cd = []
(x, y) = np.shape(a)
for row in a:
trim = np.trim_zeros(row)
to_add = y - np.size(trim)
a = to_add / 2
b = to_add - a
cd.append(np.pad(trim, (a, b), 'constant', constant_values=(0, 0)).tolist())
result = np.array(cd)
print result
[[0 0 1 2 0 2 1 0 0]
[0 0 0 2 1 1 0 0 0]
[0 0 1 0 0 2 0 0 0]]
import numpy as np
def centralise(arr):
# Find the x and y indexes of the nonzero elements:
x, y = arr.nonzero()
# Find the index of the left-most and right-most elements for each row:
nonzeros = np.bincount(x)
nonzeros_idx = nonzeros.cumsum()
left = y[np.r_[0, nonzeros_idx[:-1]]]
right = y[nonzeros_idx-1]
# Calculate how much each y has to be shifted
shift = ((arr.shape[1] - (right-left) - 0.5)//2 - left).astype(int)
shift = np.repeat(shift, nonzeros)
new_y = y + shift
# Create centered_arr
centered_arr = np.zeros_like(arr)
centered_arr[x, new_y] = arr[x, y]
return centered_arr
arr = np.array([[0, 1, 2, 0, 2, 1, 0, 0, 0],
[2, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 2, 0, 0, 0]])
print(centralise(arr))
yields
[[0 0 1 2 0 2 1 0 0]
[0 0 0 2 1 1 0 0 0]
[0 0 1 0 0 2 0 0 0]]
A benchmark comparing the original code to centralise:
def orig(a):
cd = []
(x, y) = np.shape(a)
for row in a:
trim = np.trim_zeros(row)
to_add = y - np.size(trim)
a = to_add / 2
b = to_add - a
cd.append(np.pad(trim, (a, b), 'constant', constant_values=(0, 0)).tolist())
result = np.array(cd)
return result
In [481]: arr = np.tile(arr, (1000, 1))
In [482]: %timeit orig(arr)
10 loops, best of 3: 140 ms per loop
In [483]: %timeit centralise(arr)
1000 loops, best of 3: 537 µs per loop
In [486]: (orig(arr) == centralise(arr)).all()
Out[486]: True
If you only have 10^3 rows in your array, you can probably afford a python loop if you'd like a more explicit solution:
import numpy as np
a = np.array([[0, 1, 2, 0, 2, 1, 0, 0, 0],
[2, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 2, 0, 0, 0]])
for i, r in enumerate(a):
w = np.where(r!=0)[0]
nend = len(r) - w[-1] - 1
nstart = w[0]
shift = (nend - nstart)//2
a[i] = np.roll(r, shift)
print(a)
gives:
[[0 0 1 2 0 2 1 0 0]
[0 0 0 2 1 1 0 0 0]
[0 0 1 0 0 2 0 0 0]]
A solution using np.apply_along_axis:
import numpy as np
def centerRow(a):
i = np.nonzero(a <> 0)
ifirst = i[0][0]
ilast = i[0][-1]
count = ilast-ifirst+1
padleft = (np.size(a) - count) / 2
padright = np.size(a) - padleft - count
b = np.r_ [ np.repeat(0,padleft), a[ifirst:ilast+1], np.repeat(0,padright) ]
return b
arr = np.array(
[[0, 1, 2, 0, 2, 1, 0, 0, 0],
[2, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 2, 0, 0, 0]]
)
barr = np.apply_along_axis(centerRow, 1, arr)
print barr
Algorithm:
find positions of non-zero values on the row of length n
find the difference, d, between 1st and the last non-zero element
store meaningful vector, x, in the row given by length d
find the mid-point of d, d_m, if it is even, get the right element
find the mid-point of row length, n_m, if it is even, pick the right
subtract d_m-d from n_m and place x at this position in the row of zeros of length n
repeat for all rows
Quick Octave Prototype (Will Soon post Python version):
mat = [[0, 1, 2, 0, 2, 1, 0, 0, 0],
[2, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 2, 0, 0, 0]];
newMat = zeros(size(mat)); %new matrix to be filled
n = size(mat, 2);
for i = 1:size(mat,1)
newRow = newMat(i,:);
nonZeros = find(mat(i,:));
x = mat(i, nonZeros(1):nonZeros(end));
d = nonZeros(end)- nonZeros(1);
d_m = ceil(d/2);
n_m = ceil(n/2);
newRow(n_m-d_m:n_m-d_m+d) = x;
newMat(i,:) = newRow;
end
newMat
> [[0 0 1 2 0 2 1 0 0]
[0 0 0 2 1 1 0 0 0]
[0 0 1 0 0 2 0 0 0]]