Numpy cumsum with boundary conditions to the sum [duplicate] - python

I have a numpy/pandas list of values:
a = np.random.randint(-100, 100, 10000)
b = a/100
I want to apply a custom cumsum function, but I haven't found a way to do it without loops. The custom function sets an upper limit of 1 and lower limit of -1 for the cumsum values, if the "add" to sum is beyond these limits the "add" becomes 0.
In the case that sum is between the limits of -1 and 1 but the "added" value would break beyond the limits, the "added" becomes the remainder to -1 or 1.
Here is the loop version:
def cumsum_with_limits(values):
cumsum_values = []
sum = 0
for i in values:
if sum+i <= 1 and sum+i >= -1:
sum += i
cumsum_values.append(sum)
elif sum+i >= 1:
d = 1-sum # Remainder to 1
sum += d
cumsum_values.append(sum)
elif sum+i <= -1:
d = -1-sum # Remainder to -1
sum += d
cumsum_values.append(sum)
return cumsum_values
Is there any way to vectorize this? I need to run this function on large datasets and performance is my current issue. Appreciate any help!
Update: Fixed the code a bit, and a little clarification for the outputs:
Using np.random.seed(0), the first 6 values are:
b = [0.72, -0.53, 0.17, 0.92, -0.33, 0.95]
Expected output:
o = [0.72, 0.19, 0.36, 1, 0.67, 1]

Loops aren't necessarily undesirable. If performance is an issue, consider numba. There's a ~330x improvement without materially changing your logic:
from numba import njit
np.random.seed(0)
a = np.random.randint(-100, 100, 10000)
b = a/100
#njit
def cumsum_with_limits_nb(values):
n = len(values)
res = np.empty(n)
sum_val = 0
for i in range(n):
x = values[i]
if (sum_val+x <= 1) and (sum_val+x >= -1):
res[i] = x
sum_val += x
elif sum_val+x >= 1:
d = 1-sum_val # Remainder to 1
res[i] = d
sum_val += d
elif sum_val+x <= -1:
d = -1-sum_val # Remainder to -1
res[i] = d
sum_val += d
return res
assert np.isclose(cumsum_with_limits(b), cumsum_with_limits_nb(b)).all()
If you don't mind sacrificing some performance, you can rewrite this loop more succinctly:
#njit
def cumsum_with_limits_nb2(values):
n = len(values)
res = np.empty(n)
sum_val = 0
for i in range(n):
x = values[i]
next_sum = sum_val + x
if np.abs(next_sum) >= 1:
x = np.sign(next_sum) - sum_val
res[i] = x
sum_val += x
return res
With similar performance to nb2, here's an alternative (thanks to #jdehesa):
#njit
def cumsum_with_limits_nb3(values):
n = len(values)
res = np.empty(n)
sum_val = 0
for i in range(n):
x = min(max(sum_val + values[i], -1) , 1) - sum_val
res[i] = x
sum_val += x
return res
Performance comparisons:
assert np.isclose(cumsum_with_limits(b), cumsum_with_limits_nb(b)).all()
assert np.isclose(cumsum_with_limits(b), cumsum_with_limits_nb2(b)).all()
assert np.isclose(cumsum_with_limits(b), cumsum_with_limits_nb3(b)).all()
%timeit cumsum_with_limits(b) # 12.5 ms per loop
%timeit cumsum_with_limits_nb(b) # 40.9 µs per loop
%timeit cumsum_with_limits_nb2(b) # 54.7 µs per loop
%timeit cumsum_with_limits_nb3(b) # 54 µs per loop

Start with a regular cumsum:
b = ...
s = np.cumsum(b)
Find the first clip point:
i = np.argmax((s[0:] > 1) | (s[0:] < -1))
Adjust everything that follows:
s[i:] += (np.sign(s[i]) - s[i])
Rinse and repeat. This still requires a loop, but only over the adjustment points, which is generally expected to be much smaller than the total number of array size.
b = ...
s = np.cumsum(b)
while True:
i = np.argmax((s[0:] > 1) | (s[0:] < -1))
if np.abs(s[i]) <= 1:
break
s[i:] += (np.sign(s[i]) - s[i])
I still haven't found a way to completely pre-compute the adjustment points up front, so I would have to guess that the numba solution will be faster than this, even if it you compiled this with numba.
Starting with np.seed(0), your original example has 3090 adjustment points, which is approximately 1/3. Unfortunately, with all the temp arrays and extra sums, that makes the algorithmic complexity of my solution tend to O(n2). This is completely unacceptable.

I thought I had already answered the generic question of "cumulative sum with bounds" in the past, but I can't find it.
This solution also uses numba and is a bit more general (custom bounds) and concise than the ones given by #jpp.
It operates on the OP's problem (10K values, bounds at -1, 1) in 40 µs.
import numpy as np
from numba import njit
#njit
def cumsum_clip(a, xmin=-np.inf, xmax=np.inf):
res = np.empty_like(a)
c = 0
for i in range(len(a)):
c = min(max(c + a[i], xmin), xmax)
res[i] = c
return res
Example
np.random.seed(0)
x = np.random.randint(-100, 100, 10_000) / 100
>>> x[:6]
array([ 0.72, -0.53, 0.17, 0.92, -0.33, 0.95])
>>> cumsum_clip(x, -1, 1)[:6]
array([0.72, 0.19, 0.36, 1. , 0.67, 1. ])
%timeit cumsum_clip(x, -1, 1)
39.3 µs ± 31 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
Note: you can specify other bounds, e.g.:
>>> cumsum_clip(x, 0, 1)[:10]
array([0.72, 0.19, 0.36, 1. , 0.67, 1. , 1. , 0.09, 0. , 0. ])
Or omit one of the bounds (for example here specifying only an upper bound):
>>> cumsum_clip(x, xmax=1)[:10]
array([ 0.72, 0.19, 0.36, 1. , 0.67, 1. , 1. , 0.09, -0.7 , -1.34])
Of course, it preserves the original dtype:
np.random.seed(0)
x = np.random.randint(-10, 10, 10)
>>> cumsum_clip(x, 0, 10)
array([ 2, 7, 0, 0, 0, 0, 0, 9, 10, 4])
>>> cumsum_clip(x, 0, 10).dtype
dtype('int64')

Related

Summing list to specific treshold with zeros trace

Is there a reasonable way to get the following done in fast compilation way?
I try to sum the list of numbers to specific treshold and replace previous values to 0.
I'm looking for the fastest compilation way (the list has 18 kk records).
For given example the treshold is "1".
Input:
[0.2, 0.4, 0.2, 0.2, 0.1, 1.2, 3.2 ,0.2, 0.1, 0.4, 0.5, 0.1]
Output:
[0.0, 0.0, 0.0, 1.0, 0.0, 1.3, 3.2 ,0.0, 0.0, 0.0, 1.2, 0.1]
A more faster approach compared to appending each interim value to the final list:
lst = [0.2, 0.4, 0.2, 0.2, 0.1, 1.2, 3.2 ,0.2, 0.1, 0.4, 0.5, 0.1]
res = [0] * len(lst) # initial zeros
L_size, t = len(lst), 0
for i, n in enumerate(lst):
t += n
if t >= 1 or i == L_size - 1:
res[i] = t
t = 0
print(res)
[0, 0, 0, 1.0, 0, 1.3, 3.2, 0, 0, 0, 1.2000000000000002, 0.1]
A list comprehension:
s = 0.0
res = [
0.0 if (s := s + x if s < 1.0 else x) < 1.0 else s
for x in lst
]
res[-1] = s
Benchmark with ~1.8 million values, times multiplied by 10 to estimate for your 18 million:
1.74 ± 0.06 seconds Kelly
3.32 ± 0.10 seconds Roman
3.56 ± 0.10 seconds Roman_Andrej
5.17 ± 0.07 seconds mozway
Benchmark code (Attempt This Online!):
from timeit import timeit
from statistics import mean, stdev
def mozway(l):
total = 0
out = []
for i, n in enumerate(l):
new_total = total + n
if new_total >= 1 or i+1 == len(l):
out.append(new_total)
total = 0
else:
out.append(0)
total = new_total
return out
def Roman(lst):
res = [0] * len(lst)
L_size, t = len(lst), 0
for i, n in enumerate(lst):
t += n
if t >= 1 or i == L_size - 1:
res[i] = t
t = 0
return res
def Roman_Andrej(lst):
L_size, t = len(lst), 0
for i, n in enumerate(lst):
t += n
if t >= 1 or i == L_size - 1:
lst[i] = t
t = 0
else:
lst[i] = 0
return res
def Kelly(lst):
s = 0.0
res = [
0.0 if (s := s + x if s < 1.0 else x) < 1.0 else s
for x in lst
]
res[-1] = s
return res
funcs = mozway, Roman, Roman_Andrej, Kelly
lst = [0.2, 0.4, 0.2, 0.2, 0.1, 1.2, 3.2 ,0.2, 0.1, 0.4, 0.5, 0.1]
exp = [0.0, 0.0, 0.0, 1.0, 0.0, 1.3, 3.2 ,0.0, 0.0, 0.0, 1.2, 0.1]
for f in funcs:
res = [round(x, 6) for x in f(lst[:])]
print(res == exp)
# print(exp)
# print(res)
times = {f: [] for f in funcs}
def stats(f):
ts = [t for t in sorted(times[f])[:5]]
return f'{mean(ts):4.2f} ± {stdev(ts):4.2f} seconds '
lst *= 1800000 // len(lst)
for _ in range(10):
for f in funcs:
copy = lst[:]
t = timeit(lambda: f(copy), number=1) * 10
times[f].append(t)
for f in sorted(funcs, key=stats):
print(stats(f), f.__name__)
Not sure what you mean by "fast compilation way":
l = [0.2, 0.4, 0.2, 0.2, 0.1, 1.2, 3.2 ,0.2, 0.1, 0.4, 0.5, 0.1]
total = 0
out = []
for i, n in enumerate(l):
new_total = total + n
if new_total >= 1 or i+1 == len(l):
out.append(new_total)
total = 0
else:
out.append(0)
total = new_total
Output:
[0, 0, 0, 1.0, 0, 1.3, 3.2, 0, 0, 0, 1.2000000000000002, 0.1]
Running time for 18K values:
8.16 ms ± 296 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
precision
If you need precise floating point operations you might want to use Decimal:
from decimal import Decimal
from math import isclose
total = 0
out = []
for i, n in enumerate(map(Decimal, l)):
new_total = total + n
if new_total >= 1 or isclose(new_total, 1) or i+1 == len(l):
out.append(float(new_total))
total = 0
else:
out.append(0)
total = new_total
Output:
[0, 0, 0, 1.0, 0, 1.3, 3.2, 0, 0, 0, 1.2, 0.1]
Running time for 18K values:
49.5 ms ± 2.93 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

How to threshold vectors [u,v] in a 2D numpy array?

I wrote a threshold function TH(arr, threshold) that takes in a 2D array of vectors [u,v], and sets u and v to 0 if they both have an absolute value lower than a specified threshold.
The function consists of 2 for-loops and does the job, but is compute-intensive (I am running it on large datasets).
Examples:
[u, v] --> Output (threshold = 1)
[2, 2] --> [2, 2]
[2, .1] --> [2, .1]
[.1,.1] --> [0, 0]
What other methods/functions can I use to solve this problem more efficiently (Using list comprehension or other methods)?
Here's some code:
import numpy as np
import time
start = time.time()
def TH(arr, threshold):
for idx, value in enumerate(arr):
for i, item in enumerate(value):
if np.abs(item[0]) < threshold and np.abs(item[1]) < threshold:
arr[idx][i][0] = 0.0
arr[idx][i][1] = 0.0
return arr
a = np.array([[[.5,.8], [3,4], [3,.1]],
[[0,2], [.5,.5], [.3,3]],
[[.4,.4], [.1,.1], [.5,5]]])
a = TH(a, threshold = 1)
print(a)
end = time.time()
print("Run time: ", end-start)
Output:
[[[0. 0. ]
[3. 4. ]
[3. 0.1]]
[[0. 2. ]
[0. 0. ]
[0.3 3. ]]
[[0. 0. ]
[0. 0. ]
[0.5 5. ]]]
Run time: 0.0009984970092773438
Simply slice with the two elements along the last axis and perform the same operations in a vectorized manner to get a mask and finally index with the mask into the input array to assign 0s -
mask = (np.abs(arr[...,0]) < threshold) & (np.abs(arr[...,1]) < threshold)
arr[mask] = 0
Note that arr[...,0] is another way to put arr[:,:,0] and is meant to slice a generic ndarray along the last axis. Similarly, for arr[...,1].
Alternatively, pre-compute the absolute values and use them to compare against threshold and look for all matches against last axis to get the same mask -
ab = np.abs(arr)
mask = (ab < threshold).all(-1)
Or, use the same slicing method after computing absolute values -
mask = (ab[...,0] < threshold) & (ab[...,1] < threshold)
For large arrays, we can also leverage numexpr module -
import numexpr as ne
m0 = ne.evaluate('abs(arr)<threshold')
mask = m0[...,0] & m0[...,1]
Timings -
In [209]: arr = np.random.rand(1080,1920,2)
In [210]: threshold = 1
In [211]: %timeit (np.abs(arr[...,0])<threshold) & (np.abs(arr[...,1])<threshold)
100 loops, best of 3: 10.2 ms per loop
In [212]: %timeit np.abs(arr).all(1)
10 loops, best of 3: 34.5 ms per loop
In [213]: %%timeit
...: ab = np.abs(arr)
...: (ab[...,0] < threshold) & (ab[...,1] < threshold)
...:
100 loops, best of 3: 11 ms per loop
In [214]: %%timeit
...: m0 = ne.evaluate('abs(arr)<threshold')
...: m0[...,0] & m0[...,1]
...:
100 loops, best of 3: 4.79 ms per loop

Shift rows of a numpy array independently

This is an extension of the question posed here (quoted below)
I have a matrix (2d numpy ndarray, to be precise):
A = np.array([[4, 0, 0],
[1, 2, 3],
[0, 0, 5]])
And I want to roll each row of A independently, according to roll
values in another array:
r = np.array([2, 0, -1])
That is, I want to do this:
print np.array([np.roll(row, x) for row,x in zip(A, r)])
[[0 0 4]
[1 2 3]
[0 5 0]]
Is there a way to do this efficiently? Perhaps using fancy indexing
tricks?
The accepted solution was:
rows, column_indices = np.ogrid[:A.shape[0], :A.shape[1]]
# Use always a negative shift, so that column_indices are valid.
# (could also use module operation)
r[r < 0] += A.shape[1]
column_indices = column_indices - r[:,np.newaxis]
result = A[rows, column_indices]
I would basically like to do the same thing, except when an index gets rolled "past" the end of the row, I would like the other side of the row to be padded with a NaN, rather than the value move to the "front" of the row in a periodic fashion.
Maybe using np.pad somehow? But I can't figure out how to get that to pad different rows by different amounts.
Inspired by Roll rows of a matrix independently's solution, here's a vectorized one based on np.lib.stride_tricks.as_strided -
from skimage.util.shape import view_as_windows as viewW
def strided_indexing_roll(a, r):
# Concatenate with sliced to cover all rolls
p = np.full((a.shape[0],a.shape[1]-1),np.nan)
a_ext = np.concatenate((p,a,p),axis=1)
# Get sliding windows; use advanced-indexing to select appropriate ones
n = a.shape[1]
return viewW(a_ext,(1,n))[np.arange(len(r)), -r + (n-1),0]
Sample run -
In [76]: a
Out[76]:
array([[4, 0, 0],
[1, 2, 3],
[0, 0, 5]])
In [77]: r
Out[77]: array([ 2, 0, -1])
In [78]: strided_indexing_roll(a, r)
Out[78]:
array([[nan, nan, 4.],
[ 1., 2., 3.],
[ 0., 5., nan]])
I was able to hack this together with linear indexing...it gets the right result but performs rather slowly on large arrays.
A = np.array([[4, 0, 0],
[1, 2, 3],
[0, 0, 5]]).astype(float)
r = np.array([2, 0, -1])
rows, column_indices = np.ogrid[:A.shape[0], :A.shape[1]]
# Use always a negative shift, so that column_indices are valid.
# (could also use module operation)
r_old = r.copy()
r[r < 0] += A.shape[1]
column_indices = column_indices - r[:,np.newaxis]
result = A[rows, column_indices]
# replace with NaNs
row_length = result.shape[-1]
pad_inds = []
for ind,i in np.enumerate(r_old):
if i > 0:
inds2pad = [np.ravel_multi_index((ind,) + (j,),result.shape) for j in range(i)]
pad_inds.extend(inds2pad)
if i < 0:
inds2pad = [np.ravel_multi_index((ind,) + (j,),result.shape) for j in range(row_length+i,row_length)]
pad_inds.extend(inds2pad)
result.ravel()[pad_inds] = nan
Gives the expected result:
print result
[[ nan nan 4.]
[ 1. 2. 3.]
[ 0. 5. nan]]
Based on #Seberg and #yann-dubois answers in the non-nan case, I've written a method that:
Is faster than the current answer
Works on ndarrays of any shape (specify the row-axis using the axis argument)
Allows for setting fill to either np.nan, any other "fill value" or False to allow regular rolling across the array edge.
Benchmarking
cols, rows = 1024, 2048
arr = np.stack(rows*(np.arange(cols,dtype=float),))
shifts = np.random.randint(-cols, cols, rows)
np.testing.assert_array_almost_equal(row_roll(arr, shifts), strided_indexing_roll(arr, shifts))
# True
%timeit row_roll(arr, shifts)
# 25.9 ms ± 161 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit strided_indexing_roll(arr, shifts)
# 29.7 ms ± 446 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
def row_roll(arr, shifts, axis=1, fill=np.nan):
"""Apply an independent roll for each dimensions of a single axis.
Parameters
----------
arr : np.ndarray
Array of any shape.
shifts : np.ndarray, dtype int. Shape: `(arr.shape[:axis],)`.
Amount to roll each row by. Positive shifts row right.
axis : int
Axis along which elements are shifted.
fill: bool or float
If True, value to be filled at missing values. Otherwise just rolls across edges.
"""
if np.issubdtype(arr.dtype, int) and isinstance(fill, float):
arr = arr.astype(float)
shifts2 = shifts.copy()
arr = np.swapaxes(arr,axis,-1)
all_idcs = np.ogrid[[slice(0,n) for n in arr.shape]]
# Convert to a positive shift
shifts2[shifts2 < 0] += arr.shape[-1]
all_idcs[-1] = all_idcs[-1] - shifts2[:, np.newaxis]
result = arr[tuple(all_idcs)]
if fill is not False:
# Create mask of row positions above negative shifts
# or below positive shifts. Then set them to np.nan.
*_, nrows, ncols = arr.shape
mask_neg = shifts < 0
mask_pos = shifts >= 0
shifts_pos = shifts.copy()
shifts_pos[mask_neg] = 0
shifts_neg = shifts.copy()
shifts_neg[mask_pos] = ncols+1 # need to be bigger than the biggest positive shift
shifts_neg[mask_neg] = shifts[mask_neg] % ncols
indices = np.stack(nrows*(np.arange(ncols),))
nanmask = (indices < shifts_pos[:, None]) | (indices >= shifts_neg[:, None])
result[nanmask] = fill
arr = np.swapaxes(result,-1,axis)
return arr

Fast way to round array values based on condition

I have an array like this:
a = np.array([
[0.02, 1.01, 4.01, 3.00, 5.12],
[2.11, 1.50, 3.98, 0.52, 5.01]])
and a "condition" array:
c = np.array([0, 1, 4, 5])
I want to round a[i][j]=c[k] if c[k] - const < a[i][j] < c[k] + const, otherwise a[i][j] = 0
For example, if const = 0.05. The result could be:
a_result = [[0 1 4 0 0]
[0 0 4 0 5]]
The navie way is to use 3 for loop to check for each a[i][j] and c[k]. However, it's very slow when a is big. Do we have a fast "python way" to do this?
For loop (slow) solution:
a_result = np.full(a.shape, 0)
const = 0.05
mh, mw = a.shape
for i in range(mh-1):
for j in range(mw-1):
for k in range(1, len(c)):
if a[i][j] > (c[k] - const) and a[i][j] < (c[k] + const):
a_result[i][j] = c[k]
Approach #1
One vectorized approach would be with broadcasting -
c[(np.abs(a - c[:,None,None]) < const).argmax(0)]
Sample run -
In [312]: a
Out[312]:
array([[ 0.02, 1.01, 4.01, 3. , 5.12],
[ 2.11, 1.5 , 3.98, 0.52, 5.01]])
In [313]: c
Out[313]: array([0, 1, 4, 5])
In [314]: c[(np.abs(a - c[:,None,None]) < const).argmax(0)]
Out[314]:
array([[0, 1, 4, 0, 0],
[0, 0, 4, 0, 5]])
Approach #2
Another one that would be closer to what we had in the question, but vectorized, like so -
mask = ((c[:,None,None] - const) < a) & (a < (c[:,None,None] + const))
out = c[mask.argmax(0)]
Approach #3
Here's another with memory efficiency in mind, based on this post -
idx = np.searchsorted(c, a, side="left").clip(max=c.size-1)
mask = (idx > 0) & \
( (idx == len(xx)) | (np.fabs(yy - xx[idx-1]) < np.fabs(yy - xx[idx])) )
idx0 = idx-mask
out = xx[idx0]
out[np.abs(c[idx0] - a) >= const] = 0

Numpy sum running length of non-zero values

Looking for a fast vectorized function that returns the rolling number of consecutive non-zero values. The count should start over at 0 whenever encountering a zero. The result should have the same shape as the input array.
Given an array like this:
x = np.array([2.3, 1.2, 4.1 , 0.0, 0.0, 5.3, 0, 1.2, 3.1])
The function should return this:
array([1, 2, 3, 0, 0, 1, 0, 1, 2])
This post lists a vectorized approach which basically consists of two steps:
Initialize a zeros vector of the same size as input vector, x and set ones at places corresponding to non-zeros of x.
Next up, in that vector, we need to put minus of runlengths of each island right after the ending/stop positions for each "island". The intention is to use cumsum again later on, which would result in sequential numbers for the "islands" and zeros elsewhere.
Here's the implementation -
import numpy as np
#Append zeros at the start and end of input array, x
xa = np.hstack([[0],x,[0]])
# Get an array of ones and zeros, with ones for nonzeros of x and zeros elsewhere
xa1 =(xa!=0)+0
# Find consecutive differences on xa1
xadf = np.diff(xa1)
# Find start and stop+1 indices and thus the lengths of "islands" of non-zeros
starts = np.where(xadf==1)[0]
stops_p1 = np.where(xadf==-1)[0]
lens = stops_p1 - starts
# Mark indices where "minus ones" are to be put for applying cumsum
put_m1 = stops_p1[[stops_p1 < x.size]]
# Setup vector with ones for nonzero x's, "minus lens" at stops +1 & zeros elsewhere
vec = xa1[1:-1] # Note: this will change xa1, but it's okay as not needed anymore
vec[put_m1] = -lens[0:put_m1.size]
# Perform cumsum to get the desired output
out = vec.cumsum()
Sample run -
In [116]: x
Out[116]: array([ 0. , 2.3, 1.2, 4.1, 0. , 0. , 5.3, 0. , 1.2, 3.1, 0. ])
In [117]: out
Out[117]: array([0, 1, 2, 3, 0, 0, 1, 0, 1, 2, 0], dtype=int32)
Runtime tests -
Here's some runtimes tests comparing the proposed approach against the other itertools.groupby based approach -
In [21]: N = 1000000
...: x = np.random.rand(1,N)
...: x[x>0.5] = 0.0
...: x = x.ravel()
...:
In [19]: %timeit sumrunlen_vectorized(x)
10 loops, best of 3: 19.9 ms per loop
In [20]: %timeit sumrunlen_loopy(x)
1 loops, best of 3: 2.86 s per loop
You can use itertools.groupby and np.hstack :
>>> import numpy as np
>>> x = np.array([2.3, 1.2, 4.1 , 0.0, 0.0, 5.3, 0, 1.2, 3.1])
>>> from itertools import groupby
>>> np.hstack([[i if j!=0 else j for i,j in enumerate(g,1)] for _,g in groupby(x,key=lambda x: x!=0)])
array([ 1., 2., 3., 0., 0., 1., 0., 1., 2.])
We can group the array elements based on non-zero elements then use a list comprehension and enumerate to replace the non-zero sub-arrays with those index then flatten the list with np.hstack.
This sub-problem came up in Kick Start 2021 Round A for me. My solution:
def current_run_len(a):
a_ = np.hstack([0, a != 0, 0]) # first in starts and last in stops defined
d = np.diff(a_)
starts = np.where(d == 1)[0]
stops = np.where(d == -1)[0]
a_[stops + 1] = -(stops - starts) # +1 for behind-last
return a_[1:-1].cumsum()
In fact, the problem also required a version where you count down consecutive sequences. Thus here another version with an optional keyword argument which does the same for rev=False:
def current_run_len(a, rev=False):
a_ = np.hstack([0, a != 0, 0]) # first in starts and last in stops defined
d = np.diff(a_)
starts = np.where(d == 1)[0]
stops = np.where(d == -1)[0]
if rev:
a_[starts] = -(stops - starts)
cs = -a_.cumsum()[:-2]
else:
a_[stops + 1] = -(stops - starts) # +1 for behind-last
cs = a_.cumsum()[1:-1]
return cs
Results:
a = np.array([1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1])
print('a = ', a)
print('current_run_len(a) = ', current_run_len(a))
print('current_run_len(a, rev=True) = ', current_run_len(a, rev=True))
a = [1 1 1 1 0 0 0 1 1 0 1 0 0 0 1]
current_run_len(a) = [1 2 3 4 0 0 0 1 2 0 1 0 0 0 1]
current_run_len(a, rev=True) = [4 3 2 1 0 0 0 2 1 0 1 0 0 0 1]
For an array that consists of 0s and 1s only, you can simplify [0, a != 0, 0] to [0, a, 0]. But the version as-posted also works for arbitrary non-zero numbers.

Categories