checking non-ambiguity of strides in numpy array - python

For a numpy array X, the location of its element X[k[0], ..., k[d-1]] is offset from the location of X[0,..., 0] by k[0]*s[0] + ... + k[d-1]*s[d-1], where (s[0],...,s[d-1]) is the tuple representing X.strides.
As far as I understand nothing in numpy array specs requires that distinct indexes of array X correspond to distinct addresses in memory, the simplest instance of this being a zero value of the stride, e.g. see advanced NumPy section of scipy lectures.
Does the numpy have a built-in predicate to test if the strides and the shape are such that distinct indexes map to distinct memory addresses?
If not, how does one write one, preferably so as to avoid sorting of the strides?

edit: It took me a bit to figure what you are asking about. With striding tricks it's possible to index the same element in a databuffer in different ways, and broadcasting actually does this under the covers. Normally we don't worry about it because it is either hidden or intentional.
Recreating in the strided mapping and looking for duplicates may be the only way to test this. I'm not aware of any existing function that checks it.
==================
I'm not quite sure what you concerned with. But let me illustrate how shape and strides work
Define a 3x4 array:
In [453]: X=np.arange(12).reshape(3,4)
In [454]: X.shape
Out[454]: (3, 4)
In [455]: X.strides
Out[455]: (16, 4)
Index an item
In [456]: X[1,2]
Out[456]: 6
I can get it's index in a flattened version of the array (e.g. the original arange) with ravel_multi_index:
In [457]: np.ravel_multi_index((1,2),X.shape)
Out[457]: 6
I can also get this location using strides - keeping mind that strides are in bytes (here 4 bytes per item)
In [458]: 1*16+2*4
Out[458]: 24
In [459]: (1*16+2*4)/4
Out[459]: 6.0
All these numbers are relative to the start of the data buffer. We can get the data buffer address from X.data or X.__array_interface__['data'], but usually don't need to.
So this strides tells us that to go from entry to the next, step 4 bytes, and to go from one row to the next step 16. 6 is located at one row down, 2 over, or 24 bytes into the buffer.
In the as_strided example of your link, strides=(1*2, 0) produces repeated indexing of specific values.
With my X:
In [460]: y=np.lib.stride_tricks.as_strided(X,strides=(16,0), shape=(3,4))
In [461]: y
Out[461]:
array([[0, 0, 0, 0],
[4, 4, 4, 4],
[8, 8, 8, 8]])
y is a 3x4 that repeatedly indexes the 1st column of X.
Changing one item in y ends up changing one value in X but a whole row in y:
In [462]: y[1,2]=10
In [463]: y
Out[463]:
array([[ 0, 0, 0, 0],
[10, 10, 10, 10],
[ 8, 8, 8, 8]])
In [464]: X
Out[464]:
array([[ 0, 1, 2, 3],
[10, 5, 6, 7],
[ 8, 9, 10, 11]])
as_strided can produce some weird effects if you aren't careful.
OK, maybe I've figured out what's bothering you - can I identify a situation like this where two different indexing tuples end up pointing to the same location in the data buffer? Not that I'm aware of. That y strides contains a 0 is a pretty good indicator.
as_stridedis often used to create overlapping windows:
In [465]: y=np.lib.stride_tricks.as_strided(X,strides=(8,4), shape=(3,4))
In [466]: y
Out[466]:
array([[ 0, 1, 2, 3],
[ 2, 3, 10, 5],
[10, 5, 6, 7]])
In [467]: y[1,2]=20
In [469]: y
Out[469]:
array([[ 0, 1, 2, 3],
[ 2, 3, 20, 5],
[20, 5, 6, 7]])
Again changing 1 item in y ends up changing 2 values in y, but only 1 in X.
Ordinary array creation and indexing does not have this duplicate indexing issue. Broadcasting may do something like, under the cover, where a (4,) array is changed to (1,4) and then to (3,4), effectively replicating rows. I think there's another stride_tricks function that does this explicitly.
In [475]: x,y=np.lib.stride_tricks.broadcast_arrays(X,np.array([.1,.2,.3,.4]))
In [476]: x
Out[476]:
array([[ 0, 1, 2, 3],
[20, 5, 6, 7],
[ 8, 9, 10, 11]])
In [477]: y
Out[477]:
array([[ 0.1, 0.2, 0.3, 0.4],
[ 0.1, 0.2, 0.3, 0.4],
[ 0.1, 0.2, 0.3, 0.4]])
In [478]: y.strides
Out[478]: (0, 8)
In any case, in normal array use we don't have to worry about this ambiguity. We get it only with intentional actions, not accidental ones.
==============
How about this for a test:
def dupstrides(x):
uniq={sum(s*j for s,j in zip(x.strides,i)) for i in np.ndindex(x.shape)}
print(uniq)
print(len(uniq))
print(x.size)
return len(uniq)<x.size
In [508]: dupstrides(X)
{0, 32, 4, 36, 8, 40, 12, 44, 16, 20, 24, 28}
12
12
Out[508]: False
In [509]: dupstrides(y)
{0, 4, 8, 12, 16, 20, 24, 28}
8
12
Out[509]: True

It turns out this test is already implemented in numpy, see mem_overlap.c:842.
The test is exposed as numpy.core.multiarray_tests.internal_overlap(x).
Example:
>>> import numpy as np
>>> from numpy.core.multiarray_tests import internal_overlap
>>> from numpy.lib.stride_tricks import as_strided
Now, create a contiguous array, and use as_strided to create an array with internal overlapping, and confirm this with the testing:
>>> x = np.arange(3*4, dtype=np.float64).reshape((3,4))
>>> y = as_strided(x, shape=(5,4), strides=(16, 8))
>>> y
array([[ 0., 1., 2., 3.],
[ 2., 3., 4., 5.],
[ 4., 5., 6., 7.],
[ 6., 7., 8., 9.],
[ 8., 9., 10., 11.]])
>>> internal_overlap(x)
False
>>> internal_overlap(y)
True
The function is optimized to quickly returns False for Fortran- or C- contiguous arrays.

Related

Python Numpy: How to reverse a slice operation?

I'd like to 'reverse' a slice operation. For example, the original array is:
import numpy as np
a=np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
>>> array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
Then, I slice this array, and do some operations on it (times 2):
b = 2 * a[::-1,::2]
>>> array([[18, 22],
[10, 14],
[ 2, 6]])
How can I put all elements in b back to their original position in a (Note axis 0 is flipped), i.e. the final result should be:
>>> array([[ 2, 0, 6, 0],
[10, 0, 14, 0],
[18, 0, 22, 0]])
I figured something like as_strided and flip should be used, but I can't seem to get it working. Thanks for any help!!
import numpy as np
a=np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
req_array = np.zeros(a.shape)
req_array[::-1,::2] = 2 * a[::-1,::2]
print(req_array) ## your required array
You could just np.zeros(shape=(3,4)), index them, but you would need to reverse[::-1] before doing so. Seems like a lot of pain for something that can be solved with just as simple as saving the array before under a different variable, as you always going to have to save the shape anyway. e.g.
var1 = beforeArray
var2 = 2 * var1 [::-1,::2]
to_reverse_var2 = np.zeros(var1.shape)
then index
But would be much easier to just save the var instead, because you will always need the array.shape anyway. Just use self. or global if it is functional problem.
Other ideas would be play about with matrix methods outside of numpy to speed up process such as.

Create matrix with numpy meshgrid

I want to create a matrix with M rows and N columns. The increment along the columns are always 1, whilst the increment in rows are a constant value, c. For example, to create this matrix:
The number of rows are 4, the number of columns are 2 and the shift between rows: c = 8. One way to perform this could be:
# Indices of columns
coord_x = np.arange(0, 2)
# Indices of rows
coord_y = np.arange(1, 37, 9)
# Creates 2 matrices with the coordinates
x, y = np.meshgrid(coord_x, coord_y)
# To perform the shift between columns
idx_left = x + y
And the output is:
print(idx_left)
[[ 1 2]
[10 11]
[19 20]
[28 29]]
Can I perform this without the adding idx_left = x + y?. I've already seen other functions but I don't find any that considers a shift along the rows and columns...
Stride tricks
You can use np.lib.stride_tricks for this purpose.
arr = np.arange(1,100)
shape = (4,2)
strides = (arr.strides[0]*9,arr.strides[0]*1) #8 bytes with 9 steps on axis=0, 8bytes with 1 step on axis=1
np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)
array([[ 1, 2],
[10, 11],
[19, 20],
[28, 29]])
Another example with 2 shift on axis=0 and 3 shift on axis=1.
arr = np.arange(1,100)
shape = (4,2)
strides = (arr.strides[0]*2,arr.strides[0]*3) #8bytes * 2shift on axis=0, 8bytes*3shift on axis=1
np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)
array([[ 1, 4],
[ 3, 6],
[ 5, 8],
[ 7, 10]])
Broadcasting
You could simply do this the same way you are doing without meshgrids only using broadcasting as well -
#Your original example
coord_x = np.arange(0, 2, 1) #start, stop, step
coord_y = np.arange(1, 37, 9)
coord_x[None,:] + coord_y[:,None]
array([[ 1, 2],
[10, 11],
[19, 20],
[28, 29]])
Linspace
You could use linspace if you have extreme limits. So, in your case, you can create a 4 row array from (1,2) to (28,29).
np.linspace((1,2),(28,29),4)
array([[ 1., 2.],
[10., 11.],
[19., 20.],
[28., 29.]])
Mgrid
Mgrid is more convenient than mesh grid for your purpose. You can do -
np.mgrid[0:2:1, 1:37:9].sum(0).T
array([[ 1, 2],
[10, 11],
[19, 20],
[28, 29]])

sum of squares (4 values shaping a square) within a 2d numpy array. Python

I am looking to sum each 4 point combination in a 2d array that forms squares within the matrix
in4x4 = np.array(([1,2,3,4],[2,3,4,1],[3,4,1,2],[4,3,1,2]))
print(in4x4)
array([[1, 2, 3, 4],
[2, 3, 4, 1],
[3, 4, 1, 2],
[4, 3, 1, 2]])
Expected output:
print(out3x3)
array([[ 8, 12, 12],
[12, 12, 8],
[14, 9, 6]]
Currently I am using numpy.diff in a several step process. Feel like there should be a cleaner approach. Example of the calculation I currently do:
diff3x4 = np.diff(a4x4,axis=0)
a3x4 = a4x4[0:3,:] * 2 + d3x4
d3x3 = np.diff(a3x4)
a3x3 = a3x4[:,0:3] * 2 + d3x3
Is there a clean possibly vectorized approach? Speed is of concern. Looked at scipy.convolve but does not seem to suit my purposes, but maybe I am just configuring the kernel wrong.
You can do 2-D convolution with a kerenl of ones to achieve the desired result.
Simple code that implements 2D-convolution and test on the input you gave:
import numpy as np
def conv2d(a, f):
s = f.shape + tuple(np.subtract(a.shape, f.shape) + 1)
strd = np.lib.stride_tricks.as_strided
subM = strd(a, shape = s, strides = a.strides * 2)
return np.einsum('ij,ijkl->kl', f, subM)
in4x4 = np.array(([1,2,3,4],[2,3,4,1],[3,4,1,2],[4,3,1,2]))
k = np.ones((2,2))
print(conv2d(in4x4, k))
output:
[[ 8. 12. 12.]
[12. 12. 8.]
[14. 9. 6.]]
In case you can use built-in function you can use like in the following:
import numpy as np
from scipy import signal
in4x4 = np.array(([1,2,3,4],[2,3,4,1],[3,4,1,2],[4,3,1,2]))
k = np.ones((2,2))
signal.convolve2d(in4x4, k, 'valid')
Which output:
array([[ 8., 12., 12.],
[12., 12., 8.],
[14., 9., 6.]])
signal.convolve2d(in4x4, k, 'valid')

Comparing value with neighbor elements in numpy

Let's say I have a numpy array
a b c
A = i j k
u v w
I want to compare the value central element with some of its eight neighbor elements (along the axis or along the diagonal). Is there any faster way except the nested for loop (it's too slow for big matrix)?
To be more specific, what I want to do is compare value of element with it's neighbors and assign new values.
For example:
if (j == 1):
if (j>i) & (j>k):
j = 999
else:
j = 0
if (j == 2):
if (j>c) & (j>u):
j = 999
else:
j = 0
...
something like this.
Your operation contains lots of conditionals, so the most efficient way to do it in the general case (any kind of conditionals, any kind of operations) is using loops. This could be done efficiently using numba or cython. In special cases, you can implement it using higher level functions in numpy/scipy. I'll show a solution for the specific example you gave, and hopefully you can generalize from there.
Start with some fake data:
A = np.asarray([
[1, 1, 1, 2, 0],
[1, 0, 2, 2, 2],
[0, 2, 0, 1, 0],
[1, 2, 2, 1, 0],
[2, 1, 1, 1, 2]
])
We'll find locations in A where various conditions apply.
1a) The value is 1
1b) The value is greater than its horizontal neighbors
2a) The value is 2
2b) The value is greater than its diagonal neighbors
Find locations in A where the specified values occur:
cond1a = A == 1
cond2a = A == 2
This gives matrices of boolean values, of the same size as A. The value is true where the condition holds, otherwise false.
Find locations in A where each element has the specified relationships to its neighbors:
# condition 1b: value greater than horizontal neighbors
f1 = np.asarray([[1, 0, 1]])
cond1b = A > scipy.ndimage.maximum_filter(
A, footprint=f1, mode='constant', cval=-np.inf)
# condition 2b: value greater than diagonal neighbors
f2 = np.asarray([
[0, 0, 1],
[0, 0, 0],
[1, 0, 0]
])
cond2b = A > scipy.ndimage.maximum_filter(
A, footprint=f2, mode='constant', cval=-np.inf)
As before, this gives matrices of boolean values indicating where the conditions are true. This code uses scipy.ndimage.maximum_filter(). This function iteratively shifts a 'footprint' to be centered over each element of A. The returned value for that position is the maximum of all elements for which the footprint is 1. The mode argument specifies how to treat implicit values outside boundaries of the matrix, where the footprint falls off the edge. Here, we treat them as negative infinity, which is the same as ignoring them (since we're using the max operation).
Set values of the result according to the conditions. The value is 999 if conditions 1a and 1b are both true, or if conditions 2a and 2b are both true. Else, the value is 0.
result = np.zeros(A.shape)
result[(cond1a & cond1b) | (cond2a & cond2b)] = 999
The result is:
[
[ 0, 0, 0, 0, 0],
[999, 0, 0, 999, 999],
[ 0, 0, 0, 999, 0],
[ 0, 0, 999, 0, 0],
[ 0, 0, 0, 0, 999]
]
You can generalize this approach to other patterns of neighbors by changing the filter footprint. You can generalize to other operations (minimum, median, percentiles, etc.) using other kinds of filters (see scipy.ndimage). For operations that can be expressed as weighted sums, use 2d cross correlation.
This approach should be much faster than looping in python. But, it does perform unnecessary computations (for example, it's only necessary to compute the max when the value is 1 or 2, but we're doing it for all elements). Looping manually would let you avoid these computations. Looping in python would probably be much slower than the code here. But, implementing it in numba or cython would probably be faster because these tools generate compiled code.
I used numpy's:
concatenate to pad with zeroes
dstack and roll to align correctly
Apply custom_roll twice along different dimensions and subtract original.
import numpy as np
def custom_roll(a, axis=0):
n = 3
a = a.T if axis==1 else a
pad = np.zeros((n-1, a.shape[1]))
a = np.concatenate([a, pad], axis=0)
ad = np.dstack([np.roll(a, i, axis=0) for i in range(n)])
a = ad.sum(2)[1:-1, :]
a = a.T if axis==1 else a
return a
Consider the following ndarray:
A = np.arange(25).reshape(5, 5)
A
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
sum_of_eight_around_me = custom_roll(custom_roll(A), axis=1) - A
sum_of_eight_around_me
array([[ 12., 20., 25., 30., 20.],
[ 28., 48., 56., 64., 42.],
[ 53., 88., 96., 104., 67.],
[ 78., 128., 136., 144., 92.],
[ 52., 90., 95., 100., 60.]])

Vector operations with numpy

I have three numpy arrays:
X: a 3073 x 49000 matrix
W: a 10 x 3073 matrix
y: a 49000 x 1 vector
y contains values between 0 and 9, each value represents a row in W.
I would like to add the first column of X to the row in W given by the first element in y. I.e. if the first element in y is 3, add the first column of X to the fourth row of W. And then add the second column of X to the row in W given by the second element in y and so on, until all columns of X has been aded to the row in W specified by y, which means a total of 49000 added rows.
W[y] += X.T does not work for me, because this will not add more than one vector to a row in W.
Please note: I'm only looking for vectorized solutions. I.e. no for-loops.
EDIT: To clarify I'll add an example with small matrix sizes adapted from Salvador Dali's example below.
In [1]: import numpy as np
In [2]: a, b, c = 3, 4, 5
In [3]: np.random.seed(0)
In [4]: X = np.random.randint(10, size=(b,c))
In [5]: W = np.random.randint(10, size=(a,b))
In [6]: y = np.random.randint(a, size=(c,1))
In [7]: X
Out[7]:
array([[5, 0, 3, 3, 7],
[9, 3, 5, 2, 4],
[7, 6, 8, 8, 1],
[6, 7, 7, 8, 1]])
In [8]: W
Out[8]:
array([[5, 9, 8, 9],
[4, 3, 0, 3],
[5, 0, 2, 3]])
In [9]: y
Out[9]:
array([[0],
[1],
[1],
[2],
[0]])
In [10]: W[y.ravel()] + X.T
Out[10]:
array([[10, 18, 15, 15],
[ 4, 6, 6, 10],
[ 7, 8, 8, 10],
[ 8, 2, 10, 11],
[12, 13, 9, 10]])
In [11]: W[y.ravel()] = W[y.ravel()] + X.T
In [12]: W
Out[12]:
array([[12, 13, 9, 10],
[ 7, 8, 8, 10],
[ 8, 2, 10, 11]])
The problem is to get BOTH column 0 and column 4 in X added to row 0 in W, as well as both column 1 and 2 in X added to row 1 in W.
The desired outcome is thus:
W = [[17, 22, 16, 16],
[ 7, 11, 14, 17],
[ 8, 2, 10, 11]]
First the straight forward loop solution as reference:
In [65]: for i,j in enumerate(y):
W[j]+=X[:,i]
....:
In [66]: W
Out[66]:
array([[17, 22, 16, 16],
[ 7, 11, 14, 17],
[ 8, 2, 10, 11]])
An add.at solution:
In [67]: W=W1.copy()
In [68]: np.add.at(W,(y.ravel()),X.T)
In [69]: W
Out[69]:
array([[17, 22, 16, 16],
[ 7, 11, 14, 17],
[ 8, 2, 10, 11]])
add.at does an unbuffered calculation, getting around the buffering that prevents W[y.ravel()] += X.T from working. It is still iterative, but the loop has been moved to compiled code. It isn't true vectorization because the order of application matters. The addition for one row of X.T depends on the results from the previous rows.
https://stackoverflow.com/a/20811014/901925 is the answer I gave a couple of years ago to a similar question (for 1d arrays).
But when dealing with your large arrays:
X: a 3073 x 49000 matrix
W: a 10 x 3073 matrix
y: a 49000 x 1 vector
this can run into speed issues. Note that W[y.ravel()] is the same size as X.T (why did you pick these sizes that require transpose?). And it's a copy, not a view. So there's already a time penalty.
bincount has been suggested in previous questions, and I think it is faster. Making for loop with index arrays faster (both bincount and add.at solutions)
Iterating over the small 3073 dimension could also have speed advantages. Or better yet on the size 10 dimension as Divakar demonstrates.
For the small test case, a,b,c=3,4,5, the add.at solution is fastest, with Divakar's bincount and einseum next. For a larger a,b,c=10,1000,20000, add.at gets very slow, with bincount being the fastest.
Related SO answers
https://stackoverflow.com/a/28205888/901925 (notes that bincount requires complete coverage for y).
https://stackoverflow.com/a/30041823/901925 (where Divakar again shows that bincount rules!)
Vectorized approaches
Approach #1
Based on this answer, here's a vectorized solution using np.bincount -
N = y.max()+1
id = y.ravel() + np.arange(X.shape[0])[:,None]*N
W[:N] += np.bincount(id.ravel(), weights=X.ravel()).reshape(-1,N).T
Approach #2
You can make good usage of boolean indexing and np.einsum to get the job done in a concise vectorized manner -
N = y.max()+1
W[:N] += np.einsum('ijk,lk->il',(np.arange(N)[:,None,None] == y.ravel()),X)
Loopy approaches
Approach #3
Since you are selecting and adding up a huge number of columns from X per unique y, it might be better in terms of performance to run a loop with complexity equal to the number of such unique y's, which seems to be at max equal to the number of rows in W and that in your case is just 10. Thus, the loop has just 10 iterations, not bad! Here's the implementation to fulfill those aspirations -
for k in range(W.shape[0]):
W[k] += X[:,(y==k).ravel()].sum(1)
Approach #4
You can bring in np.einsum to do the columnwise summations and have the final output like so -
for k in range(W.shape[0]):
W[k] += np.einsum('ij->i',X[:,(y==k).ravel()])
This will achieve what you want: X + W[y.ravel()].T
To see that this really does the work, here is a reproducible example:
import numpy as np
np.random.seed(0)
a, b, c = 3, 5, 4 # you can use your 3073, 49000, 10 later
X = np.random.rand(a, b)
W = np.random.rand(c, a)
y = np.random.randint(c, size=(b, 1))
Now your matrices are:
[[ 0.0871293 0.0202184 0.83261985]
[ 0.77815675 0.87001215 0.97861834]
[ 0.79915856 0.46147936 0.78052918]
[ 0.11827443 0.63992102 0.14335329]]
[[3]
[0]
[3]
[2]
[0]]
[[ 0.5488135 0.71518937 0.60276338 0.54488318 0.4236548 ]
[ 0.64589411 0.43758721 0.891773 0.96366276 0.38344152]
[ 0.79172504 0.52889492 0.56804456 0.92559664 0.07103606]]
And W[y.ravel()] gives you " W given by the first element in y". By transposing it, you will get a matrix ready to be added to X:
[[ 0.11827443 0.0871293 0.11827443 0.79915856 0.0871293 ]
[ 0.63992102 0.0202184 0.63992102 0.46147936 0.0202184 ]
[ 0.14335329 0.83261985 0.14335329 0.78052918 0.83261985]]
While I can't say that this is very pythonic, it is a solution (I think):
for column in range(x.shape[1]):
w[y[column]] = x[:,column].T

Categories