i have a doubt. There is an efficient way to sum all neighbors of a numpy matrix without using several conditions?
This is an example:
array([[5, 4, 8, 3, 1, 4, 3, 2, 2, 3],
[2, 7, 4, 5, 8, 5, 4, 7, 1, 1],
[5, 2, 6, 4, 5, 5, 6, 1, 7, 3],
[6, 1, 4, 1, 3, 3, 6, 1, 4, 6],
[6, 3, 5, 7, 3, 8, 5, 4, 7, 8],
[4, 1, 6, 7, 5, 2, 4, 6, 4, 5],
[2, 1, 7, 6, 8, 4, 1, 7, 2, 1],
[6, 8, 8, 2, 8, 8, 1, 1, 3, 4],
[4, 8, 4, 6, 8, 4, 8, 5, 5, 4],
[5, 2, 8, 3, 7, 5, 1, 5, 2, 6]])
When I run m[0][-1] it returns me 3 and not an error, so if I want to add 1 to all neighbors of a value I need to use a lot of conditions because I can't just use m[0][-1] because in this case and in the other cases of the corners it returns me just a " False neighbor"
IIUC, you want to add 1 to each neighbour of a cell with a given value.
For the example, let's add 1 to each cell in the neighborhood of a 7:
from scipy.signal import convolve2d
v = np.array([[1,1,1],[1,0,1],[1,1,1]])
a + convolve2d(a==7, v, mode='same')
output:
array([[6, 5, 9, 3, 1, 4, 4, 3, 3, 3],
[3, 7, 5, 5, 8, 5, 5, 8, 3, 2],
[6, 3, 7, 4, 5, 5, 7, 3, 8, 4],
[6, 1, 5, 2, 4, 3, 6, 3, 6, 8],
[6, 3, 7, 8, 5, 8, 5, 5, 7, 9],
[4, 2, 9, 9, 7, 2, 5, 8, 6, 6],
[2, 2, 8, 8, 9, 4, 2, 7, 3, 1],
[6, 9, 9, 3, 8, 8, 2, 2, 4, 4],
[4, 8, 4, 7, 9, 5, 8, 5, 5, 4],
[5, 2, 8, 4, 7, 6, 1, 5, 2, 6]])
In addition to the good #mozway solution, one very efficient solution is to use the Numba stencil decorator combined with a parallel execution. Here is an example:
import numba as nb
# parallel=True is only useful for quite-big arrays
#nb.njit(parallel=True)
def kernel(v):
cond = np.zeros((v.shape[0]+2, v.shape[1]+2), dtype=np.bool_)
cond[1:-1, 1:-1] = v == 7
res = nb.stencil(lambda c: c[-1,-1]+c[-1,0]+c[-1,1]+c[0,-1]+c[0,1]+c[1,-1]+c[1,0]+c[1,1])(cond)
return v + res[1:-1, 1:-1]
kernel(m)
An even faster solution consist in working in-place (using v += res instead of the return v + res). Here are the performance results for a 2000x2000 integer array on my 6-core machine:
scipy.signal.convolve2d: 124 ms
Numba out-of-place: 20 ms
Numba in-place: 15 ms
Note that the first call to kernel is slower due to the compilation time.
I also got a similar speed-up for smaller arrays (200x200).
Related
I am trying to create an output that will be an array that contains 5 "sub-arrays". Every array should include 10 random numbers between 0 and 10.
I have this code:
def count_tweets():
big_array = []
for i in range(5):
array = []
for p in range(10):
array.append(random.randint(0,10))
big_array.append(array)
print(big_array)
I get a result like:
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10]]
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10], [5, 10, 7, 10, 7, 2, 1, 4, 8, 3]]
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10], [5, 10, 7, 10, 7, 2, 1, 4, 8, 3], [2, 7, 1, 3, 8, 5, 7, 6, 0, 0]]
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10], [5, 10, 7, 10, 7, 2, 1, 4, 8, 3], [2, 7, 1, 3, 8, 5, 7, 6, 0, 0], [0, 1, 9, 9, 4, 2, 10, 4, 3, 8]]
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10], [5, 10, 7, 10, 7, 2, 1, 4, 8, 3], [2, 7, 1, 3, 8, 5, 7, 6, 0, 0], [0, 1, 9, 9, 4, 2, 10, 4, 3, 8], [3, 7, 3, 5, 4, 0, 2, 8, 6, 2]]
But instead it should be like:
[[0,2,6,7,9,4,6,1,10,5],[1,3,5,9,8,7,6,9,0,10],[3,5,1,7,9,4,7,2,7,9],[10,2,8,5,6,9,2,3,5,9],[4,5,2,9,8,7,5,1,3,5]]
I cannot seem to get the indentation correct. How do I fix the code?
So what you did was put the print() statement inside a loop, which will print each time it runs.
import random
def count_tweets():
big_array = []
for i in range(5):
array = []
for p in range(10):
array.append(random.randint(0,10))
big_array.append(array)
print(big_array)
count_tweets()
Hope this helps :)
You got it right, just slide the print out of the for loop.(delete four spaces before print())
I have a NumPy array, say 10x10 size. I want to find a patch in this array. The patch is defined by a mean value and a sigma such that the patch contains values in the following range: [μ ± σ]
Please see below pseudo-code:
a = np.array([[2, 1, 6, 7, 6, 5, 9, 1, 5, 6],
[1, 7, 6, 0, 1, 9, 8, 1, 2, 0],
[4, 4, 5, 1, 7, 8, 8, 7, 3, 3],
[5, 6, 4, 4, 5, 4, 2, 2, 2, 7],
[3, 4, 4, 5, 5, 4, 8, 6, 1, 9],
[4, 4, 5, 5, 4, 6, 1, 9, 4, 5],
[8, 4, 6, 4, 4, 5, 2, 1, 8, 0],
[4, 5, 5, 5, 5, 4, 6, 2, 2, 4],
[3, 6, 1, 7, 7, 3, 2, 3, 5, 1],
[5, 1, 8, 3, 1, 4, 5, 9, 5, 0]])
patch_mu = 5
patch_sigma = 1
patch_size = 5x5 (5 rows, 5 cols)
def find_patch_index(arr, mu, sigma, size):
# magic happens here
return idx
idx = find_patch_index(a, patch_mu, patch_sigma, patch_size)
print(idx) # should give 3, 1 (i.e., 3rd row, 1st col)
Patch has no specific definition. Basically in a 2D array (a in this case), I want to find a square that has elements are in a given range, i.e., [μ ± σ]
I am thinking to use np.where but not getting a condition to satisfy the patch! Any lead, please?
You can use numpy.lib.stride_tricks.sliding_window_view in the following way:
valid = np.logical_and(
patch_mu - patch_sigma <= a,
patch_mu + patch_sigma >= a,
)
idx = np.argwhere(
sliding_window_view(valid, patch_size).all(axis=(-2, -1))
)
For Numpy < 1.20 you can use stride_tricks.as_strided instead of sliding_window_view:
as_strided(
valid,
shape=(valid.shape[0]-patch_size[0]+1, valid.shape[1]-patch_size[1]+1, *patch_size),
strides=2*valid.strides,
)
This question already has answers here:
Index a 2D Numpy array with 2 lists of indices
(5 answers)
Closed 2 years ago.
Suppose I have the following numpy array:
array = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
,[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
,[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]], np.int32)
Then if I use slicing I get:
array[1:5,1:5]
array([[2, 3, 4, 5],
[2, 3, 4, 5],
[2, 3, 4, 5],
[2, 3, 4, 5]], dtype=int32)
I want the similar results, if I want to select rows and columns that have "gaps"(for example 1,3 and 5).
So I want select rows and columns 1,3,5 and get:
array([[2, 4, 6],
[2, 4, 6],
[2, 4, 6]], dtype=int32)
But I don't know how to do it.
I want to do the same in tensorflow 2.0 but tf.gather doesnt' help
EDIT: Slicing doesn't solve the problem, when there is no patter in rows and columns numbers
In the case of wanting to index on a given list of indices, and you're expecting the behavior you'd get with slicing, you have np.ix:
ix = [1,3,5]
array[np.ix_(ix,ix)]
array([[2, 4, 6],
[2, 4, 6],
[2, 4, 6]])
Not sure how this is exactly done in tensorflow, but the idea is to add a new axis to one of the arrays (this is internally handled by np.ix_). In pytorch this could be done with:
a[ix.view(-1,1), ix]
tensor([[2, 4, 6],
[2, 4, 6],
[2, 4, 6]], dtype=torch.int32)
This question already has answers here:
Sort invariant for numpy.argsort with multiple dimensions
(3 answers)
Closed 3 years ago.
When we have a 1D numpy array, we can sort it the following way:
>>> temp = np.random.randint(1,10, 10)
>>> temp
array([5, 1, 1, 9, 5, 2, 8, 7, 3, 9])
>>> sort_inds = np.argsort(temp)
>>> sort_inds
array([1, 2, 5, 8, 0, 4, 7, 6, 3, 9], dtype=int64)
>>> temp[sort_inds]
array([1, 1, 2, 3, 5, 5, 7, 8, 9, 9])
Note: I know I can do this using np.sort; Obviously, I need the sorting indices for a different array - this is just a simple example. Now we can continue to my actual question..
I tried to apply the same approach for a 2D array:
>>> d = np.random.randint(1,10,(5,10))
>>> d
array([[1, 6, 8, 4, 4, 4, 4, 4, 4, 8],
[3, 6, 1, 4, 5, 5, 2, 1, 8, 2],
[1, 2, 6, 9, 8, 6, 9, 2, 5, 8],
[8, 5, 1, 6, 6, 2, 4, 3, 7, 1],
[5, 1, 4, 4, 4, 2, 5, 9, 7, 9]])
>>> sort_inds = np.argsort(d)
>>> sort_inds
array([[0, 3, 4, 5, 6, 7, 8, 1, 2, 9],
[2, 7, 6, 9, 0, 3, 4, 5, 1, 8],
[0, 1, 7, 8, 2, 5, 4, 9, 3, 6],
[2, 9, 5, 7, 6, 1, 3, 4, 8, 0],
[1, 5, 2, 3, 4, 0, 6, 8, 7, 9]], dtype=int64)
This result looks good - notice that we can sort each row of d using the indices of the corresponding row from sort_inds as demonstrated in the 1D example. However, trying to get a sorted array using the same approach I used in the 1D example, I got this exception:
>>> d[sort_inds]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-63-e480a9fb309c> in <module>
----> 1 d[ind]
IndexError: index 5 is out of bounds for axis 0 with size 5
So I have 2 questions:
What just happened? How did numpy interpret this code?
How can I still achieve what I want - that is, sorting d - or any other array of the same dimensions - using sort_inds?
Thanks
You need a little extra work to properly index the 2d array. Here's a way using advanced indexing, where np.arange is used in the first axis so that each row in sort_inds extracts values from the corresponding row in d:
d[np.arange(d.shape[0])[:,None], sort_inds]
array([[1, 1, 2, 3, 3, 4, 4, 7, 8, 9],
[1, 3, 4, 5, 5, 5, 6, 8, 8, 9],
[1, 2, 3, 4, 5, 6, 7, 8, 8, 8],
[2, 2, 4, 7, 7, 8, 8, 9, 9, 9],
[1, 1, 2, 4, 4, 7, 7, 8, 8, 8]])
I have a 2D array, and I need to make it into a 3D array - with the next layer starting with the second row of the first layer.
This is my best attempt to visually show what I want to do, with four 'layers':
# original array
dat = np.array([[0, 0, 0, 0, 9]
[1, 1, 1, 1, 9],
[2, 2, 2, 2, 9],
[3, 3, 3, 3, 9],
[4, 4, 4, 4, 9],
[5, 5, 5, 5, 9],
[6, 6, 6, 6, 9],
[7, 7, 7, 7, 9],
[8, 8, 8, 8, 9]], np.int32
)
#dat.shape
#(8, 5)
layers = 4
# new 3d array
# first 'layer'
[0, 0, 0, 0, 9],
[1, 1, 1, 1, 9],
[2, 2, 2, 2, 9],
[3, 3, 3, 3, 9],
[4, 4, 4, 4, 9],
[5, 5, 5, 5, 9]
# second 'layer'
[1, 1, 1, 1, 9],
[2, 2, 2, 2, 9],
[3, 3, 3, 3, 9],
[4, 4, 4, 4, 9],
[5, 5, 5, 5, 9],
[6, 6, 6, 6, 9]
# third 'layer'
[2, 2, 2, 2, 9],
[3, 3, 3, 3, 9],
[4, 4, 4, 4, 9],
[5, 5, 5, 5, 9],
[6, 6, 6, 6, 9],
[7, 7, 7, 7, 9]
# fourth 'layer'
[3, 3, 3, 3, 9],
[4, 4, 4, 4, 9],
[5, 5, 5, 5, 9],
[6, 6, 6, 6, 9],
[7, 7, 7, 7, 9],
[8, 8, 8, 8, 9]
# new shape: (rows, layers, columns)
#dat.shape
#(6, 4, 5)
I realize my visual representation of the layers might not be the way I say it is at the end, but that is the shape that I'm trying to get it in.
Solutions that I've tried include a variation of np.repeat(dat[:, :, np.newaxis], steps, axis=2) but for some reason I struggle once it's more than two dimensions.
Appreciate any help!
Approach #1: Here's one approach using broadcasting -
layers = 4
L = dat.shape[0]-layers+1
out = dat[np.arange(L) + np.arange(layers)[:,None]]
If you actually need a (6,4,5) shaped array, we would need slight modification :
out = dat[np.arange(L)[:,None] + np.arange(layers)]
Approach #2: Here's another with NumPy strides -
strided = np.lib.stride_tricks.as_strided
m,n = dat.strides
N = dat.shape[1]
out = strided(dat, shape = (layers,L,N), strides= (m,N*n,n))
For (6,4,5) shaped output array,
out = strided(dat, shape = (L,layers,N), strides= (N*n,m,n))
Note that this second method would create a view into input array dat and is very efficient to be created. If you need a copy instead, append .copy() at the end : out.copy().
Sample output for (6,4,5) output -
In [267]: out[:,0,:]
Out[267]:
array([[0, 0, 0, 0, 9],
[1, 1, 1, 1, 9],
[2, 2, 2, 2, 9],
[3, 3, 3, 3, 9],
[4, 4, 4, 4, 9],
[5, 5, 5, 5, 9]])
In [268]: out[:,1,:]
Out[268]:
array([[1, 1, 1, 1, 9],
[2, 2, 2, 2, 9],
[3, 3, 3, 3, 9],
[4, 4, 4, 4, 9],
[5, 5, 5, 5, 9],
[6, 6, 6, 6, 9]])
In [269]: out[:,2,:]
Out[269]:
array([[2, 2, 2, 2, 9],
[3, 3, 3, 3, 9],
[4, 4, 4, 4, 9],
[5, 5, 5, 5, 9],
[6, 6, 6, 6, 9],
[7, 7, 7, 7, 9]])
In [270]: out[:,3,:]
Out[270]:
array([[3, 3, 3, 3, 9],
[4, 4, 4, 4, 9],
[5, 5, 5, 5, 9],
[6, 6, 6, 6, 9],
[7, 7, 7, 7, 9],
[8, 8, 8, 8, 9]])