I have a 2d grid of dim q(rows) p(columns).
The element of the grid are the indices of the element itself.
So the element a[i,j]=(i,j).
Now I need to create pair-wise distances between these points with circularity constraint, meaning that the element distance(a[i,p-1],a[i,0])=1 (I am using 0-index based matrix as in Python). Analogously distance(a[q-1,j],a[0,j])=1
Observe that distance(a[q-2,j],a[0,j]) is the shorter vertical path going from 0 to q-2 and the path from q2 to 0 (leveraging the circularity of the grid). Same thing happens horizontally.
My question is: is there a NumPy function that enables to quickly calculate such a pairwise distances matrix?
I don't know of a function do do this, but it's pretty easy to compute manually:
import numpy as np
q = 6 # rows
p = 5 # columns
# Create array of indices
a = np.mgrid[0:q, 0:p].transpose(1, 2, 0)
# The array `a` now has shape (q, p, 2) :
a[1, 2] # array([1, 2])
a[3, 0] # array([3, 0])
# Create a new array measuring the i,j difference between all pairs of
# locations in the array. `diff` will have shape (q, p, q, p, 2) :
diff = a[None, None, :, :, :] - a[:, :, None, None, :]
# Modify diff to obey circularity constraint
di = diff[..., 0]
dj = diff[..., 1]
di[di > q//2] -= q
dj[dj > p//2] -= p
# compute distance from diff
dist = (diff[..., 0]**2 + diff[..., 1]**2)**0.5
# test:
dist[1, 0, 1, 0] # 0.0
dist[1, 0, 1, 1] # 1.0
dist[1, 0, 1, 2] # 2.0
dist[1, 0, 1, 3] # 2.0
dist[1, 0, 1, 4] # 1.0
Related
Imagine a 3-dimensional grid in space, where each grid point has a binary value.
The values of the grid points are represented by a 3d numpy array.
For each grid point that has a value of 1, we want to know where the nearest 0-valued point is located, in 14 different directions (±x, ±y, ±z, and 8 diagonals).
That means, with a numpy array of shape (nx, ny, nz) as input, the output should be an array of shape (nx, ny, nz, 14), where each value in the last dimension of the output corresponds to distance of that point to the nearest 0-valued neighbor in one direction (but we don't need to calculate this for 0-valued grid points, so their values can be set to zero).
What is the most efficient way to calculate this in numpy?
My current approach is looping over the grid points (three nested for-loops), and for each point, first checking whether it's a 1, and if so, slicing the array 14 times to get the points in one of the directions starting from the current point, and taking the index of first 0 element as distance to nearest 0-valued neighbor in that direction.
Here's a way that should work for all the non-diagonal cases. The example is just in 1d, but should work along each axis in 3d. This just computes the distance to the nearest 0 to the left:
a = np.array([0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1])
n = len(a)
zero_index = np.arange(0, n) * (1 - a)
one_index = np.arange(0, n) * a
(one_index - np.maximum.accumulate(zero_index)) * a
# -> returns array([0, 1, 2, 3, 4, 0, 0, 0, 1, 2, 0, 1, 2])
The idea is that zero_index has the integer index wherever a is 0, and one_index has the integer index wherever a is 1. np.maximum.accumulate does a cumulative maximum, which fills the zero values in zero_index with the values from the left. The final * a just sets the final distance to 0 wherever these was a 0.
This doesn't handle the case when a[0] is 1. We can fix that by offsetting the zero_index by 1 (so it starts with 1 if a[0] is 0):
a = np.array([1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1])
n = len(a)
zero_index = np.arange(1, n+1) * (1 - a)
one_index = np.arange(0, n) * a
(one_index - np.maximum.accumulate(zero_index) + 1) * a
# -> returns array([1, 2, 3, 4, 5, 0, 0, 0, 1, 2, 0, 1, 2])
You can handle the opposite direction with reverse indexing.
Unfortunately I can't propose a vectorized way of doing the diagonals. Perhaps there's a way (some kind of shape or roll) that skews the diagonal to make it aligned with an axis.
Basically, I have three arrays that I multiply with values from 0 to 2, expanding the number of rows to the number of products (the values to be multiplied are the same for each array). From there, I want to calculate the product of every combination of rows from all three arrays. So I have three arrays
A = np.array([1, 2, 3])
B = np.array([1, 2, 3])
C = np.array([1, 2, 3])
and I'm trying to reduce the operation given below
search_range = np.linspace(0, 2, 11)
results = np.array([[0, 0, 0]])
for i in search_range:
for j in search_range:
for k in search_range:
sm = i*A + j*B + k*C
results = np.append(results, [sm], axis=0)
What I tried doing:
A = np.array([[1, 2, 3]])
B = np.array([[1, 2, 3]])
C = np.array([[1, 2, 3]])
n = 11
scale = np.linspace(0, 2, n).reshape(-1, 1)
A = np.repeat(A, n, axis=0) * scale
B = np.repeat(B, n, axis=0) * scale
C = np.repeat(C, n, axis=0) * scale
results = np.array([[0, 0, 0]])
for i in range(n):
A_i = A[i]
for j in range(n):
B_j = B[j]
C_k = C
sm = A_i + B_j + C_k
results = np.append(results, sm, axis=0)
which only removes the last for loop. How do I reduce the other for loops?
You can get the same result like this:
search_range = np.linspace(0, 2, 11)
search_range = np.array(np.meshgrid(search_range, search_range, search_range))
search_range = search_range.T.reshape(-1, 3)
sm = search_range[:, 0, None]*A + search_range[:, 1, None]*B + search_range[:, 2, None]*C
results = np.concatenate(([[0, 0, 0]], sm))
Instead of using three nested loops to get every combination of elements in the "search_range" array, I used the meshgrid function to convert "search_range" to a 2D array of every possible combination and then instead of i, j and k you can use the 3 items in the arrays in the "search_range".
And finally, as suggested by #Mercury you can use indexing for the new "search_range" array to generate the result. For example search_range[:, 1, None] is an array in shape of (1331, 1), containing singleton arrays of every element at index of 0 in arrays in the "search_range". That concatenate is only there because you wanted the results array to have default value of [[0, 0, 0]], so I appended sm to it; Otherwise, the sm array contains the answer.
I have a matrix of matrices with some arbitrary shape (N1,N2,k,k), meaning N1*N2 matrices with shape k*k.
I wish to calculate the sum of each matrix (of shape (k,k)) and convert the matrix itself with that sum.
the resulting array would be of shape (N1,N2), where each element positioned in some index i,j is the sum of the corresponding matrix in that given index.
is there a way of doing so with numpy operations? (that is - no looping over range(N1) and range(N2))
here's a simple example (Im using * with the first array and the second array transpose just to create the example):
m = np.array([[0, 0, 0, 0]]).reshape(2, 2) # matrix element of size k*k (k=2)
a = np.array([m, m + 1, m + 2, m + 3])
b = np.array([m, m + 1, m + 2, m + 3])
reshaped1 = a[:, np.newaxis] # (N1,1,k,k) where N1=4
reshaped2 = b[np.newaxis, :] # (1,N2,k,k) where N2=4
mult = reshaped1 * reshaped2 # (N1,N2,k,k)=(4,4,2,2)
I wish to create a new array res that will contain the sum of all mult elements. that can somewhat be done with the following pseudo:
for i in range(N1):
for j in range(N2):
res[i,j] = sum(mult[i,j])
appreciate your help!
If I understand you correctly, you can use np.sum with multiple axes:
np.sum(mult, axis=(2, 3))
Output:
array([[ 0, 0, 0, 0],
[ 0, 4, 8, 12],
[ 0, 8, 16, 24],
[ 0, 12, 24, 36]])
try using np.sum(np.sum(mult,axis=3),axis=2)
import numpy as np
N1=4
N2=4
m = np.array([[0, 0, 0, 0]]).reshape(2, 2) # matrix element of size k*k (k=2)
a = np.array([m, m + 1, m + 2, m + 3])
b = np.array([m, m + 1, m + 2, m + 3])
reshaped1 = a[:, np.newaxis] # (N1,1,k,k) where N1=4
reshaped2 = b[np.newaxis, :] # (1,N2,k,k) where N2=4
mult = reshaped1 * reshaped2 # (N1,N2,k,k)=(4,4,2,2)
np.sum(mult,axis=3)
res=np.zeros((4,4))
for i in range(N1):
for j in range(N2):
res[i,j] = np.sum(mult[i,j])
print(np.array_equal(np.sum(np.sum(mult,axis=3),axis=2),res))
>>> True
Can I index NumPy N-D array with fallback to default values for out-of-bounds indexes? Example code below for some imaginary np.get_with_default(a, indexes, default):
import numpy as np
print(np.get_with_default(
np.array([[1,2,3],[4,5,6]]), # N-D array
[(np.array([0, 0, 1, 1, 2, 2]), np.array([1, 2, 2, 3, 3, 5]))], # N-tuple of indexes along each axis
13, # Default for out-of-bounds fallback
))
should print
[2 3 6 13 13 13]
I'm looking for some built-in function for this. If such not exists then at least some short and efficient implementation to do that.
I arrived at this question because I was looking for exactly the same. I came up with the following function, which does what you ask for 2 dimension. It could likely be generalised to N dimensions.
def get_with_defaults(a, xx, yy, nodata):
# get values from a, clipping the index values to valid ranges
res = a[np.clip(yy, 0, a.shape[0] - 1), np.clip(xx, 0, a.shape[1] - 1)]
# compute a mask for both x and y, where all invalid index values are set to true
myy = np.ma.masked_outside(yy, 0, a.shape[0] - 1).mask
mxx = np.ma.masked_outside(xx, 0, a.shape[1] - 1).mask
# replace all values in res with NODATA, where either the x or y index are invalid
np.choose(myy + mxx, [res, nodata], out=res)
return res
xx and yy are the index array, a is indexed by (y,x).
This gives:
>>> a=np.zeros((3,2),dtype=int)
>>> get_with_defaults(a, (-1, 1000, 0, 1, 2), (0, -1, 0, 1, 2), -1)
array([-1, -1, 0, 0, -1])
As an alternative, the following implementation achieves the same and is more concise:
def get_with_default(a, xx, yy, nodata):
# get values from a, clipping the index values to valid ranges
res = a[np.clip(yy, 0, a.shape[0] - 1), np.clip(xx, 0, a.shape[1] - 1)]
# replace all values in res with NODATA (gets broadcasted to the result array), where
# either the x or y index are invalid
res[(yy < 0) | (yy >= a.shape[0]) | (xx < 0) | (xx >= a.shape[1])] = nodata
return res
I don't know if there is anything in NumPy to do that directly, but you can always implement it yourself. This is not particularly smart or efficient, as it requires multiple advanced indexing operations, but does what you need:
import numpy as np
def get_with_default(a, indices, default=0):
# Ensure inputs are arrays
a = np.asarray(a)
indices = tuple(np.broadcast_arrays(*indices))
if len(indices) <= 0 or len(indices) > a.ndim:
raise ValueError('invalid number of indices.')
# Make mask of indices out of bounds
mask = np.zeros(indices[0].shape, np.bool)
for ind, s in zip(indices, a.shape):
mask |= (ind < 0) | (ind >= s)
# Only do masking if necessary
n_mask = np.count_nonzero(mask)
# Shortcut for the case where all is masked
if n_mask == mask.size:
return np.full_like(a, default)
if n_mask > 0:
# Ensure index arrays are contiguous so masking works right
indices = tuple(map(np.ascontiguousarray, indices))
for ind in indices:
# Replace masked indices with zeros
ind[mask] = 0
# Get values
res = a[indices]
if n_mask > 0:
# Replace values of masked indices with default value
res[mask] = default
return res
# Test
print(get_with_default(
np.array([[1,2,3],[4,5,6]]),
(np.array([0, 0, 1, 1, 2, 2]), np.array([1, 2, 2, 3, 3, 5])),
13
))
# [ 2 3 6 13 13 13]
I also needed a solution to this, but I wanted a solution that worked in N dimensions. I made Markus' solution work for N-dimensions, including selecting from an array with more dimensions than the coordinates point to.
def get_with_defaults(arr, coords, nodata):
coords, shp = np.array(coords), np.array(arr.shape)
# Get values from arr, clipping to valid ranges
res = arr[tuple(np.clip(c, 0, s-1) for c, s in zip(coords, shp))]
# Set any output where one of the coords was out of range to nodata
res[np.any(~((0 <= coords) & (coords < shp[:len(coords), None])), axis=0)] = nodata
return res
import numpy as np
if __name__ == '__main__':
A = np.array([[1,2,3],[4,5,6]])
B = np.array([[[1, -9],[2, -8],[3, -7]],[[4, -6],[5, -5],[6, -4]]])
coords1 = [[0, 0, 1, 1, 2, 2], [1, 2, 2, 3, 3, 5]]
coords2 = [[0, 0, 1, 1, 2, 2], [1, 2, 2, 3, 3, 5], [1, 1, 1, 1, 1, 1]]
out1 = get_with_defaults(A, coords1, 13)
out2 = get_with_defaults(B, coords1, 13)
out3 = get_with_defaults(B, coords2, 13)
print(out1)
# [2, 3, 6, 13, 13, 13]
print(out2)
# [[ 2 -8]
# [ 3 -7]
# [ 6 -4]
# [13 13]
# [13 13]
# [13 13]]
print(out3)
# [-8, -7, -4, 13, 13, 13]
Imagine a matrix A having one column with a lot of inequality/equality operators (≥, = ≤) and a vector b, where the number of rows in A is equal the number of elements in b. Then one row, in my setting would be computed by, e.g
dot(A[0, 1:], x) ≥ b[0]
where x is some vector, column A[,0] represents all operators and we'd know that for row 0 we were suppose to calculate using ≥ operator (e.i. A[0,0] == "≥" is true). Now, is there a way for dynamically calculate all rows in following so far imaginary way
dot(A[, 1:], x) A[, 0] b
My hope was for a dynamic evaluation of each row where we evaluate which operator is used for each row.
Example, let
A = [
[">=", -2, 1, 1],
[">=", 0, 1, 0],
["==", 0, 1, 1]
]
b = [0, 1, 1]
and x be some given vector, e.g. x = [1,1,0] we wish to compute as following
A[,1:] x A[,0] b
dot([-2, 1, 1], [1, 1, 0]) >= 0
dot([0, 1, 0], [1, 1, 0]) >= 1
dot([0, 1, 1], [1, 1, 0]) == 1
The output would be [False, True, True]
If I understand correctly, this is a way to do that operation:
import numpy as np
# Input data
a = [
[">=", -2, 1, 1],
[">=", 0, 1, 0],
["==", 0, 1, 1]
]
b = np.array([0, 1, 1])
x = np.array([1, 1, 0])
# Split in comparison and data
a0 = np.array([lst[0] for lst in a])
a1 = np.array([lst[1:] for lst in a])
# Compute dot product
c = a1 # x
# Compute comparisons
leq = c <= b
eq = c == b
geq = c >= b
# Find comparison index for each row
cmps = np.array(["<=", "==", ">="]) # This array is lex sorted
cmp_idx = np.searchsorted(cmps, a0)
# Select the right result for each row
result = np.choose(cmp_idx, [leq, eq, geq])
# Convert to numeric type if preferred
result = result.astype(np.int32)
print(result)
# [0 1 1]