Longest subarray of same number - python

If I had a 2D array of shape (n,n) and a given point (x,y) and valuer v I want to find the length longest sub array in all sub arrays that go through that point e.g.:
n=3
x=1
y=1
v=2
array = [
[1,0,2],
[2,2,0],
[2,1,2]
]
Then the following arrays would be checked:
[2,2,0] (horizontal) returns 2
[0,2,1] (vertical) returns 1
[1,2,2] (left diag) returns 2
[2,2,2] (right diag) returns 3

If you have (n*n) matrix, you can use np.diag to get values on the main diagonal of the matrix. np.fliplr can be combined by np.diag to get values on the reverse diagonal. For specified column and row, it can be achieved by indexing on NumPy array. Satisfaction of the condition can be checked by == v, which will get a Boolean array with True for where value is equal to v. Summing the Boolean arrays will get number of Trues. So it shows the number of v in that flattened array:
array = np.array([[1, 0, 2],
[2, 2, 0],
[2, 1, 2]])
v = 2
x = 1
y = 1
horz = (array[x, :] == v).sum() # --> 2
vert = (array[:, y] == v).sum() # --> 1
diag = (np.diag(array) == v).sum() # --> 2
reversed_diag = (np.diag(np.fliplr(array)) == v).sum() # --> 3
It will works on your prepared (n*n) arrays where x and y are the center of the matrix; Diagonals can be specified by np.diag using parameter k if we know which diagonals will pass the point. there is a good example in this regard on SO.
It may be possible to specify the two passing (from the point) diagonals, diag and reverse_diag, related arrays by the following relations, too, as another alternative method. The correctness of the resulted arrays must be checked further:
a_mod = array.ravel()
size = array.shape[0]
if y >= x:
diag = a_mod[y-x:(x+size-y)*size:size+1]
else:
diag = a_mod[(x-y)*size::size+1]
if x-(size-1-y) >= 0:
reverse_diag = array[:, ::-1].ravel()[(x-(size-1-y))*size::size+1]
else:
reverse_diag = a_mod[x:x*size+1:size-1]
diag = (diag == v).sum()
reverse_diag = (reverse_diag == v).sum()
I think these solutions will be helpful to solve the problem. If any other results needed, it must be clarified by the OP for further works on.

Related

Creating 3d Tensor Array from 2d Array (Python)

I have two numpy arrays (4x4 each). I would like to concatenate them to a tensor of (4x4x2) in which the first 'sheet' is the first array, second 'sheet' is the second array, etc. However, when I try np.stack the output of d[1] is not showing the correct values of the first matrix.
import numpy as np
x = array([[ 3.38286851e-02, -6.11905173e-05, -9.08147798e-03,
-2.46860166e-02],
[-6.11905173e-05, 1.74237508e-03, -4.52140165e-04,
-1.22904439e-03],
[-9.08147798e-03, -4.52140165e-04, 1.91939979e-01,
-1.82406361e-01],
[-2.46860166e-02, -1.22904439e-03, -1.82406361e-01,
2.08321422e-01]])
print(np.shape(x)) # 4 x 4
y = array([[ 6.76573701e-02, -1.22381035e-04, -1.81629560e-02,
-4.93720331e-02],
[-1.22381035e-04, 3.48475015e-03, -9.04280330e-04,
-2.45808879e-03],
[-1.81629560e-02, -9.04280330e-04, 3.83879959e-01,
-3.64812722e-01],
[-4.93720331e-02, -2.45808879e-03, -3.64812722e-01,
4.16642844e-01]])
print(np.shape(y)) # 4 x 4
d = np.dstack((x,y))
np.shape(d) # indeed it is 4,4,2... but if I do d[1] then it is not the first x matrix.
d[1] # should be y
If you do np.dstack((x, y)), which is the same as the more explicit np.stack((x, y), axis=-1), you are concatenating along the last, not the first axis (i.e., the one with size 2):
(x == d[..., 0]).all()
(y == d[..., 1]).all()
Ellipsis (...) is a python object that means ": as many times as necessary" when used in an index. For a 3D array, you can equivalently access the leaves as
d[:, :, 0]
d[:, :, 1]
If you want to access the leaves along the first axis, your array must be (2, 4, 4):
d = np.stack((x, y), axis=0)
(x == d[0]).all()
(y == d[1]).all()
Use np.stack instead of np.dstack:
>>> d = np.stack([y, x])
>>> np.all(d[0] == y)
True
>>> np.all(d[1] == x)
True
>>> d.shape
(2, 4, 4)

Change all values equal to x to y

I have a very simple task that I cannot figure out how to do in numpy. I have a 3 channel array and wherever the array value does not equal (1,1,1) I want to convert that array value to (0,0,0).
So the following:
[[0,1,1],
[1,1,1],
[1,0,1]]
Should change to:
[[0,0,0],
[1,1,1],
[0,0,0]]
How can I achieve this in numpy? The following is not achieving the desired results:
# my_arr.dtype = uint8
my_arr[my_arr != (1,1,1)] = 0
my_arr = np.where(my_arr == (1,1,1), my_arr, (0,0,0))
Use numpy.array.all(1) to filter and assign 0:
import numpy as np
arr = np.array([[0,1,1],
[1,1,1],
[1,0,1]])
arr[~(arr == 1).all(1)] = 0
Output:
array([[0, 0, 0],
[1, 1, 1],
[0, 0, 0]])
Explain:
arr==1: returns array of bools that satisfy the condition (here it's 1)
all(axis=1): returns array of bools if each row has all True (i.e. all rows that are 1`
~(arr==1).all(1): selects rows that are not all 1
This is just comparing the two lists.
x = [[0,1,1],
[1,1,1],
[1,0,1]]
for i in range(len(x)):
if x[i] != [1,1,1]:
x[i] = [0,0,0]

Comparing Numpy Arrays

So I have two 2D numpy arrays of equal size, both obtained using the pygame.surfarray.array3d method on two different surfaces.
Each value in the array is also an array in the form [a, b, c] (so I basically have a 2D array with 1D elements).
I'd essentially like to compare the two based on the condition:
if any(val1 != val2) and all(val1 != [0, 0, 0]):
# can't be equal and val1 cant be [0, 0, 0]
Is there any more efficient way of doing this without simply iterating through either array as shown below?
for y in range(len(array1)):
for x in range(len(array1[y])):
val1 = array1[y,x]; val2 = array[y,x]
if any(val1 != val2) and all(val1 != [0, 0, 0]):
# do something
import numpy as np
if np.any(array1 != array2) and not np.any(np.all(a == 0, axis=-1))
np.any(array1 != array2) is comparing each element of the "big" 3D array. This is, however, equivalent to comparing val1 to val2 for every x and y.
The other condition, np.any(np.all(a == 0, axis=-1)) is a little bit more complicated. The innermost np.all(a == 0, axis=-1) creates a 2D array of boolean values. Each value is set to True or False depending if all values in the last dimension are 0. The outer condition checks if any of the values in the 2D array are True which would mean that there was an element of array1[y, x] that was equal to [0, 0, 0].

Calculate mean, variance, covariance of different length matrices in a split list

I have an array of 5 values, consisting of 4 values and one index. I sort and split the array along the index. This leads me to splits of matrices with different lengths. From here on I want to calculate the mean, variance of the fourth values and covariance of the first 3 values for every split. My current approach works with a for loop, which I would like to replace by matrix operations, but I am struggeling with the different sizes of my matrices.
import numpy as np
A = np.random.rand(10,5)
A[:,-1] = np.random.randint(4, size=10)
sorted_A = A[np.argsort(A[:,4])]
splits = np.split(sorted_A, np.where(np.diff(sorted_A[:,4]))[0]+1)
My current for loop looks like this:
result = np.zeros((len(splits), 5))
for idx, values in enumerate(splits):
if(len(values))>0:
result[idx, 0] = np.mean(values[:,3])
result[idx, 1] = np.var(values[:,3])
result[idx, 2:5] = np.cov(values[:,0:3].transpose(), ddof=0).diagonal()
else:
result[idx, 0] = values[:,3]
I tried to work with masked arrays without success, since I couldn't load the matrices into the masked arrays in a proper form. Maybe someone knows how to do this or has a different suggestion.
You can use np.add.reduceat as follows:
>>> idx = np.concatenate([[0], np.where(np.diff(sorted_A[:,4]))[0]+1, [A.shape[0]]])
>>> result2 = np.empty((idx.size-1, 5))
>>> result2[:, 0] = np.add.reduceat(sorted_A[:, 3], idx[:-1]) / np.diff(idx)
>>> result2[:, 1] = np.add.reduceat(sorted_A[:, 3]**2, idx[:-1]) / np.diff(idx) - result2[:, 0]**2
>>> result2[:, 2:5] = np.add.reduceat(sorted_A[:, :3]**2, idx[:-1], axis=0) / np.diff(idx)[:, None]
>>> result2[:, 2:5] -= (np.add.reduceat(sorted_A[:, :3], idx[:-1], axis=0) / np.diff(idx)[:, None])**2
>>>
>>> np.allclose(result, result2)
True
Note that the diagonal of the covariance matrix are just the variances which simplifies this vectorization quite a bit.

Nearest neighbor 1 dimensional data with a specified range

I have two nested lists A and B:
A = [[50,140],[51,180],[54,500],......]
B = [[50.1, 170], [51,200],[55,510].....]
The 1st element in each inner list runs from 0 to around 1e5, the 0th element runs from around 50 up to around 700, these elements are unsorted. What i want to do, is run through each element in A[n][1] and find the closest element in B[n][1], but when searching for the nearest neighbor i want to only search within an interval defined by A[n][0] plus or minus 0.5.
I have been using the function:
def find_nearest_vector(array, value):
idx = np.array([np.linalg.norm(x+y) for (x,y) in array-value]).argmin()
return array[idx]
Which finds the nearest neighbor between the coordinates A[0][:]and B[0][:], for example. However, this I need to confine the search range to a rectangle around some small shift in the value A[0][0]. Also, I do not want to reuse elements - I want a unique bijection between each value A[n][1] to B[n][1] within the interval A[n][0] +/- 0.5.
I have been trying to use Scipy's KDTree, but this reuses elements and I don't know how to confine the search range. Effectively, I want to do a one dimensional NNN search on a two dimensional nested list along a specific axis where the neighborhood in which the NNN search is within a hyper-rectangle defined by the 0th element in each inner list plus or minus some small shift.
I use numpy.argsort(), numpy.searchsorted(), numpy.argmin() to do the search.
%pylab inline
import numpy as np
np.random.seed(0)
A = np.random.rand(5, 2)
B = np.random.rand(100, 2)
xaxis_range = 0.02
order = np.argsort(B[:, 0])
bx = B[order, 0]
sidx = np.searchsorted(bx, A[:, 0] - xaxis_range, side="right")
eidx = np.searchsorted(bx, A[:, 0] + xaxis_range, side="left")
result = []
for s, e, ay in zip(sidx, eidx, A[:, 1]):
section = order[s:e]
by = B[section, 1]
idx = np.argmin(np.abs(ay-by))
result.append(B[section[idx]])
result = np.array(result)
I plot the result as following:
plot(A[:, 0], A[:, 1], "o")
plot(B[:, 0], B[:, 1], ".")
plot(result[:, 0], result[:, 1], "x")
the output:
My understanding of your problem is that you are trying to find the closest elements for each A[n][1] in another set of points (B[i][1] restricted to points where if A[n][0] is within +/- 0.5 of B[i][0]).
(I'm not familiar with numpy or scipy, and I'm sure that there's a better way to do this with their algorithms.)
That being said, here's my naive implementation in O(a*b*log(a*b)) time.
def main(a,b):
for a_bound,a_val in a:
dist_to_valid_b_points = {abs(a_val-b_val):(b_bound,b_val) for b_bound,b_val in b if are_within_bounds(a_bound,b_bound)}
print get_closest_point((a_bound, a_val),dist_to_valid_b_points)
def are_within_bounds(a_bound, b_bound):
return abs(b_bound-a_bound) < 0.5
def get_closest_point(a_point, point_dict):
return (a_point, None if not point_dict else point_dict[min(point_dict, key=point_dict.get)])
main([[50,140],[51,180],[54,500]],[[50.1, 170], [51,200],[55,510]]) yields the following output:
((50, 140), (50.1, 170))
((51, 180), (51, 200))
((54, 500), None)

Categories