cut some rows and columns where values are 255 - python

I am trying to get rid of all rows and columns in a grayscale numpy array where the values are 255.
My array could be:
arr = [[255,255,255,255],
[255,0,0,255],
[255,255,255,255]]
The result should be:
arr = [0,0]
I can just interating over the array, but there should be a pythonic way to solve the problem.
For the rows i tried:
arr = arr[~(arr==255).all(1)]
This works really well, but i cannot find an equal solution for colums.

Given boolean arrays for rows and columns:
In [26]: rows
Out[26]: array([False, True, False], dtype=bool)
In [27]: cols
Out[27]: array([False, True, True, False], dtype=bool)
np.ix_ creates ordinal indexers which can be used to index arr:
In [32]: np.ix_(rows, cols)
Out[32]: (array([[1]]), array([[1, 2]]))
In [33]: arr[np.ix_(rows, cols)]
Out[33]: array([[0, 0]])
Therefore you could use
import numpy as np
arr = np.array([[255,255,255,255],
[255,0,0,255],
[255,255,255,255]])
mask = (arr != 255)
rows = mask.all(axis=1)
cols = mask.all(axis=0)
print(arr[np.ix_(rows, cols)])
which yields the 2D array
[[0 0]]

For the columns, you can simply transpose the array:
arr = arr.T[~(arr.T==255).all(1)].T
arr = arr[~(arr==255).all(1)]
which results in
>> print(arr)
[[0 0]]

Related

How to get the remaining indexes after a random set of indexes is selected in Python?

Use the following code to illustrate my question.
import numpy as np
np.random.seed(200)
a = np.array([1,21,6,41,8]) # given an array with 5 elements
idx = np.random.choice(5, 3, replace=False) # randomly select 3 indexes between 0 and 4
idx.sort() # sort indexes
print(idx) # [0 3 4]
print(a[idx]) # get random selected subset using the indexes, [ 1 41 8]
How to get the remaining indexes [1,2]?
In [123]: np.random.seed(200)
...: a = np.array([1,21,6,41,8]) # given an array with 5 elements
...: idx = np.random.choice(5, 3, replace=False) # randomly select 3 indexe
...: s between 0 and 4
...: idx.sort() # sort indexes
In [124]: idx
Out[124]: array([0, 3, 4])
In [125]: a[idx]
Out[125]: array([ 1, 41, 8])
We could make a boolean mask, and find the True indices:
In [126]: mask = np.ones(a.shape, bool)
In [127]: mask[idx]=False
In [128]: mask
Out[128]: array([False, True, True, False, False])
In [129]: np.nonzero(mask)[0]
Out[129]: array([1, 2])
In [131]: np.arange(a.shape[0])[mask]
Out[131]: array([1, 2])
np.delete does this same sort of masking:
In [132]: np.delete(np.arange(a.shape[0]), idx)
Out[132]: array([1, 2])
One way to do it:
inverted_idx = [x not in idx for x in range(0, len(a))]
print(a[inverted_idx])
Result:
[21 6]
That creates a boolean mask, if you prefer an integer mask, like the one you had:
inverted_idx = [x for x in range(0, len(a)) if x not in idx]
print(a[inverted_idx])

Numpy getting row indices of last two elements of each column in mask

I have a boolean mask shaped (M, N). Each column in the mask may have a different number of True elements, but is guaranteed to have at least two. I want to find the row index of the last two such elements as efficiently as possible.
If I only wanted one element, I could do something like (M - 1) - np.argmax(mask[::-1, :], axis=0). However, that won't help me get the second-to-last index.
I've come up with an iterative solution using np.where or np.nonzero:
M = 4
N = 3
mask = np.array([
[False, True, True],
[True, False, True],
[True, False, True],
[False, True, False]
])
result = np.zeros((2, N), dtype=np.intp)
for col in range(N):
result[:, col] = np.flatnonzero(mask[:, col])[-2:]
This creates the expected result:
array([[1, 0, 1],
[2, 3, 2]], dtype=int64)
I would like to avoid the final loop. Is there a reasonably vectorized form of the above? I am looking for specifically two rows, which are always guaranteed to exist. A general solution for arbitrary element counts is not required.
An argsort does it -
In [9]: np.argsort(mask,axis=0,kind='stable')[-2:]
Out[9]:
array([[1, 0, 1],
[2, 3, 2]])
Another with cumsum -
c = mask.cumsum(0)
out = np.where((mask & (c>=c[-1]-1)).T)[1].reshape(-1,2).T
Specifically for exactly two rows, one way with argmax -
c = mask.copy()
idx = len(c)-c[::-1].argmax(0)-1
c[idx,np.arange(len(idx))] = 0
idx2 = len(c)-c[::-1].argmax(0)-1
out = np.vstack((idx2,idx))

How to add element to empty 2d numpy array

I'm trying to insert elements to an empty 2d numpy array. However, I am not getting what I want.
I tried np.hstack but it is giving me a normal array only. Then I tried using append but it is giving me an error.
Error:
ValueError: all the input arrays must have same number of dimensions
randomReleaseAngle1 = np.random.uniform(20.0, 77.0, size=(5, 1))
randomVelocity1 = np.random.uniform(40.0, 60.0, size=(5, 1))
randomArray =np.concatenate((randomReleaseAngle1,randomVelocity1),axis=1)
arr1 = np.empty((2,2), float)
arr = np.array([])
for i in randomArray:
data = [[170, 68.2, i[0], i[1]]]
df = pd.DataFrame(data, columns = ['height', 'release_angle', 'velocity', 'holding_angle'])
test_y_predictions = model.predict(df)
print(test_y_predictions)
if (np.any(test_y_predictions == 1)):
arr = np.hstack((arr, np.array([i[0], i[1]])))
arr1 = np.append(arr1, np.array([i[0], i[1]]), axis=0)
print(arr)
print(arr1)
I wanted to get something like
[[1.5,2.2],
[3.3,4.3],
[7.1,7.3],
[3.3,4.3],
[3.3,4.3]]
However, I'm getting
[56.60290125 49.79106307 35.45102444 54.89380834 47.09359271 49.19881675
22.96523274 44.52753514 67.19027156 54.10421167]
The recommended list append approach:
In [39]: alist = []
In [40]: for i in range(3):
...: alist.append([i, i+10])
...:
In [41]: alist
Out[41]: [[0, 10], [1, 11], [2, 12]]
In [42]: np.array(alist)
Out[42]:
array([[ 0, 10],
[ 1, 11],
[ 2, 12]])
If we start with a empty((2,2)) array:
In [47]: arr = np.empty((2,2),int)
In [48]: arr
Out[48]:
array([[139934912589760, 139934912589784],
[139934871674928, 139934871674952]])
In [49]: np.concatenate((arr, [[1,10]],[[2,11]]), axis=0)
Out[49]:
array([[139934912589760, 139934912589784],
[139934871674928, 139934871674952],
[ 1, 10],
[ 2, 11]])
Note that empty does not mean the same thing as the list []. It's a real 2x2 array, with 'unspecified' values. And those values remain when we add other arrays to it.
I could start with an array with a 0 dimension:
In [51]: arr = np.empty((0,2),int)
In [52]: arr
Out[52]: array([], shape=(0, 2), dtype=int64)
In [53]: np.concatenate((arr, [[1,10]],[[2,11]]), axis=0)
Out[53]:
array([[ 1, 10],
[ 2, 11]])
That looks more like the list append approach. But why start with the (0,2) array in the first place?
np.concatenate takes a list of arrays (or lists that can be made into arrays). I used nested lists that make (1,2) arrays. With this I can join them on axis 0.
Each concatenate makes a new array. So if done iteratively it is more expensive than the list append.
np.append just takes 2 arrays and does a concatenate. So doesn't add much. hstack tweaks shapes and joins on the 2nd (horizontal) dimension. vstack is another variant. But they all end up using concatenate.
With the hstack method, you can just reshape after you get the final array:
arr = arr.reshape(-1, 2)
print(arr)
The other method can be more easily done in a similar way:
arr1 = np.append(arr1, np.array([i[0], i[1]]) # in the loop
arr1 = arr1.reshape(-1, 2)
print(arr1)

Use 1d boolean index to select out of 2d array

Sometimes I'll have an ND array out of which I need to select data, but the data criterion has only M < N dimensions. Take for example
## generate some matrix
test = np.arange(9).reshape((3, 3))
## some condition based on first-dimension only
selectMe = np.array([ True, True, False], dtype=bool)
Now, I would like to do
test[selectMe[:, None]]
but that leads to an IndexError:
IndexError: boolean index did not match indexed array along dimension 1; dimension is 3 but corresponding boolean dimension is 1
Naturally, if I repeat the boolean index on the second dimension, everything works -- the following is the expected output:
test[np.repeat(selectMe[:, None], 3, axis=1)]
Out[41]: array([0, 1, 2, 3, 4, 5])
However, this is quite inefficient. What's the natural way of achieving this with numpy without having to repeat the matrix?
If I understand your problem, you can use ellipsis (...) to cover unfiltered dimensions:
import numpy as np
test = np.arange(10000).reshape((100, 100))
# condition
selectMe = np.random.randint(0, 2, 100).astype(bool)
assert (test[selectMe, ...].ravel() == test[np.repeat(selectMe[:, None], 100, axis=1)]).all()
%timeit test[selectMe, ...].ravel() # 11.6 µs
%timeit test[np.repeat(selectMe[:, None], 100, axis=1)] # 103 µs

How to index and assign to a tensor in tensorflow?

I have a tensor as follows and a numpy 2D array
k = 1
mat = np.array([[1,2],[3,4],[5,6]])
for row in mat:
values_zero, indices_zero = tf.nn.top_k(row, len(row) - k)
row[indices_zero] = 0 #????
I want to assign the elements in that row to be zero at those indices. However I can't index a tensor and assign to it as well. I have tried using the tf.gather function but how can I do an assignment? I want to keep it as a tensor and then run it in a session at the end if that is possible.
I guess you are trying to mask the maximum in each row to zero? If so, I would do it like this. The idea is to create the tensor by construction rather than assignment.
import numpy as np
import tensorflow as tf
mat = np.array([[1, 2], [3, 4], [5, 6]])
# All tensorflow from here
tmat = tf.convert_to_tensor(mat)
# Get index of maximum
max_inds = tf.argmax(mat, axis=1)
# Create an array of column indices in each row
shape = tmat.get_shape()
inds = tf.range(0, shape[1], dtype=max_inds.dtype)[None, :]
# Create boolean mask of maximums
bmask = tf.equal(inds, max_inds[:, None])
# Convert boolean mask to ones and zeros
imask = tf.where(bmask, tf.zeros_like(tmat), tf.ones_like(tmat))
# Create new tensor that is masked with maximums set to zer0
newmat = tmat * imask
with tf.Session() as sess:
print(newmat.eval())
which outputs
[[1 0]
[3 0]
[5 0]]
One way to do this is by advanced indexing:
In [87]: k = 1
In [88]: mat = np.array([[1,2],[3,4],[5,6]])
# `sess` is tf.InteractiveSession()
In [89]: vals, idxs = sess.run(tf.nn.top_k(mat, k=1))
In [90]: idxs
Out[90]:
array([[1],
[1],
[1]], dtype=int32)
In [91]: mat[:, np.squeeze(idxs)[0]] = 0
In [92]: mat
Out[92]:
array([[1, 0],
[3, 0],
[5, 0]])

Categories