numpy.where() returns inconsisten dimensions

numpy.where() returns inconsisten dimensions - python

I pass an array of size (734,814,3) to a function but numpy.where() gives one dimensional result instead of the two-dimensional one, which it should for a 2D array
def hsi2rgb(img):
img_rgb = np.empty_like(img)
h = img[:,:,0] #(734,814)
s = img[:,:,1] #(734,814)
i = img[:,:,2] #(734,814)
l1 = 0.00
l2 = 2*3.14/3
l3 = 4*3.14/3
l4 = 3.14
r1 = np.where(np.logical_and(h>=l1, h<l2)) #(99048,)
r2 = np.where(np.logical_and(h>=l2, h<l3))
r3 = np.where(np.logical_and(h>=l3, h<l4))
hs = h[r1]
return img_rgb
r1 is shown to be a tupple, and r1[0],r1[1] are of the size 99048, which shouldn't be the case. r1 should have row indices and column indices of those values which satisfy the condition. I tried it without the logical and, using just one condition, but the problem persists.

I followed your code, and np.where returned the expected result: a tuple with two 1D arrays containing the indexes where the condition is met:
import numpy as np
h = np.random.uniform(size=(734, 814))
r1 = np.where(np.logical_and(h >= 0.1, h < 0.9))
print(r1[0].shape, r1[1].shape) # (478129,) (478129,)
This means that 478129 elements met the condition. For each of them, r1[0] will have its row index, and r11 will have its column index. Namely, if r1 looks like
(array([ 0, 0, 0, ..., 733, 733, 733]), array([ 0, 1, 2, ..., 808, 809, 811]))
then I know that h[0, 0], h[0, 1], h[0, 2], etc satisfy the conditions: the row index comes from the first array, the column index from the second. This structure may be less readable, but it's usable for indexing the array h.
The transposed form of the output is more readable, being a 2D array with row-column index pairs:
array([[ 0, 0],
[ 0, 1],
[ 0, 2],
...,
[733, 808],
[733, 809],
[733, 811]])
It can be obtained by transposing r1 (if you need the original r1 as well), or directly with np.argwhere:
r1 = np.argwhere(np.logical_and(h >= 0.1, h < 0.9))

Related

comparing numpy arrays such that they equal each other if they both fall within the same range of values

Suppose that I have a 3d numpy array where at each canvas[row, col], there is another numpy array in the format of [R, G, B, A].I want to check if the numpy array at canvas[row, col] is equal to another numpy array [0, 0, 0, 240~255], where the last element is a range of values that will be accepted as "equal". For example, both [0,0,0, 242] and [0,0,0, 255] will pass this check. Below, I have it so that it only accepts the latter case.
(canvas[row,col] == np.array([0,0,0,255])).all()
How might I write this condition so it does as I described previously?

You can compare slices:
(
(canvas[row, col, :3] == 0).all() # checking that color is [0, 0, 0]
and
(canvas[row, col, 3] >= 240) # checking that alpha >= 240
)
Also, if you need to check this on a lot of values, you can optimize it with vectorization, producing a 2D array of boolean values:
np.logical_and(
(canvas[..., :3] == 0).all(axis=-1), # checking that color is [0, 0, 0]
(canvas[..., 3] >= 240) # checking that alpha >= 240
)

Using tf.where (or np.where) to draw randomly conditional on an input

I have a TensorFlow vector that only contains 1s and 0s, like a = [0, 0, 0, 1, 0, 1], and conditional on the value of a, I want to draw new random values 0 or 1. If the value of a is 1, I want to draw a new value but if the value of a is 0 I want to leave it alone. So I've tried this:
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
# random draw of zeros and ones
a = tfd.Binomial(total_count = 1.0, probs = 0.5).sample(6)
which gives me <tf.Tensor: shape=(6,), dtype=float32, numpy=array([0., 0., 0., 1., 0., 1.], dtype=float32)> then if I redraw
# redraw with a different probability if value is 1. in the original draw
b = tf.where(a == 1.0, tfd.Binomial(total_count = 1., probs = 0.5).sample(1), a)
I would expect tf.where to give me a new vector b that has, on average, half of the 1s become 0s but instead it either returns a copy of a or a vector of all 0s. Example output would be one of b = [0, 0, 0, 0, 0, 0], b = [0, 0, 0, 0, 0, 1], b = [0, 0, 0, 1, 0, 0], or b = [0, 0, 0, 1, 0, 1] . I could of course just use b = tfd.Binomial(total_count = 1.0, probs = 0.25).sample(6) but in my particular case the order of the original vector matters.
A more general situation might use a different distribution so that bit-wise operations can't be easily used. For example
# random draw of normals
a = tfd.Normal(loc = 0., scale = 1.).sample(6)
# redraw with a different scale if value is greater than zero in the original draw
b = tf.where(a > 0, tfd.Normal(loc = 0., scale = 2.).sample(1), a)

APPROACH 1:
Not tested, but I think the middle param should be a tensor that matches the original one. E.g. 6 elements:
First, make a second random sequence, of same length:
a2 = tfd.Binomial(total_count = 1.0, probs = 0.5).sample(6)
NOTE: If you need a different probability, you simply use that probability when creating a2.
prob = 0.3
a2 = tfd.Binomial(total_count = 1.0, probs = prob).sample(6)
Then:
b = tf.where(a == 1.0, a2, a)
Explanation:
The values in a2 are irrelevant where a is 0, and are "prob" on average where a is 1.
APPROACH 2:
If that doesn't work, then first param needs to be mapped to a tensor of [true, false, ..]:
def pos(n):
return n > 0
cond = list(map(pos,a)) # I don't have TensorFlow handy; may need to replace `list` with appropriate function to create a Tensor.
b = tf.where(cond, a2, 0.0)
APPROACH 3:
Tested. Doesn't use tf.where.
First, make a second random sequence, of same length:
a2 = tfd.Binomial(total_count = 1.0, probs = prob).sample(6)
Then combine the two, "bitwise-and"ing corresponding elements:
def and2(a, b):
return (a & b)
b = list(map(and2, a, a2))
NOTE: could alternatively use any other function to combine the two corresponding elements.
Example data:
a = [0,0,1,1]
a2 = [0,1,0,1]
Result:
b = [0,0,0,1]
Explanation:
The values in a2 are irrelevant where a is 0, and are "prob" on average where a is 1.

tf.where is broadcasting your second argument with shape [1] value up to a six vector [6]. The second value is put in the place of the existing 1s: either 1 (in which case the output matches a) or 0 (in which case all zeroes). Draw six samples for independent resampled values at each site.
Consider tfd.Bernoulli for 0/1 samples.

Replacing array at i`th dimension

Let's say I have a two-dimensional array
import numpy as np
a = np.array([[1, 1, 1], [2,2,2], [3,3,3]])
and I would like to replace the third vector (in the second dimension) with zeros. I would do
a[:, 2] = np.array([0, 0, 0])
But what if I would like to be able to do that programmatically? I mean, let's say that variable x = 1 contained the dimension on which I wanted to do the replacing. How would the function replace(arr, dimension, value, arr_to_be_replaced) have to look if I wanted to call it as replace(a, x, 2, np.array([0, 0, 0])?
numpy has a similar function, insert. However, it doesn't replace at dimension i, it returns a copy with an additional vector.
All solutions are welcome, but I do prefer a solution that doesn't recreate the array as to save memory.

arr[:, 1]
is basically shorthand for
arr[(slice(None), 1)]
that is, a tuple with slice elements and integers.
Knowing that, you can construct a tuple of slice objects manually, adjust the values depending on an axis parameter and use that as your index. So for
import numpy as np
arr = np.array([[1, 1, 1], [2, 2, 2], [3, 3, 3]])
axis = 1
idx = 2
arr[:, idx] = np.array([0, 0, 0])
# ^- axis position
you can use
slices = [slice(None)] * arr.ndim
slices[axis] = idx
arr[tuple(slices)] = np.array([0, 0, 0])

Change all values equal to x to y

I have a very simple task that I cannot figure out how to do in numpy. I have a 3 channel array and wherever the array value does not equal (1,1,1) I want to convert that array value to (0,0,0).
So the following:
[[0,1,1],
[1,1,1],
[1,0,1]]
Should change to:
[[0,0,0],
[1,1,1],
[0,0,0]]
How can I achieve this in numpy? The following is not achieving the desired results:
# my_arr.dtype = uint8
my_arr[my_arr != (1,1,1)] = 0
my_arr = np.where(my_arr == (1,1,1), my_arr, (0,0,0))

Use numpy.array.all(1) to filter and assign 0:
import numpy as np
arr = np.array([[0,1,1],
[1,1,1],
[1,0,1]])
arr[~(arr == 1).all(1)] = 0
Output:
array([[0, 0, 0],
[1, 1, 1],
[0, 0, 0]])
Explain:
arr==1: returns array of bools that satisfy the condition (here it's 1)
all(axis=1): returns array of bools if each row has all True (i.e. all rows that are 1`
~(arr==1).all(1): selects rows that are not all 1

This is just comparing the two lists.
x = [[0,1,1],
[1,1,1],
[1,0,1]]
for i in range(len(x)):
if x[i] != [1,1,1]:
x[i] = [0,0,0]

Einsum for high dimensions

Considering the 3 arrays below:
np.random.seed(0)
X = np.random.randint(10, size=(4,5))
W = np.random.randint(10, size=(3,4))
y = np.random.randint(3, size=(5,1))
i want to add and sum each column of the matrix X to the row of W ,given by y as index. So ,for example, if the first element in y is 3 , i'll add the first column of X to the fourth row of W(index 3 in python) and sum it. i'll do it over and over until all columns of X are added to the specific row of W and summed.
i could do it in different ways:
1- using for loop:
for i,j in enumerate(y):
W[j]+=X[:,i]
2- using the add.at function
np.add.at(W,(y.ravel()),X.T)
3- but i can't understand how to do it using einsum.
i was given a solution ,but really can't understand it.
N = y.max()+1
W[:N] += np.einsum('ijk,lk->il',(np.arange(N)[:,None,None] == y.ravel()),X)
Anyone could explain me this structure?
1 - (np.arange(N)[:,None,None] == y.ravel(),X). i imagine this part refers to summing the column of X to the specific row of W ,according to y. But where s W ? and why do we have to transform W in 4 dimensions in this case?
2- 'ijk,lk->il' - i didnt understand this either.
i -refers to the rows,
j - columns,
k- each element,
l - what does 'l' refers too?.
if anyone can understand this and explain to me , i would really appreciate.
Thanks in advance.

Let's simplify the problem by dropping one dimension and using values that are easy to verify manually:
W = np.zeros(3, np.int)
y = np.array([0, 1, 1, 2, 2])
X = np.array([1, 2, 3, 4, 5])
Values in the vector W get added values from X by looking up with y:
for i, j in enumerate(y):
W[j] += X[i]
W is calculated as [1, 5, 9], (check quickly by hand).
Now, how could this code be vectorized? We can't do a simple W[y] += X[y] as y has duplicate values in it and the different sums would overwrite each other at indices 1 and 2.
What could be done is to broadcast the values into a new dimension of len(y) and then sum up over this newly created dimension.
N = W.shape[0]
select = (np.arange(N) == y[:, None]).astype(np.int)
Taking the index range of W ([0, 1, 2]), and setting the values where they match y to 1 in a new dimension, otherwise 0. select contains this array:
array([[1, 0, 0],
[0, 1, 0],
[0, 1, 0],
[0, 0, 1],
[0, 0, 1]])
It has len(y) == len(X) rows and len(W) columns and shows for every y/row, what index of W it contributes to.
Let's multiply X with this array, mult = select * X[:, None]:
array([[1, 0, 0],
[0, 2, 0],
[0, 3, 0],
[0, 0, 4],
[0, 0, 5]])
We have effectively spread out X into a new dimension, and sorted it in a way we can get it into shape W by summing over the newly created dimension. The sum over the rows is the vector we want to add to W:
sum_Xy = np.sum(mult, axis=0) # [1, 5, 9]
W += sum_Xy
The computation of select and mult can be combined with np.einsum:
# `select` has shape (len(y)==len(X), len(W)), or `yw`
# `X` has shape len(X)==len(y), or `y`
# we want something `len(W)`, or `w`, and to reduce the other dimension
sum_Xy = np.einsum("yw,y->w", select, X)
And that's it for the one-dimensional example. For the two-dimensional problem posed in the question it is exactly the same approach: introduce an additional dimension, broadcast the y indices, and then reduce the additional dimension with einsum.
If you internalize how every step works for the one-dimensional example, I'm sure you can work out how the code is doing it in two dimensions, as it is just a matter of getting the indices right (W rows, X columns).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

numpy.where() returns inconsisten dimensions - python

Related

comparing numpy arrays such that they equal each other if they both fall within the same range of values

Using tf.where (or np.where) to draw randomly conditional on an input

Replacing array at i`th dimension

Change all values equal to x to y

Einsum for high dimensions

Categories

Resources