Finding the logits with respect to labels Tensorflow Python - python

I have the label array and logits array as:
label = [1,1,0,1,-1,-1,1,0,-1,0,-1,-1,0,0,0,1,1,1,-1,1]
logits = [0.2,0.3,0.4,0.1,-1.4,-2,0.4,0.5,-0.231,1.9,1.4,-1.456,0.12,-0.45,0.5,0.3,0.4,0.2,1.2,12]
Using Tensorflow, I want to get the values from label and logits where:
1> label is greater than zero
2> label is less than zero
3> label is equals to zero
I am willing to have result something like this:
label1,logits1 = some_Condition_logic_Where(label > 0) _ returns respective labels and logits
Can anyone suggest me how is this achievable?
EDITED:
>>> label = [1,1,0,1,-1,-1,1,0,-1,0,-1,-1,0,0,0,1,1,1,-1,1]
>>> logits = [0.2,0.3,0.4,0.1,-1.4,-2,0.4,0.5,-0.231,1.9,1.4,-1.456,0.12,-0.45,0.5,0.3,0.4,0.2,1.2,12]
>>> label1 = [];logits1 = []
>>> for l1,l2 in zip(label,logits):
... if(l1>0):
... label1.append(l1)
... logits1.append(l2)
...
>>> label1
[1, 1, 1, 1, 1, 1, 1, 1]
>>> logits1
[0.2, 0.3, 0.1, 0.4, 0.3, 0.4, 0.2, 12]
Want this logic to be implemented in Tensorflow same for the values with -1 and 0. How I can achieve this?

You can use tf.boolean_mask.
import tensorflow as tf
label = tf.constant([1,1,0,1,-1,-1,1,0,-1,0,-1,-1,0,0,0,1,1,1,-1,1],dtype=tf.float32)
logits = tf.constant([0.2,0.3,0.4,0.1,-1.4,-2,0.4,0.5,-0.231,1.9,1.4,-1.456,0.12,-0.45,0.5,0.3,0.4,0.2,1.2,12],dtype=tf.float32)
# label>0
label1 = tf.boolean_mask(label,tf.greater(label,0))
logits1 = tf.boolean_mask(logits,tf.greater(label,0))
# label<0
label2 = tf.boolean_mask(label,tf.less(label,0))
logits2 = tf.boolean_mask(logits,tf.less(label,0))
# label=0
label3 = tf.boolean_mask(label,tf.equal(label,0))
logits3 = tf.boolean_mask(logits,tf.equal(label,0))
with tf.Session() as sess:
print(sess.run(label1))
print(sess.run(logits1))
print(sess.run(label2))
print(sess.run(logits2))
print(sess.run(label3))
print(sess.run(logits3))
[1. 1. 1. 1. 1. 1. 1. 1.]
[ 0.2 0.3 0.1 0.4 0.3 0.4 0.2 12. ]
[-1. -1. -1. -1. -1. -1.]
[-1.4 -2. -0.231 1.4 -1.456 1.2 ]
[0. 0. 0. 0. 0. 0.]
[ 0.4 0.5 1.9 0.12 -0.45 0.5 ]

Related

how to reverse index a 2-d array

I have a 2d MxN array A , each row of which is a sequence of indices, padded by -1's at the end e.g.:
[[ 2 1 -1 -1 -1]
[ 1 4 3 -1 -1]
[ 3 1 0 -1 -1]]
I have another MxN array of float values B:
[[ 0.7 0.4 1.5 2.0 4.4 ]
[ 0.8 4.0 0.3 0.11 0.53]
[ 0.6 7.4 0.22 0.71 0.06]]
and I want to use the indices in A to filter B i.e. for each row, only the indices present in A retain their values, and the values at all other locations are set to 0.0, i.e. the result would look like:
[[ 0.0 0.4 1.5 0.0 0.0 ]
[ 0.0 4.0 0.0 0.11 0.53 ]
[ 0.6 7.4 0.0 0.71 0.0]]
What's a good way to do this in "pure" numpy? (I would like to do this in pure numpy so I can jit it in jax.
Numpy supports fancy indexing. Ignoring the "-1" entries for the moment, you can do something like this:
index = (np.arange(B.shape[0]).reshape(-1, 1), A)
result = np.zeros_like(B)
result[index] = B[index]
This works because indices are broadcasted. The column np.arange(B.shape[0]).reshape(-1, 1) matches all the elements of a given row of A to the corresponding row in B and result.
This example does not address the fact that -1 is a valid numpy index. You need to clear the elements that correspond to -1 in A when 4 (the last column) is not present in that row:
mask = (A == -1).any(axis=1) & (A != A.shape[1] - 1).all(axis=1)
result[mask, -1] = 0.0
Here, the mask is [True, False, True], indicating that even though the second row has a -1 in it, it also contains a 4.
This approach is fairly efficient. It will create no more than a couple of boolean arrays of the same shape as A for the mask.
You can use broadcasting, but note that it will create a large intermediate array of shape (M, N, N) (in pure numpy at least):
import numpy as np
A = ...
B = ...
M, N = A.shape
out = np.where(np.any(A[..., None] == np.arange(N), axis=1), B, 0.0)
out:
array([[0. , 0.4 , 1.5 , 0. , 0. ],
[0. , 4. , 0. , 0.11, 0.53],
[0.6 , 7.4 , 0. , 0.71, 0. ]])
Another possible solution:
maxr = np.max(A, axis=1)
A = np.where(A == -1, maxr.reshape(-1,1), A)
mask = np.zeros(np.shape(B), dtype=bool)
np.put_along_axis(mask, A, True, axis=1)
np.where(mask, B, 0)
Output:
array([[0. , 0.4 , 1.5 , 0. , 0. ],
[0. , 4. , 0. , 0.11, 0.53],
[0.6 , 7.4 , 0. , 0.71, 0. ]])
EDIT (When there is rows with only -1)
The following code aims to contemplate the possibility, raised by #MadPhysicist (to whom I thank), of having rows containing only -1 -- that is only necessary to add 2 lines of code to my previous code.
A = np.array([[ 2, 1, -1, -1, -1],
[ -1, -1, -1, -1, -1],
[ 3, 1, 0, -1, -1]])
B = np.array([[ 0.7, 0.4, 1.5, 2.0, 4.4 ],
[ 0.8, 4.0, 0.3, 0.11, 0.53],
[ 0.6, 7.4, 0.22, 0.71, 0.06]])
rminus1 = np.all(A == -1, axis=1) # new
maxr = np.max(A, axis=1)
A = np.where(A == -1, maxr.reshape(-1,1), A)
mask = np.zeros(np.shape(B), dtype=bool)
np.put_along_axis(mask, A, True, axis=1)
C = np.where(mask, B, 0)
C[rminus1, :] = 0 # new
Output:
array([[0. , 0.4 , 1.5 , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. ],
[0.6 , 7.4 , 0. , 0.71, 0. ]])

Generating three new features (membership class) based on probability clustering

I have a column of 100,000 temperatures with a minimum of 0°F and maximum of 130°F. I want to create three new columns (features) based on that temperature column for my model based on probability of membership to a cluster (I think it is also called fuzzy clustering or soft k means clustering).
As illustrated in the plot below: I want to create 3 class memberships with overlap (cold, medium, hot) each with probability of data points belonging to each class of temperature. For example: a temperature of 39°F might have a class 1 (hot) membership of 0.05, a class 2 (medium) membership of 0.20 and a class 3 (cold) membership of 0.75 (note the sum of three would be 1). Is there any way to do this in Python?
cluster_1 = 0 to 30
Cluster_2 = 50 to 80
Cluster_3 = 100 to 130
Based on the image and description: this is more of an assignment problem based on known soft clusters, rather than a clustering problem in itself.
If you have a vector of temperatures: [20, 30, 40, 50, 60, ...] that you want to convert to probabilities of being cold, warm, or hot based on the image above, you can achieve this with linear interpolation:
import numpy as np
def discretize(vec):
out = np.zeros((len(vec), 3))
for i, v in enumerate(vec):
if v < 30:
out[i] = [1.0, 0.0, 0.0]
elif v <= 50:
out[i] = [(50 - v) / 20, (v - 30) / 20, 0.0]
elif v <= 80:
out[i] = [0.0, 1.0, 0.0]
elif v <= 100:
out[i] = [0.0, (100 - v) / 20, (v - 80) / 20]
else:
out[i] = [0.0, 0.0, 1.0]
return out
result = discretize(np.arange(20, 120, step=5))
Which will expand a 1xN array into a 3xN array:
[[1. 0. 0. ]
[1. 0. 0. ]
[1. 0. 0. ]
[0.75 0.25 0. ]
[0.5 0.5 0. ]
[0.25 0.75 0. ]
[0. 1. 0. ]
...
[0. 1. 0. ]
[0. 0.75 0.25]
[0. 0.5 0.5 ]
[0. 0.25 0.75]
[0. 0. 1. ]
...
[0. 0. 1. ]]
If you don't know the clusters ahead of time, a Gaussian mixture performs something similar to this idea.
For example, consider a multimodal distribution X with modes at 25, 65, and 115 (to correspond roughly with the temperature example):
from numpy.random import default_rng
rng = default_rng(42)
X = np.c_[
rng.normal(loc=25, scale=15, size=1000),
rng.normal(loc=65, scale=15, size=1000),
rng.normal(loc=115, scale=15, size=1000),
].reshape(-1, 1)
Fitting a Gaussian mixture corresponds to trying to estimate where the means are:
from sklearn.mixture import GaussianMixture
model = GaussianMixture(n_components=3, random_state=42)
model.fit(X)
print(model.means_)
Here: the means that are found tend to be pretty close to where we expected them to be in our synthetic data:
[[115.85580935]
[ 25.33925571]
[ 65.35465989]]
Finally, the .predict_proba() method provides an estimate for how likely a value belongs to each cluster:
>>> np.round(model.predict_proba(X), 3)
array([[0. , 0.962, 0.038],
[0.002, 0.035, 0.963],
[0.989, 0. , 0.011],
...,
[0. , 0.844, 0.156],
[0.88 , 0. , 0.12 ],
[0.993, 0. , 0.007]])

Index array with the result of .nonzero()

I am having difficulties selecting rows using two condition in Numpy. The following code does not return the intended output
tot_length=0.3
steps=0.1
start_val=0.0
list_no =np.arange(start_val, tot_length, steps)
x, y, z = np.meshgrid(*[list_no for _ in range(3)], sparse=True)
a = ((x>=y) & (y>=z)).nonzero() # this maybe the problem
output
(array([0, 0, 0, 1, 1, 1, 1, 2, 2, 2]), array([0, 1, 2, 1, 1, 2, 2, 2, 2, 2]), array([0, 0, 0, 0, 1, 0, 1, 0, 1, 2]))
whereas, the intended output
[[0. 0. 0. ]
[0.1 0. 0. ]
[0.1 0.1 0. ]
[0.1 0.1 0.1]
[0.2 0. 0. ]
[0.2 0.1 0. ]
[0.2 0.1 0.1]
[0.2 0.2 0. ]
[0.2 0.2 0.1]
[0.2 0.2 0.2]]
ndarray.nonzero as well as np.where return tuples of arrays of indices. This makes unpacking those indices into separate arrays, which can then be used to index along a given axis. Stacking them up into a 2D array is trivial though, simply build a new array and transpose as:
ix = np.array(((x>=y) & (y>=z)).nonzero()).T
Then you can easily use the array of indices to index list_no as:
list_no[ix]
array([[0. , 0. , 0. ],
[0. , 0.1, 0. ],
[0. , 0.2, 0. ],
[0.1, 0.1, 0. ],
[0.1, 0.1, 0.1],
[0.1, 0.2, 0. ],
[0.1, 0.2, 0.1],
[0.2, 0.2, 0. ],
[0.2, 0.2, 0.1],
[0.2, 0.2, 0.2]])

Using for loop to replace values in a matrix but only the last replaced value is kept

x_n = np.arange(0, 1.0, 0.25)
u_m = np.arange(0, 1.0, 0.5)
for x in range(len(x_n)):
for u in range(len(u_m)):
zeros_array = np.zeros( (len(x_n), len(u_m)) )
zeros_array[x,u] = x_n[x] - u_m[u]
zeros_array
#result
array([[ 0. , 0. ],
[ 0. , 0. ],
[ 0. , 0. ],
[ 0. , 0.25]])
Only the last replaced value is kept. I want to know how to keep all the replaced values.
You're initializing a new zeros_array on every iteration of the loop, so it's straight-forward that when the loop ends, only the last zeros_array value is kept, to solve this, you need to define zeros_array once outside the loop and keep updating it inside:
x_n = np.arange(0, 1.0, 0.25)
u_m = np.arange(0, 1.0, 0.5)
zeros_array = np.zeros((len(x_n), len(u_m)))
for x in range(len(x_n)):
for u in range(len(u_m)):
zeros_array[x, u] = x_n[x] - u_m[u]
print(zeros_array)
Output:
[[ 0. -0.5 ]
[ 0.25 -0.25]
[ 0.5 0. ]
[ 0.75 0.25]]
you have the initialization of the zeros_array inside the loop so it's doing it every loop
do:
zeros_array = np.zeros((len(x_n),len(u_m)))
for x in range(len(x_n)):
for u in range(len(u_m)):
zeros_array[x,u] = x_n[x] - u_m[u]
output:
array([[ 0. , -0.5 ],
[ 0.25, -0.25],
[ 0.5 , 0. ],
[ 0.75, 0.25]])

Flip non-zero values along each row of a lower triangular numpy array

I have a lower triangular array, like B:
B = np.array([[1,0,0,0],[.25,.75,0,0], [.1,.2,.7,0],[.2,.3,.4,.1]])
>>> B
array([[ 1. , 0. , 0. , 0. ],
[ 0.25, 0.75, 0. , 0. ],
[ 0.1 , 0.2 , 0.7 , 0. ],
[ 0.2 , 0.3 , 0.4 , 0.1 ]])
I want to flip it to look like:
array([[ 1. , 0. , 0. , 0. ],
[ 0.75, 0.25, 0. , 0. ],
[ 0.7 , 0.2 , 0.1 , 0. ],
[ 0.1 , 0.4 , 0.3 , 0.2 ]])
That is, I want to take all the positive values, and reverse within the positive values, leaving the trailing zeros in place. This is not what fliplr does:
>>> np.fliplr(B)
array([[ 0. , 0. , 0. , 1. ],
[ 0. , 0. , 0.75, 0.25],
[ 0. , 0.7 , 0.2 , 0.1 ],
[ 0.1 , 0.4 , 0.3 , 0.2 ]])
Any tips? Also, the actual array I am working with would be something like B.shape = (200,20,4,4) instead of (4,4). Each (4,4) block looks like the above example (with different numbers across the 200, 20 different entries).
How about this:
# row, column indices of the lower triangle of B
r, c = np.tril_indices_from(B)
# flip the column indices by subtracting them from r, which is equal to the number
# of nonzero elements in each row minus one
B[r, c] = B[r, r - c]
print(repr(B))
# array([[ 1. , 0. , 0. , 0. ],
# [ 0.75, 0.25, 0. , 0. ],
# [ 0.7 , 0.2 , 0.1 , 0. ],
# [ 0.1 , 0.4 , 0.3 , 0.2 ]])
The same approach will generalize to any arbitrary N-dimensional array that consists of multiple lower triangular submatrices:
# creates a (200, 20, 4, 4) array consisting of tiled copies of B
B2 = np.tile(B[None, None, ...], (200, 20, 1, 1))
print(repr(B2[100, 10]))
# array([[ 1. , 0. , 0. , 0. ],
# [ 0.25, 0.75, 0. , 0. ],
# [ 0.1 , 0.2 , 0.7 , 0. ],
# [ 0.2 , 0.3 , 0.4 , 0.1 ]])
r, c = np.tril_indices_from(B2[0, 0])
B2[:, :, r, c] = B2[:, :, r, r - c]
print(repr(B2[100, 10]))
# array([[ 1. , 0. , 0. , 0. ],
# [ 0.75, 0.25, 0. , 0. ],
# [ 0.7 , 0.2 , 0.1 , 0. ],
# [ 0.1 , 0.4 , 0.3 , 0.2 ]])
For an upper triangular matrix you could simply subtract r from c instead, e.g.:
r, c = np.triu_indices_from(B.T)
B.T[r, c] = B.T[c - r, c]
Here's one approach for a 2D array case -
mask = np.tril(np.ones((4,4),dtype=bool))
out = np.zeros_like(B)
out[mask] = B[:,::-1][mask[:,::-1]]
You can extend it to a 3D array case using the same 2D mask by masking the last two axes with it, like so -
out = np.zeros_like(B)
out[:,mask] = B[:,:,::-1][:,mask[:,::-1]]
.. and similarly for a 4D array case, like so -
out = np.zeros_like(B)
out[:,:,mask] = B[:,:,:,::-1][:,:,mask[:,::-1]]
As one can see, we are keeping the masking process to the last two axes of (4,4) and the solution basically stays the same.
Sample run -
In [95]: B
Out[95]:
array([[ 1. , 0. , 0. , 0. ],
[ 0.25, 0.75, 0. , 0. ],
[ 0.1 , 0.2 , 0.7 , 0. ],
[ 0.2 , 0.3 , 0.4 , 0.1 ]])
In [96]: mask = np.tril(np.ones((4,4),dtype=bool))
...: out = np.zeros_like(B)
...: out[mask] = B[:,::-1][mask[:,::-1]]
...:
In [97]: out
Out[97]:
array([[ 1. , 0. , 0. , 0. ],
[ 0.75, 0.25, 0. , 0. ],
[ 0.7 , 0.2 , 0.1 , 0. ],
[ 0.1 , 0.4 , 0.3 , 0.2 ]])

Categories