Keras one-hot-encoder - python

I have an array, and use the to_categorical function in keras:
labels = np.array([1,7,7,1,7])
keras.utils.to_categorical(labels)
I get this response:
array([[0., 1., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 0., 0., 0., 1.],
[0., 1., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 1.]], dtype=float32)
How can I get only two columns? One for the 1 and one for the 7.
This is a possible way, but not a very good one:
labels = np.delete(labels, np.s_[0:1], axis=1)
np.delete(labels, np.s_[1:6], axis=1)
that gives:
array([[1., 0.],
[0., 1.],
[0., 1.],
[1., 0.],
[0., 1.]], dtype=float32)
Is there a better way to achieve this? Preferably by some "hidden" function in Keras utils or similar?

IIUC, you can just index your array by any column that has a value:
cat = keras.utils.to_categorical(labels)
>>> cat
array([[0., 1., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 0., 0., 0., 1.],
[0., 1., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 1.]])
# Select column if it has at least one value:
>>> cat[:,cat.any(0)]
array([[1., 0.],
[0., 1.],
[0., 1.],
[1., 0.],
[0., 1.]])
You could also use pandas:
import pandas as pd
cat = pd.get_dummies(labels).values
>>> cat
array([[1, 0],
[0, 1],
[0, 1],
[1, 0],
[0, 1]], dtype=uint8)

Use np.unique with return_inverse flag -
# Get unique IDs mapped to each group of elements
In [73]: unql, idx = np.unique(labels, return_inverse=True)
# Perform outer comparison for idx against range of unique groups
In [74]: (idx[:,None] == np.arange(len(unql))).astype(float)
Out[74]:
array([[1., 0.],
[0., 1.],
[0., 1.],
[1., 0.],
[0., 1.]])
Alternatively with direct usage of unique labels -
In [96]: (labels[:,None] == np.unique(labels)).astype(float)
Out[96]:
array([[1., 0.],
[0., 1.],
[0., 1.],
[1., 0.],
[0., 1.]])

Related

How to sort a one hot tensor according to a tensor of indices

Given the below tensor:
tensor = torch.Tensor([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 0., 1.],
[1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.]])
and below is the tensor containing the indices:
indices = torch.tensor([2, 6, 7, 5, 4, 0, 3, 1])
How can I sort tensor using the values inside of indices?
Trying with sorted gives the error:
TypeError: 'Tensor' object is not callable`.
While numpy.sort gives:
ValueError: Cannot specify order when the array has no fields.`
You can use the indices like this:
tensor = torch.Tensor([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 0., 1.],
[1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.]])
indices = torch.tensor([2, 6, 7, 5, 4, 0, 3, 1])
sorted_tensor = tensor[indices]
print(sorted_tensor)
# output
tensor([[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.],
[1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[0., 0., 0., 0., 1.],
[0., 1., 0., 0., 0.]])

create one-hot encoding for values of histogram bins

Given the tensor below of size torch.Size([22])
tensor([-20.1659, -19.7022, -17.4124, -16.7115, -16.4696, -15.6848, -15.5201, -14.5384, -12.5017, -12.4227, -11.0946, -10.7844, -10.5467, -9.3933, -4.2351, -4.0521, -3.8844, -3.8668, -3.7337, -3.7002, -3.6242, -3.5820])
and the below historgram:
hist = torch.histogram(tensor, 5)
hist
torch.return_types.histogram(
hist=tensor([3., 5., 5., 1., 8.]),
bin_edges=tensor([-20.1659, -16.8491, -13.5323, -10.2156, -6.8988, -3.5820]))
For each value of the tensor, how to create a one hot encoding that corresponds to its bin number, so that the output is a tensor of size torch.Size([22, 5])
You can use torch.repeat_interleave
import torch
bins = torch.tensor([3, 5, 5, 1, 8])
one_hots = torch.eye(len(bins))
one_hots = torch.repeat_interleave(one_hots, bins, dim=0)
print(one_hots)
output
tensor([[1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.]])

Tensorflow to categorical problem, i want map my masks for segmentation?

I have a problem with labels for segmentation, the label can have this value: 0, 200, 210, 220, 230, 240. I use this code:
mask = tf.keras.utils.to_categorical(y, 241)
The code work, but i want map the mask with only 6 classes, is this possible?
mask = tf.keras.utils.to_categorical(y,6)
On solution is that first replace your list with ordered indexes and then make it categorical. Because to_categorical expects indices for your list.
Here is the example code if you have limited categories:
y = [0, 200,210,0,240,230,200,0,210,220,240,0]
replacements = {
0: 0,
200: 1,
210: 2,
220: 3,
230: 4,
240: 5,
}
y = [replacements.get(x, x) for x in y]
y = tf.keras.utils.to_categorical(y)
Or you can use a simpler way like this:
from sklearn.preprocessing import LabelEncoder
y = tf.keras.utils.to_categorical(LabelEncoder().fit_transform(y))
If you print y:
array([[1., 0., 0., 0., 0., 0.],
[0., 1., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0.],
[1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1., 0.],
[0., 1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0.],
[0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 0., 1.],
[1., 0., 0., 0., 0., 0.]], dtype=float32)

pytorch: how to apply function over all cells of 4-d tensor

I'm trying to apply a function over a 4-D tensor (I think about it as a 2-D matrix with a 2-D matrix in each cell) with the following dimensions: [N x N x N x N].
The apply function returns [1 x N] tensor, so after the apply function I'm expecting a tensor of the following dimensions: [N x N x 1 x N].
Example: let's define [4 x 4 x 4 x 4] tensor:
tensor_4d = torch.tensor([[[[1., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]],
[[1., 0., 0., 0.],
[1., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]],
[[1., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]],
[[1., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]]],
[[[0., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]],
[[0., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]],
[[0., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]],
[[0., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]]],
[[[0., 0., 1., 0.],
[0., 0., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 0.]],
[[0., 0., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 0.]],
[[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 0.]],
[[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 0.]]],
[[[0., 0., 1., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 1.],
[0., 0., 0., 1.]],
[[0., 0., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.],
[0., 0., 0., 1.]],
[[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 1.],
[0., 0., 0., 1.]],
[[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 1.]]]], dtype=torch.float64)
lets look on tensor_4d at [3][0]:
tensor_4d[3][0]
tensor([[0., 0., 1., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 1.],
[0., 0., 0., 1.]], dtype=torch.float64)
this my apply function :
def apply_function(tensor_2d):
eigenvalues, eigenvectors = torch.eig(input=tensor_2d, eigenvectors=True)
return eigenvector[:, 2]
and this the result of the apply function:
apply_function(tensor_4d[3][0])
tensor([-1.0000e+00, 0.0000e+00, 4.0083e-292, 0.0000e+00],
dtype=torch.float64)
so the apply_function works for each cell.
Next, I'm trying to use the apply_function with the whole matrix, and expecting each cell will contain the result of activating 'apply_function' for this cell. but, when using the apply function I'm getting the following error:
apply_function(tensor_4d)
Traceback (most recent call last):
File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.2.3\helpers\pydev\_pydevd_bundle\pydevd_exec2.py", line 3, in Exec
exec(exp, global_vars, local_vars)
File "<input>", line 1, in <module>
File "C:/Users/LiavB/PycharmProjects/RBC/Components/RBC_torch.py", line 41, in apply_function
eigenvalues, eigenvectors = torch.eig(input=tensor_2d, eigenvectors=True)
RuntimeError: invalid argument 1: A should be 2 dimensional at ..\aten\src\TH/generic/THTensorLapack.cpp:206
Let's try:
new_shape=(-1,)+tensor_4d.shape[2:]
out = (torch.stack([apply_function(t) for t in tensor_4d.view(new_shape)], axis=-1)
.reshape(new_shape)
)

How to update value in ndarray by index that is inside another array [duplicate]

This question already has answers here:
using an numpy array as indices of the 2nd dim of another array? [duplicate]
(2 answers)
Closed 4 years ago.
For example, I have 10x7 ndarray of zeros x=np.zeros( (10,7) )
array([[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.]])
and I want to randomly assign one '1' in each row. Say I create another array (10,1) and then value is between 0-6. r=np.random.randint(0, 7, (10,1))
array([[6],
[2],
[5],
[1],
[2],
[4],
[6],
[3],
[0],
[1]])
i want from r that it means set to 0 of the element x[0,6] , x[1,2], x[2,5], x[3,1] etc, so x should become something like
array([[0., 0., 0., 0., 0., 0., 1.],
[0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0.],
[0., 1., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 0., 0., 1.],
[0., 0., 0., 1., 0., 0., 0.],
[1., 0., 0., 0., 0., 0., 0.],
[0., 1., 0., 0., 0., 0., 0.]])
How to do it efficiently?
Use a one dimensional array for r and use it as the column index. For row indexes you can simply use a range:
In [25]: r=np.random.randint(0, 7, 10)
In [26]: x=np.zeros( (10,7) )
In [27]: x[np.arange(10), r] = 1
In [28]: x
Out[28]:
array([[0., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 0., 0., 1.],
[0., 1., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 1., 0., 0., 0.],
[0., 0., 0., 0., 1., 0., 0.],
[0., 0., 1., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0., 0., 0.]])

Categories