create one-hot encoding for values of histogram bins

create one-hot encoding for values of histogram bins - python

Given the tensor below of size torch.Size([22])
tensor([-20.1659, -19.7022, -17.4124, -16.7115, -16.4696, -15.6848, -15.5201, -14.5384, -12.5017, -12.4227, -11.0946, -10.7844, -10.5467, -9.3933, -4.2351, -4.0521, -3.8844, -3.8668, -3.7337, -3.7002, -3.6242, -3.5820])
and the below historgram:
hist = torch.histogram(tensor, 5)
hist
torch.return_types.histogram(
hist=tensor([3., 5., 5., 1., 8.]),
bin_edges=tensor([-20.1659, -16.8491, -13.5323, -10.2156, -6.8988, -3.5820]))
For each value of the tensor, how to create a one hot encoding that corresponds to its bin number, so that the output is a tensor of size torch.Size([22, 5])

You can use torch.repeat_interleave
import torch
bins = torch.tensor([3, 5, 5, 1, 8])
one_hots = torch.eye(len(bins))
one_hots = torch.repeat_interleave(one_hots, bins, dim=0)
print(one_hots)
output
tensor([[1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.]])

Related

How does dim argument of "Tensor.scatter_" method in PyTorch work?

Could anyone teach me why the below code uses dim=1 in the scatter_ method? The meaning of the attached codes is for one-hot encoding. I tried to read the PyTorch document example and thought I should use dim=0 for the desired result. However, the result has shown that dim=1 is correct instead.
>>> target = torch.tensor([3, 5, 0, 2, 7, 5])
>>> target
tensor([3, 5, 0, 2, 7, 5])
>>> onehot = torch.zeros(target.shape[0], 8)
>>> onehot.scatter_(1, target.unsqueeze(1), 1.0)
tensor([[0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0.],
[1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 0., 1., 0., 0.]])

You are applying scatter on a zero tensor onehot shaped (len(target), 8) on dim=1 using target as input and 1. as value. This will have the following effect on onehot:
onehot[i][target[i][j]] = 1.
This means for every row in target it will look at the unique value since j is always equal to 1 and use it to index the 2nd axis of onehot. In other words, for every row, it takes the value from target to position the 1. among the columns of onehot.
Step by step illustration would be:
>>> for i in range(len(target)):
... k = target[i] # k, depends on values of target i.e. dim=1
... onehot[i, k] = 1
... print(onehot)
tensor([[0., 0., 0., 1., 0., 0., 0., 0.], # i=0; k=3
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.]])
tensor([[0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0.], # i=1; k=5
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.]])
tensor([[0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0.],
[1., 0., 0., 0., 0., 0., 0., 0.], # i=2; k=0
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.]])
tensor([[0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0.],
[1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0., 0.], # i=3; k=2
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.]])
tensor([[0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0.],
[1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 1.], # i=4; k=7
[0., 0., 0., 0., 0., 0., 0., 0.]])
tensor([[0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0.],
[1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 0., 1., 0., 0.]]) # i=5; k=5
Notice that onehot.scatter_(0, target.unsqueeze(1), 1.0) would have produced:
onehot[target[i][j]][j] = 1.
Which is a valid operation only if you initialize onehot the other way around:
>>> onehot = torch.zeros(8, len(target))
>>> onehot.scatter_(0, target.unsqueeze(1), 1.)
tensor([[1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0., 0.]])
And you get the transpose of the other matrix.

numpy reshape tensor into matrix

I would like to know if the following is correct.
I would like to do the dyadic product between two identity tensors.
import numpy as np
a = np.tensordot(np.eye(3),np.eye(3),axes=0)
The output is:
array([[[[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]],
[[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]],
[[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]]],
[[[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]],
[[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]],
[[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]]],
[[[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]],
[[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]],
[[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]]]])
Which I assume is calculated as: https://ibb.co/5nvG2Nm
I would like to "flatten" the tensor into a matrix:
If I do:
np.tensordot(np.eye(3),np.eye(3),axes=0).reshape(9,9)
array([[1., 0., 0., 0., 1., 0., 0., 0., 1.],
[0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0.],
[1., 0., 0., 0., 1., 0., 0., 0., 1.],
[0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0.],
[1., 0., 0., 0., 1., 0., 0., 0., 1.]])
But I was hopping do get:
array([[1., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 1., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 1.]])
If my understanding of the tensor structure wrong? Or is the reshape function not reshaping correctly?

How to sort a one hot tensor according to a tensor of indices

Given the below tensor:
tensor = torch.Tensor([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 0., 1.],
[1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.]])
and below is the tensor containing the indices:
indices = torch.tensor([2, 6, 7, 5, 4, 0, 3, 1])
How can I sort tensor using the values inside of indices?
Trying with sorted gives the error:
TypeError: 'Tensor' object is not callable`.
While numpy.sort gives:
ValueError: Cannot specify order when the array has no fields.`

You can use the indices like this:
tensor = torch.Tensor([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 0., 1.],
[1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.]])
indices = torch.tensor([2, 6, 7, 5, 4, 0, 3, 1])
sorted_tensor = tensor[indices]
print(sorted_tensor)
# output
tensor([[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.],
[1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[0., 0., 0., 0., 1.],
[0., 1., 0., 0., 0.]])

pytorch: how to apply function over all cells of 4-d tensor

I'm trying to apply a function over a 4-D tensor (I think about it as a 2-D matrix with a 2-D matrix in each cell) with the following dimensions: [N x N x N x N].
The apply function returns [1 x N] tensor, so after the apply function I'm expecting a tensor of the following dimensions: [N x N x 1 x N].
Example: let's define [4 x 4 x 4 x 4] tensor:
tensor_4d = torch.tensor([[[[1., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]],
[[1., 0., 0., 0.],
[1., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]],
[[1., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]],
[[1., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]]],
[[[0., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]],
[[0., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]],
[[0., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]],
[[0., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]]],
[[[0., 0., 1., 0.],
[0., 0., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 0.]],
[[0., 0., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 0.]],
[[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 0.]],
[[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 0.]]],
[[[0., 0., 1., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 1.],
[0., 0., 0., 1.]],
[[0., 0., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.],
[0., 0., 0., 1.]],
[[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 1.],
[0., 0., 0., 1.]],
[[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 1.]]]], dtype=torch.float64)
lets look on tensor_4d at [3][0]:
tensor_4d[3][0]
tensor([[0., 0., 1., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 1.],
[0., 0., 0., 1.]], dtype=torch.float64)
this my apply function :
def apply_function(tensor_2d):
eigenvalues, eigenvectors = torch.eig(input=tensor_2d, eigenvectors=True)
return eigenvector[:, 2]
and this the result of the apply function:
apply_function(tensor_4d[3][0])
tensor([-1.0000e+00, 0.0000e+00, 4.0083e-292, 0.0000e+00],
dtype=torch.float64)
so the apply_function works for each cell.
Next, I'm trying to use the apply_function with the whole matrix, and expecting each cell will contain the result of activating 'apply_function' for this cell. but, when using the apply function I'm getting the following error:
apply_function(tensor_4d)
Traceback (most recent call last):
File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.2.3\helpers\pydev\_pydevd_bundle\pydevd_exec2.py", line 3, in Exec
exec(exp, global_vars, local_vars)
File "<input>", line 1, in <module>
File "C:/Users/LiavB/PycharmProjects/RBC/Components/RBC_torch.py", line 41, in apply_function
eigenvalues, eigenvectors = torch.eig(input=tensor_2d, eigenvectors=True)
RuntimeError: invalid argument 1: A should be 2 dimensional at ..\aten\src\TH/generic/THTensorLapack.cpp:206

Let's try:
new_shape=(-1,)+tensor_4d.shape[2:]
out = (torch.stack([apply_function(t) for t in tensor_4d.view(new_shape)], axis=-1)
.reshape(new_shape)
)

How to update value in ndarray by index that is inside another array [duplicate]

This question already has answers here:
using an numpy array as indices of the 2nd dim of another array? [duplicate]
(2 answers)
Closed 4 years ago.
For example, I have 10x7 ndarray of zeros x=np.zeros( (10,7) )
array([[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.]])
and I want to randomly assign one '1' in each row. Say I create another array (10,1) and then value is between 0-6. r=np.random.randint(0, 7, (10,1))
array([[6],
[2],
[5],
[1],
[2],
[4],
[6],
[3],
[0],
[1]])
i want from r that it means set to 0 of the element x[0,6] , x[1,2], x[2,5], x[3,1] etc, so x should become something like
array([[0., 0., 0., 0., 0., 0., 1.],
[0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0.],
[0., 1., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 0., 0., 1.],
[0., 0., 0., 1., 0., 0., 0.],
[1., 0., 0., 0., 0., 0., 0.],
[0., 1., 0., 0., 0., 0., 0.]])
How to do it efficiently?

Use a one dimensional array for r and use it as the column index. For row indexes you can simply use a range:
In [25]: r=np.random.randint(0, 7, 10)
In [26]: x=np.zeros( (10,7) )
In [27]: x[np.arange(10), r] = 1
In [28]: x
Out[28]:
array([[0., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 0., 0., 1.],
[0., 1., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 1., 0., 0., 0.],
[0., 0., 0., 0., 1., 0., 0.],
[0., 0., 1., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0., 0., 0.]])

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

create one-hot encoding for values of histogram bins - python

Related

How does dim argument of "Tensor.scatter_" method in PyTorch work?

numpy reshape tensor into matrix

How to sort a one hot tensor according to a tensor of indices

pytorch: how to apply function over all cells of 4-d tensor

How to update value in ndarray by index that is inside another array [duplicate]

Categories

Resources