Related
I'm trying to figure out how to do the following broadcast:
I have two tensors, of sizes (n1,N) and (n2,N)
What I want to do is to multiply each row of the first tensor, with each row of the second tensor, and then sum each of there multiplied row result, so that my final tensor should be of the form (n1,n2).
I tried this:
x1*torch.reshape(x2,(x2.size(dim=0),x2.size(dim=1),1))
But obviously this doesn't work.. Can't figure out how to do this
What you are looking for is the Tensordot command from PyTorch and Numpy
Since you want to compute dot product along N, which is dimension 1 of x1, and dimension 1 of x2 tensor, you need to perform a contraction along the first axes of both Tensors by supplying a ([1], [1]) to dims arg in Tensordot. This means Torch will sum products of x1 and x2 elements over the specified x1-axes 1 and specified x2-axes 1 respectively. The args to supply to dims is quite confusing, here's a useful thread to help understand how to use Tensordothere
x1 = torch.arange(6.).reshape(2,3)
>>> tensor([[0., 1., 2.],
[3., 4., 5.]])
# x1 is Tensor of shape (2,3)
x2 = torch.arange(9.).reshape(3,3)
>>> tensor([[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.]])
# x2 is Tensor of shape (3,3)
x = torch.tensordot(x1, x2, dims=([1],[1]))
>>> tensor([[ 5., 14., 23.],
[14., 50., 86.]])
# x is Tensor of shape (2,3)
What you describe seems to be effectively the same as performing a matrix multiplication between the first tensor and the transpose of the second tensor. This can be done as:
torch.matmul(x1, x2.T)
I have a problem with a numpy array.
In particular, suppose to have a matrix
x = np.array([[1., 2., 3.], [4., 5., 6.]])
with shape (2,3), I want to convert the float numbers into list so to obtain the array [[[1.], [2.], [3.]], [[4.], [5.], [6.]]] with shape (2,3,1).
I tried to convert each float number to a list (i.e., x[0][0] = [x[0][0]]) but it does not work.
Can anyone help me? Thanks
What you want is adding another dimension to your numpy array. One way of doing it is using reshape:
x = x.reshape(2,3,1)
output:
[[[1.]
[2.]
[3.]]
[[4.]
[5.]
[6.]]]
There is a function in Numpy to perform exactly what #Valdi_Bo mentions. You can use np.expand_dims and add a new dimension along axis 2, as follows:
x = np.expand_dims(x, axis=2)
Refer:
np.expand_dims
Actually, you want to add a dimension (not level).
To do it, run:
result = x[...,np.newaxis]
Its shape is just (2, 3, 1).
Or save the result back under x.
You are trying to add a new dimension to the numpy array. There are multiple ways of doing this as other answers mentioned np.expand_dims, np.new_axis, np.reshape etc. But I usually use the following as I find it the most readable, especially when you are working with vectorizing multiple tensors and complex operations involving broadcasting (check this Bounty question that I solved with this method).
x[:,:,None].shape
(2,3,1)
x[None,:,None,:,None].shape
(1,2,1,3,1)
Well, maybe this is an overkill for the array you have, but definitely the most efficient solution is to use np.lib.stride_tricks.as_strided. This way no data is copied.
import numpy as np
x = np.array([[1., 2., 3.], [4., 5., 6.]])
newshape = x.shape[:-1] + (x.shape[-1], 1)
newstrides = x.strides + x.strides[-1:]
a = np.lib.stride_tricks.as_strided(x, shape=newshape, strides=newstrides)
results in:
array([[[1.],
[2.],
[3.]],
[[4.],
[5.],
[6.]]])
>>> a.shape
(2, 3, 1)
For quick debugging purposes, I'm trying to print out the SparseTensor I've just initialized.
The built-in print function just says it's a SparseTensor object, and tf.Print() gives an error. The error statement does print the contents of the object, but not in a way that shows the actual entries (unless it's telling me it's empty, there's some :0s I don't know the significance of).
rows = tf.Print(rows, [rows])
TypeError: Failed to convert object of type <class 'tensorflow.python.framework.sparse_tensor.SparseTensor'> to Tensor. Contents: SparseTensor(indices=Tensor("SparseTensor/indices:0", shape=(6, 2), dtype=int64), values=Tensor("SparseTensor/values:0", shape=(6,), dtype=float32), dense_shape=Tensor("SparseTensor/dense_shape:0", shape=(2,), dtype=int64)). Consider casting elements to a supported type.
Way 0: Run the SparseTensor and print the result
Running the graph (in this case just the SparseTensor object) returns a SparseTensorValue object which prints in the same format as the call used to initialize the SparseTensor, which is ultimately what I wanted.
with tf.Session() as sess:
rows = sess.run(rows)
print(rows)
Way 1: Use Print after conversion to dense matrix
To use the Print function, I could convert to a dense matrix in my case. But Print only executes when you run the graph:
rows = tf.sparse_tensor_to_dense(rows)
rows = tf.Print(rows, [rows], summarize=100)
with tf.Session() as sess:
sess.run(rows)
Note the "summarize"--the default setting just printed out zeroes since it's getting the first few entries of a sparse matrix represented in dense form!
Way 2: Use tf.test.TestCase
I found out that the TestCase.evaluate method gives me the kind of nice format I want, the same as Way 0 above:
print(str(self.evaluate(rows)))
Outputs e.g.:
SparseTensorValue(indices=array([[1, 2],
[1, 7],
[1, 8],
[2, 2],
[3, 4],
[3, 5]]), values=array([1., 1., 1., 1., 1., 1.], dtype=float32), dense_shape=array([4, 9]))
You're seeing this error because SparseTensor is not really a Tensor, it's a MetaTensor that wraps 3 dense tensors.
Try using print() on your SparseTensor and you'll see the internal details:
indices=Tensor(…), values=Tensor(…), dense_shape=Tensor(…))
You can print any of these "internal" tensors using tf.Print. For example, tf.Print(my_sparse_tensor.values, [my_sparse_tensor.values]) will succeed.
The SparseTensor documentation describes the internal data structure:
https://www.tensorflow.org/api_docs/python/tf/sparse/SparseTensor
TensorFlow represents a sparse tensor as three separate dense tensors: indices, values, and dense_shape. In Python, the three tensors are collected into a SparseTensor class for ease of use. If you have separate indices, values, and dense_shape tensors, wrap them in a SparseTensor object before passing to the ops below.
Concretely, the sparse tensor SparseTensor(indices, values, dense_shape) comprises the following components, where N and ndims are the number of values and number of dimensions in the SparseTensor, respectively:
indices: A 2-D int64 tensor of dense_shape [N, ndims], which specifies the indices of the elements in the sparse tensor that contain nonzero values (elements are zero-indexed). For example, indices=[[1,3], [2,4]] specifies that the elements with indexes of [1,3] and [2,4] have nonzero values.
values: A 1-D tensor of any type and dense_shape [N], which supplies the values for each element in indices. For example, given indices=[[1,3], [2,4]], the parameter values=[18, 3.6] specifies that element [1,3] of the sparse tensor has a value of 18, and element [2,4] of the tensor has a value of 3.6.
dense_shape: A 1-D int64 tensor of dense_shape [ndims], which specifies the dense_shape of the sparse tensor. Takes a list indicating the number of elements in each dimension. For example, dense_shape=[3,6] specifies a two-dimensional 3x6 tensor, dense_shape=[2,3,4] specifies a three-dimensional 2x3x4 tensor, and dense_shape=[9] specifies a one-dimensional tensor with 9 elements.
The corresponding dense tensor satisfies:
dense.shape = dense_shape
dense[tuple(indices[i])] = values[i]
By convention, indices should be sorted in row-major order (or equivalently lexicographic order on the tuples indices[i]). This is not enforced when SparseTensor objects are constructed, but most ops assume correct ordering. If the ordering of sparse tensor st is wrong, a fixed version can be obtained by calling tf.sparse_reorder(st).
Example: The sparse tensor
SparseTensor(indices=[[0, 0], [1, 2]], values=[1, 2], dense_shape=[3, 4])
represents the dense tensor:
[[1, 0, 0, 0]
[0, 0, 2, 0]
[0, 0, 0, 0]]
The function torch.nn.functional.softmax takes two parameters: input and dim. According to its documentation, the softmax operation is applied to all slices of input along the specified dim, and will rescale them so that the elements lie in the range (0, 1) and sum to 1.
Let input be:
input = torch.randn((3, 4, 5, 6))
Suppose I want the following, so that every entry in that array is 1:
sum = torch.sum(input, dim = 3) # sum's size is (3, 4, 5, 1)
How should I apply softmax?
softmax(input, dim = 0) # Way Number 0
softmax(input, dim = 1) # Way Number 1
softmax(input, dim = 2) # Way Number 2
softmax(input, dim = 3) # Way Number 3
My intuition tells me that is the last one, but I am not sure. English is not my first language and the use of the word along seemed confusing to me because of that.
I am not very clear on what "along" means, so I will use an example that could clarify things. Suppose we have a tensor of size (s1, s2, s3, s4), and I want this to happen
Steven's answer is not correct. See the snapshot below. It is actually the reverse way.
Image transcribed as code:
>>> x = torch.tensor([[1,2],[3,4]],dtype=torch.float)
>>> F.softmax(x,dim=0)
tensor([[0.1192, 0.1192],
[0.8808, 0.8808]])
>>> F.softmax(x,dim=1)
tensor([[0.2689, 0.7311],
[0.2689, 0.7311]])
The easiest way I can think of to make you understand is: say you are given a tensor of shape (s1, s2, s3, s4) and as you mentioned you want to have the sum of all the entries along the last axis to be 1.
sum = torch.sum(input, dim = 3) # input is of shape (s1, s2, s3, s4)
Then you should call the softmax as:
softmax(input, dim = 3)
To understand easily, you can consider a 4d tensor of shape (s1, s2, s3, s4) as a 2d tensor or matrix of shape (s1*s2*s3, s4). Now if you want the matrix to contain values in each row (axis=0) or column (axis=1) that sum to 1, then, you can simply call the softmax function on the 2d tensor as follows:
softmax(input, dim = 0) # normalizes values along axis 0
softmax(input, dim = 1) # normalizes values along axis 1
You can see the example that Steven mentioned in his answer.
Let's consider the example in two dimensions
x = [[1,2],
[3,4]]
do you want your final result to be
y = [[0.27,0.73],
[0.27,0.73]]
or
y = [[0.12,0.12],
[0.88,0.88]]
If it's the first option then you want dim = 1. If it's the second option you want dim = 0.
Notice that the columns or zeroth dimension is normalized in the second example hence it is normalized along the zeroth dimension.
Updated 2018-07-10: to reflect that zeroth dimension refers to columns in pytorch.
I am not 100% sure what your question means but I think your confusion is simply that you don't understand what dim parameter means. So I will explain it and provide examples.
If we have:
m0 = nn.Softmax(dim=0)
what that means is that m0 will normalize elements along the zeroth coordinate of the tensor it receives. Formally if given a tensor b of size say (d0,d1) then the following will be true:
sum^{d0}_{i0=1} b[i0,i1] = 1, forall i1 \in {0,...,d1}
you can easily check this with a Pytorch example:
>>> b = torch.arange(0,4,1.0).view(-1,2)
>>> b
tensor([[0., 1.],
[2., 3.]])
>>> m0 = nn.Softmax(dim=0)
>>> b0 = m0(b)
>>> b0
tensor([[0.1192, 0.1192],
[0.8808, 0.8808]])
now since dim=0 means going through i0 \in {0,1} (i.e. going through the rows) if we choose any column i1 and sum its elements (i.e. the rows) then we should get 1. Check it:
>>> b0[:,0].sum()
tensor(1.0000)
>>> b0[:,1].sum()
tensor(1.0000)
as expected.
Note we do get all rows sum to 1 by "summing out the rows" with torch.sum(b0,dim=0), check it out:
>>> torch.sum(b0,0)
tensor([1.0000, 1.0000])
We can create a more complicated example to make sure it's really clear.
a = torch.arange(0,24,1.0).view(-1,3,4)
>>> a
tensor([[[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.]],
[[12., 13., 14., 15.],
[16., 17., 18., 19.],
[20., 21., 22., 23.]]])
>>> a0 = m0(a)
>>> a0[:,0,0].sum()
tensor(1.0000)
>>> a0[:,1,0].sum()
tensor(1.0000)
>>> a0[:,2,0].sum()
tensor(1.0000)
>>> a0[:,1,0].sum()
tensor(1.0000)
>>> a0[:,1,1].sum()
tensor(1.0000)
>>> a0[:,2,3].sum()
tensor(1.0000)
so as we expected if we sum all the elements along the first coordinate from the first value to the last value we get 1. So everything is normalized along the first dimension (or first coordiante i0).
>>> torch.sum(a0,0)
tensor([[1.0000, 1.0000, 1.0000, 1.0000],
[1.0000, 1.0000, 1.0000, 1.0000],
[1.0000, 1.0000, 1.0000, 1.0000]])
Also along the dimension 0 means that you vary the coordinate along that dimension and consider each element. Sort of like having a for loop going through the values the first coordinates can take i.e.
for i0 in range(0,d0):
a[i0,b,c,d]
import torch
import torch.nn.functional as F
x = torch.tensor([[1, 2], [3, 4]], dtype=torch.float)
s1 = F.softmax(x, dim=0)
tensor([[0.1192, 0.1192],
[0.8808, 0.8808]])
s2 = F.softmax(x, dim=1)
tensor([[0.2689, 0.7311],
[0.2689, 0.7311]])
torch.sum(s1, dim=0)
tensor([1., 1.])
torch.sum(s2, dim=1)
tensor([1., 1.])
Think of what softmax is trying to achieve. It outputs probability of one outcome against the other. Let's say you are trying to predict two outcomes: is it A or is it B. If p(A) is greater than p(B) then the next step is to convert the outcome into Boolean( i.e. the outcome would be A if p(A) > 50% or B if p(B) > 50% Since we are dealing with probabilities they should add-up to 1.
Therefore what you want is sum probabilities OF EACH ROW to be 1. Therefore you specify dim=1 or row sum
On the other hand if your model is designed to predict more than two variables the output tensor will look something like [p(a), p(b), p(c)...p(i)]
What matters here is that p(a) + p(b) + p(c) +...p(i) = 1
then you would use dim = 0
It all depends on how you define your output layer.
I am trying to learn tensor flow. In the given example how can we define rank and shape?? I mean how to find the rank and shape??
3 # a rank 0 tensor; this is a scalar with shape []
[1. ,2., 3.] # a rank 1 tensor; this is a vector with shape [3]
[[1., 2., 3.], [4., 5., 6.]] # a rank 2 tensor; a matrix with shape [2, 3]
[[[1., 2., 3.]], [[7., 8., 9.]]] # a rank 3 tensor with shape [2, 1, 3]
Rank is the number of dimensions in the tensor. Refer to:
https://en.wikipedia.org/wiki/Tensor
The total number of indices required to identify each component uniquely is equal to the dimension of the array, and is called the order, degree or rank of the tensor.
Shape describes the number of elements in each dimension of the tensor.
In the given example,
[1. ,2., 3.]
is a set of numbers with only one dimension. This is called a vector and generally used to represent a line.
[[1., 2., 3.], [4., 5., 6.]]
is a set of numbers with two dimensions. This is called a matrix and generally represents a set of lines geometrically. (Each line described by elements in each of the inner brackets)
This can be generalized to more than two dimensions.
More generally, all these sets of numbers are known as Tensors. TensorFlow uses these sets of numbers as a data structure.