For quick debugging purposes, I'm trying to print out the SparseTensor I've just initialized.
The built-in print function just says it's a SparseTensor object, and tf.Print() gives an error. The error statement does print the contents of the object, but not in a way that shows the actual entries (unless it's telling me it's empty, there's some :0s I don't know the significance of).
rows = tf.Print(rows, [rows])
TypeError: Failed to convert object of type <class 'tensorflow.python.framework.sparse_tensor.SparseTensor'> to Tensor. Contents: SparseTensor(indices=Tensor("SparseTensor/indices:0", shape=(6, 2), dtype=int64), values=Tensor("SparseTensor/values:0", shape=(6,), dtype=float32), dense_shape=Tensor("SparseTensor/dense_shape:0", shape=(2,), dtype=int64)). Consider casting elements to a supported type.
Way 0: Run the SparseTensor and print the result
Running the graph (in this case just the SparseTensor object) returns a SparseTensorValue object which prints in the same format as the call used to initialize the SparseTensor, which is ultimately what I wanted.
with tf.Session() as sess:
rows = sess.run(rows)
print(rows)
Way 1: Use Print after conversion to dense matrix
To use the Print function, I could convert to a dense matrix in my case. But Print only executes when you run the graph:
rows = tf.sparse_tensor_to_dense(rows)
rows = tf.Print(rows, [rows], summarize=100)
with tf.Session() as sess:
sess.run(rows)
Note the "summarize"--the default setting just printed out zeroes since it's getting the first few entries of a sparse matrix represented in dense form!
Way 2: Use tf.test.TestCase
I found out that the TestCase.evaluate method gives me the kind of nice format I want, the same as Way 0 above:
print(str(self.evaluate(rows)))
Outputs e.g.:
SparseTensorValue(indices=array([[1, 2],
[1, 7],
[1, 8],
[2, 2],
[3, 4],
[3, 5]]), values=array([1., 1., 1., 1., 1., 1.], dtype=float32), dense_shape=array([4, 9]))
You're seeing this error because SparseTensor is not really a Tensor, it's a MetaTensor that wraps 3 dense tensors.
Try using print() on your SparseTensor and you'll see the internal details:
indices=Tensor(…), values=Tensor(…), dense_shape=Tensor(…))
You can print any of these "internal" tensors using tf.Print. For example, tf.Print(my_sparse_tensor.values, [my_sparse_tensor.values]) will succeed.
The SparseTensor documentation describes the internal data structure:
https://www.tensorflow.org/api_docs/python/tf/sparse/SparseTensor
TensorFlow represents a sparse tensor as three separate dense tensors: indices, values, and dense_shape. In Python, the three tensors are collected into a SparseTensor class for ease of use. If you have separate indices, values, and dense_shape tensors, wrap them in a SparseTensor object before passing to the ops below.
Concretely, the sparse tensor SparseTensor(indices, values, dense_shape) comprises the following components, where N and ndims are the number of values and number of dimensions in the SparseTensor, respectively:
indices: A 2-D int64 tensor of dense_shape [N, ndims], which specifies the indices of the elements in the sparse tensor that contain nonzero values (elements are zero-indexed). For example, indices=[[1,3], [2,4]] specifies that the elements with indexes of [1,3] and [2,4] have nonzero values.
values: A 1-D tensor of any type and dense_shape [N], which supplies the values for each element in indices. For example, given indices=[[1,3], [2,4]], the parameter values=[18, 3.6] specifies that element [1,3] of the sparse tensor has a value of 18, and element [2,4] of the tensor has a value of 3.6.
dense_shape: A 1-D int64 tensor of dense_shape [ndims], which specifies the dense_shape of the sparse tensor. Takes a list indicating the number of elements in each dimension. For example, dense_shape=[3,6] specifies a two-dimensional 3x6 tensor, dense_shape=[2,3,4] specifies a three-dimensional 2x3x4 tensor, and dense_shape=[9] specifies a one-dimensional tensor with 9 elements.
The corresponding dense tensor satisfies:
dense.shape = dense_shape
dense[tuple(indices[i])] = values[i]
By convention, indices should be sorted in row-major order (or equivalently lexicographic order on the tuples indices[i]). This is not enforced when SparseTensor objects are constructed, but most ops assume correct ordering. If the ordering of sparse tensor st is wrong, a fixed version can be obtained by calling tf.sparse_reorder(st).
Example: The sparse tensor
SparseTensor(indices=[[0, 0], [1, 2]], values=[1, 2], dense_shape=[3, 4])
represents the dense tensor:
[[1, 0, 0, 0]
[0, 0, 2, 0]
[0, 0, 0, 0]]
Related
I have a tensor a = torch.arange(6).reshape(2,3), and another tensor b=(torch.rand(a.size())> 0.5).int().nonzero().
I want to create a new tensor that contains only values from a of the indices that are indicated by b.
For example:
a = torch.arange(6).reshape(2,3) # tensor([[0, 1, 2],
# [3, 4, 5]])
b = (torch.rand(a.size())> 0.5).int().nonzero() # tensor([[0, 1],
# [0, 2],
# [1, 0],
# [1, 1]])
The desired output is:
tensor([1,2,3,4])
I know that I can iterate over the values of b and access those values in a as indices but I wanted to know if there is a better Pytorch way to to this (using tensor operations only).
** The shape of the output tensor doesn't really matter, I just need to have a tensor with only the values indicated by b.
If I understand you correctly, you can do:
a[b[:,0], b[:,1]]
This will produce a 1D tensor with the values at the indices specified by b. Note that the output might not be the same from run to run of the program since the indices are selected nondeterministically.
If you don't know the number of dimension in advance, you'll need to use map() to generate the desired slices:
a[tuple(map(lambda x: b[:,x], range(a.dim())))]
I am trying to convert a sparse adjacency matrix/list that only contains the indices of the non-zero elements ([[rows], [columns]]) to a dense matrix that contains 1s at the indices and otherwise 0s. I found a solution using to_dense_adj from Pytorch geometric (Documentation). But this does not exactly what I want, since the shape of the dense matrix is not as expected. Here is an example:
sparse_adj = torch.tensor([[0, 1, 2, 1, 0], [0, 1, 2, 3, 4]])
So the dense matrix should be of size 5x3 (the second array "stores" the columns; with non-zero elements at (0,0), (1,1), (2,2),(1,3) and (0,4)) because the elements in the first array are lower or equal than 2.
However,
dense_adj = to_dense(sparse_adj)[0]
outputs a dense matrix, but of shape (5,5). Is it possible to define the output shape or is there a different solution to get what I want?
Edit: I have a solution to convert it back to the sparse representation now that works
dense_adj = torch.sparse.FloatTensor(sparse_adj, torch.ones(5), torch.Size([3,5])).to_dense()
ind = dense_adj.nonzero(as_tuple=False).t().contiguous()
sparse_adj = torch.stack((ind[1], ind[0]), dim=0)
Or is there any alternative way that is better?
You can acheive this by first constructing a sparse matrix with torch.sparse then converting it to a dense matrix. For this you will need to provide torch.sparse.FloatTensor a 2D tensor of indices, a tensor of values as well as a output size:
sparse_adj = torch.tensor([[0, 1, 2, 1, 0], [0, 1, 2, 3, 4]])
torch.sparse.FloatTensor(sparse_adj, torch.ones(5), torch.Size([3,5])).to_dense()
You can get the size of the output matrix dynamically with
sparse_adj.max(axis=1).values + 1
So it becomes:
torch.sparse.FloatTensor(
sparse_adj,
torch.ones(sparse_adj.shape[1]),
(sparse_adj.max(axis=1).values + 1).tolist())
I'm trying to understand how tf.reshape works. Let's have an example:
embeddings = tf.placeholder(tf.float32, shape=[N0,N1])
M_2D = tf.placeholder(tf.float32, shape=[N0,None])
M_3D = tf.reshape(M_2D, [-1,N0,1])
weighted_embeddings = tf.multiply(embeddings, M_3D)
Here I have a 2D tensor M_2D whose columns represent coefficients for the N0 embeddings of dimension N1. I want to create a 3D tensor where each column of M_2D is placed in the first dimension of M_3D, and columns are keep in the same order. My final goal is to create a 3D tensor of 2D embeddings, each weighted by the columns of M_2D.
How can I be sure that reshape actually place each column in the new dimension of M_3D. Is it possible that it places the rows instead ? Is there somewhere in tensorflow documentation a clear explanation on the internal working process of tf.reshape, particularly when -1 is provided ?
Tensor before and after tf.reshape have the same flatten order.
In tensorflow runtime, a Tensor is consists of raw data(byte array), shape, and dtype, tf.reshape only change shape, with raw data and dtype not changed. -1 or None in tf.reshape means that this value can be calculated.
For example,
# a tensor with 6 elements, with shape [3,2]
a = tf.constant([[1,2], [3,4], [5,6]])
# reshape tensor to [2, 3, 1], 2 is calculated by 6/3/1
b = tf.reshape(a, [-1, 3, 1])
In this example, a and b have the same flatten order, namely [1,2,3,4,5,6], a has the shape [3,2], its value is [[1,2], [3,4], [5,6]], b has the shape [2,3,1], its value is [[[1],[2],[3]],[[4],[5],[6]]].
I am having troubles understanding the meaning and usages for Tensorflow Tensors and Sparse Tensors.
According to the documentation
Tensor
Tensor is a typed multi-dimensional array. For example, you can represent a mini-batch of images as a 4-D array of floating point numbers with dimensions [batch, height, width, channels].
Sparse Tensor
TensorFlow represents a sparse tensor as three separate dense tensors: indices, values, and shape. In Python, the three tensors are collected into a SparseTensor class for ease of use. If you have separate indices, values, and shape tensors, wrap them in a SparseTensor object before passing to the ops below.
My understandings are Tensors are used for operations, input and output. And Sparse Tensor is just another representation of a Tensor(dense?). Hope someone can further explain the differences, and the use cases for them.
Matthew did a great job but I would love to give an example to shed more light on Sparse tensors with a example.
If a tensor has lots of values that are zero, it can be called sparse.
Lets consider a sparse 1-D Tensor
[0, 7, 0, 0, 8, 0, 0, 0, 0]
A sparse representation of the same tensor will focus only on the non-zero values
values = [7,8]
We also have to remember where those values occurs, by their indices
indices = [1,4]
The one-dimensional indices form will work with some methods, for this one-dimensional example, but in general indices have multiple dimensions, so it will be more consistent (and work everywhere) to represent indices like this:
indices = [[1], [4]]
With values and indices, we don't have quite enough information yet. How many zeros are there? We represent dense shape of a tensor.
dense_shape = [9]
These three things together, values, indices, and dense_shape, are a sparse representation of the tensor
In tensorflow 2.0 it can be implemented as
x = tf.SparseTensor(values=[7,8],indices=[[1],[4]],dense_shape=[9])
x
#o/p: <tensorflow.python.framework.sparse_tensor.SparseTensor at 0x7ff04a58c4a8>
print(x.values)
print(x.dense_shape)
print(x.indices)
#o/p:
tf.Tensor([7 8], shape=(2,), dtype=int32)
tf.Tensor([9], shape=(1,), dtype=int64)
tf.Tensor(
[[1]
[4]], shape=(2, 1), dtype=int64)
EDITED to correct indices as pointed out in the comments.
The difference involves computational speed. If a large tensor has many, many zeroes, it's faster to perform computation by iterating through the non-zero elements. Therefore, you should store the data in a SparseTensor and use the special operations for SparseTensors.
The relationship is similar for matrices and sparse matrices. Sparse matrices are common in dynamic systems, and mathematicians have developed many special methods for operating on them.
I want to do something like this.
Let's say we have a tensor A.
A = [[1,0],[0,4]]
And I want to get nonzero values and their indices from it.
Nonzero values: [1,4]
Nonzero indices: [[0,0],[1,1]]
There are similar operations in Numpy.
np.flatnonzero(A) return indices that are non-zero in the flattened A.
x.ravel()[np.flatnonzero(x)] extract elements according to non-zero indices.
Here's a link for these operations.
How can I do somthing like above Numpy operations in Tensorflow with python?
(Whether a matrix is flattened or not doesn't really matter.)
You can achieve same result in Tensorflow using not_equal and where methods.
zero = tf.constant(0, dtype=tf.float32)
where = tf.not_equal(A, zero)
where is a tensor of the same shape as A holding True or False, in the following case
[[True, False],
[False, True]]
This would be sufficient to select zero or non-zero elements from A. If you want to obtain indices you can use wheremethod as follows:
indices = tf.where(where)
where tensor has two True values so indices tensor will have two entries. where tensor has rank of two, so entries will have two indices:
[[0, 0],
[1, 1]]
#assume that an array has 0, 3.069711, 3.167817.
mask = tf.greater(array, 0)
non_zero_array = tf.boolean_mask(array, mask)
What about using sparse tensors.
>>> A = [[1,0],[0,4]]
>>> sparse = tf.sparse.from_dense(A)
>>> sparse.values.numpy(), sparse.indices.numpy()
(array([1, 4], dtype=int32), array([[0, 0],
[1, 1]]))