Removing empty rows in tf sparse tensor - python

This a follow up question to Tensorflow sparse tensor row-wise mask. It seems tf doesn't provide a convenient way to remove empty rows in the sparse tensor. e.g. from
SparseTensor(indices=tf.Tensor(
[[0 0]
[2 0]
[2 1]
[3 0]
[4 0]], shape=(5, 2), dtype=int64), values=tf.Tensor([b'a', b'b', b'c', b'd', b'e'], shape=(5,), dtype=string), dense_shape=tf.Tensor([5 2], shape=(2,), dtype=int64))
to
SparseTensor(indices=tf.Tensor(
[[0 0]
[1 0]
[1 1]
[2 0]
[3 0]], shape=(5, 2), dtype=int64), values=tf.Tensor([b'a', b'b', b'c', b'd', b'e'], shape=(5,), dtype=string), dense_shape=tf.Tensor([4 2], shape=(2,), dtype=int64))
How to deal with the case as above without converting the sparse to dense?
Thx, J

Related

Tensorflow gives 0 results

I am learning Tensorflow from this github
https://colab.research.google.com/github/instillai/TensorFlow-Course/blob/master/codes/ipython/1-basics/tensors.ipynb#scrollTo=TKX2U0Imcm7d
Here is an easy tutorial
import numpy as np
import tensorflow as tf
x = tf.constant([[1, 1],
[1, 1]])
y = tf.constant([[2, 4],
[6, 8]])
# Add two tensors
print(tf.add(x, y), "\n")
# Add two tensors
print(tf.matmul(x, y), "\n")
What I expect is
tf.Tensor(
[[3 5]
[7 9]], shape=(2, 2), dtype=int32)
tf.Tensor(
[[ 8 12]
[ 8 12]], shape=(2, 2), dtype=int32)
However, the results are
Tensor("Add_3:0", shape=(2, 2), dtype=int32)
Tensor("MatMul_3:0", shape=(2, 2), dtype=int32)
It does not mean that the values of the tensors are zero. Add_3:0 and MatMul_3:0 are just names of the tensors and you can only use print in Eager Execution to see the values of the tensors. In Graph mode you should use tf.print and you should see the results:
import tensorflow as tf
x = tf.constant([[1, 1],
[1, 1]])
y = tf.constant([[2, 4],
[6, 8]])
print(tf.add(x, y), "\n")
print(tf.matmul(x, y), "\n")
# Graph mode
#tf.function
def calculate():
x = tf.constant([[1, 1],
[1, 1]])
y = tf.constant([[2, 4],
[6, 8]])
tf.print(tf.add(x, y), "\n")
tf.print(tf.matmul(x, y), "\n")
return x, y
_, _ = calculate()
tf.Tensor(
[[3 5]
[7 9]], shape=(2, 2), dtype=int32)
tf.Tensor(
[[ 8 12]
[ 8 12]], shape=(2, 2), dtype=int32)
[[3 5]
[7 9]]
[[8 12]
[8 12]]
Without tf.print, you will see the your output from the function calculate:
Tensor("Add:0", shape=(2, 2), dtype=int32)
Tensor("MatMul:0", shape=(2, 2), dtype=int32)
See this guide for more information.

what is the difference between matrix multiplication methods and functions in tensorflow?

What are the differences between these three ways to multiply two matrices in tensorflow? the three ways are :
#
tf.tensordot()
tf.matmul()
I have tested them and they give the same result. but I wanted to know if there is any underlying difference.
Let us understand this with below example, I have taken two matrix a, b to perform these functions:
import tensorflow as tf
a = tf.constant([[1, 2],
[3, 4]])
b = tf.constant([[1, 1],
[1, 1]]) # or `tf.ones([2,2])`
tf.matmul(a,b) and (a # b) - both performs matrix mutiplication
print(tf.matmul(a, b), "\n") # matrix - multiplication
Output:
tf.Tensor(
[[3 3]
[7 7]], shape=(2, 2), dtype=int32)
You can see the same output here as well for same matrix:
print(a # b, "\n") # # used as matrix_multiplication operator
Output:
tf.Tensor(
[[3 3]
[7 7]], shape=(2, 2), dtype=int32)
tf.tensordot() - Tensordot (also known as tensor contraction) sums the product of elements from a and b over the indices specified by axes .
if we take axes=0 (scalar, no axes):
print(tf.tensordot(a, b, axes=0), "\n")
#One by one each element(scalar) of first matrix multiply with all element of second matrix and keeps output in separate matrix for each element multiplication.
Output:
tf.Tensor(
[[[[1 1]
[1 1]]
[[2 2]
[2 2]]]
[[[3 3]
[3 3]]
[[4 4]
[4 4]]]], shape=(2, 2, 2, 2), dtype=int32)
if we change axes=1:
print(tf.tensordot(a, b, axes=1), "\n")
# performs matrix-multiplication
Output:
tf.Tensor(
[[3 3]
[7 7]], shape=(2, 2), dtype=int32)
and for axes=2:
print(tf.tensordot(a, b, axes=2), "\n")
# performs element-wise multiplication,sums the result into scalar.
Output:
tf.Tensor(10, shape=(), dtype=int32)
You can explore more about tf.tensordot() and basic details on axes in given links.

Loop through rows of input matrices with vectorize

I have a 4x2 and a 2x2 matrices. I would like to loop each combination of rows (vectors of dimension 2) through a function foo using vectorize.
Here are the matrices:
X = np.array([[1, 0], [2, 0], [3, 0], [4,0]])
Y = np.array([[1, 0], [2, 0]])
Here's how I'm trying to run it:
def foo(x, y):
print("inputs:", x, y)
return x[0] * y[0]
bar = np.vectorize(foo, signature="???")
output = bar(X, Y)
print(output)
I'm looking for the following output. bar would return a 4x2 matrice:
inputs: [1,0] [1,0]
inputs: [1,0] [2,0]
inputs: [2,0] [1,0]
inputs: [2,0] [2,0]
inputs: [3,0] [1,0]
inputs: [3,0] [2,0]
inputs: [4,0] [1,0]
inputs: [4,0] [2,0]
[[1,2], [2,4], [3,6], [4,8]]
I've tried various combinations of signature, but I'm just not grokking how to use it given the output I'm looking for.
NB: I am aware vectorize just uses Python for loops under the hood and offers no performance benefit. I just want to understand how to use it.
The basic use of vectorize broadcasts the inputs against each other, and passes scalar tuples to your function. A (4,2) can't broadcast with a (2,2). signature is an addition that should make it possible to pass "rows" of your arrays. It's even slower, and I haven't see it used much (or recommended it).
In [536]: bar = np.vectorize(foo, signature="(n),(n)->()")
In [533]: bar(X,Y[0,:])
inputs: [1 0] [1 0]
inputs: [2 0] [1 0]
inputs: [3 0] [1 0]
inputs: [4 0] [1 0]
Out[533]: array([1, 2, 3, 4])
In [537]: bar(X[:,None],Y[None])
inputs: [1 0] [1 0]
inputs: [1 0] [2 0]
inputs: [2 0] [1 0]
inputs: [2 0] [2 0]
inputs: [3 0] [1 0]
inputs: [3 0] [2 0]
inputs: [4 0] [1 0]
inputs: [4 0] [2 0]
Out[537]:
array([[1, 2],
[2, 4],
[3, 6],
[4, 8]])
So this gives bar a (4,1,2) and (1,2,2); which broadcast as (4,2,2). Or with this signature it's broadcasting a (4,1) with 1,2) => (4,2). It's the signature that determines how the last dimensions match.
It may in some cases be convenient, but I wouldn't recommend devoting too much time to understanding vectorize.

Tensorflow 2D Matrices mutiplication returning list of matrix products

I'm struggling with this problem in keras/tensorflow. ​
I'm implementing a user defined loss function and I have this problem: I have to multiply 2 matrices, obtaining a list of matrix products in the form
[column_0_matrix_1 x row_0_matrix_2], [column_1_matrix_1 x row_1_matrix_2] ecc.
Let's say I have
A = [[1 1]
[3 2]]
B = [[4 1]
[1 3]]
Then I want to have a list of products in the form
C = |[1] x [4 1]|, |[1] x [1 3]|
|[3] | |[2] |
Any idea? I tried by my self but always get back the product of the 2 starting matrices.
Any help would by appreciated. Thank you
You could split each tensor and then use tf.linalg.matmul in a list comprehension to achieve what you want
import tensorflow as tf
a = tf.constant([[1, 1], [3, 2]])
b = tf.constant([[4, 1], [1, 3]])
a_split = tf.split(a, 2, 1)
b_split = tf.split(b, 2, 0)
[tf.linalg.matmul(x, y) for x, y in zip(a_split, b_split)]
# [<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
# array([[ 4, 1],
# [12, 3]], dtype=int32)>,
# <tf.Tensor: shape=(2, 2), dtype=int32, numpy=
# array([[1, 3],
# [2, 6]], dtype=int32)>]

Problems sorting an 2D array

I have a following 2D numpy array:
array([[1 0]
[2 0]
[4 0]
[1 1]
[2 1]
[3 1]
[4 2])
I want to sort the ID of first column with its value in second value, suck that I get back:
array([[1 0]
[1 1]
[2 0]
[2 1]
[3 1]
[4 0]
[4 2]])
I am getting O(n^2) complexity and want to improve it further.
A better way to sort a list of lists:
import numpy as np
a = np.array([[1, 0], [2 ,0], [4 ,0], [1 ,1], [2 ,1], [3 ,1], [4 ,2]])
s_a = np.asarray(sorted(a, key=lambda x: x[0]))
print(s_a)
Output:
[[1 0]
[1 1]
[2 0]
[2 1]
[3 1]
[4 0]
[4 2]]
Try the below code, Hope this will help:
a = np.array([[1, 0],
[2 ,0],
[4 ,0],
[1 ,1],
[2 ,1],
[3 ,1],
[4 ,2]])
np.sort(a.view('i8,i8'), order=['f0'], axis=0).view(np.int)
Ouput will be :
array([[(1, 0)],
[(1, 1)],
[(2, 0)],
[(2, 1)],
[(3, 1)],
[(4, 0)],
[(4, 2)]])

Categories