how to count the nonzero mismatches between two tensors

how to count the nonzero mismatches between two tensors - python

Suppose I have two tensors in tensorflow, A, B (of the same shape). Suppose these are both sparse. I need to know a count of the instances where one of these tensors has a nonzero value at a given index, while the other tensor has a zero value. So, I am looking for a number of locations (i,j pairs) where one matrix has a nonzero value there and the other matrix has a zero value there. How do I do this efficiently?

I would do as follows:
import tensorflow as tf
tensor1 = tf.constant([[0, 1], [0, 2]])
tensor2 = tf.constant([[1, 0], [0, 2]])
a = tf.math.equal(tensor1, tf.zeros_like(tensor1))
b = tf.math.equal(tensor2, tf.zeros_like(tensor2))
c = tf.math.equal(a, b)
c = tf.cast(c, tf.int32)
c = tf.math.reduce_sum(c)

import tensorflow as tf
a = tf.sparse.SparseTensor(
[[0,1], [1,1]], [1,2], [2,2]
)
b = tf.sparse.SparseTensor(
[[0,0], [0,1],[1,0]], [1,2,1], [2,2]
)
res = tf.reduce_sum(
tf.cast(tf.math.logical_xor(
tf.math.not_equal(tf.sparse.to_dense(a), 0),
tf.math.not_equal(tf.sparse.to_dense(b), 0)
), 'int32')
)

This would do it. It sums the True cases according to these conditions, element-wise:
a and b have non-equal values
a is not zero
b is not zero
tf.reduce_sum(
tf.cast(
tf.logical_and(
tf.not_equal(tf.sparse.to_dense(a), tf.sparse.to_dense(b)),
tf.cast(tf.sparse.to_dense(a), tf.bool),
tf.cast(tf.sparse.to_dense(b), tf.bool)),
tf.int32))
Based on these two sparse tensors:
<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[1, 0, 2],
[0, 1, 0],
[0, 2, 1]])>
<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[2, 2, 2],
[0, 0, 0],
[0, 2, 1]])>
Complete example:
import tensorflow as tf
a = tf.SparseTensor(indices=[[0, 0], [0, 2], [1, 1], [2, 1], [2, 2]],
values=[1, 2, 1, 2, 1], dense_shape=[3, 3])
b = tf.SparseTensor(indices=[[0, 0], [0, 1], [0, 2], [2, 1], [2, 2]],
values=[2, 2, 2, 2, 1], dense_shape=[3, 3])
tf.sparse.to_dense(a)
tf.sparse.to_dense(b)
tf.reduce_sum(
tf.cast(
tf.logical_and(
tf.not_equal(tf.sparse.to_dense(a), tf.sparse.to_dense(b)),
tf.cast(tf.sparse.to_dense(a), tf.bool),
tf.cast(tf.sparse.to_dense(b), tf.bool)),
tf.int32))

Related

Create a PyTorch tensor of sequences which excludes specified value

I have a 1d PyTorch tensor containing integers between 0 and n-1. Now I need to create a 2d PyTorch tensor with n-1 columns, where each row is a sequence from 0 to n-1 excluding the value in the first tensor. How can I achieve this efficiently?
Ex:
n = 3
a = torch.Tensor([0, 1, 2, 1, 2, 0])
# desired output
b = [
[1, 2],
[0, 2],
[0, 1],
[0, 2],
[0, 1],
[1, 2]
]
Typically, the a.numel() >> n.
Detailed Explanation:
The first element of a is 0, hence it has to map to the sequence [0, 1, 2] excluding 0, which is [1, 2].
Similarly, the second element of a is 1, hence it has to map to [0, 2] and so on.
PS: I actually have an additional batch dimension, which I've excluded here for simplicity. Hence, I need the solution to be easily extendable to one additional dimension.

We can construct a tensor with the desired sequences and index with tensor a.
import torch
n = 3
a = torch.Tensor([0, 1, 2, 1, 2, 0]) # using torch.tensor is recommended
def exclude_gather(a, n):
sequences = torch.nonzero(torch.arange(n) != torch.arange(n)[:,None], as_tuple=True)[1].reshape(-1, n-1)
return sequences[a.long()]
exclude_gather(a, n)
Output
tensor([[1, 2],
[0, 2],
[0, 1],
[0, 2],
[0, 1],
[1, 2]])
We can add a batch dimension with functorch.vmap
from functorch import vmap
n = 4
b = torch.Tensor([[0, 1, 2, 1, 3, 0],[0, 3, 1, 0, 2, 1]])
vmap(exclude_gather, in_dims=(0, None))(b, n)
Output
tensor([[[1, 2, 3],
[0, 2, 3],
[0, 1, 3],
[0, 2, 3],
[0, 1, 2],
[1, 2, 3]],
[[1, 2, 3],
[0, 1, 2],
[0, 2, 3],
[1, 2, 3],
[0, 1, 3],
[0, 2, 3]]])

All you have to do is initialize a multi-dimension array with all possible indices using torch.arange(). After that, purge indices that you don't want from each tensor using a boolean mask.
import torch
a = torch.Tensor([0, 1, 2, 1, 2, 0])
n = 3
b = [torch.arange(n) for i in range(len(a))]
c = [b[i]!=a[i] for i in range(len(b))]
# use the boolean array as a mask to apply on b
d = [[b[i][c[i]] for i in range(len(b))]]
print(d) # this can be converted to a list of numbers or torch tensor
This prints the output - [[tensor([1, 2]), tensor([0, 2]), tensor([0, 1]), tensor([0, 2]), tensor([0, 1]), tensor([1, 2])]] which you can convert to int/numpy/torch array/tensor easily.
This can be extended to multiple dimensions as well.

The following does the trick
b = []
for i in range(n-1):
b.append(i * torch.ones_like(a) + (a <= i))
b = torch.stack(b, dim=1)
Since n << size(a), the for loop should not be very costly.

how to perform multiplication of a triadiagonal matrix with a tensor of different rank and outer dimensions in tensorflow

The code below (modified tensorflow example) produces the error "All input tensors must have the same rank.". Similar error is given by mult operations of tf.linalg.LinearOperatorTridiag. I need to multiply an input by a tridiagonal matrix in a Keras layer, and ranks of tensors are different due to additional batch dimensions in the input of the layer. Any known practical solution for this?
import tensorflow as tf
superdiag = tf.constant([-1, -1, 0], dtype=tf.float64)
maindiag = tf.constant([2, 2, 2], dtype=tf.float64)
subdiag = tf.constant([0, -1, -1], dtype=tf.float64)
diagonals = [superdiag, maindiag, subdiag]
rhs = tf.constant([[[1, 1], [1, 1], [1, 1]]], dtype=tf.float64)
x = tf.linalg.tridiagonal_matmul(diagonals, rhs, diagonals_format='sequence')

you have to expand the first dimension
superdiag = tf.constant([-1, -1, 0], dtype=tf.float64)
maindiag = tf.constant([2, 2, 2], dtype=tf.float64)
subdiag = tf.constant([0, -1, -1], dtype=tf.float64)
diagonals = [tf.expand_dims(superdiag,0), tf.expand_dims(maindiag,0), tf.expand_dims(subdiag,0)]
rhs = tf.constant([[[1, 1], [1, 1], [1, 1]]], dtype=tf.float64)
x = tf.linalg.tridiagonal_matmul(diagonals, rhs, diagonals_format='sequence')

How to construct a matrix that contains all pairs of rows of a matrix in tensorflow

I need to construct a matrix z that would contain combinations of pairs of rows of a matrix x.
x = tf.constant([[1, 3],
[2, 4],
[0, 2],
[0, 1]], dtype=tf.int32)
z=[[[1,2],
[1,0],
[1,0],
[2,0],
[2,0],
[0,0]],
[3,4],
[3,2],
[3,1],
[4,2],
[4,1],
[2,1]]]
It pairs each value with the rest of the values on that row.
I could not find any function or come up with a good idea to do that.
Update 1
So I need the final shape be 2*6*2 like the z above.

Unfortunately, it's a bit more complex than one would like using tensorflow operators only. I would go with creating the indices for all combinations with a while_loop then use tf.gather to collect values:
import tensorflow as tf
x = tf.constant([[1, 3],
[2, 4],
[3, 2],
[0, 1]], dtype=tf.int32)
m = tf.constant([], shape=(0,2), dtype=tf.int32)
_, idxs = tf.while_loop(
lambda i, m: i < tf.shape(x)[0] - 1,
lambda i, m: (i + 1, tf.concat([m, tf.stack([tf.tile([i], (tf.shape(x)[0] - 1 - i,)), tf.range(i + 1, tf.shape(x)[0])], axis=1)], axis=0)),
loop_vars=(0, m),
shape_invariants=(tf.TensorShape([]), tf.TensorShape([None, 2])))
z = tf.reshape(tf.transpose(tf.gather(x, idxs), (2,0,1)), (-1, 2))
# <tf.Tensor: shape=(12, 2), dtype=int32, numpy=
# array([[1, 2],
# [1, 3],
# [1, 0],
# [2, 3],
# [2, 0],
# [3, 0],
# [3, 4],
# [3, 2],
# [3, 1],
# [4, 2],
# [4, 1],
# [2, 1]])>
This should work in both TF1 and TF2.
If the length of x is known in advance, you don't need the while_loop and could simply precompute the indices in python then place them in a constant.

Here is a way to do that without a loop:
import tensorflow as tf
x = tf.constant([[1, 3],
[2, 4],
[0, 2],
[0, 1]], dtype=tf.int32)
# Number of rows
n = tf.shape(x)[0]
# Grid of indices
ri = tf.range(0, n - 1)
rj = ri + 1
ii, jj = tf.meshgrid(ri, rj, indexing='ij')
# Stack together
grid = tf.stack([ii, jj], axis=-1)
# Get upper triangular part
m = ii < jj
idx = tf.boolean_mask(grid, m)
# Get values
g = tf.gather(x, idx, axis=0)
# Rearrange result
result = tf.transpose(g, [2, 0, 1])
print(result.numpy())
# [[[1 2]
# [1 0]
# [1 0]
# [2 0]
# [2 0]
# [0 0]]
#
# [[3 4]
# [3 2]
# [3 1]
# [4 2]
# [4 1]
# [2 1]]]

keras-gcn fit model ValueError

I'm using this library to create a model to learn graphs. Here is the code (from repository):
import numpy as np
from keras_gcn.backend import keras
from keras_gcn import GraphConv
# feature matrix
input_data = np.array([[[0, 1, 2],
[2, 3, 4],
[4, 5, 6],
[7, 7, 8]]])
# adjacency matrix
input_edge = np.array([[[1, 1, 1, 0],
[1, 1, 0, 0],
[1, 0, 1, 0],
[0, 0, 0, 1]]])
labels = np.array([[[1],
[0],
[1],
[0]]])
data_layer = keras.layers.Input(shape=(None, 3), name='Input-Data')
edge_layer = keras.layers.Input(shape=(None, None), dtype='int32', name='Input-Edge')
conv_layer = GraphConv(units=4, step_num=1, kernel_initializer='ones',
bias_initializer='ones', name='GraphConv')([data_layer, edge_layer])
model = keras.models.Model(inputs=[data_layer, edge_layer], outputs=conv_layer)
model.compile(optimizer='adam', loss='mae', metrics=['mae'])
model.fit([input_data, input_edge], labels)
However, when I run the code I get the following error:
ValueError: Error when checking target: expected GraphConv to have 3 dimensions, but got array with shape (4, 1)
while the shape of labels is (1, 4, 1)

You should encode your labels using onehot-encoder, something like the following:
lables = np.array([[[0, 1],
[1, 0],
[0, 1],
[1, 0]]])
Also number of units in GraphConv layer should be equal to the number of unique labels which is 2 in your case.

I think the issue is mismatch between the shapes of your edge_layer and data_layer.
When you use the function keras.layers.Input you're giving data_layer a shape of shape=(None, 3) and then you're giving edge_layer a shape of shape=(None, None)
Match the shapes and let me know how it goes.

Converting a list of numpy array to a single int numpy array

I have a list of numpy arrays, that I want to convert into a single int numpy array.
For example if I have 46 4 x 4 numpy arrays in a list of dimension 2 x 23, I want to convert it into a single integer numpy array of 2 x 23 x 4 x 4 dimension. I have found a way to do this by going through every single element and using numpy.stack(). Is there any better way?

You can simply use np.asarray like so
import numpy as np
list_of_lists = [[np.random.normal(0, 1, (4, 4)) for _ in range(23)]
for _ in range(2)]
a = np.asarray(list_of_lists)
a.shape
The function will infer the shape of the list of lists for you and create an appropriate array.

Stack works for me:
In [191]: A,B,C = np.zeros((2,2),int),np.ones((2,2),int),np.arange(4).reshape(2,
...: 2)
In [192]: x = [[A,B,C],[C,B,A]]
In [193]:
In [193]: x
Out[193]:
[[array([[0, 0],
[0, 0]]), array([[1, 1],
[1, 1]]), array([[0, 1],
[2, 3]])], [array([[0, 1],
[2, 3]]), array([[1, 1],
[1, 1]]), array([[0, 0],
[0, 0]])]]
In [194]: np.stack(x)
Out[194]:
array([[[[0, 0],
[0, 0]],
[[1, 1],
[1, 1]],
[[0, 1],
[2, 3]]],
[[[0, 1],
[2, 3]],
[[1, 1],
[1, 1]],
[[0, 0],
[0, 0]]]])
In [195]: _.shape
Out[195]: (2, 3, 2, 2)
stack views x as a list of 2 items, and applies np.asarray to each.
In [198]: np.array(x[0]).shape
Out[198]: (3, 2, 2)
Then adds a dimension, (1,3,2,2), and concatenates on the first axis.
In this case np.array(x) works just as well
In [201]: np.array(x).shape
Out[201]: (2, 3, 2, 2)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to count the nonzero mismatches between two tensors - python

Related

Create a PyTorch tensor of sequences which excludes specified value

how to perform multiplication of a triadiagonal matrix with a tensor of different rank and outer dimensions in tensorflow

How to construct a matrix that contains all pairs of rows of a matrix in tensorflow

keras-gcn fit model ValueError

Converting a list of numpy array to a single int numpy array

Categories

Resources