tf.gather_nd without "flattening" shape? - python

I'm still playing around with tensorflow and been trying to use the gather_nd op, but the return value is not in the shape/format I want...
Input Tensor: - shape: (2, 7, 4)
array([[[ 0., 0., 1., 2.],
[ 0., 0., 2., 2.],
[ 0., 0., 3., 3.],
[ 0., 0., 4., 3.],
[ 0., 0., 5., 4.],
[ 0., 0., 6., 4.],
[ 0., 0., 7., 5.]],
[[ 1., 1., 0., 2.],
[ 1., 2., 0., 2.],
[ 1., 3., 0., 3.],
[ 1., 4., 0., 3.],
[ 1., 5., 0., 4.],
[ 1., 6., 0., 5.],
[ 1., 7., 0., 5.]]], dtype=float32)
Indices returned by tf.where op: - shape: (3, 2)
array([[0, 0],
[0, 1],
[1, 0]])
tf.gather results: (shape = [3, 4])
array([[ 0., 0., 1., 2.],
[ 0., 0., 2., 2.],
[ 1., 1., 0., 2.]], dtype=float32)
desired results: = (2, sparse, 4)
array([[[ 0., 0., 1., 2.],
[ 0., 0., 2., 2.]],
[[ 1., 1., 0., 2.]]], dtype=float32)
What's the best way to achieve this, keeping in mind that tf.where = dynamic shapes and no guarantees of shape consistency across the 2nd dimension (axis=1)?
NB: Ignore this question - See my answer

I think its a Tensorflow version problem. In my version (1.2.1), I get the exact desired output from your inputs. However, I also tried the following code according to the older version.
import tensorflow as tf
indices = [[[0, 0, 0], [0, 0, 1], [0, 0, 2], [0, 0, 3]],
[[0, 1, 0], [0, 1, 1], [0, 1, 2], [0, 1, 3]],
[[1, 0, 0], [1, 0, 1], [1, 0, 2], [1, 0, 3]]]
params = [[[ 0., 0., 1., 2.],
[ 0., 0., 2., 2.],
[ 0., 0., 3., 3.],
[ 0., 0., 4., 3.],
[ 0., 0., 5., 4.],
[ 0., 0., 6., 4.],
[ 0., 0., 7., 5.]],
[[ 1., 1., 0., 2.],
[ 1., 2., 0., 2.],
[ 1., 3., 0., 3.],
[ 1., 4., 0., 3.],
[ 1., 5., 0., 4.],
[ 1., 6., 0., 5.],
[ 1., 7., 0., 5.]]]
output = tf.gather_nd(params, indices)
with tf.Session()as sess:
print (sess.run(output))
Hope this helps.

I realized the idiocy of my question.
# of tuples when 1st dim is 0 != # of tuples when 1st dim is 1
I'm not sure that what I'm asking is feasible ...

Related

How to distribute a Numpy array along the diagonal of an array of higher dimension?

I have three two dimensional Numpy arrays x, w, d and want to create a fourth one called a. w and d define only the shape of a with d.shape + w.shape. I want to have x in the entries of a with a zeros elsewhere.
Specifically, I want a loop-free version of this code:
a = np.zeros(d.shape + w.shape)
for j in range(d.shape[1]):
a[:,j,:,j] = x
For example, given:
x = np.array([
[2, 3],
[1, 1],
[8,10],
[0, 1]
])
w = np.array([
[ 0, 1, 1],
[-1,-2, 1]
])
d = np.matmul(x,w)
I want a to be
array([[[[ 2., 0., 0.],
[ 3., 0., 0.]],
[[ 0., 2., 0.],
[ 0., 3., 0.]],
[[ 0., 0., 2.],
[ 0., 0., 3.]]],
[[[ 1., 0., 0.],
[ 1., 0., 0.]],
[[ 0., 1., 0.],
[ 0., 1., 0.]],
[[ 0., 0., 1.],
[ 0., 0., 1.]]],
[[[ 8., 0., 0.],
[10., 0., 0.]],
[[ 0., 8., 0.],
[ 0., 10., 0.]],
[[ 0., 0., 8.],
[ 0., 0., 10.]]],
[[[ 0., 0., 0.],
[ 1., 0., 0.]],
[[ 0., 0., 0.],
[ 0., 1., 0.]],
[[ 0., 0., 0.],
[ 0., 0., 1.]]]])
This answer inspired the following solution:
# shape a: (4, 3, 2, 3)
# shape x: (4, 2)
a = np.zeros(d.shape + w.shape)
a[:, np.arange(a.shape[1]), :, np.arange(a.shape[3])] = x
It uses Numpy's broadcasting (see here or here) im combination with Advanced Indexing to enlarge x to fit the slicing.
I happen to have an even simpler solution: a = np.tensordot(x, np.identity(3), axes = 0).swapaxes(1,2)
The size of the identity matrix will be decided by the number of times you wish to repeat the elements of x.

np.ufunc.at for 2D array

In order to compute confusion matrix (not the accuracy) loop over the predicted and true labels may be needed. How to perform that in a numpy manner, if next code does not give needed result?
>> a = np.zeros((5, 5))
>> indices = np.array([
[0, 0],
[2, 2],
[4, 4],
[0, 0],
[2, 2],
[4, 4],
])
np.add.at(a, indices, 1)
>> a
>> array([
[4., 4., 4., 4., 4.],
[0., 0., 0., 0., 0.],
[4., 4., 4., 4., 4.],
[0., 0., 0., 0., 0.],
[4., 4., 4., 4., 4.]
])
# Wanted
>> array([
[2., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 2., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 2.]
])
Docs say If first operand has multiple dimensions, indices can be a tuple of array like index objects or slice objects.
Using next tupling wanted result is reached.
np.add.at(a, (indices[:, 0], indices[:, 1]), 1)

Alternatives to np.newaxis() for saving memory when comparing arrays

I want to copared each vector from one array with all vectors from another array, and count how many symbols matches per vector. Let me show an example.
I have two arrays, a and b.
For each vector in a, I want to compare it with each vector in b. I then want to return a new array which is with dimension np.array((len(a),14)) where each vector holds the number of times vectors in a had 0,1,2,3,4,..,12,13 matches with vectors from b. The wished results are shown in array c below.
I already have solved this problem using np.newaxis() but my issue is (see my function below), that this takes up so much memory so my computer can't handle it when a and b gets larger. Hence, I am looking for a more efficient way to do this calculation, as it hurts my memory big time to add on dimensions to the vectors. One solution is to go with a normal for loop, but this method is rather slow.
Is it possible to make these calculations more efficient?
a = array([[1., 1., 1., 2., 1., 1., 2., 1., 0., 2., 2., 2., 2.],
[0., 2., 2., 0., 1., 1., 0., 1., 1., 0., 2., 1., 2.],
[0., 0., 0., 1., 1., 0., 2., 1., 2., 0., 1., 2., 2.],
[1., 2., 2., 0., 1., 1., 0., 2., 0., 1., 1., 0., 2.],
[1., 2., 0., 2., 2., 0., 2., 0., 0., 1., 2., 0., 0.]])
b = array([[0., 2., 0., 0., 0., 0., 0., 1., 1., 1., 0., 2., 2.],
[1., 0., 1., 2., 2., 0., 1., 1., 1., 1., 2., 1., 2.],
[1., 2., 1., 2., 0., 0., 0., 1., 1., 2., 2., 0., 2.],
[0., 1., 2., 0., 2., 1., 0., 1., 2., 0., 0., 0., 2.],
[0., 2., 2., 1., 2., 1., 0., 1., 1., 1., 2., 2., 2.],
[0., 2., 2., 1., 0., 1., 1., 0., 1., 0., 2., 2., 1.],
[1., 0., 2., 2., 0., 1., 0., 1., 0., 1., 1., 2., 2.],
[1., 1., 0., 2., 1., 1., 1., 1., 0., 2., 0., 2., 2.],
[1., 2., 0., 0., 0., 1., 2., 1., 0., 1., 2., 0., 1.],
[1., 2., 1., 2., 2., 1., 2., 0., 2., 0., 0., 1., 1.]])
c = array([[0, 0, 0, 2, 1, 2, 2, 2, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 2, 3, 1, 2, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 3, 2, 4, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 3, 0, 3, 2, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 4, 0, 3, 0, 1, 0, 0, 0, 0, 0]])
My solution:
def new_method_test(a,b):
test = (a[:,np.newaxis] == b).sum(axis=2)
zero = (test == 0).sum(axis=1)
one = (test == 1).sum(axis=1)
two = (test == 2).sum(axis=1)
three = (test == 3).sum(axis=1)
four = (test == 4).sum(axis=1)
five = (test == 5).sum(axis=1)
six = (test == 6).sum(axis=1)
seven = (test == 7).sum(axis=1)
eight = (test == 8).sum(axis=1)
nine = (test == 9).sum(axis=1)
ten = (test == 10).sum(axis=1)
eleven = (test == 11).sum(axis=1)
twelve = (test == 12).sum(axis=1)
thirteen = (test == 13).sum(axis=1)
c = np.concatenate((zero,one,two,three,four,five,six,seven,eight,nine,ten,eleven,twelve,thirteen), axis = 0).reshape(14,len(a)).T
return c
Thank you for you help.
welcome to Stackoverflow! I think a for loop is the way to go if you want to save memory (and it's really not that slow). Additionally you can directly go from one test to your c output matrix with np.bincount. I think this method will be approximately equally fast as yours and it will use significantly less memory by comparison.
import numpy as np
c = np.empty(a.shape, dtype=int)
for i in range(a.shape[0]):
test_one_vector = (a[i,:]==b).sum(axis=1)
c[i,:] = np.bincount(test_one_vector, minlength=a.shape[1])
Small sidenote if you are really dealing with floating point numbers in a and b you should consider dropping the equality check (==) in favor of a proximity check like e.g. np.isclose

Summing positive and negative elements from two NumPy arrays

>>> x1
array([[ 0., -1., 2.],
[ 3., -4., 2.],
[ -2., 1., -8.]])
>>> x3
array([[ 0., -5., 2.],
[ 3., 0., -3.],
[ 3., 2., 8.]])
I need two matricies to be output: S and T, such that X is the sum of all positive values in X and Y, and T is the sum of all negative values in X and Y.
For example:
S = array([ [ 0., 0., 4.],
[ 6., 0., 2.],
[ 3., 3., 8.]])
T = array([ [ 0., -6., 0.],
[ 0., -4., -3.],
[ -2., 0., -8.]])
I am using Python 2.6.7.
You can use np.clip() to selectively add
In [140]: x1.clip(min=0) + x3.clip(min=0)
Out[140]:
array([[ 0., 0., 4.],
[ 6., 0., 2.],
[ 3., 3., 8.]])
In [141]: x1.clip(max=0) + x3.clip(max=0)
Out[141]:
array([[ 0., -6., 0.],
[ 0., -4., -3.],
[-2., 0., -8.]])
As well as clip you can do this by multiplying by boolean arrays:
>>> x1 * (x1 > 0) + x3 * (x3 > 0)
array([[ 0., -0., 4.],
[ 6., 0., 2.],
[ 3., 3., 8.]])
>>> x1 * (x1 <= 0) + x3 * (x3 <= 0)
array([[ 0., -6., 0.],
[ 0., -4., -3.],
[-2., 0., -8.]])
>>>

Iterate with binary structure over numpy array to get cell sums

In the package scipy there is the function to define a binary structure (such as a taxicab (2,1) or a chessboard (2,2)).
import numpy
from scipy import ndimage
a = numpy.zeros((6,6), dtype=numpy.int)
a[1:5, 1:5] = 1;a[3,3] = 0 ; a[2,2] = 2
s = ndimage.generate_binary_structure(2,2) # Binary structure
#.... Calculate Sum of
result_array = numpy.zeros_like(a)
What i want is to iterate over all cells of this array with the given structure s. Then i want to append a function to the current cell value indexed in a empty array (example function sum), which uses the values of all cells in the binary structure.
For example:
array([[0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0],
[0, 1, 2, 1, 1, 0],
[0, 1, 1, 0, 1, 0],
[0, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0]])
# The array a. The value in cell 1,2 is currently one. Given the structure s and an example function such as sum the value in the resulting array (result_array) becomes 7 (or 6 if the current cell value is excluded).
Someone got an idea?
For the particular case of sums, you could use ndimage.convolve:
In [42]: import numpy as np
In [43]: a = np.zeros((6,6), dtype=np.int)
a[1:5, 1:5] = 1;
a[3,3] = 0;
a[2,2] = 2
In [48]: s = ndimage.generate_binary_structure(2,2) # Binary structure
In [49]: ndimage.convolve(a,s)
Out[49]:
array([[1, 2, 3, 3, 2, 1],
[2, 5, 7, 7, 4, 2],
[3, 7, 9, 9, 5, 3],
[3, 7, 9, 9, 5, 3],
[2, 4, 5, 5, 3, 2],
[1, 2, 3, 3, 2, 1]])
For the particular case of products, you could use the fact that log(a*b) = log(a)+log(b) to convert the problem back to one involving sums. For example, if we wanted to "product-convolve" b:
b = a[1:-1, 1:-1]
print(b)
# [[1 1 1 1]
# [1 2 1 1]
# [1 1 0 1]
# [1 1 1 1]]
we could compute:
print(np.exp(ndimage.convolve(np.log(b), s, mode = 'constant')))
# [[ 2. 2. 2. 1.]
# [ 2. 0. 0. 0.]
# [ 2. 0. 0. 0.]
# [ 1. 0. 0. 0.]]
The situation becomes more complicated if b includes negative values:
b[0,1] = -1
print(b)
# [[ 1 -1 1 1]
# [ 1 2 1 1]
# [ 1 1 0 1]
# [ 1 1 1 1]]
but not impossible:
logb = np.log(b.astype('complex'))
real, imag = logb.real, logb.imag
print(np.real_if_close(
np.exp(
sum(j * ndimage.convolve(x, s, mode = 'constant')
for x,j in zip((real, imag),(1,1j))))))
# [[-2. -2. -2. 1.]
# [-2. -0. -0. 0.]
# [ 2. 0. 0. 0.]
# [ 1. 0. 0. 0.]]
It's easier if you use a 2-deep wall of zeroes:
In [11]: a0
Out[11]:
array([[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 1., 1., 1., 1., 0., 0.],
[ 0., 0., 1., 2., 1., 1., 0., 0.],
[ 0., 0., 1., 1., 0., 1., 0., 0.],
[ 0., 0., 1., 1., 1., 1., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.]])
In [12]: b0 = zeros_like(a0)
In [13]: for i in range(1,len(a0)-1):
....: for j in range(1,len(a0)-1):
....: b0[i,j] = sum(a0[i-1:i+2, j-1:j+2] * s)
This enables you to multiply the two sub-matrices together and sum, as desired. (You could also do something more elaborate here...)
In [14]: b0
Out[14]:
array([[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 1., 2., 3., 3., 2., 1., 0.],
[ 0., 2., 5., 7., 7., 4., 2., 0.],
[ 0., 3., 7., 9., 9., 5., 3., 0.],
[ 0., 3., 7., 9., 9., 5., 3., 0.],
[ 0., 2., 4., 5., 5., 3., 2., 0.],
[ 0., 1., 2., 3., 3., 2., 1., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.]])
In [15]: b0[1:len(b0)-1, 1:len(b0)-1]
Out[15]:
array([[ 1., 2., 3., 3., 2., 1.],
[ 2., 5., 7., 7., 4., 2.],
[ 3., 7., 9., 9., 5., 3.],
[ 3., 7., 9., 9., 5., 3.],
[ 2., 4., 5., 5., 3., 2.],
[ 1., 2., 3., 3., 2., 1.]])

Categories