Tensorflow split or unstack to work with interleaved values - python

Say I have a Tensorflow tensor l with shape [20,] and these are 10 coordinates packed as [x1,y1,x2,y2,...]. I need access to [x1,x2,...] and [y1,y2,...] to modify their values (e.g., rotate, scale, shift) and then repackage as [x1',y1',x1',y2',...].
I can reshape, tf.reshape(l, (10, 2)), but then I'm not sure whether to use split or unstack and what the arguments should be. When should one use split instead of unstack? And then how should the modified values be repacked so they're in the original format?

This is the kind of stuff that can be easily verifiable with tensorflow's eager execution mode:
import numpy as np
import tensorflow as tf
tf.enable_eager_execution()
l = np.arange(20)
y = tf.reshape(l, [10, 2])
a = tf.split(y, num_or_size_splits=2, axis=1)
b = tf.unstack(y, axis=1)
print('reshaped:', y, sep='\n', end='\n\n')
for operation, c in zip(('split', 'unstack'), (a, b)):
print('%s:' % operation, c, sep='\n', end='\n\n')
reshaped:
tf.Tensor(
[[ 0 1]
[ 2 3]
...
[16 17]
[18 19]], shape=(10, 2), dtype=int64)
split:
[<tf.Tensor: id=5, shape=(10, 1), dtype=int64, numpy=
array([[ 0],
[ 2],
...
[16],
[18]])>,
<tf.Tensor: id=6, shape=(10, 1), dtype=int64, numpy=
array([[ 1],
[ 3],
...
[17],
[19]])>]
unstack:
[<tf.Tensor: id=7, shape=(10,), dtype=int64, numpy=array([ 0, 2, ... 16, 18])>,
<tf.Tensor: id=8, shape=(10,), dtype=int64, numpy=array([ 1, 3, ... 17, 19])>]
So they are pretty much the same, using these parameters; except by:
tf.split will always split the tensor along the axis into num_or_size_splits splits, which can potentially be different than the number of dimensions shape[axis] and therefore needs to retain the original rank, outputting tensors of shape [10, n / num_or_size_splits] = [10, 2 / 2] = [10, 1].
Repacking can be performed by concatenating all split parts in a:
c=tf.concat(a, axis=1)
print(c)
array([[ 0, 1],
[ 2, 3],
...
[16, 17],
[18, 19]])>
tf.unstack will split the tensor along the axis into the exact amount of dimensions shape[axis], and can therefore unambiguously reduce the rank by 1, resulting in tensors of shape [10].
Repacking can be performed by stacking all split parts in b:
c=tf.stack(b, axis=1)
print(c)
array([[ 0, 1],
[ 2, 3],
...
[16, 17],
[18, 19]])>

Related

Tensorflow gives 0 results

I am learning Tensorflow from this github
https://colab.research.google.com/github/instillai/TensorFlow-Course/blob/master/codes/ipython/1-basics/tensors.ipynb#scrollTo=TKX2U0Imcm7d
Here is an easy tutorial
import numpy as np
import tensorflow as tf
x = tf.constant([[1, 1],
[1, 1]])
y = tf.constant([[2, 4],
[6, 8]])
# Add two tensors
print(tf.add(x, y), "\n")
# Add two tensors
print(tf.matmul(x, y), "\n")
What I expect is
tf.Tensor(
[[3 5]
[7 9]], shape=(2, 2), dtype=int32)
tf.Tensor(
[[ 8 12]
[ 8 12]], shape=(2, 2), dtype=int32)
However, the results are
Tensor("Add_3:0", shape=(2, 2), dtype=int32)
Tensor("MatMul_3:0", shape=(2, 2), dtype=int32)
It does not mean that the values of the tensors are zero. Add_3:0 and MatMul_3:0 are just names of the tensors and you can only use print in Eager Execution to see the values of the tensors. In Graph mode you should use tf.print and you should see the results:
import tensorflow as tf
x = tf.constant([[1, 1],
[1, 1]])
y = tf.constant([[2, 4],
[6, 8]])
print(tf.add(x, y), "\n")
print(tf.matmul(x, y), "\n")
# Graph mode
#tf.function
def calculate():
x = tf.constant([[1, 1],
[1, 1]])
y = tf.constant([[2, 4],
[6, 8]])
tf.print(tf.add(x, y), "\n")
tf.print(tf.matmul(x, y), "\n")
return x, y
_, _ = calculate()
tf.Tensor(
[[3 5]
[7 9]], shape=(2, 2), dtype=int32)
tf.Tensor(
[[ 8 12]
[ 8 12]], shape=(2, 2), dtype=int32)
[[3 5]
[7 9]]
[[8 12]
[8 12]]
Without tf.print, you will see the your output from the function calculate:
Tensor("Add:0", shape=(2, 2), dtype=int32)
Tensor("MatMul:0", shape=(2, 2), dtype=int32)
See this guide for more information.

How to do numpy like conditional assignment in Tensorflow

The following is how it works in Numpy
import numpy as np
vals_for_fives = [12, 18, 22, 33]
arr = np.array([5, 2, 3, 5, 5, 5])
arr[arr == 5] = vals_for_fives # It is guaranteed that length of vals_for_fives is equal to the number of fives in arr
# now the value of arr is [12, 2, 3, 18, 22, 33]
For broadcastable or constant assignment we can use where() and assign() in Tensorflow. How can we achieve the above scenario in TF?
tf.experimental.numpy.where is a thing in tensorflow v2.5.
But for now you could do this:
First find the positions of the 5's:
arr = np.array([5, 2, 3, 5, 5, 5])
where = tf.where(arr==5)
where = tf.cast(where, tf.int32)
print(where)
# <tf.Tensor: id=91, shape=(4, 1), dtype=int32, numpy=
array([[0],
[3],
[4],
[5]])>
Then use scatter_nd to "replace" elements by index:
tf.scatter_nd(where, tf.constant([12,18,22,23]), tf.constant([5]))
# <tf.Tensor: id=94, shape=(5,), dtype=int32, numpy=array([12, 0, 0, 18
, 22])>
Do a similar thing for the entries that were not 5 to find the missing tensor:
tf.scatter_nd(tf.constant([[1], [2]]), tf.constant([2,3]), tf.constant([5]))
# <tf.Tensor: id=98, shape=(5,), dtype=int32, numpy=array([0, 2, 3, 0, 0])>
Then sum the two tensors to get:
<tf.Tensor: id=113, shape=(5,), dtype=int32, numpy=array([12, 2, 3, 1, 8, 22])>

Applying a cutoff value to a Tensor

I want to apply a threshold to 1 column in a 2D tensor. Any value below the cutoff would be listed as null or zero. I have tried to avoid looping through the tensor and I want the input & output tensor to have the same shape.
Here is the code:
NFValue = tf.Variable(1.,dtype=tf.float64,constraint=lambda t: tf.clip_by_value(t, 10, 20))
col1 = tf.gather(x, [0], axis=0)
col2 = tf.gather(x, [1], axis=0)
y = tf.fill(tf.shape(col2), NFValue) # creates a tensor of the same size as X, with Cutoff
y = tf.cast(y, np.float32) # converts that tensor into the correct type for comparision.
NewCol2 = tf.boolean_mask(col2, tf.math.greater(col2, y))
return tf.concat([col1[0,:], NewCol2], axis=0)
The problem is that tf.boolean_mask() returns a tensor with just the values which were greater than NFValue. So the shape has changed. tf.Greater will return a boolean vector of the correct shape, but I would need to loop through the tensor.
I have tried several different options around this. I have looked at slice, tf.Scan and a couple different functions. I am expecting there to be a canned solution here.
Use tf.where
import tensorflow as tf
x = tf.reshape(tf.range(9), (3, 3))
<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])>
tf.where(x > 5, x, 0)
<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[0, 0, 0],
[0, 0, 0],
[6, 7, 8]])>

Combine arbitrary shaped tensors

I'd like to combine two variable length tensors.
Since they don't match in shape I can't use tf.concat or tf.stack.
So I thought I'd flatten one and then append it to each element of the other - but I don't see how to do that.
For example,
a = [ [1,2], [3,4] ]
flat_b = [5, 6]
combine(a, flat_b) would be [ [ [1,5,6], [2,5,6] ],
[ [3,5,6], [4,5,6] ] ]
Is there a method like this?
Using tf.map_fn with tf.concat, Example code:
import tensorflow as tf
a = tf.constant([ [1,2], [3,4] ])
flat_b = [5, 6]
flat_a = tf.reshape(a, (tf.reduce_prod(a.shape).numpy(), ))[:, tf.newaxis]
print(flat_a)
c = tf.map_fn(fn=lambda t: tf.concat([t, flat_b], axis=0), elems=flat_a)
c = tf.reshape(c, (-1, a.shape[1], c.shape[1]))
print(c)
Outputs:
tf.Tensor(
[[1]
[2]
[3]
[4]], shape=(4, 1), dtype=int32)
tf.Tensor(
[[[1 5 6]
[2 5 6]]
[[3 5 6]
[4 5 6]]], shape=(2, 2, 3), dtype=int32)
Here's a somewhat simpler version of the previous answer. Rather than reshaping several times, I prefer to use tf.expand_dims and tf.stack. The latter adds a dimension so that's one less call to tf.reshape, which can be confusing.
import tensorflow as tf
a = tf.constant([[1,2], [3,4]])
b = [5, 6]
flat_a = tf.reshape(a, [-1])
c = tf.map_fn(lambda x: tf.concat([[x], b], axis=0), flat_a)
c = tf.stack(tf.split(c, num_or_size_splits=len(a)), axis=0)
<tf.Tensor: shape=(2, 2, 3), dtype=int32, numpy=
array([[[1, 5, 6],
[2, 5, 6]],
[[3, 5, 6],
[4, 5, 6]]])>
You could go through element-wise. In list form you would do something like out[i][j] = [a[i][j]]+flat_b starting from out being the same shape as a. This gets to the form you wanted. I'm not sure if there is this sort of element-wise concatenation in the tensorflow library.

Broadcasting with ragged tensor

Define x as:
>>> import tensorflow as tf
>>> x = tf.constant([1, 2, 3])
Why does this normal tensor multiplication work fine with broacasting:
>>> tf.constant([[1, 2, 3], [4, 5, 6]]) * tf.expand_dims(x, axis=0)
<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[ 1, 4, 9],
[ 4, 10, 18]], dtype=int32)>
while this one with a ragged tensor does not?
>>> tf.ragged.constant([[1, 2, 3], [4, 5, 6]]) * tf.expand_dims(x, axis=0)
*** tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected 'tf.Tensor(False, shape=(), dtype=bool)' to be true. Summarized data: b'Unable to broadcast: dimension size mismatch in dimension'
1
b'lengths='
3
b'dim_size='
3, 3
How can I get a 1-D tensor to broadcast over a 2-D ragged tensor? (I am using TensorFlow 2.1.)
The problem will be resolved if you add ragged_rank=0 to the Ragged Tensor, as shown below:
tf.ragged.constant([[1, 2, 3], [4, 5, 6]], ragged_rank=0) * tf.expand_dims(x, axis=0)
Complete working code is:
%tensorflow_version 2.x
import tensorflow as tf
x = tf.constant([1, 2, 3])
print(tf.ragged.constant([[1, 2, 3], [4, 5, 6]], ragged_rank=0) * tf.expand_dims(x, axis=0))
Output of the above code is:
tf.Tensor(
[[ 1 4 9]
[ 4 10 18]], shape=(2, 3), dtype=int32)
One more correction.
As per the definition of Broadcasting, Broadcasting is the process of **making** tensors with different shapes have compatible shapes for elementwise operations, there is no need to specify tf.expand_dims explicitly, Tensorflow will take care of it.
So, below code works and demonstrates the property of Broadcasting well:
%tensorflow_version 2.x
import tensorflow as tf
x = tf.constant([1, 2, 3])
print(tf.ragged.constant([[1, 2, 3], [4, 5, 6]], ragged_rank=0) * x)
Output of the above code is:
tf.Tensor(
[[ 1 4 9]
[ 4 10 18]], shape=(2, 3), dtype=int32)
For more information, please refer this link.
Hope this helps. Happy Learning!

Categories