I have a ragged tensor of dimensions [BATCH_SIZE, TIME_STEPS, EMBEDDING_DIM]. I want to augment the last axis with data from another tensor of shape [BATCH_SIZE, AUG_DIM]. Each time step of a given example gets augmented with the same value.
If the tensor wasn't ragged with varying TIME_STEPS for each example, I could simply reshape the second tensor with tf.repeat and then use tf.concat:
import tensorflow as tf
# create data
# shape: [BATCH_SIZE, TIME_STEPS, EMBEDDING_DIM]
emb = tf.constant([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [0, 0, 0]]])
# shape: [BATCH_SIZE, 1, AUG_DIM]
aug = tf.constant([[[8]], [[9]]])
# concat
aug = tf.repeat(aug, emb.shape[1], axis=1)
emb_aug = tf.concat([emb, aug], axis=-1)
This doesn't approach work when emb is ragged since emb.shape[1] is unknown and varies across examples:
# rag and remove padding
emb = tf.RaggedTensor.from_tensor(emb, padding=(0, 0, 0))
# reshape for augmentation - this doesn't work
aug = tf.repeat(aug, emb.shape[1], axis=1)
ValueError: Attempt to convert a value (None) with an unsupported type (<class 'NoneType'>) to a Tensor.
The goal is to create a ragged tensor emb_aug which looks like this:
<tf.RaggedTensor [[[1, 2, 3, 8], [4, 5, 6, 8]], [[1, 2, 3 ,9]]]>
Any ideas?
The easiest way to do this is to just make your ragged tensor a regular tensor by using tf.RaggedTensor.to_tensor() and then do the rest of your solution. I'll assume that you need the tensor to remain ragged. The key is to find the row_lengths of each batch in your ragged tensor, and then use this information to make your augmentation tensor ragged.
Example:
import tensorflow as tf
# data
emb = tf.constant([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [0, 0, 0]]])
aug = tf.constant([[[8]], [[9]]])
# make embeddings ragged for testing
emb_r = tf.RaggedTensor.from_tensor(emb, padding=(0, 0, 0))
print(emb_r.shape)
# (2, None, 3)
Here we'll use a combination of row_lengths and sequence_mask to create a new ragged tensor.
# find the row lengths of the embeddings
rl = emb_r.row_lengths()
print(rl)
# tf.Tensor([2 1], shape=(2,), dtype=int64)
# find the biggest row length
max_rl = tf.math.reduce_max(rl)
print(max_rl)
# tf.Tensor(2, shape=(), dtype=int64)
# repeat the augmented data `max_rl` number of times
aug_t = tf.repeat(aug, repeats=max_rl, axis=1)
print(aug_t)
# tf.Tensor(
# [[[8]
# [8]]
#
# [[9]
# [9]]], shape=(2, 2, 1), dtype=int32)
# create a mask
msk = tf.sequence_mask(rl)
print(msk)
# tf.Tensor(
# [[ True True]
# [ True False]], shape=(2, 2), dtype=bool)
From here we can use tf.ragged.boolean_mask to make the augmented data ragged
# make the augmented data a ragged tensor
aug_r = tf.ragged.boolean_mask(aug_t, msk)
print(aug_r)
# <tf.RaggedTensor [[[8], [8]], [[9]]]>
# concatenate!
output = tf.concat([emb_r, aug_r], 2)
print(output)
# <tf.RaggedTensor [[[1, 2, 3, 8], [4, 5, 6, 8]], [[1, 2, 3, 9]]]>
You can find the list of tensorflow methods that support ragged tensors here
Ragged Tensors can be constructed from row lengths directly.
The values input is a flat (with respect to the future ragged dimension not all other dimensions) tensor that can be constructed using tf.repeat, again using the row_lengths to find the appropriate number of repeats per sample!
ragged_lengths = emb.row_lengths()
aug = tf.RaggedTensor.from_row_lengths(
values=tf.repeat(aug, ragged_lengths, axis=0),
row_lengths=ragged_lengths)
emb_aug = tf.concat([emb, aug], axis=-1)
Related
I am working with BERT context vectors and I am trying to extract specific layer activations for specific tokens. I can extract sequential layers fine using ":" slice notation but I want specific layers given by a list (or some other method) e.g. first and fourth layers only.
# (num_target_tokens, num_tokens_in_sequence, num_bert_layers, size_of_bert_vector)
example = torch.randn([3, 12, 13, 768])
indices = torch.tensor([[0, 1], [1, 10], [2, 11]])
a = example[indices[:, 0], indices[:, 1], -4:]
b = example[indices[:, 0], indices[:, 1], 1:5]
# (num_target_tokens, num_specified_layers, size_of_bert_vector)
a.shape
>> torch.Size([3, 4, 768])
b.shape
>> torch.Size([3, 4, 768])
# Desired output shape: torch.Size([3, 4, 768])
c = example[indices[:, 0], indices[:, 1], [1, 3, 5, 7]] # Desired usage
>> shape mismatch: indexing tensors could not be broadcast together with shapes [3], [3], [4]
Is there some elegant way to achieve this with indexing or will I need to split my tensors to achieve this result.
Using PyTorch, torch.combinations will only take a 1D tensor as input but I would like to apply it to each 1D tensor in a multidimensional tensor.
inp = torch.tensor([[1, 2, 3],
[2, 3, 4]])
torch.combinations((inp), r=2)
The result is an error saying I can't apply it to that shape but I want to apply it to [1, 2, 3] and [2, 3, 4] individually. I can't do it one by one because the idea is to apply this to large sets of data.
inp = torch.tensor([[1,2,3],[2,3,4]])
inp_tuple = torch.unbind(inp)
print(inp_tuple)
(tensor([1, 2, 3]), tensor([2, 3, 4]))
torch.combinations((inp_tuple), r=2)
I also tried unbinding the tensor and applying it to the tuple of tensors but it gives an error saying it can't be applied to a tuple.
Is there any way that I can get torch.combinations to automatically apply to each individual 1D tensor in a multidimensional tensor or each tensor in a tuple of tensors? If not are there any alternatives to achieve all combinations of each individual part of a multidimensional tensor?
Function torch.combinations returns all possible combinations of size r of the elements contained in the 1D input vector. The reason why multi-dimensional inputs are not supported is probably that you have no guarantee that the different vectors in your input have the exact same number of unique elements. Obviously if one of the vectors has a duplicate element then you would end up with one set of combinations bigger than another which is simply not possible to represent with a homogenous PyTorch tensor.
So from there on, I will assume that the input tensor inp is a 2D tensor shaped (N, C) where each of its N vectors contains C unique elements. The example you gave would fit to this requirement since both vectors have three unique elements each: {1, 2, 3} and {2, 3, 4}.
>>> inp = torch.tensor([[1,2,3],[2,3,4]])
The idea is to apply torch.combinations on an arrangement tensor of length equal to that of our vectors. We can then use those as indices to gather values in our different vectors in our input tensor.
We can retrieve all combinations of an arrangement with the following:
>>> c = torch.combinations(torch.arange(inp.size(1)), r=2)
tensor([[0, 1],
[0, 2],
[1, 2]])
Then we need to reshape and expand both inp and c such that they match in number of dimensions:
>>> x = inp[:,None].expand(-1,len(c),-1)
tensor([[[1, 2, 3],
[1, 2, 3],
[1, 2, 3]],
[[2, 3, 4],
[2, 3, 4],
[2, 3, 4]]])
>>> idx = c[None].expand(len(x), -1, -1)
tensor([[[0, 1],
[0, 2],
[1, 2]],
[[0, 1],
[0, 2],
[1, 2]]])
Finally we can apply torch.gather on x and idx on dim=2. This will return a 3D tensor out such that:
out[i][j][k] = x[i][j][index[i][j][k]]
Let's make our call on torch.gather:
>>> x.gather(dim=2, index=idx)
tensor([[[1, 2],
[1, 3],
[2, 3]],
[[2, 3],
[2, 4],
[3, 4]]])
Which is the desired result.
Given a 1-d tensor:
A = torch.tensor([1, 2, 3, 4])
suppose we have some "indexer tensor"
ind1 = torch.tensor([3, 0, 1])
ind2 = torch.tensor([[3, 0], [1, 2]])
as we run A[ind1] & A[ind2]
we get results tensor([4, 1, 2]) & tensor([[4, 1],[2, 3]])
which is the same shape of the indexed tensor (ind1 and ind2) and its value are mapped from tensor A.
I want to ask how can I index on higher dimension tensors?
Currently I have one solution:
For a N-d tensor A, suppose we have the indexer tensor IND,
IND is like [[i11, i12, ... i1N], [i21, i22, ... i2N], ...[iM1, i22, ... iMN], where M is the number of indexed elements.
We can divide IND into N tensors, where
IND_1 = torch.tensor([i11, i21, ... iM1])
...
IND_N = torch.tensor([i1N, i2N, ... iMN])
as we run A[IND_1, ... IND_N], we got tensor(v1, v2, ... vM)
Example:
A = tensor([[1, 2], [3, 4]], [[5, 6], [7, 8]]]) # [2 * 2 * 2]
ind1 = tensor([1, 0, 1])
ind2 = tensor([1, 1, 0])
ind3 = tensor([0, 1, 0])
A[ind1, ind2, ind3]
=> tensor([7, 4, 5])
# and the good thing is you can control the shape of result tensor by modifying the inds' shape.
ind1 = tensor([[0, 0], [1, 0]])
ind2 = tensor([[1, 1], [0, 1]])
ind3 = tensor([[0, 1], [0, 0]])
A[ind1, ind2, ind3]
=> tensor([[3, 4],[5, 3]]) # same as inds' shape
Anyone has more elegant solutions?
1- Manual approach using unraveled indices on flattened input.
If you want to index on an arbitrary number of axes (all axes of A) then one straightforward approach is to flatten all dimensions and unravel the indices. Let's assume that A is 3D and we want to index it using a stack of ind1, ind2, and ind3:
>>> ind = torch.stack((ind1, ind2, ind3))
You can first unravel the indices using A's strides:
>>> unraveled = torch.tensor(A.stride()) # ind.flatten(1)
Then flatten A, index it with unraveled and reshape to the final form:
>>> A.flatten()[unraveled].reshape_as(ind[0])
2- Using a simple split of ind.
You can actually perform the same operation using torch.chunk:
>>> A[ind.chunk(len(ind))][0]
Or alternatively torch.split which is identical:
>>> A[ind.split(1)][0]
3- Initial answer for single-axis indexing.
Let's take a minimal multi-dimensional example with A being a 2-D tensor defined as:
>>> A = torch.tensor([[1, 2, 3, 4],
[5, 6, 7, 8]])
From your description of the problem:
the same shape of index tensor and its value are mapped from tensor A.
Then the indexer tensor would require to have the same shape as the indexed tensor A, since this one is no longer flat. Otherwise, what would the result of A (shaped (2, 4)) indexed by ind1 (shape (3,)) be?
If you are indexing on a single dimension then you can utilize torch.gather:
>>> A.gather(1, ind2)
tensor([[4, 1],
[6, 7]])
I want to apply a threshold to 1 column in a 2D tensor. Any value below the cutoff would be listed as null or zero. I have tried to avoid looping through the tensor and I want the input & output tensor to have the same shape.
Here is the code:
NFValue = tf.Variable(1.,dtype=tf.float64,constraint=lambda t: tf.clip_by_value(t, 10, 20))
col1 = tf.gather(x, [0], axis=0)
col2 = tf.gather(x, [1], axis=0)
y = tf.fill(tf.shape(col2), NFValue) # creates a tensor of the same size as X, with Cutoff
y = tf.cast(y, np.float32) # converts that tensor into the correct type for comparision.
NewCol2 = tf.boolean_mask(col2, tf.math.greater(col2, y))
return tf.concat([col1[0,:], NewCol2], axis=0)
The problem is that tf.boolean_mask() returns a tensor with just the values which were greater than NFValue. So the shape has changed. tf.Greater will return a boolean vector of the correct shape, but I would need to loop through the tensor.
I have tried several different options around this. I have looked at slice, tf.Scan and a couple different functions. I am expecting there to be a canned solution here.
Use tf.where
import tensorflow as tf
x = tf.reshape(tf.range(9), (3, 3))
<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])>
tf.where(x > 5, x, 0)
<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[0, 0, 0],
[0, 0, 0],
[6, 7, 8]])>
Define x as:
>>> import tensorflow as tf
>>> x = tf.constant([1, 2, 3])
Why does this normal tensor multiplication work fine with broacasting:
>>> tf.constant([[1, 2, 3], [4, 5, 6]]) * tf.expand_dims(x, axis=0)
<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[ 1, 4, 9],
[ 4, 10, 18]], dtype=int32)>
while this one with a ragged tensor does not?
>>> tf.ragged.constant([[1, 2, 3], [4, 5, 6]]) * tf.expand_dims(x, axis=0)
*** tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected 'tf.Tensor(False, shape=(), dtype=bool)' to be true. Summarized data: b'Unable to broadcast: dimension size mismatch in dimension'
1
b'lengths='
3
b'dim_size='
3, 3
How can I get a 1-D tensor to broadcast over a 2-D ragged tensor? (I am using TensorFlow 2.1.)
The problem will be resolved if you add ragged_rank=0 to the Ragged Tensor, as shown below:
tf.ragged.constant([[1, 2, 3], [4, 5, 6]], ragged_rank=0) * tf.expand_dims(x, axis=0)
Complete working code is:
%tensorflow_version 2.x
import tensorflow as tf
x = tf.constant([1, 2, 3])
print(tf.ragged.constant([[1, 2, 3], [4, 5, 6]], ragged_rank=0) * tf.expand_dims(x, axis=0))
Output of the above code is:
tf.Tensor(
[[ 1 4 9]
[ 4 10 18]], shape=(2, 3), dtype=int32)
One more correction.
As per the definition of Broadcasting, Broadcasting is the process of **making** tensors with different shapes have compatible shapes for elementwise operations, there is no need to specify tf.expand_dims explicitly, Tensorflow will take care of it.
So, below code works and demonstrates the property of Broadcasting well:
%tensorflow_version 2.x
import tensorflow as tf
x = tf.constant([1, 2, 3])
print(tf.ragged.constant([[1, 2, 3], [4, 5, 6]], ragged_rank=0) * x)
Output of the above code is:
tf.Tensor(
[[ 1 4 9]
[ 4 10 18]], shape=(2, 3), dtype=int32)
For more information, please refer this link.
Hope this helps. Happy Learning!