What is Tensorflow equivalent of pytorch's conv1d? - python

Just wondering how I can perform 1D convolution in tensorflow. Specifically, looking to replace this code to tensorflow:
inputs = F.pad(inputs, (kernel_size-1,0), 'constant', 0)
output = F.conv1d(inputs, weight, padding=0, groups=num_heads)

Tensorflow equivalent of PyTorch's
torch.nn.functional.conv1d() is
tf.nn.conv1d() and torch.nn.functional.pad() is tf.pad().
For Example:
(PyTorch code)
import torch.nn as nn
import torch
inputs = torch.tensor([1, 0, 2, 3, 0, 1, 1], dtype=torch.float32)
filters = torch.tensor([2, 1, 3], dtype=torch.float32)
inputs = inputs.unsqueeze(0).unsqueeze(0) # torch.Size([1, 1, 7])
filters = filters.unsqueeze(0).unsqueeze(0) # torch.Size([1, 1, 3])
conv_res = F.conv1d(inputs, filters, padding=0, groups=1) # torch.Size([1, 1, 5])
pad_res = F.pad(conv_res, (1, 1), mode='constant', value=0) # torch.Size([1, 1, 7])
output:
tensor([[[ 0., 8., 11., 7., 9., 4., 0.]]])
(Tensorflow code)
import tensorflow as tf
tf.enable_eager_execution()
i = tf.constant([1, 0, 2, 3, 0, 1, 1], dtype=tf.float32)
k = tf.constant([2, 1, 3], dtype=tf.float32, name='k')
data = tf.reshape(i, [1, int(i.shape[0]), 1], name='data')
kernel = tf.reshape(k, [int(k.shape[0]), 1, 1], name='kernel')
res = tf.nn.conv1d(data, kernel, 1, 'VALID')
res = tf.pad(res[0], [[1, 1], [0, 0]], "CONSTANT")
output:
<tf.Tensor: id=555, shape=(7, 1), dtype=float32, numpy=
array([[ 0.],
[ 8.],
[11.],
[ 7.],
[ 9.],
[ 4.],
[ 0.]], dtype=float32)>

Related

How to implement Multinomial conditional distributions depending on the conditional binary value in Tensorflow Probability?

I am trying to build a graphical model in Tensorflow Probability, where we first sample a number of positive (1) and negative (0) examples (count_i) from Categorical distribution and then construct Multinomial distribution (Y_i) depending on the value of (count_i). These events (Y_i) are mutually exclusive :
Y_1 ~ Multinomial([.9, 0.1, 0.05, 0.05, 0.1], total_count = [tf.reduce_sum(tf.cast(count==1, tf.float32))
Y_2 ~ Multinomial([0.99, 0.01, 0., 0., 0.], total_count = [tf.reduce_sum(tf.cast(count==0, tf.float32))
I have read these tutorials, however I am stuck with two issues:
This code generates two arrays of length 500, whereas I only need 1 array of 500. What should I change so we only get 1 sample from Categorical distribution and then depending on the overall count of the value we are conditioning on, Multinomial is constructed ?
The sample from Categorical distribution gives only values of 0, whereas it should be a blend between 0 and 1. What am I doing wrong here?
My code is as follows. You can run these to replicate the behaviour:
def simplied_model():
return tfd.JointDistributionSequential([
tfd.Uniform(low=0., high = 1., name = 'e'), #e
lambda e: tfd.Sample(tfd.Categorical(probs = tf.stack([e, 1.-e], 0)), sample_shape =500), #count #should it be independent?
lambda count: tfd.Multinomial(probs = tf.constant([[.9, 0.1, 0.05, 0.05, 0.1], [0.99, 0.01, 0., 0., 0.]]), total_count = tf.cast(tf.stack([tf.reduce_sum(tf.cast(count==1, tf.float32)),tf.reduce_sum(tf.cast(count==0, tf.float32))], 0), dtype= tf.float32))
])
tt = simplied_model()
tt.resolve_graph()
tt.sample(1)
The first array will be your Y_{1} and the second will be your Y_{2}. The key is that your output will always be of shape (2, 5) because that is the length of the probabilities you are passing to tfd.Multinomial.
Code:
import tensorflow as tf
import tensorflow_probability as tfp
from tensorflow_probability import distributions as tfd
# helper function
def _get_counts(vec):
zeros = tf.reduce_sum(tf.cast(vec == 0, tf.float32))
ones = tf.reduce_sum(tf.cast(vec == 1, tf.float32))
return tf.stack([ones, zeros], 0)
joint = tfd.JointDistributionSequential([
tfd.Sample( # sample from uniform to make it 2D
tfd.Uniform(0., 1., name="e"), 1),
lambda e: tfd.Sample(
tfd.Categorical(probs=tf.stack([e, 1.-e], -1)), 500),
lambda c: tfd.Multinomial(
probs=[
[0.9, 0.1, 0.05, 0.05, 0.1],
[0.99, 0.01, 0., 0., 0.],
],
total_count=_get_counts(c),
)
])
joint.sample(5) # or however many you want to sample
Output:
# [<tf.Tensor: shape=(5, 1), dtype=float32, numpy=
# array([[0.5611458 ],
# [0.48223293],
# [0.6097224 ],
# [0.94013655],
# [0.14861858]], dtype=float32)>,
# <tf.Tensor: shape=(5, 1, 500), dtype=int32, numpy=
# array([[[1, 0, 0, ..., 1, 0, 1]],
#
# [[1, 1, 1, ..., 1, 0, 0]],
#
# [[0, 0, 0, ..., 1, 0, 0]],
#
# [[0, 0, 0, ..., 0, 0, 0]],
#
# [[1, 0, 1, ..., 1, 0, 1]]], dtype=int32)>,
# <tf.Tensor: shape=(2, 5), dtype=float32, numpy=
# array([[ 968., 109., 0., 0., 0.],
# [1414., 9., 0., 0., 0.]], dtype=float32)>]

Add blocks of values to a tensor at specific locations in PyTorch

I have a list of indices:
indx = torch.LongTensor([
[ 0, 2, 0],
[ 0, 2, 4],
[ 0, 4, 0],
[ 0, 10, 14],
[ 1, 4, 0],
[ 1, 8, 2],
[ 1, 12, 0]
])
And I have a tensor of 2x2 blocks:
blocks = torch.FloatTensor([
[[1.5818, 2.3108],
[2.6742, 3.0024]],
[[2.0472, 1.6651],
[3.2807, 2.7413]],
[[1.5587, 2.1905],
[1.9231, 3.5083]],
[[1.6007, 2.1426],
[2.4802, 3.0610]],
[[1.9087, 2.1021],
[2.7781, 3.2282]],
[[1.5127, 2.6322],
[2.4233, 3.6836]],
[[1.9645, 2.3831],
[2.8675, 3.3770]]
])
What I want to do is to add each block at an index position to another tensor (i.e. so that it starts at that index). Let's assume that I want to add it to the following tensor:
a = torch.ones([2,18,18])
Is there any efficient way to do so? So far I came up only with:
i = 0
for b, x, y in indx:
a[b, x:x+2, y:y+2] += blocks[i]
i += 1
It is quite inefficient, I also tried to use index_add, but it did not work properly.
You are looking to index on three different dimensions at the same time. I had a look around in the documentation, torch.index_add will only receive a vector as index. My hopes were on torch.scatter but it doesn't to fit well to this problem. As it turns out you can achieve this pretty easily with a little work, the most difficult parts are the setup and teardown. Please hang on tight.
I'll use a simplified example here, but the same can be applied with larger tensors.
>>> indx
tensor([[ 0, 2, 0],
[ 0, 2, 4],
[ 0, 4, 0]]))
>>> blocks
tensor([[[1.5818, 2.3108],
[2.6742, 3.0024]],
[[2.0472, 1.6651],
[3.2807, 2.7413]],
[[1.5587, 2.1905],
[1.9231, 3.5083]]])
>>> a
tensor([[[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.]]])
The main issue here is that you are looking index with slicing. That not possible in a vectorize form. To counter that though you can convert your a tensor into 2x2 chunks. This will be particulary handy since we will be able to access sub-tensors such as a[0, 2:4, 4:6] with just a[0, 1, 2]. Since the 2:4 slice on dim=1 will be grouped together on index=1 while the 4:6 slice on dim=0 will be grouped on index=2.
First we will convert a to tensor made up of 2x2 chunks. Then we will update with blocks. Finally, we will stitch back the resulting tensor into the original shape.
1. Converting a to a 2x2-chunks tensor
You can use a combination of torch.chunk and torch.cat (not torch.dog) twice: on dim=1 and dim=2. The shape of a is (1, h, w) so we're looking for a result of shape (1, h//2, w//2, 2, 2).
To do so we will unsqueeze two axes on a:
>>> a_ = a[:, None, :, None, :]
>>> a_.shape
torch.Size([1, 1, 6, 1, 6])
Then make 3 chunks on dim=2, then concatenate on dim=1:
>>> a_row_chunks = torch.cat(torch.chunk(a_, 3, dim=2), dim=1)
>>> a_row_chunks.shape
torch.Size([1, 3, 2, 1, 6])
And make 3 chunks on dim=4, then concatenate on dim=3:
>>> a_col_chunks = torch.cat(torch.chunk(a_row_chunks, 3, dim=4), dim=3)
>>> a_col_chunks.shape
torch.Size([1, 3, 2, 3, 2])
Finally reshape all.
>>> a_chunks = a_col_chunks.reshape(1, 3, 3, 2, 2)
Create a new index with adjusted values for our new tensor with. Essentially we divide all values by 2 except for the first column which is the index of dim=0 in a which was unchanged. There's some fiddling around with the types (in short: it has to be a float in order to divide by 2 but needs to be cast back to a long in order for the indexing to work):
>>> indx_ = indx.clone().float()
>>> indx_[:, 1:] /= 2
>>> indx_ = indx_.long()
tensor([[0, 1, 0],
[0, 1, 2],
[0, 2, 0]])
2. Updating with blocks
We will simply index and accumulate with:
>>> a_chunks[indx_[:, 0], indx_[:, 1], indx_[:, 2]] += blocks
3. Putting it back together
I thought that was it, but actually converting a_chunk back to a 6x6 tensor is way trickier than it seems. Apparently torch.cat can only receive a tuple. I won't go into to much detail: tuple() will only consider the first axis, as a workaround you can use torch.permute to switch the axes. This combined with two torch.cat will do:
>>> a_row_cat = torch.cat(tuple(a_chunks.permute(1, 0, 2, 3, 4)), dim=2)
>>> a_row_cat.shape
torch.Size([1, 3, 6, 2])
>>> A = torch.cat(tuple(a_row_cat.permute(1, 0, 2, 3)), dim=2)
>>> A.shape
torch.Size([1, 6, 6])
>>> A
tensor([[[0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[1.5818, 2.3108, 0.0000, 0.0000, 2.0472, 1.6651],
[2.6742, 3.0024, 0.0000, 0.0000, 3.2807, 2.7413],
[1.5587, 2.1905, 0.0000, 0.0000, 0.0000, 0.0000],
[1.9231, 3.5083, 0.0000, 0.0000, 0.0000, 0.0000]]])
Et voilĂ .
If you didn't quite get how the chunks worked. Run this:
for x in range(0, 6, 2):
for y in range(0, 6, 2):
a *= 0
a[:, x:x+2, y:y+2] = 1
print(a)
And see for yourself: each 2x2 block of 1s corresponds to a chunk in a_chunks.
So you can do the same with:
for x in range(3):
for y in range(3):
a_chunks *= 0
a_chunks[:, x, y] = 1
print(a_chunks)

Upsampling2D output shape

I don't really understand the Keras UpSampling2D output shape, for example the following code should output a tensor of shape=(1,1,6,6) however it outputs a tensor of shape=(1, 2, 6, 3) the output channels look correct in terms of the data but I am confused about the shape:
input_shape = (1, 1, 3, 3)
x = np.arange(np.prod(input_shape)).reshape(input_shape)
y = tf.keras.layers.UpSampling2D(size=(2, 2))(x)
Tensor:
[[[[0 1 2]
[3 4 5]
[6 7 8]]]]
Output:
tf.Tensor(
[[[[0 1 2]
[0 1 2]
[3 4 5]
[3 4 5]
[6 7 8]
[6 7 8]]
[[0 1 2]
[0 1 2]
[3 4 5]
[3 4 5]
[6 7 8]
[6 7 8]]]], shape=(1, 2, 6, 3), dtype=int64)
Expected output:
tf.Tensor([[0., 0., 1., 1., 2., 2.],
[0., 0., 1., 1., 2., 2.],
[3., 3., 4., 4., 5., 5.],
[3., 3., 4., 4., 5., 5.],
[6., 6., 7., 7., 8., 8.],
[6., 6., 7., 7., 8., 8.]], dtype=float32)
I understand how to use tf.concat to get the desired output, I am trying to understand the behavior of the output
It is about the position of the channel index.
UpSampling2D expects the input to be like (nb_samples, height, width, channels), and not (nb_samples, channels, height, width).
You would get the output you expect with the following code:
input_shape = (1, 3, 3, 1)
x = np.arange(np.prod(input_shape)).reshape(input_shape)
y = tf.keras.layers.UpSampling2D(size=(2, 2))(x)
In this case, the shape of y is (1, 6, 6, 1), and y[0, :, :, 0] is equal to:
array([[0, 0, 1, 1, 2, 2],
[0, 0, 1, 1, 2, 2],
[3, 3, 4, 4, 5, 5],
[3, 3, 4, 4, 5, 5],
[6, 6, 7, 7, 8, 8],
[6, 6, 7, 7, 8, 8]])>
Edit: as #xdurch0 points out, it is about using channels_first or channels_last data format, see here.

How do you efficiently sum the occurences of a value in one array at positions in another array

Im looking for an efficient 'for loop' avoiding solution that solves an array related problem I'm having. I want to use a huge 1Darray (A -> size = 250.000) of values between 0 and 40 for indexing in one dimension, and a array (B) with the same size with values between 0 and 9995 for indexing in a second dimension.
The result should be an array with size (41, 9996) with for each index the amount of times that any value from array 1 occurs at a value from array 2.
Example:
A = [0, 3, 2, 4, 3]
B = [1, 2, 2, 0, 2]
which should result in:
[[0, 1, 0,
[0, 0, 0,
[0, 0, 1,
[0, 0, 2,
[1, 0, 0]]
The dirty way is too slow as the amount of data is huge, what you would be able to do is:
out = np.zeros(41,9995)
for i in A:
for j in B:
out[i,j] += 1
which will take 238.000 * 238.000 loops...
I've tried this, which works partially:
out = np.zeros(41,9995)
out[A,B] += 1
Which generates a result with 1 everywhere, regardless of the amount of times the values occur.
Does anyone have a clue how to fix this? Thanks in advance!
You are looking for a sparse tensor:
import torch
A = [0, 3, 2, 4, 3]
B = [1, 2, 2, 0, 2]
idx = torch.LongTensor([A, B])
torch.sparse.FloatTensor(idx, torch.ones(idx.shape[1]), torch.Size([5,3])).to_dense()
Output:
tensor([[0., 1., 0.],
[0., 0., 0.],
[0., 0., 1.],
[0., 0., 2.],
[1., 0., 0.]])
You can also do the same with scipy sparse matrix:
import numpy as np
from scipy.sparse import coo_matrix
coo_matrix((np.ones(len(A)), (np.array(A), np.array(B))), shape=(5,3)).toarray()
output:
array([[0., 1., 0.],
[0., 0., 0.],
[0., 0., 1.],
[0., 0., 2.],
[1., 0., 0.]])
Sometimes it is better to leave the matrix in its sparse representation, rather than forcing it to be "dense" again.
Use numpy.add.at:
import numpy as np
A = [0, 3, 2, 4, 3]
B = [1, 2, 2, 0, 2]
arr = np.zeros((5, 3))
np.add.at(arr, (A, B), 1)
print(arr)
Output
[[0. 1. 0.]
[0. 0. 0.]
[0. 0. 1.]
[0. 0. 2.]
[1. 0. 0.]]
Given that the numbers are in a small range, bincount would be a good choice for bin-based summing -
def accumulate_coords(A,B):
nrows = A.max()+1
ncols = B.max()+1
return np.bincount(A*ncols+B,minlength=nrows*ncols).reshape(-1,ncols)
Sample run -
In [55]: A
Out[55]: array([0, 3, 2, 4, 3])
In [56]: B
Out[56]: array([1, 2, 2, 0, 2])
In [58]: accumulate_coords(A,B)
Out[58]:
array([[0, 1, 0],
[0, 0, 0],
[0, 0, 1],
[0, 0, 2],
[1, 0, 0]])

Specifying dtype=object for numpy.gradient

Is there a way to specify the dtype for numpy.gradient?
I'm using an array of subarrays and it's throwing the following error:
ValueError: setting an array element with a sequence.
Here is an example:
import numpy as np
a = np.empty([3, 3], dtype=object)
it = np.nditer(a, flags=['multi_index', 'refs_ok'])
while not it.finished:
i = it.multi_index[0]
j = it.multi_index[1]
a[it.multi_index] = np.array([i, j])
it.iternext()
print(a)
which outputs
[[array([0, 0]) array([0, 1]) array([0, 2])]
[array([1, 0]) array([1, 1]) array([1, 2])]
[array([2, 0]) array([2, 1]) array([2, 2])]]
I would like print(np.gradient(a)) to return
array(
[[array([[1, 0],[0, 1]]), array([[1, 0], [0, 1]]), array([[1, 0], [0, 1]])],
[array([[1, 0], [0, 1]]), array([[1, 0], [0, 1]]), array([[1, 0],[0, 1]])],
[array([[1, 0], [0, 1]]), array([[1, 0], [0, 1]]), array([[1, 0],[0, 1]])]],
dtype=object)
Notice that, in this case, the gradient of the vector field is an identity tensor field.
why are you working an array of dtype object? That's more work than using a 2d array.
e.g.
In [53]: a1=np.array([[1,2],[3,4],[5,6]])
In [54]: a1
Out[54]:
array([[1, 2],
[3, 4],
[5, 6]])
In [55]: np.gradient(a1)
Out[55]:
[array([[ 2., 2.],
[ 2., 2.],
[ 2., 2.]]),
array([[ 1., 1.],
[ 1., 1.],
[ 1., 1.]])]
or working column by column, or row by row
In [61]: [np.gradient(i) for i in a1.T]
Out[61]: [array([ 2., 2., 2.]), array([ 2., 2., 2.])]
In [62]: [np.gradient(i) for i in a1]
Out[62]: [array([ 1., 1.]), array([ 1., 1.]), array([ 1., 1.])]
dtype=object only make sense if the subarrays/lists differ in type and/or shape. And even then it doesn't add much to a regular Python list.
==============================
I can take your 2d a, and make a 3d array with:
In [126]: a1=np.zeros((3,3,2),int)
In [127]: a1.flat[:]=[i for i in a.flatten()]
In [128]: a1
Out[128]:
array([[[0, 0],
[0, 1],
[0, 2]],
[[1, 0],
[1, 1],
[1, 2]],
[[2, 0],
[2, 1],
[2, 2]]])
Or I could produce the same thing with meshgrid:
In [129]: X,Y=np.meshgrid(np.arange(3),np.arange(3),indexing='ij')
In [130]: a2=np.array([Y,X]).T
When I apply np.gradient to that I get 3 arrays, each (3,3,2) in shape.
In [136]: ga1=np.gradient(a1)
In [137]: len(ga1)
Out[137]: 3
In [138]: ga1[0].shape
Out[138]: (3, 3, 2)
It looks like the 1st 2 arrays have the values you want, so it's just a matter of rearranging them.
In [141]: np.array(ga1[:2]).shape
Out[141]: (2, 3, 3, 2)
In [143]: gga1=np.array(ga1[:2]).transpose([1,2,0,3])
In [144]: gga1.shape
Out[144]: (3, 3, 2, 2)
In [145]: gga1[0,0]
Out[145]:
array([[ 1., -0.],
[-0., 1.]])
If they must go back into a (3,3) object array, I could do:
In [146]: goa1=np.empty([3,3],dtype=object)
In [147]: for i in range(3):
for j in range(3):
goa1[i,j]=gga1[i,j]
.....:
In [148]: goa1
Out[148]:
array([[array([[ 1., -0.],
[-0., 1.]]),
array([[ 1., -0.],
[ 0., 1.]]),
array([[ 1., -0.],
...
[ 0., 1.]]),
array([[ 1., 0.],
[ 0., 1.]])]], dtype=object)
I still wonder what's the point to working with a object array.

Categories