In the following code what does torch.cat really do. I know it concatenates the batch which is contained in the sample but why do we have to do that and what does concatenate really mean.
# memory is just a list of events
def sample(self, batch_size):
samples = zip(*random.sample(self.memory, batch_size))
return map(lambda x: Variable(torch.cat(x,0)))
torch.cat concatenates as the name suggests along specified dimension.
Example from documentation will tell you everything you need to know:
x = torch.randn(2, 3) # shape (2, 3)
catted = torch.cat((x, x, x), dim=0) # shape (6, 3), e.g. 3 x stacked on each other
Remember concatenated tensors need to have the same dimension except the one along which you are concatenating.
In the above example it doesn't do anything though and isn't even viable as it lacks second argument (inputs to apply map to), see here.
Assuming you would do this mapping instead:
map(lambda x: Variable(torch.cat(x,0)), samples)
It would create a new tensor of shape [len(samples), x_dim_1, x_dim_2, ...] provided all samples have the same dimensionality except 0.
Still it is pretty convoluted example and definitely shouldn't be done like that (torch.autograd.Variable is deprecated, see here), this should be enough:
# assuming random.sample returns either `list` or `tuple`
def sample(self, batch_size):
return torch.cat(random.sample(self.memory, batch_size), dim=0)
Related
I need to perform a loop in parallel wit GPUs of a function that computes independently the rows of a matrix. I was using map_fn, but to be able to have the parallel computing enabled with the eager execution, as far as I understand, I've to use the while_loop function.
Unfortunately I find not very intuitive how to use this function, so I'm kindly asking to you how to convert map_fn to while_loop in my code. Here a simplified version of the code:
*some 1-D float tensors*
def compute_row(ithStep):
*operations on the 1-D tensors that return a 1-D tensor with fixed length*
return values
image = tf.map_fn(compute_row, tf.range(0,nRows))
The version with while_loop I wrote, following the example in the documentation and other questions here on Stackoverflow is:
*some 1-D float tensors*
def compute_row(i):
*operations on the 1-D tensors that return a 1-D tensor with fixed length*
return values
def condition(i):
return tf.less(i, nRows)
i = tf.constant(0)
image = tf.while_loop(condition, compute_row, [i])
But in this case what I obtain is:
ValueError: The two structures don't have the same nested structure.
First structure: type=list str=[TensorSpec(shape=(), dtype=tf.int32, name=None)]
Second structure: type=list ... *a long list of tensors*
Where is the mistake? Thanks in advance. If needed I can provide a simplified runnable code.
EDIT: adding below the runnable code
import numpy
import tensorflow as tf
from matplotlib import pyplot
#Defining the data which normally are loaded from file:
#1- matrix of x position-time values, with weights, in sparse format
matrix = numpy.random.randint(2, size = 100).astype(float).reshape(10,10)
x = numpy.nonzero(matrix)[0]
times = numpy.nonzero(matrix)[1]
weights = numpy.random.rand(x.size)
#2- array of y positions
nStepsY = 5
y = numpy.arange(1,nStepsY+1)
#3- the size of the final matrix
nRows = nStepsY
nColumns = 80
# Building the TF tensors
x = tf.constant(x, dtype = tf.float32)
times = tf.constant(times, dtype = tf.float32)
weights = tf.constant(weights, dtype = tf.float32)
y = tf.constant(y, dtype = tf.float32)
# the function to iterate
def compute_row(i):
yTimed = tf.multiply(y[i],times)
positions = tf.round((x-yTimed)+50)
positions = tf.cast(positions, dtype=tf.int32)
values = tf.math.unsorted_segment_sum(weights, positions, nColumns)
return values
image = tf.map_fn(compute_row, tf.range(0,nRows), dtype=tf.float32)
%matplotlib inline
pyplot.imshow(image, aspect = 10)
pyplot.colorbar(shrink = 0.75,aspect = 10)
The output image is:
To construct a while loop, you need to define two functions:
the conditional function: when this function returns false, the loop stops
the loop body function, that performs the wanted operations. In your case, because you want to build a Tensor, you can see that as an accumulation function: the function takes the Tensor as an argument, and append a new row at the end.
Knowing that, we can define the two functions:
First, the loop body. Let's reuse compute_row function to compute the value of the new row based on the value of i, and append the new row to our accumulator using tf.concat. We make sure that the shapes are compatible for he concatenation by adding one dimension to the new row. We also increase the value of the counter i by 1.
def loop_body(i, accumulator):
new_row = compute_row(i)
accumulator = tf.concat([accumulator, new_row[tf.newaxis,:]],axis=0)
return i+1, accumulator
Next the condition: in that case, we just need to check that the value of i is not greater than the number of rows wanted.
def cond(i,accumulator):
return tf.less(i,nRows)
Note that the two functions, loop_body and cond must have the same signature. (That explains why cond takes a second unused argument).
Now, we can put that together in the while_loop call:
i0 = tf.constant(0) # we initialize the counter i at 0
# we initialize the accumulator with an empty Tensor of dimension 1 equal to 0.
accumulator = tf.zeros((0, nColumns))
final_i, image = tf.while_loop(cond, loop_body, loop_vars=[i0, accumulator])
to make sure that it reproduces the same values as the map_fn version, we can compare the two results:
>>> image_map = tf.map_fn(compute_row_map, tf.range(0, nRows), dtype=tf.float32)
>>> tf.reduce_all(tf.equal(image, image_map))
<tf.Tensor: shape=(), dtype=bool, numpy=True>
I have a problem. I get a task.
Create LinearRegression X to Y.
fit() a to reshape X and Y vectors new shape: (-1, 1).
This is part of my code
tuple1 = tuple(zip(X,Y))
np.reshape(tuple1, (-1, 1))
reg = LinearRegression().fit(tuple1)
I don't understand the question. The problem is the three last lines in my code. So first I should merge X and Y into a tuple to make reshape? But then I must use linear regression so I need X and Y which are not merged. I don't get it.
As the method fit() accepts properly shaped arrays, ...
The way it is defined, X is a 1D vector (X.shape gives (5,))
as scikit-learn fit() methods in general expect an array of vectors
So X is a problem, because that's not an array of vectors, but just a 1D vector.
reshape X and Y vectors by using the method reshape() and passing to it a tuple with a new shape: (-1, 1)
X.reshape(-1, 1).shape gives (5, 1), which is what we need. I see where you got confused: The "tuple" refers to the arguments of the reshape function (literally the tuple (-1, 1)), not to the result of the transformation.
Perform the reshaping on site (in the function call), keep the original vectors as they are.
Reshape in the function call: reg = LinearRegression().fit(X.reshape(-1, 1), Y), i.e. don't mess with the variables beforehand.
Note: X can stay the way it is, because that's ok as a 1D vector (only one dependent variable); so "you will have to reshape X and Y vectors" is not correct.
I have 2 arrays to concatenate:
X_train's shape is (3072, 50000)
y_train's shape is (50000,)
I'd like to concatenate them so I can shuffle the indices all in one go. I have tried the following, but neither works:
np.concatenate([X_train, np.transpose(y_train)])
np.column_stack([X_train, np.transpose(y_train)])
How can I concatenate them?
To give you some recommendation targeting the task, not your problem: don't do this!
Assuming X are your samples / observations, y are your targets:
Just generate a random-permutation and create views (nothing copied or modified) into those, e.g. (untested):
import numpy as np
X = np.random.random(size=(50000, 3072))
y = np.random.random(size=50000)
perm = np.random.permutation(X.shape[0]) # assuming X.shape[0] == y.shape[0]
X_perm = X[perm] # views!!!
y_perm = y[perm]
Reminder: your start-shapes are not compatible to most python-based ml-tools as the usual interpretation is:
first-dim / rows: samples
second-dim / cols: features
As #samples need to be the same as #target-values y, you will see that my example is correct in regards to this, while yours need a transpose on X
As DavidG said, I realized the answer is that y_train has shape (50000,) so I needed to reshape it before concat-ing
np.concatenate([X_train,
np.reshape(y_train, (1, 50000))])
Still, this evaluated very slowly in Jupyter. If there's a faster answer, I'd be grateful to have it
I want to perform a check for even and odd elements of the batch and swap them if needed. I managed to result with two tensors I want to interweave:
def tf_oplu(x, name=None):
even = x[:,::2] #slicing into odd and even parts on the batch
odd = x[:,1::2]
even_flatten = tf.reshape(even, [-1]) # flatten tensors
#in row-major order to apply function across them
odd_flatten = tf.reshape(odd, [-1])
compare = tf.to_float(even_flatten<odd_flatten)
compare_not = tf.to_float(even_flatten>=odd_flatten)
#def oplu(x,y): # trivial function
# if x<y : # (x<y)==1
# return y, x
# else:
# return x, y # (x<y)==0
even_flatten_new = odd_flatten * compare + even_flatten * compare_not
odd_flatten_new = odd_flatten * compare_not + even_flatten * compare
# convolute back
even_new = tf.reshape(even_flatten_new,[100,128])
odd_new = tf.reshape(odd_flatten_new,[100,128])
Now I want to get back $[100,256]$ tensor with even and odd places filled. In numpy I would of course do:
y = np.empty((even_new.size + odd_newsize,), dtype=even_new.dtype)
y[:,0::2] = even_new
y[:,1::2] = odd_new
return y
But such thing is not possible for tensoflow, as tensor is not modifiable. I suppose it is possible with either sparse tensor or tf.gather_nd, but both require generating array of indices, which is again non-trivial task for me.
One more note: I don not want to use any python functions via tf.py_func, as I checked that they run on CPU only. Maybe lambda and tf.map_fn may help somehow? Thanks!
To interleave two matrices vertically, you do not big guns such as gather or map_fn. You can simply interleave them as follows:
tf.reshape(
tf.stack([even_new, odd_new], axis=1),
[-1, tf.shape(even_new)[1]])
EDIT
To interleave them horizontally:
tf.reshape(
tf.concat([even_new[...,tf.newaxis], odd_new[...,tf.newaxis]], axis=-1),
[tf.shape(even_new)[0],-1])
The idea is to use stack to interleave them in memory. The dimension where the stack occurs gives the granularity of the interleaving. If we stack at axis=0, then the interleaving occurs at each element, mixing columns. If we stack at axis=1, entire input rows remain contiguous, interleaving occurs between rows.
you can use tf.dynamic_stitch, that takes as first argument a list of tensors of indices for each tensor to interleave and as second argument a list of tensors to interleave. The tensors will be interleaved along the first dimension so we need to transpose them and then transpose back. Here is the code:
even_new = tf.transpose(even_new,perm=[1,0])
odd_new = tf.transpose(odd_new,perm=[1,0])
even_pos = tf.convert_to_tensor(list(range(0,256,2)),dtype=tf.int32)
odd_pos = tf.convert_to_tensor(list(range(1,256,2)),dtype=tf.int32)
interleaved = tf.dynamic_stitch([even_pos,odd_pos],[even_new,odd_new])
interleaved = tf.transpose(interleaved,perm=[1,0])
You can use assign to assign into slices.
odd_new = tf.constant([1,3,5])
even_new = tf.constant([2,4,6])
y=tf.Variable(tf.zeros(6, dtype=tf.int32))
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
y[0::2].assign(odd_new).eval()
y[1::2].assign(even_new).eval()
I am attempting to implement an RNN and have output predictions p_y of shape (batch_size, time_points, num_classes). I also have a target_output of shape (batch_size, time_points), where the value at a given index of target_output is an integer denoting the class (a value between 0 and num_classes-1). How can I index p_y with target_output to get the probabilities of the given class I need to compute Cross-Entropy?
I'm not even sure how to do this in numpy. The expression p_y[target_output] does not give the desired results.
You need to use advanced indexing (search for "advanced indexing" here). But Theano advanced indexing behaves differently to numpy so knowing how to do this in numpy may not be all that helpful!
Here's a function which does this for my setup, but note that the order of my dimensions differs from yours. I use (time points, batch_size, num_classes). This also assumes you want to use the 1-of-N categorical cross-entropy variant. You may not want sequence length padding either.
def categorical_crossentropy_3d(coding_dist, true_dist, lengths):
# Zero out the false probabilities and sum the remaining true probabilities to remove the third dimension.
indexes = theano.tensor.arange(coding_dist.shape[2])
mask = theano.tensor.neq(indexes, true_dist.reshape((true_dist.shape[0], true_dist.shape[1], 1)))
predicted_probabilities = theano.tensor.set_subtensor(coding_dist[theano.tensor.nonzero(mask)], 0.).sum(axis=2)
# Pad short sequences with 1's (the pad locations are implicitly correct!)
indexes = theano.tensor.arange(predicted_probabilities.shape[0]).reshape((predicted_probabilities.shape[0], 1))
mask = indexes >= lengths
predicted_probabilities = theano.tensor.set_subtensor(predicted_probabilities[theano.tensor.nonzero(mask)], 1.)
return -theano.tensor.log(predicted_probabilities)