I am trying to implement a custom rbf kernel function. However I am getting the following error. I am not sure why it is expected a certain amount of data points?
Error occurs in this line of code:
rbf_y = rbf_kernel.predict(X_test)
Code
def myKernel(x,y):
pairwise_dists = squareform(pdist(x, 'euclidean'))
K = scip.exp(-pairwise_dists ** 2 / .08 ** 2)
return K
rbf_kernel = svm.SVC(kernel=myKernel, C=1).fit(X_train, Y_train.ravel())
rbf_y = rbf_kernel.predict(X_test)
rbf_accuracy = accuracy_score(Y_test, rbf_y)
Error:
ValueError: X.shape[1] = 15510 should be equal to 31488, the number of samples at training time
Data Shape
X_train shape: (31488, 128)
X_test shape: (15510, 128)
Y_train shape: (31488, 1)
Y_test shape: (15510, 1)
Return Shape from Kernel
myKernel(X_train, X_train).shape = (31488, 31488)
A custom kernel kernel(X, Y) should compute a similarity measure between the matrix X and the matrix Y, and the output should be of shape [X.shape[0], Y.shape[0]]. Your kernel function ignores Y, and returns a matrix of shape [X.shape[0], X.shape[0]], which leads to the error you are seeing.
To fix the issue, implement a kernel function that computes a kernel matrix of the correct shape. Scikit-learn's custom kernels documentation has some simple examples of how this might work.
In the case of your specific kernel, you might try cdist(x, y) in place of squareform(pdist(x)).
Related
I've encountered a problem while writing Siamese net. Definition of the net takes as an input 2 vectors which represents 2 pieces of text. The vectors length is padded and different with respect to batches (in batch 1: vectors length = 32, in batch 2: vectors length = 64 and so on).
# model definition
def create_model(vocab_size=512, d_model=128):
def normalize(x):
norm = tf.norm(x, axis=-1, keepdims=True)
return tf.divide(x, norm)
component = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, d_model),
tf.keras.layers.LSTM(d_model),
tf.keras.layers.Lambda(lambda x: tf.reduce_mean(x, axis=1)),
tf.keras.layers.Lambda(normalize),
])
# due to the variability in text, input shape differs with respect to batch
inputs = [tf.keras.Input(shape=(None,)) for _ in range(2)]
outputs = tf.tuple([component(ins) for ins in inputs])
return tf.keras.Model(inputs=inputs, outputs=outputs)
# loss function
class MyLoss(tf.keras.losses.Loss):
def __init__(self):
super().__init__(name='TripletLoss')
def call(self, y_true, y_pred):
# >>> HERE IS THE PROBLEM, y_pred has different shape then I'd expect,
# its shape is (batch_size,) instead of (2, batch_size)
l, r = y_pred
# compute and return loss
return loss
When calling Model#fit(loss=MyLoss(), ...) the parameter passed to the MyLoss#call is a projection of the first coordinate of the model prediction, i.e. model.predict(z) returns [x, y] where x, y are vectors with length equal to the batch size. I'd expected that y_pred passed as a parameter to Loss#call would have had that exact value, that is [x,y], but it equals to the first vector of the given list, that is x. Furthermore I've looked up at the call stack and I've spotted that before y_pred is passed to the MyLoss#call it has expected value ([x,y]) which changes to the x in the keras' Loss.__call__ body.
I tried to reshape input, but other problems arised.
I'm doing sequence classification, I've got batch sizes of 1, 5 outcomes, and variable time steps (14 in this example). My sample weights w are the same shape as my label y:
y = tf.convert_to_tensor(np.ones(shape = (1,14,5)))
w = tf.convert_to_tensor(np.random.uniform(size = (1,14,5)))
y.shape
Out[53]: TensorShape([1, 14, 5])
w.shape
Out[54]: TensorShape([1, 14, 5])
When I try to run this through the loss function, I get the following error:
loss_object = tf.keras.losses.BinaryCrossentropy(from_logits=False)
loss_object(y_true = y,
y_pred = y,
sample_weight = w)
InvalidArgumentError: Can not squeeze dim[2], expected a dimension of 1, got 5 [Op:Squeeze]
what's going on? It should be a straightforward multiplication of a loss matrix (pre-reduction) with the weights. How to fix?
Super simple fix! Tensorflow squeezes the last dimension of the sample weights because they are supposed to be applied per sample, therefore, all you need to do is add one dimension to your weight matrix along the last axis:
y = tf.convert_to_tensor(np.ones(shape = (1,14,5)))
w = tf.convert_to_tensor(np.random.uniform(size = (1,14,5,1))) # Change made here
loss_object = tf.keras.losses.BinaryCrossentropy(from_logits=False)
loss_object(y_true = y,
y_pred = y,
sample_weight = w)
You can also just change the shape of the weights matrix after creation:
w = tf.expand_dims(w, axis=-1)
I am writing a custom loss function for semi supervised learning on cifar-10 dataset, for which I need to duplicate columns of my tensor for creating a sort of mask which I then multiply with the activation values to later sum over.
My loss function is a sum of entropy and cross entropy for unlabelled and labeled samples. I add an extra class and set it to 1 for unlabelled samples.
I then create a mask for identifying row indices of unlabelled samples from the y_true tensor. From that I should get a (n_samples, 1) tensor which I need to repeat/duplicate/copy to a (n_samples, 11) tensor that I can multiply with the activation values in y_pred
Loss function code:
a = np.ones((mini_batch_size, 1)) * 10
a_var = K.variable(value=a)
v = K.cast(K.equal(K.cast(K.argmax(y_true, axis=1), 'float32'), a_var), 'float32')
e_loss = K.sum(K.concatenate([v,v,v,v,v,v,v,v,v,v,v], axis=-1) * K.log(y_pred) * y_pred)
m_u = K.sum(K.cast(K.equal(K.cast(K.argmax(y_true, axis=1), 'float32'), a_var), 'float32'))
b = np.ones((mini_batch_size, 1)) * 10
b_var = K.variable(value=b)
v2 = K.cast(K.not_equal(K.cast(K.argmax(y_true, axis=1), 'float32'), b_var), 'float32')
ce_loss = K.sum(K.concatenate([v2, v2, v2, v2, v2, v2, v2, v2, v2, v2, v2], axis=1) * K.log(y_pred))
m_l = K.variable(value=float(mini_batch_size), dtype='float32') #- m_u
return -((e_loss/m_u) + (ce_loss/m_l))
The error I get is:
InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: Incompatible shapes: [40,11] vs. [40,440]
[[{{node loss_36/dense_74_loss/mul_2}}]]
[[metrics_28/acc/Mean/_2627]]
(1) Invalid argument: Incompatible shapes: [40,11] vs. [40,440]
[[{{node loss_36/dense_74_loss/mul_2}}]]
0 successful operations.
0 derived errors ignored.
My batch size is 40.
I need my concatenated tensor to be of size [40, 11] not [40, 440]
I don't have real data to test whether the loss properly works, but this got rid of that InvalidArgumentError and did work with model.fit() for a dense model.
Few changes I did,
You don't have to repeat your v 11 times to multiply that with y_pred. All you need is reshape it to (-1,1) - (Will save you memory)
Got rid of all the K.variables. Now this is something I want to check with you, you are not trying to optimize a_var and b_var right (i.e. that's not a part of the model)? (Apparently, that's what's causing the issue. I need to dive deeper to see why). It seems the whole point of a_var and b_var is to perform boolean logics equal and not_equal, which works just fine with the constant.
Made m_l a K.constant
def loss_fn(y_true, y_pred):
v = K.cast(K.equal(K.cast(K.argmax(y_true, axis=-1), 'float32'), 10), 'float32')
e_loss = K.sum(K.reshape(v, (-1,1)) * K.log(y_pred) * y_pred)
m_u = K.sum(K.cast(K.equal(K.cast(K.argmax(y_true, axis=-1), 'float32'), 10), 'float32'))
v2 = K.cast(K.not_equal(K.cast(K.argmax(y_true, axis=-1), 'float32'), 10), 'float32')
ce_loss = K.sum(K.reshape(v2, (-1,1)) * K.log(y_pred))
m_l = K.constant(value=float(mini_batch_size), dtype='float32') #- m_u
return -((e_loss/m_u) + (ce_loss/m_l))
Note: Depending on the batch size within the loss function is a bad idea. Try to get rid of any batch_size dependent operations (especially for shape of tensors). You can see that I only have kept mini_batch_size to set m_l. But I would suggest setting this to some constant instead of min_batch_size. Because, if a batch with <40 comes through, you are using a different loss function for that batch. And your results aren't comparable between different batch sizes, as your loss function changes.
I am trying to filter a TensorFlow tensor of shape (N_batch, N_data), where N_batch is the batch size (e.g. 32), and N_data is the size of the (noisy) timeseries array. I have a Gaussian kernel (taken from here), which is one-dimensional. I then want to use tensorflow.nn.conv1d to convolve this kernel with my signal.
I have been trying for most of the morning to get the dimensions of the input signal and the kernel right, but obviously with no success. From what I gathered from the interwebs, the dimensions of both the input signal and the kernel need to be aligned in some finicky way, and I just can't figure out which way that is. The TensorFlow error messages aren't particularly meaningful either (Shape must be rank 4 but is rank 3 for 'conv1d/Conv2D' (op: 'Conv2D') with input shapes: [?,1,1000], [1,81]). Below I've included a little piece of code to reproduce the situation:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
# Based on: https://stackoverflow.com/a/52012658/1510542
# Credits to #zephyrus
def gaussian_kernel(size, mean, std):
d = tf.distributions.Normal(tf.cast(mean, tf.float32), tf.cast(std, tf.float32))
vals = d.prob(tf.range(start=-size, limit=size+1, dtype=tf.float32))
kernel = vals # Some reshaping is required here
return kernel / tf.reduce_sum(kernel)
def gaussian_filter(input, sigma):
size = int(4*sigma + 0.5)
x = input # Some reshaping is required here
kernel = gaussian_kernel(size=size, mean=0.0, std=sigma)
conv = tf.nn.conv1d(x, kernel, stride=1, padding="SAME")
return conv
def run_filter():
tf.reset_default_graph()
# Define size of data, batch sizes
N_batch = 32
N_data = 1000
noise = 0.2 * (np.random.rand(N_batch, N_data) - 0.5)
x = np.linspace(0, 2*np.pi, N_data)
y = np.tile(np.sin(x), N_batch).reshape(N_batch, N_data)
y_noisy = y + noise
input = tf.placeholder(tf.float32, shape=[None, N_data])
smooth_input = gaussian_filter(input, sigma=10)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
y_smooth = smooth_input.eval(feed_dict={input: y_noisy})
plt.plot(y_noisy[0])
plt.plot(y_smooth[0])
plt.show()
if __name__ == "__main__":
run_filter()
Any ideas?
You need to add channel dimensions to your input/kernel, since TF convolutions are generally used for multi-channel inputs/outputs. As you are working with simple 1-channel input/output this amounts to just adding some size-1 "dummy" axes.
Since by default convolution expects channels to come last, your placeholder should have shape [None, N_data, 1] and your input be modified like
y_noisy = y + noise
y_noisy = y_noisy[:, :, np.newaxis]
Similarly, you need to add input and output channel dimensions to your filter:
kernel = gaussian_kernel(size=size, mean=0.0, std=sigma)
kernel = kernel[:, tf.newaxis, tf.newaxis]
That is, the filter is expected to have shape [width, in_channels, out_cannels].
I'm using Theano for classification (convolutional neural networks)
Previously, I've been using the pixel values of the (flattened) image as the features of the NN.
Now, I want to add additional features. I've been told that I can concatenate that vector of additional features to the flattened image features and then use that as input to the fully-connected layer, but I'm having trouble with that.
First of all, is that the right approach?
Here's some code snippets and my errors:
Similar to the provided example from their site with some modifications
(from the class that builds the model)
# allocate symbolic variables for the data
self.x = T.matrix('x') # the data is presented as rasterized images
self.y = T.ivector('y') # the labels are presented as 1D vector of [int] labels
self.f = T.matrix('f') # additional features
Below, variables v and rng are defined previously. What's important is layer2_input:
layer2_input = self.layer1.output.flatten(2)
layer2_input = T.concatenate([layer2_input, self.f.flatten(2)])
self.layer2 = HiddenLayer(rng, input=layer2_input, n_in=v, n_out=200, activation=T.tanh)
(from the class that trains)
train_model = theano.function([index], cost, updates=updates,
givens={
model.x: train_set_x[index * batch_size: (index + 1) * batch_size],
model.y: train_set_y[index * batch_size: (index + 1) * batch_size],
model.f: train_set_f[index * batch_size: (index + 1) * batch_size]
})
However, I get an error when the train_model is called:
ValueError: GpuJoin: Wrong inputs for input 1 related to inputs 0.!
Apply node that caused the error: GpuJoin(TensorConstant{0}, GpuElemwise{tanh,no_inplace}.0, GpuFlatten{2}.0)
Inputs shapes: [(), (5, 11776), (5, 2)]
Inputs strides: [(), (11776, 1), (2, 1)]
Inputs types: [TensorType(int8, scalar), CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix)]
Do the input shapes represent the shapes of x, y and f, respectively?
If so, the third seems correct (batchsize=5, 2 extra features), but why is the first a scalar and the second a matrix?
More details:
train_set_x.shape = (61, 19200) [61 flattened images (160x120), 19200 pixels]
train_set_y.shape = (61,) [61 integer labels]
train_set_f.shape = (61,2) [2 additional features per image]
batch_size = 5
Do I have the right idea or is there a better way of accomplishing this?
Any insights into why I'm getting an error?
Issue was that I was concatenating on the wrong axis.
layer2_input = T.concatenate([layer2_input, self.f.flatten(2)])
should have been
layer2_input = T.concatenate([layer2_input, self.f.flatten(2)], axis=1)