I am trying to do a grid search over my hyperparameters for tuning a deep learning architecture. I have multiple input options to the model and I am trying to use sklearn's grid search api. The problem is, grid search api only takes single array as input and the code fails while it checks for the data size dimension.(My input dimension is 5*number of data points while according to sklearn api, it should be number of data points*feature dimension). My code looks something like this:
from keras.layers import Concatenate, Reshape, Input, Embedding, Dense, Dropout
from keras.models import Model
from keras.wrappers.scikit_learn import KerasClassifier
def model(hyparameters):
a = Input(shape=(1,))
b = Input(shape=(1,))
c = Input(shape=(1,))
d = Input(shape=(1,))
e = Input(shape=(1,))
//Some operations and I get a single output -->out
model = Model([a, b, c, d, e], out)
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
return model
k_model = KerasClassifier(build_fn=model, epochs=150, batch_size=512, verbose=2)
# define the grid search parameters
param_grid = hyperparameter options dict
grid = GridSearchCV(estimator=k_model, param_grid=param_grid, n_jobs=-1)
grid_result = grid.fit([a_input, b_input, c_input, d_input, e_input], encoded_outputs)
this is workaround to use GridSearch and Keras model with multiple inputs. the trick consists in merge all the inputs in a single array. I create a dummy model that receives a SINGLE input and then split it into the desired parts using Lambda layers. the procedure can be easily modified according to your own data structure
def createMod(optimizer='Adam'):
combi_input = Input((3,)) # (None, 3)
a_input = Lambda(lambda x: tf.expand_dims(x[:,0],-1))(combi_input) # (None, 1)
b_input = Lambda(lambda x: tf.expand_dims(x[:,1],-1))(combi_input) # (None, 1)
c_input = Lambda(lambda x: tf.expand_dims(x[:,2],-1))(combi_input) # (None, 1)
## do something
c = Concatenate()([a_input, b_input, c_input])
x = Dense(32)(c)
out = Dense(1,activation='sigmoid')(x)
model = Model(combi_input, out)
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics='accuracy')
return model
## recreate multiple inputs
n_sample = 1000
a_input, b_input, c_input = [np.random.uniform(0,1, n_sample) for _ in range(3)]
y = np.random.randint(0,2, n_sample)
## merge inputs
combi_input = np.stack([a_input, b_input, c_input], axis=-1)
model = tf.keras.wrappers.scikit_learn.KerasClassifier(build_fn=createMod, verbose=0)
batch_size = [10, 20]
epochs = [10, 5]
optimizer = ['adam','SGD']
param_grid = dict(batch_size=batch_size, epochs=epochs)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(combi_input, y)
Another simple and valuable solution
Related
I am creating a custom loss function following the instruction found here. When I add validation_data, I get an error message on ValueError. When I set validation_data=None, this error disappears. I found a similar question on Stackoverflow, but I think my issue is different because I am trying to use a custom loss function.
Here is my code:
from tensorflow.keras.layers import *
from tensorflow.keras.models import Model
import numpy as np
import tensorflow.keras.backend as K
from tensorflow.keras import regularizers
def loss_fcn(y_true, y_pred, w):
loss = K.mean(K.square((y_true-y_pred)*w))
return loss
# since tensor flow sets the batch_size default to 32. The number of samples have to be a multiple of 32 when it is great than 32.
data_x = np.random.rand(32, 51)
data_w = np.random.rand(32, 5)
data_y = np.random.rand(32, 5)
val_x = np.random.rand(4, 51)
val_w = np.random.rand(4, 5)
val_y = np.random.rand(4, 5)
input_x = Input(shape=(51,), name="input")
y_true = Input(shape=(5,), name="true_y")
w = Input(shape=(5,), name="weights")
out = Dense(128, kernel_regularizer=regularizers.l2(0.001), name="HL1")(input_x)
y = Dense(5, name="HL2", activation="tanh")(out)
model = Model(inputs=[input_x, y_true, w], outputs=y)
model.add_loss(loss_fcn(y_true, y, w))
model.compile()
model.fit((data_x, data_y, data_w), validation_data=(val_x, val_y, val_w))
The error message:
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 3 array(s), but instead got the following list of 1 arrays: [array([[0.74785946, 0.63599707, 0.45929641, 0.98855504, 0.84815295,
0.28217452, 0.93502174, 0.23942027, 0.11885888, 0.32092279,
0.47407394, 0.19737623, 0.85962504, 0.35906666, 0.22262...
Instead of tuples, make the training and validation data as list:
model.fit([data_x, data_y, data_w], validation_data=[val_x, val_y, val_w])
You model has 3 inputs and one output.
The arguments to model fit should be:
x = list (or tuple) of 3 tensors / arrays
y = output values
validation_data = tuple(list of 3 inputs, output values)
I try to develop a network, and use python generator as data provider. Everything looks OK until the model starts to fit, then I receive this error:
ValueError: `y` argument is not supported when using dataset as input.
I proofed every line and, I think the problem is in the format of x_test and y_test feed to the network. After hours of googling, and changing the format several times, the error is still there.
Can you help me to fix it? You can find the whole code below:
import os
import numpy as np
import pandas as pd
import re # To match regular expression for extracting labels
import tensorflow as tf
print(tf.__version__)
def xfiles(filename):
if re.match('^\w{12}_x\.csv$', filename) is None:
return False
else:
return True
def data_generator():
folder = "i:/Stockpred/csvdbase/datasets/DS0002"
file_list = os.listdir(folder)
x_files = list(filter(xfiles, file_list))
x_files.sort()
np.random.seed(1729)
np.random.shuffle(x_files)
for file in x_files:
filespec = folder + '/' + file
xs = pd.read_csv(filespec, header=None)
yfile = file.replace('_x', '_y')
yfilespec = folder + '/' + yfile
ys = pd.read_csv(open(yfilespec, 'r'), header=None, usecols=[1])
xs = np.asarray(xs, dtype=np.float32)
ys = np.asarray(ys, dtype=np.float32)
for i in range(xs.shape[0]):
yield xs[i][1:169], ys[i][0]
dataset = tf.data.Dataset.from_generator(
data_generator,
(tf.float32, tf.float32),
(tf.TensorShape([168, ]), tf.TensorShape([])))
dataset = dataset.shuffle(buffer_size=16000, seed=1729)
# dataset = dataset.batch(4000, drop_remainder=True)
dataset = dataset.cache('R:/Temp/model')
def is_test(i, d):
return i % 4 == 0
def is_train(i, d):
return not is_test(i, d)
recover = lambda i, d: d
test_dataset = dataset.enumerate().filter(is_test).map(recover)
train_dataset = dataset.enumerate().filter(is_train).map(recover)
x_test = test_dataset.map(lambda x, y: x)
y_test = test_dataset.map(lambda x, y: y)
x_train = train_dataset.map(lambda x, y: x)
y_train = train_dataset.map(lambda x, y: y)
print(x_train.element_spec)
print(y_train.element_spec)
print(x_test.element_spec)
print(y_test.element_spec)
# define an object (initializing RNN)
model = tf.keras.models.Sequential()
# first LSTM layer
model.add(tf.keras.layers.LSTM(units=168, activation='relu', return_sequences=True, input_shape=(168, 1)))
# dropout layer
model.add(tf.keras.layers.Dropout(0.2))
# second LSTM layer
model.add(tf.keras.layers.LSTM(units=168, activation='relu', return_sequences=True))
# dropout layer
model.add(tf.keras.layers.Dropout(0.2))
# third LSTM layer
model.add(tf.keras.layers.LSTM(units=80, activation='relu', return_sequences=True))
# dropout layer
model.add(tf.keras.layers.Dropout(0.2))
# fourth LSTM layer
model.add(tf.keras.layers.LSTM(units=120, activation='relu'))
# dropout layer
model.add(tf.keras.layers.Dropout(0.2))
# output layer
model.add(tf.keras.layers.Dense(units=1))
model.summary()
# compile the model
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(x_train.as_numpy_iterator(), y_train.as_numpy_iterator(), batch_size=32, epochs=100)
predicted_stock_price = model.predict(x_test)
everything looks OK until the model starts to fit. and i reciev this error:
ValueError: `y` argument is not supported when using dataset as input.
Can you help to fix it?
As the docs say:
y - Target data. Like the input data x, it could be either Numpy array(s) or TensorFlow tensor(s). It should be consistent with x (you cannot have Numpy inputs and tensor targets, or inversely). If x is a dataset, generator, or keras.utils.Sequence instance, y should not be specified (since targets will be obtained from x).
So, I suppose you should have one generator serving tuples of sample and label.
If you are providing Dataset as input, then
type(train_dataset) should be tensorflow.python.data.ops.dataset_ops.BatchDataset
if so, simply feed this Dataset (which includes your X and y bundle) into the model,
model.fit(train_dataset, batch_size=32, epochs=100)
(Yes, this is a little different convention than how we did in sklearn - X and y separately.)
meanwhile, if you want tensorflow to explicitly use a separate dataset for validation, you must use the kwarg like:
model.fit(train_dataset, validation_data=val_dataset, batch_size=32, epochs=100)
where val_dataset is a separate dataset you had spared for validation during model training. (Not test).
use model.fit_generator, and use tuples (x,y) of input data and labels. So altogether:
model.fit_generator(train_dataset.as_numpy_iterator(),epochs=100)
I want to get the activation values for a given input of a trained LSTM network, specifically the values for the cell, the input gate, the output gate and the forget gate. According to this Keras issue and this Stackoverflow question I'm able to get some activation values with the following code:
(basically I'm trying to classify 1-dimensional timeseries using one label per timeseries, but that doesn't really matter for this general question)
import random
from pprint import pprint
import keras.backend as K
import numpy as np
from keras.layers import Dense
from keras.layers.recurrent import LSTM
from keras.models import Sequential
from keras.utils import to_categorical
def getOutputLayer(layerNumber, model, X):
return K.function([model.layers[0].input],
[model.layers[layerNumber].output])([X])
model = Sequential()
model.add(LSTM(10, batch_input_shape=(1, 1, 1), stateful=True))
model.add(Dense(2, activation='softmax'))
model.compile(
loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
# generate some test data
for i in range(10):
# generate a random timeseries of 100 numbers
X = np.random.rand(10)
X = X.reshape(10, 1, 1)
# generate a random label for the whole timeseries between 0 and 1
y = to_categorical([random.randint(0, 1)] * 10, num_classes=2)
# train the lstm for this one timeseries
model.fit(X, y, epochs=1, batch_size=1, verbose=0)
model.reset_states()
# to keep the output simple use only 5 steps for the input of the timeseries
X_test = np.random.rand(5)
X_test = X_test.reshape(5, 1, 1)
# get the activations for the output lstm layer
pprint(getOutputLayer(0, model, X_test))
Using that I get the following activation values for the LSTM layer:
[array([[-0.04106992, -0.00327154, -0.01524276, 0.0055838 , 0.00969929,
-0.01438944, 0.00211149, -0.04286387, -0.01102304, 0.0113989 ],
[-0.05771339, -0.00425535, -0.02032563, 0.00751972, 0.01377549,
-0.02027745, 0.00268653, -0.06011265, -0.01602218, 0.01571197],
[-0.03069103, -0.00267129, -0.01183739, 0.00434298, 0.00710012,
-0.01082268, 0.00175544, -0.0318702 , -0.00820942, 0.00871707],
[-0.02062054, -0.00209525, -0.00834482, 0.00310852, 0.0045242 ,
-0.00741894, 0.00141046, -0.02104726, -0.0056723 , 0.00611038],
[-0.05246543, -0.0039417 , -0.01877101, 0.00691551, 0.01250046,
-0.01839472, 0.00250443, -0.05472757, -0.01437504, 0.01434854]],
dtype=float32)]
So I get for each input value 10 values, because I specified in the Keras model to use a LSTM with 10 neurons. But which one is a cell, which is is the input gate, which one the output gate, which one the forget gate?
Well, these are the output values, to get and look into the value of each gate look into this issue
I paste the essential part here
for i in range(epochs):
print('Epoch', i, '/', epochs)
model.fit(cos,
expected_output,
batch_size=batch_size,
verbose=1,
nb_epoch=1,
shuffle=False)
for layer in model.layers:
if 'LSTM' in str(layer):
print('states[0] = {}'.format(K.get_value(layer.states[0])))
print('states[1] = {}'.format(K.get_value(layer.states[1])))
print('Input')
print('b_i = {}'.format(K.get_value(layer.b_i)))
print('W_i = {}'.format(K.get_value(layer.W_i)))
print('U_i = {}'.format(K.get_value(layer.U_i)))
print('Forget')
print('b_f = {}'.format(K.get_value(layer.b_f)))
print('W_f = {}'.format(K.get_value(layer.W_f)))
print('U_f = {}'.format(K.get_value(layer.U_f)))
print('Cell')
print('b_c = {}'.format(K.get_value(layer.b_c)))
print('W_c = {}'.format(K.get_value(layer.W_c)))
print('U_c = {}'.format(K.get_value(layer.U_c)))
print('Output')
print('b_o = {}'.format(K.get_value(layer.b_o)))
print('W_o = {}'.format(K.get_value(layer.W_o)))
print('U_o = {}'.format(K.get_value(layer.U_o)))
# output of the first batch value of the batch after the first fit().
first_batch_element = np.expand_dims(cos[0], axis=1) # (1, 1) to (1, 1, 1)
print('output = {}'.format(get_LSTM_output([first_batch_element])[0].flatten()))
model.reset_states()
print('Predicting')
predicted_output = model.predict(cos, batch_size=batch_size)
print('Ploting Results')
plt.subplot(2, 1, 1)
plt.plot(expected_output)
plt.title('Expected')
plt.subplot(2, 1, 2)
plt.plot(predicted_output)
plt.title('Predicted')
plt.show()
My model weights (I output them to weights_before.txt and weights_after.txt) are precisely the same before and after the training, i.e. the training doesn't change anything, there's no fitting happening.
My data look like this (I basically want the model to predict the sign of feature, result is 0 if feature is negative, 1 if positive):
,feature,zerosColumn,result
0,-5,0,0
1,5,0,1
2,-3,0,0
3,5,0,1
4,3,0,1
5,3,0,1
6,-3,0,0
...
Brief summary of my approach:
Load the data.
Split it column-wise to x (feature) and y (result), split these two row-wise to test and validation sets.
Transform these sets into TimeseriesGenerators (not necessary in this scenario but I want to get this setup working and I don't see any reason why it shouldn't).
Create and compile simple Sequential model with few Dense layers and softmax activation on its output layer, use binary_crossentropy as loss function.
Train the model... nothing happens!
Complete code follows:
import keras
import pandas as pd
import numpy as np
np.random.seed(570)
TIMESERIES_LENGTH = 1
TIMESERIES_SAMPLING_RATE = 1
TIMESERIES_BATCH_SIZE = 1024
TEST_SET_RATIO = 0.2 # the portion of total data to be used as test set
VALIDATION_SET_RATIO = 0.2 # the portion of total data to be used as validation set
RESULT_COLUMN_NAME = 'feature'
FEATURE_COLUMN_NAME = 'result'
def create_network(csv_path, save_model):
before_file = open("weights_before.txt", "w")
after_file = open("weights_after.txt", "w")
data = pd.read_csv(csv_path)
data[RESULT_COLUMN_NAME] = data[RESULT_COLUMN_NAME].shift(1)
data = data.dropna()
x = data.ix[:, 1:2]
y = data.ix[:, 3]
test_set_length = int(round(len(x) * TEST_SET_RATIO))
validation_set_length = int(round(len(x) * VALIDATION_SET_RATIO))
x_train_and_val = x[:-test_set_length]
y_train_and_val = y[:-test_set_length]
x_train = x_train_and_val[:-validation_set_length].values
y_train = y_train_and_val[:-validation_set_length].values
x_val = x_train_and_val[-validation_set_length:].values
y_val = y_train_and_val[-validation_set_length:].values
train_gen = keras.preprocessing.sequence.TimeseriesGenerator(
x_train,
y_train,
length=TIMESERIES_LENGTH,
sampling_rate=TIMESERIES_SAMPLING_RATE,
batch_size=TIMESERIES_BATCH_SIZE
)
val_gen = keras.preprocessing.sequence.TimeseriesGenerator(
x_val,
y_val,
length=TIMESERIES_LENGTH,
sampling_rate=TIMESERIES_SAMPLING_RATE,
batch_size=TIMESERIES_BATCH_SIZE
)
model = keras.models.Sequential()
model.add(keras.layers.Dense(10, activation='relu', input_shape=(TIMESERIES_LENGTH, 1)))
model.add(keras.layers.Dropout(0.2))
model.add(keras.layers.Dense(10, activation='relu'))
model.add(keras.layers.Dropout(0.2))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(1, activation='softmax'))
for item in model.get_weights():
before_file.write("%s\n" % item)
model.compile(
loss=keras.losses.binary_crossentropy,
optimizer="adam",
metrics=[keras.metrics.binary_accuracy]
)
history = model.fit_generator(
train_gen,
epochs=10,
verbose=1,
validation_data=val_gen
)
for item in model.get_weights():
after_file.write("%s\n" % item)
before_file.close()
after_file.close()
create_network("data/sign_data.csv", False)
Do you have any ideas?
The problem is that you are using softmax as the activation function of last layer. Essentially, softmax normalizes its input to make the sum of the elements to be one. Therefore, if you use it on a layer with only one unit (i.e. Dense(1,...)), then it would always output 1. To fix this, change the activation function of last layer to sigmoid which outputs a value in the range (0,1).
I'm trying to build a class to quickly initialize and train an autoencoder for rapid prototyping. One thing I'd like to be able to do is quickly adjust the number of epochs I train for. However, it seems like no matter what I do, the model trains each layer for 100 epochs! I'm using the tensorflow backend.
Here is the code from the two offending methods.
def pretrain(self, X_train, nb_epoch = 10):
data = X_train
for ae in self.pretrains:
ae.fit(data, data, nb_epoch = nb_epoch)
ae.layers[0].output_reconstruction = False
ae.compile(optimizer='sgd', loss='mse')
data = ae.predict(data)
.........
def fine_train(self, X_train, nb_epoch):
weights = [ae.layers[0].get_weights() for ae in self.pretrains]
dims = self.dims
encoder = containers.Sequential()
decoder = containers.Sequential()
## add special input encoder
encoder.add(Dense(output_dim = dims[1], input_dim = dims[0],
weights = weights[0][0:2], activation = 'linear'))
## add the rest of the encoders
for i in range(1, len(dims) - 1):
encoder.add(Dense(output_dim = dims[i+1],
weights = weights[i][0:2], activation = self.act))
## add the decoders from the end
decoder.add(Dense(output_dim = dims[len(dims) - 2], input_dim = dims[len(dims) - 1],
weights = weights[len(dims) - 2][2:4], activation = self.act))
for i in range(len(dims) - 2, 1, -1):
decoder.add(Dense(output_dim = dims[i - 1],
weights = weights[i-1][2:4], activation = self.act))
## add the output layer decoder
decoder.add(Dense(output_dim = dims[0],
weights = weights[0][2:4], activation = 'linear'))
masterAE = AutoEncoder(encoder = encoder, decoder = decoder)
masterModel = models.Sequential()
masterModel.add(masterAE)
masterModel.compile(optimizer = 'sgd', loss = 'mse')
masterModel.fit(X_train, X_train, nb_epoch = nb_epoch)
self.model = masterModel
Any suggestions on how to fix the problem would be appreciated. My original suspicion was that it was something to do with tensorflow, so I tried running with the theano backend but encountered the same problem.
Here is a link to the full program.
Following the Keras doc, the fit method uses a default of 100 training epochs (nb_epoch=100):
fit(X, y, batch_size=128, nb_epoch=100, verbose=1, callbacks=[], validation_split=0.0, validation_data=None, shuffle=True, show_accuracy=False, class_weight=None, sample_weight=None)
I'm sure how you are running these methods, but following the "Typical usage" from the original code, you should be able to run something like (adjusting the variable num_epoch as required):
#Typical usage:
num_epoch = 10
ae = JPAutoEncoder(dims)
ae.pretrain(X_train, nb_epoch = num_epoch)
ae.train(X_train, nb_epoch = num_epoch)
ae.predict(X_val)