i tried to make a confusion matrix from the model that i make, all seems fine till making the model until i approach a error that says
ValueError: Found input variables with inconsistent numbers of
samples: [4, 304]
here are the code that i use
# Convert List to numpy array, for Keras use
Train_label = np.eye(n_labels)[label] # One-hot encoding by np array function
Train_data = np.array(data)
print("Dataset shape is",Train_data.shape, "(size, timestep, column, row, channel)")
print("Label shape is",Train_label.shape,"(size, label onehot vector)")
# shuffling dataset for input fit function
# if don`t, can`t train model entirely
x = np.arange(Train_label.shape[0])
np.random.shuffle(x)
# same order shuffle is needed
Train_label = Train_label[x]
Train_data = Train_data[x]
train_size = 0.9
X_train=Train_data[:int(Totalnb * 0.9),:]
Y_train=Train_label[:int(Totalnb * 0.9)]
X_test=Train_data[int(Totalnb * 0.1):,:]
Y_test=Train_label[int(Totalnb * 0.1):]
# 2. Buliding a Model
# declare input layer for CNN+LSTM architecture
video = Input(shape=(timesteps,img_col,img_row,img_channel))
STEPS_PER_EPOCH = 120
#AlexNet Layer
model = tf.keras.models.Sequential([
# 1st conv
tf.keras.layers.Conv2D(96, (11,11),strides=(4,4), activation='relu', input_shape=(img_col, img_row, img_channel)),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.MaxPooling2D(2, strides=(2,2)),
# 2nd conv
tf.keras.layers.Conv2D(256, (5,5),strides=(1,1), activation='relu',padding="same"),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.MaxPooling2D(2, strides=(2,2)),
# 3rd conv
tf.keras.layers.Conv2D(384, (3,3),strides=(1,1), activation='relu',padding="same"),
tf.keras.layers.BatchNormalization(),
# 4th conv
tf.keras.layers.Conv2D(384, (3,3),strides=(1,1), activation='relu',padding="same"),
tf.keras.layers.BatchNormalization(),
# 5th Conv
tf.keras.layers.Conv2D(256, (3, 3), strides=(1, 1), activation='relu',padding="same"),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.MaxPooling2D(2, strides=(2,2)),
])
model.trainable = True
# FC Dense Layer
x = model.output
x = Flatten()(x)
cnn_out = Dense(128)(x)
# Construct CNN model
Lstm_inp = Model(model.input, cnn_out)
# Distribute CNN output by timesteps
encoded_frames = TimeDistributed(Lstm_inp)(video)
# Contruct LSTM model
encoded_sequence = LSTM(256)(encoded_frames)
hidden_Drop = Dropout(0.2)(encoded_sequence)
hidden_layer = Dense(128)(hidden_Drop)
outputs = Dense(n_labels, activation="softmax")(hidden_layer)
# Contruct CNN+LSTM model
model = Model([video], outputs)
# 3. Setting up the Model Learning Process
# Model Compile
opt = SGD(lr=0.01)
model.compile(loss = "categorical_crossentropy", optimizer = opt, metrics=['accuracy'])
model.summary()
# 4. Training the Model
hist = model.fit(X_train, Y_train, batch_size=batch_size, validation_split=validation_ratio, shuffle=True, epochs=num_epochs)
Y_pred2 = model.predict(X_test)
y_pred= np.argmax(Y_pred2, axis=1) # prediksi
y_test=np.argmax(Y_test, axis=0)
from sklearn.metrics import confusion_matrix
confusion_matrix(y_test, y_pred)
import seaborn as sns
import matplotlib.pyplot as plt
f, ax = plt.subplots(figsize=(8,5))
sns.heatmap(confusion_matrix(y_test, y_pred), annot=True, fmt=".0f", ax=ax)
plt.xlabel("Y_head")
plt.ylabel("Y_true")
plt.show()
from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))
everything seems fine and work but the error come out when i try to make the confussion matrix in the line confusion_matrix(y_test, y_pred)
i still cant figure what might be the problem
hope anyone can help me with this
thank you so much guys
Posting my comments as answer for completeness:
One possible thing that looks a bit weird is that you take different axis when calculating the argmax for y_pred and y_test. But that might be ok depending on your data layout.
y_test and y_pred seem be be of different lengths. Can you check the shapes of Y_pred2 and Y_test and see if the axes over which you calculate the argmax are correct.
Related
I'm trying to build a NARX NN with Keras. I'm still not 100% sure on the use of the argument return_sequence=True in the LSTM neurons but, before I can check that, I need to make the code work. When I try to run it I get the following message:
ValueError: Error when checking input: expected lstm_84_input to have 3 dimensions, but got array with shape (6686, 3)
See my code below. The error is raised while running the model.fit command. My data data is of the shape 40101 time steps x 6 features (3 exogenous inputs, 3 system responses).
import numpy as np
import pandas as pd
from sklearn.model_selection import TimeSeriesSplit
import tensorflow as tf
from tensorflow.keras import initializers
# --- main
data = pd.read_excel('example.xlsx',usecols=['wave','wind','current','X','Y','RZ'])
data.plot(subplots=True, figsize=[15,10])
x_data = np.array(data.loc[:,['wave','wind','current']])
y_data = np.array(data.loc[:,['X','Y','RZ']])
timeSeriesCrossValidation = TimeSeriesSplit(n_splits=5)
for train, validation in timeSeriesCrossValidation.split(x_data, y_data):
# create model
model = tf.keras.models.Sequential()
# input layer
model.add(tf.keras.layers.LSTM(units=50,
input_shape=(40101,3),
dropout=0.01,
recurrent_dropout=0.2,
kernel_initializer=initializers.RandomNormal(mean=0,stddev=.5),
bias_initializer=initializers.Zeros(),
return_sequences = True))
# 1st hidden layer
model.add(tf.keras.layers.LSTM(units=50,
dropout=0.01,
recurrent_dropout=0.2,
kernel_initializer=initializers.RandomNormal(mean=0,stddev=.5),
bias_initializer=initializers.Zeros(),
return_sequences = True))
# 2nd hidder layer
model.add(tf.keras.layers.LSTM(units=50,
dropout=0.01,
recurrent_dropout=0.2,
kernel_initializer=initializers.RandomNormal(mean=0,stddev=.5),
bias_initializer=initializers.Zeros(),
return_sequences = False))
# output layer
model.add(tf.keras.layers.Dense(3))
model.compile(loss='mse',optimizer='nadam',metrics=['accuracy'])
model.fit(x_data[train], y_data[train],
verbose=2,
batch_size=None,
epochs=10,
validation_data=(x_data[validation], y_data[validation])
#callbacks=early_stop
)
prediction = model.predict(x_data[validation])
y_validation = y_data[validation]
LSTM layers need input in 3 dimensions:
(n_samples, time_steps, features)
You passed data with this format:
(n_samples, features)
Since you don't have a function to create time steps, the easiest solution would be to change your input to shape:
(40101, 1, 3)
Bogus data:
x_data = np.random.rand(40101, 1, 3)
y_data = np.random.rand(40101, 3)
Also, you shouldn't pass the number of samples in the input_shape argument of a Keras layer. Just use this:
input_shape=(1, 3)
So here is the corrected code (with bogus data):
import numpy as np
from sklearn.model_selection import TimeSeriesSplit
import tensorflow as tf
from tensorflow.keras import initializers
from tensorflow.keras.layers import *
x_data = np.random.rand(40101, 1, 3)
y_data = np.random.rand(40101, 3)
timeSeriesCrossValidation = TimeSeriesSplit(n_splits=5)
for train, validation in timeSeriesCrossValidation.split(x_data, y_data):
# create model
model = tf.keras.models.Sequential()
# input layer
model.add(LSTM(units=5,
input_shape=(1, 3),
dropout=0.01,
recurrent_dropout=0.2,
kernel_initializer=initializers.RandomNormal(mean=0, stddev=.5),
bias_initializer=initializers.Zeros(),
return_sequences=True))
# 1st hidden layer
model.add(LSTM(units=5,
dropout=0.01,
recurrent_dropout=0.2,
kernel_initializer=initializers.RandomNormal(mean=0, stddev=.5),
bias_initializer=initializers.Zeros(),
return_sequences=True))
# 2nd hidder layer
model.add(LSTM(units=50,
dropout=0.01,
recurrent_dropout=0.2,
kernel_initializer=initializers.RandomNormal(mean=0, stddev=.5),
bias_initializer=initializers.Zeros(),
return_sequences=False))
# output layer
model.add(tf.keras.layers.Dense(3))
model.compile(loss='mse', optimizer='nadam', metrics=['accuracy'])
model.fit(x_data[train], y_data[train],
verbose=2,
batch_size=None,
epochs=1,
validation_data=(x_data[validation], y_data[validation])
# callbacks=early_stop
)
prediction = model.predict(x_data[validation])
y_validation = y_data[validation]
If you want a function to create time steps, use this:
def multivariate_data(dataset, target, start_index, end_index, history_size,
target_size, step, single_step=False):
data = []
labels = []
start_index = start_index + history_size
if end_index is None:
end_index = len(dataset) - target_size
for i in range(start_index, end_index):
indices = range(i-history_size, i, step)
data.append(dataset[indices])
if single_step:
labels.append(target[i+target_size])
else:
labels.append(target[i:i+target_size])
return np.array(data), np.array(labels)
It will give you the right shape, e.g.:
multivariate_data(dataset=np.random.rand(40101, 3),
target=np.random.rand(40101, 3),
0, len(x_data), 5, 0, 1, True)[0].shape
(40096, 5, 3)
You lost 5 data points because at the beginning you can't look 5 steps back in the past.
EDIT: it seems like I did not even run the model for enough epochs, so I will try that out and return with my results
I am trying to create a CNN that classifies 3D brain images. However, the CNN program always predict the same class when I run it and am not sure what other methods I can do to prevent this. I have searched up this problem with many plausible solutions, but they did not work
So far, I have tried:
Decreasing the learning rate
Normalize the data to [0, 1]
Change optimizers
Only use sigmoid and binary_crossentropy
Add/remove dropout layers
Changed into a simpler CNN model
Balance the dataset
Added augmented data using a custom 3D imagedatagenerator()
Link: https://github.com/dhuy228/augmented-volumetric-image-generator
For context, I am classifying between two groups. The amount of images I am using is a total of 200 3D brain images (about 100 for each category). To increase my training size, I used a custom data augmentation I found from github
Looking at the learning curve, the accuracy and loss rates are completely random. Some runs they would be decreasing, some increasing, and some fluctuating within a range
Any help would be appreciated!
import os
import csv
import tensorflow as tf # 2.0
import nibabel as nib
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder, LabelEncoder
from keras.models import Model
from keras.layers import Conv3D, MaxPooling3D, Dense, Dropout, Activation, Flatten
from keras.layers import Input, concatenate
from keras import optimizers
from keras.utils import to_categorical
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt
from augmentedvolumetricimagegenerator.generator import customImageDataGenerator
from keras.callbacks import EarlyStopping
# Administrative items
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
# Where the file is located
path = r'C:\Users\jesse\OneDrive\Desktop\Research\PD\decline'
folder = os.listdir(path)
target_size = (96, 96, 96)
# creating x - converting images to array
def read_image(path, folder):
mri = []
for i in range(len(folder)):
files = os.listdir(path + '\\' + folder[i])
for j in range(len(files)):
image = np.array(nib.load(path + '\\' + folder[i] + '\\' + files[j]).get_fdata())
image = np.resize(image, target_size)
image = np.expand_dims(image, axis=3)
image /= 255.
mri.append(image)
return mri
# creating y - one hot encoder
def create_y():
excel_file = r'C:\Users\jesse\OneDrive\Desktop\Research\PD\decline_label.xlsx'
excel_read = pd.read_excel(excel_file)
excel_array = np.array(excel_read['Label'])
label = LabelEncoder().fit_transform(excel_array)
label = label.reshape(len(label), 1)
onehot = OneHotEncoder(sparse=False).fit_transform(label)
return onehot
# Splitting image train/test
x = np.asarray(read_image(path, folder))
y = np.asarray(create_y())
x_split, x_test, y_split, y_test = train_test_split(x, y, test_size=.2, stratify=y)
x_train, x_val, y_train, y_val = train_test_split(x_split, y_split, test_size=.25, stratify=y_split)
print(x_train.shape, x_val.shape, x_test.shape, y_train.shape, y_val.shape, y_test.shape)
batch_size = 10
num_classes = len(folder)
inputs = Input((96, 96, 96, 1))
conv1 = Conv3D(32, [3, 3, 3], padding='same', activation='relu')(inputs)
conv1 = Conv3D(32, [3, 3, 3], padding='same', activation='relu')(conv1)
pool1 = MaxPooling3D(pool_size=(2, 2, 2), padding='same')(conv1)
drop1 = Dropout(0.5)(pool1)
conv2 = Conv3D(64, [3, 3, 3], padding='same', activation='relu')(drop1)
conv2 = Conv3D(64, [3, 3, 3], padding='same', activation='relu')(conv2)
pool2 = MaxPooling3D(pool_size=(2, 2, 2), padding='same')(conv2)
drop2 = Dropout(0.5)(pool2)
conv3 = Conv3D(128, [3, 3, 3], padding='same', activation='relu')(drop2)
conv3 = Conv3D(128, [3, 3, 3], padding='same', activation='relu')(conv3)
pool3 = MaxPooling3D(pool_size=(2, 2, 2), padding='same')(conv3)
drop3 = Dropout(0.5)(pool3)
flat1 = Flatten()(drop3)
dense1 = Dense(128, activation='relu')(flat1)
drop5 = Dropout(0.5)(dense1)
dense2 = Dense(num_classes, activation='sigmoid')(drop5)
model = Model(inputs=[inputs], outputs=[dense2])
opt = optimizers.Adagrad(lr=1e-5)
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
train_datagen = customImageDataGenerator(
horizontal_flip=True
)
val_datagen = customImageDataGenerator()
training_set = train_datagen.flow(x_train, y_train, batch_size=batch_size)
validation_set = val_datagen.flow(x_val, y_val, batch_size=batch_size)
callbacks = EarlyStopping(monitor='val_loss', patience=3)
history = model.fit_generator(training_set,
steps_per_epoch = 10,
epochs = 20,
validation_steps = 5,
callbacks = [callbacks],
validation_data = validation_set)
score = model.evaluate(x_test, y_test, batch_size=batch_size)
print(score)
y_pred = model.predict(x_test, batch_size=batch_size)
y_test = np.argmax(y_test, axis=1)
y_pred = np.argmax(y_pred, axis=1)
confusion = confusion_matrix(y_test, y_pred)
map = sns.heatmap(confusion, annot=True)
print(map)
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
plt.figure(1)
plt.plot(acc)
plt.plot(val_acc)
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='best')
plt.title('Accuracy')
plt.figure(2)
plt.plot(loss)
plt.plot(val_loss)
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='best')
plt.title('Loss')
You can find the outputs here: https://i.stack.imgur.com/FF13P.jpg
It is kind of hard to help without the dataset itself. Though one or two things I would test:
I find the ReLU activation inappropriate for Dense layer, which could lead to the mono-class prediction. Try replacing the relu from your Dense(128) layer by something else (sigmoid, tanh)
Dropout is not really appropriate for images in general, you might want to look at DropBlock
Initial learning rate is pretty low, I would start with something between 1e-3 or 1e-4
Stupid thing that happened to me way too often: have you visualize the image / label combinaison to make sure each image has the right label?
Again, not sure it will fix everything, but I hope it might help!
This could be any number of things, but it is possible that the misbehaviour is being caused by the data itself.
Just from looking at the code, it seems like you haven't normalized the testing data before calling model.predict or model.evaluate in the same way as you have done for the training and validation data.
I had a similar problem once and it turned out this was the cause. As a quick test you can just rescale the test data and see if that helps.
My CNN code in keras is as follows:
from keras.models import Sequential
from keras.layers import Convolution2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.layers import Dropout
classifier = Sequential()
#1st Conv layer
classifier.add(Convolution2D(64, (9, 9), input_shape=(64, 64, 3), activation='relu'))
classifier.add(MaxPooling2D(pool_size=(4,4)))
#2nd Conv layer
classifier.add(Convolution2D(32, (3, 3), activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2,2)))
#Flattening
classifier.add(Flatten())
# Step 4 - Full connection
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dropout(0.1))
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dropout(0.2))
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 2, activation = 'softmax'))
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
#Fitting dataset
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('dataset/training_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'categorical')
test_set = test_datagen.flow_from_directory('dataset/test_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'categorical')
classifier.fit_generator(
training_set,
steps_per_epoch=(1341+3875)/32,
epochs=15,
validation_data=test_set,
validation_steps=(234+390)/32)
Wherever I see the use of roc_curve from sklearn.metrics, it takes parameters like x_train, y_train, x_test, y_test which I know can be pandas DataFrames but in my case it is not the case. How do I plot the ROC curve and get AUC score for model training for CNNs like here?
I got it working. All I had to do was match the datatype of preds obtained from preds = classifier.predict(test_set) with the true_labels I got from labels = test_set. Preds is basically a numpy.ndarray containing single element lists which have np.float32 values. Conversion of labels to that same format and shape got the roc_curve working.
Also, I had to add a third variable threshold in fpr, tpr, threshold = roc_curve(true_labels, preds) so no ValueError: too many values to unpack error popped up.
Actually if look at the docs of sklearn.metrics.roc_curve (and almost for every sklearn metric) they don't take the inputs of your model (images) as arguments, it just takes the true labels and the predicted label. So after you make the inference on the test set, which in keras (here i just guessing) is something like
preds = classifier.predict(batch)
You call roc_curve as
fpr, tpr = roc_curve(true_labels,preds)
Probablly you have to change the type though, beacuse they're are tensor.
EDIT : I've checked the keras documentation on flow_from_directory and yields an iterator over (x,y) = (images,labels) so if you want to do some kind of post-training analysis you should get the labels using something like this:
labels = []
for _,y in test_set:
labels.extend(list(y))
And if you only have two classes, change the class_mode to binary
To compute ROC AUC, you need scores, not the result of final classification/decision.
Since your model has a softmax with two neurons (classifier.add(Dense(units = 2, activation = 'softmax'))) as its last layer, this is a kind of multi-class classification where the number of classes is 2. But, the function roc_curve is restricted to use in binary classification problems. So, you cannot use it with softmax.
You can replace your 2 neurons and softmax with one neuron and sigmoid. Then , it is safe to use roc_curve in a binary classification problem.
There is another function named roc_auc_score which has a argument multi_class that converts a multiclass classification problem into multiple binary problems. E.g., auc_roc = roc_auc_score(labels, classifier.predict(...), multi_class='ovr'). However, this only returns AUC score and it cannot help you to plot the ROC curve.
I understand that the features extracted from an auto-encoder can be fed into an mlp for classification or regression purpose. This is something that I did earlier.
But what if I have 2 auto-encoders? Can I extract the features from the bottleneck layers of 2 auto-encoders and feed them into an mlp which performs classification based on these features? If yes, then how? I am not sure how to concatenate these two feature sets. I tried with numpy.hstack() which gives me 'unhashable slice' error, whereas, using tf.concat() gives me the error 'Input tensors to a Model must be Keras tensors.' the bottleneck layers of the two auto-encoders are of dimension (None,100) each. So, essentially, if I stack them horizontally, I should be getting a (None, 200). The hidden layer of the mlp may contain some (num_hidden=100) neurons. Could anyone please help?
x1 = autoencoder1.get_layer('encoder2').output
x2 = autoencoder2.get_layer('encoder2').output
#inp = np.hstack((x1, x2))
inp = tf.concat([x1, x2], 1)
x = tf.concat([x1, x2], 1)
h = Dense(num_hidden, activation='relu', name='hidden')(x)
y = Dense(1, activation='sigmoid', name='prediction')(h)
mymlp = Model(inputs=inp, outputs=y)
# Compile model
mymlp.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Train model
mymlp.fit(x_train, y_train, epochs=20, batch_size=8)
updated as per #twolffpiggott's suggestion:
from keras.layers import Input, Dense, Dropout
from keras import layers
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy as np
x1 = Data1
x2 = Data2
y = Data3
num_neurons1 = x1.shape[1]
num_neurons2 = x2.shape[1]
# Train-test split
x1_train, x1_test, x2_train, x2_test, y_train, y_test = train_test_split(x1, x2, y, test_size=0.2)
# scale data within [0-1] range
scalar = MinMaxScaler()
x1_train = scalar.fit_transform(x1_train)
x1_test = scalar.transform(x1_test)
x2_train = scalar.fit_transform(x2_train)
x2_test = scalar.transform(x2_test)
x_train = np.concatenate([x1_train, x2_train], axis =-1)
x_test = np.concatenate([x1_test, x2_test], axis =-1)
# Auto-encoder1
encoding_dim1 = 500
encoding_dim2 = 100
input_data = Input(shape=(num_neurons1,))
encoded = Dense(encoding_dim1, activation='relu', name='encoder1')(input_data)
encoded1 = Dense(encoding_dim2, activation='relu', name='encoder2')(encoded)
decoded = Dense(encoding_dim2, activation='relu', name='decoder1')(encoded1)
decoded = Dense(num_neurons1, activation='sigmoid', name='decoder2')(decoded)
# this model maps an input to its reconstruction
autoencoder1 = Model(inputs=input_data, outputs=decoded)
autoencoder1.compile(optimizer='sgd', loss='mse')
# training
autoencoder1.fit(x1_train, x1_train,
epochs=100,
batch_size=8,
shuffle=True,
validation_data=(x1_test, x1_test))
# Auto-encoder2
encoding_dim1 = 500
encoding_dim2 = 100
input_data = Input(shape=(num_neurons2,))
encoded = Dense(encoding_dim1, activation='relu', name='encoder1')(input_data)
encoded2 = Dense(encoding_dim2, activation='relu', name='encoder2')(encoded)
decoded = Dense(encoding_dim2, activation='relu', name='decoder1')(encoded2)
decoded = Dense(num_neurons2, activation='sigmoid', name='decoder2')(decoded)
# this model maps an input to its reconstruction
autoencoder2 = Model(inputs=input_data, outputs=decoded)
autoencoder2.compile(optimizer='sgd', loss='mse')
# training
autoencoder2.fit(x2_train, x2_train,
epochs=100,
batch_size=8,
shuffle=True,
validation_data=(x2_test, x2_test))
# MLP
num_hidden = 100
encoded1.trainable = False
encoded2.trainable = False
encoded1 = autoencoder1(autoencoder1.inputs)
encoded2 = autoencoder2(autoencoder2.inputs)
concatenated = layers.concatenate([encoded1, encoded2], axis=-1)
x = Dropout(0.2)(concatenated)
h = Dense(num_hidden, activation='relu', name='hidden')(x)
h = Dropout(0.5)(h)
y = Dense(1, activation='sigmoid', name='prediction')(h)
myMLP = Model(inputs=[autoencoder1.inputs, autoencoder2.inputs], outputs=y)
# Compile model
myMLP.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Training
myMLP.fit(x_train, y_train, epochs=200, batch_size=8)
# Testing
myMLP.predict(x_test)
giving me an error: unhashable type: 'list' from the line:
myMLP = Model(inputs=[autoencoder1.inputs, autoencoder2.inputs], outputs=y)
The problem is that you're mixing numpy arrays with keras tensors. This can't go.
There are two approaches.
Predict numpy arrays from each autoencoder, concat the arrays, send them to the third model
Connect all models, probably make the autoencoders untrainable, fit with one input for each autoencoder.
Personally, I'd go for the first. (Assuming the autoencoders are already trained and don't need change).
First approach
numpyOutputFromAuto1 = autoencoder1.predict(numpyInputs1)
numpyOutputFromAuto2 = autoencoder2.predict(numpyInputs2)
inputDataForThird = np.concatenate([numpyOutputFromAuto1,numpyOutputFromAuto2],axis=-1)
inputTensorForMlp = Input(inputsForThird.shape[1:])
h = Dense(num_hidden, activation='relu', name='hidden')(inputTensorForMlp)
y = Dense(1, activation='sigmoid', name='prediction')(h)
mymlp = Model(inputs=inputTensorForMlp, outputs=y)
....
mymlp.fit(inputDataForThird ,someY)
Second Approach
This is a little more complicated, and at first I don't see much reason to do this. (But of course there may be cases where it's a good choice)
Now we're totally forgetting numpy and working with keras tensors.
Creating the mlp on its own (good if you will use it later without the autoencoders):
inputTensorForMlp = Input(input_shape_compatible_with_concatenated_encoder_outputs)
x = Dropout(0.2)(inputTensorForMlp)
h = Dense(num_hidden, activation='relu', name='hidden')(x)
h = Dropout(0.5)(h)
y = Dense(1, activation='sigmoid', name='prediction')(h)
myMLP = Model(inputs=[autoencoder1.inputs, autoencoder2.inputs], outputs=y)
We probably want the bottleneck features of the autoencoders, right? If you happened to create the autoencoders properly with: encoder model, decoder model, join both, then it's easier to use just the encoder model. Else:
encodedOutput1 = autoencoder1.layers[bottleneckLayer].outputs #or encoder1.outputs
encodedOutput2 = autoencoder1.layers[bottleneckLayer].outputs #or encoder2.outputs
Creating a joined model. The concatenation must use a keras layer (we're working with keras tensors):
concatenated = Concatenate()([encodedOutput1,encodedOutput2])
output = myMLP(concatenated)
joinedModel = Model([autoencoder1.input,autoencoder2.input],output)
I'd also go with Daniel's first approach (for simplicity and efficiency), but if you're interested in the second; for instance if you're interested in running the network end-to-end, you'd approach it like this:
# make autoencoders not trainable
autoencoder1.trainable = False
autoencoder2.trainable = False
encoded1 = autoencoder1(kerasInputs1)
encoded2 = autoencoder2(kerasInputs2)
concatenated = layers.concatenate([encoded1, encoded2], axis=-1)
h = Dense(num_hidden, activation='relu', name='hidden')(concatenated)
y = Dense(1, activation='sigmoid', name='prediction')(h)
myMLP = Model([input_data1, input_data2], y)
myMLP.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Training
myMLP.fit([x1_train, x2_train], y_train, epochs=200, batch_size=8)
# Testing
myMLP.predict([x1_test, x2_test])
Key edits
The weights of both autoencoders should be frozen end-to-end (otherwise early-stage gradient updates from the randomly initialized MLP will likely result in the loss of much of their learning).
The autoencoder input layers should be assigned to separate variables input_data1 and input_data2 per autoencoder (instead of both to input_data). Even though autoencoder1.inputs returns a tf tensor, this is the source of the unhashable type: list exception, and replacing with [input_data1, input_data2] solves the issue.
When fitting the MLP for the end-to-end model, the input should be a list of x1_train and x2_train rather than the concatenated inputs. Same when predicting.
I am trying to use Keras to make a neural network. The data I am using is https://archive.ics.uci.edu/ml/datasets/Yacht+Hydrodynamics. My code is as follows:
import numpy as np
from keras.layers import Dense, Activation
from keras.models import Sequential
from sklearn.model_selection import train_test_split
data = np.genfromtxt(r"""file location""", delimiter=',')
model = Sequential()
model.add(Dense(32, activation = 'relu', input_dim = 6))
model.add(Dense(1,))
model.compile(optimizer='adam', loss='mean_squared_error', metrics = ['accuracy'])
Y = data[:,-1]
X = data[:, :-1]
From here I have tried using model.fit(X, Y), but the accuracy of the model appears to remain at 0. I am new to Keras so this is probably an easy solution, apologies in advance.
My question is what is the best way to add regression to the model so that the accuracy increases? Thanks in advance.
First of all, you have to split your dataset into training set and test set using train_test_split class from sklearn.model_selection library.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.08, random_state = 0)
Also, you have to scale your values using StandardScaler class.
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
Then, you should add more layers in order to get better results.
Note
Usually it's a good practice to apply following formula in order to find out the total number of hidden layers needed.
Nh = Ns/(α∗ (Ni + No))
where
Ni = number of input neurons.
No = number of output neurons.
Ns = number of samples in training data set.
α = an arbitrary scaling factor usually 2-10.
So our classifier becomes:
# Initialising the ANN
model = Sequential()
# Adding the input layer and the first hidden layer
model.add(Dense(32, activation = 'relu', input_dim = 6))
# Adding the second hidden layer
model.add(Dense(units = 32, activation = 'relu'))
# Adding the third hidden layer
model.add(Dense(units = 32, activation = 'relu'))
# Adding the output layer
model.add(Dense(units = 1))
The metric that you use- metrics=['accuracy'] corresponds to a classification problem. If you want to do regression, remove metrics=['accuracy']. That is, just use
model.compile(optimizer = 'adam',loss = 'mean_squared_error')
Here is a list of keras metrics for regression and classification
Also, you have to define the batch_size and epochs values for fit method.
model.fit(X_train, y_train, batch_size = 10, epochs = 100)
After you trained your network you can predict the results for X_test using model.predict method.
y_pred = model.predict(X_test)
Now, you can compare the y_pred that we obtained from neural network prediction and y_test which is real data. For this, you can create a plot using matplotlib library.
plt.plot(y_test, color = 'red', label = 'Real data')
plt.plot(y_pred, color = 'blue', label = 'Predicted data')
plt.title('Prediction')
plt.legend()
plt.show()
It seems that our neural network learns very good
Here is how the plot looks.
Here is the full code
import numpy as np
from keras.layers import Dense, Activation
from keras.models import Sequential
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
# Importing the dataset
dataset = np.genfromtxt("data.txt", delimiter='')
X = dataset[:, :-1]
y = dataset[:, -1]
# Splitting the dataset into the Training set and Test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.08, random_state = 0)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
# Initialising the ANN
model = Sequential()
# Adding the input layer and the first hidden layer
model.add(Dense(32, activation = 'relu', input_dim = 6))
# Adding the second hidden layer
model.add(Dense(units = 32, activation = 'relu'))
# Adding the third hidden layer
model.add(Dense(units = 32, activation = 'relu'))
# Adding the output layer
model.add(Dense(units = 1))
#model.add(Dense(1))
# Compiling the ANN
model.compile(optimizer = 'adam', loss = 'mean_squared_error')
# Fitting the ANN to the Training set
model.fit(X_train, y_train, batch_size = 10, epochs = 100)
y_pred = model.predict(X_test)
plt.plot(y_test, color = 'red', label = 'Real data')
plt.plot(y_pred, color = 'blue', label = 'Predicted data')
plt.title('Prediction')
plt.legend()
plt.show()