I am still a beginner in AI and deep learning but I wanted to test whether a neural network will be able to calculate the sum of two numbers so I generated a dataset of 5000 numbers and made test size = 0.3 so the training dataset will be equal to 3500 but what was weird that I found the model is training only on 110 input instead of 3500.
The code used:
import tensorflow as tf
from sklearn.model_selection import train_test_split
import numpy as np
from random import random
def generate_dataset(num_samples, test_size=0.33):
"""Generates train/test data for sum operation
:param num_samples (int): Num of total samples in dataset
:param test_size (int): Ratio of num_samples used as test set
:return x_train (ndarray): 2d array with input data for training
:return x_test (ndarray): 2d array with input data for testing
:return y_train (ndarray): 2d array with target data for training
:return y_test (ndarray): 2d array with target data for testing
"""
# build inputs/targets for sum operation: y[0][0] = x[0][0] + x[0][1]
x = np.array([[random()/2 for _ in range(2)] for _ in range(num_samples)])
y = np.array([[i[0] + i[1]] for i in x])
# split dataset into test and training sets
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=test_size)
return x_train, x_test, y_train, y_test
if __name__ == "__main__":
# create a dataset with 2000 samples
x_train, x_test, y_train, y_test = generate_dataset(5000, 0.3)
# build model with 3 layers: 2 -> 5 -> 1
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(5, input_dim=2, activation="sigmoid"),
tf.keras.layers.Dense(1, activation="sigmoid")
])
# choose optimiser
optimizer = tf.keras.optimizers.SGD(learning_rate=0.1)
# compile model
model.compile(optimizer=optimizer, loss='mse')
# train model
model.fit(x_train, y_train, epochs=100)
# evaluate model on test set
print("\nEvaluation on the test set:")
model.evaluate(x_test, y_test, verbose=2)
# get predictions
data = np.array([[0.1, 0.2], [0.2, 0.2]])
predictions = model.predict(data)
# print predictions
print("\nPredictions:")
for d, p in zip(data, predictions):
print("{} + {} = {}".format(d[0], d[1], p[0]))
The 110/110 you are seeing in your image is actually the batch count, not the sample count. So 110 batches * the default batch size of 32 gives you ~3500 training samples, which matches what you'd expect as 70% of 5000.
You can see by backing into it the other way that the last batch would be a partial batch, since it's not evenly divisible by 32:
>>> (.7 * 5000) / 110
31.818181818181817
In neural networks, an epoch is a full pass over the data. It trains in small batches (also called steps), and this is the way Keras logs them.
Related
I have datasets that have more than 2000 rows and 23 columns including the age column. I have completed all of the processes for SVR. Now I want to predict the trained SVR model is where I need to input X_test to the model? Have faced an error that is
ValueError: X.shape[1] = 1 should be equal to 22, the number of features at training time
How may I resolve this problem? How may I write code for making predictions on the trained SVR model?
import pandas as pd
import numpy as np
# Make fake dataset
dataset = pd.DataFrame(data= np.random.rand(2000,22))
dataset['age'] = np.random.randint(2, size=2000)
# Separate the target from the other features
target = dataset['age']
data = dataset.drop('age', axis = 1)
X_train, y_train = data.loc[:1000], target.loc[:1000]
X_test, y_test = data.loc[1001], target.loc[1001]
X_test = np.array(X_test).reshape((len(X_test), 1))
print(X_test.shape)
SupportVectorRefModel = SVR()
SupportVectorRefModel.fit(X_train, y_train)
y_pred = SupportVectorRefModel.predict(X_test)
Output:
ValueError: X.shape[1] = 1 should be equal to 22, the number of features at training time
Your reshaping of X_test is not correct; it should be:
X_test = np.array(X_test).reshape(1, -1)
print(X_test.shape)
# (1, 22)
With that change, the rest of your code runs OK:
y_pred = SupportVectorRefModel.predict(X_test)
y_pred
# array([0.90156667])
UPDATE
In the case as you show it in your code, obviously X_test consists of one single sample, as defined here:
X_test, y_test = data.loc[1001], target.loc[1001]
But if (as I suspect) this is not what you actually want, but in fact you want the rest of your data as your test set, you should change the definition to:
X_test, y_test = data.loc[1001:], target.loc[1001:]
X_test.shape
# (999, 22)
and without any reshaping
y_pred = SupportVectorRefModel.predict(X_test)
y_pred.shape
# (999,)
i.e. a y_pred of 999 predictions.
Currently, I'm playing with Stocks Predictions task which I try to solve using LSTM/GRU.
Problem: After training LSTM/GRU I get huge drop predicted values
Model training process
Train, test data is simply generated using pd.shift in series_to_supervised function.
df['Mid'] = df['Low'] + df['High'] / 2
n_lag = 1 # Lag columns back
n_seq = 1*50 # TimeSteps to predict
seq_col = 'Mid'
seq_col_t = f'{seq_col}(t)'
split_date = '2018-01-01'
def series_to_supervised(data: pd.DataFrame,
seq_col: str,
n_in: int = 1,
n_out: int = 1,
drop_seq_col: bool = True,
dropna: bool = True):
"""Convert time series into supervised learning problem
{input sequence, forecast sequence}
"""
# input sequence (t-n, ... t-1) -> pisitive shift
for i in range(n_in, 0, -1):
data[f'{seq_col}(t-{i})'] = data[seq_col].shift(i)
# no sequence -> no shift
data[f'{seq_col}(t)'] = data[seq_col]
for i in range(1, n_out+1):
# forecast sequence (t, t+1, ... t+n) -> negative shift
data[f'{seq_col}(t+{i})'] = data[seq_col].shift(-i)
if drop_seq_col:
data = data.drop(seq_col, axis=1)
if dropna:
data.dropna(inplace=True)
return data
df = series_to_supervised(df, seq_col=seq_col, n_in=n_lag, n_out=n_seq)
mask = df.index < split_date
train, test = df[mask], df[~mask]
X_cols = ['Mid(t-1)']
y_cols = train.filter(like='Mid(t+').columns
X_train, y_train, X_test, y_test = train[X_cols], train[y_cols], test[X_cols], test[y_cols]
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range=(-1, 1))
# also returns np.ndarray
X_train = scaler.fit_transform(X_train)
X_test = scaler.fit_transform(X_test)
y_train = y_train.values
y_test = y_test.values
from keras.models import Sequential
from keras.layers import Dense, Dropout, LSTM, GRU
from keras.optimizers import Adam, RMSprop, Adamax
from keras.callbacks import ModelCheckpoint
def get_model(X, y, n_batch):
num_classes=y.shape[1]
# design network
model = Sequential()
# For Stock Predictions has to be used LSTM stateful=True
model.add(GRU(10, batch_input_shape=(n_batch, X.shape[1], X.shape[2]), stateful=True))
model.add(Dropout(0.3))
model.add(Dense(num_classes))
opt = Adam(learning_rate=0.01)
# opt = RMSprop(learning_rate=0.001)
model.compile(loss='mean_squared_error', optimizer=opt)
return model
def reshape_batch(X_train, y_train, X_test, y_test, n_batch):
# reshape training into [samples, timesteps, features]
X_train = X_train.reshape(X_train.shape[0], 1, X_train.shape[1])
X_test = X_test.reshape(X_test.shape[0], 1, X_test.shape[1])
# cut to equally divided n_batches (without reminder).
# needed for LSTM stateful=True
train_cut = X_train.shape[0] % n_batch
test_cut = X_test.shape[0] % n_batch
if train_cut > 0:
X_train = X_train[:-train_cut]
y_train = y_train[:-train_cut]
if test_cut > 0:
X_test = X_test[:-test_cut]
y_test = y_test[:-test_cut]
return X_train, y_train, X_test, y_test
# fit an LSTM network to training data
def fit_lstm(X_train: np.ndarray,
y_train: np.ndarray,
n_lag: int,
n_seq: int,
n_batch: int,
nb_epoch: int,
X_test: np.ndarray=None,
y_test: np.ndarray=None):
model = get_model(X_train, y_train, n_batch)
# fit network
history = model.fit(X_train, y_train, validation_data=(X_test, y_test), callbacks=None,
epochs=nb_epoch, batch_size=n_batch, verbose=1, shuffle=False)
print('Predict:', model.predict(X_test, batch_size=n_batch))
model.reset_states()
return model, history
n_batch = 32
nb_epoch = 40
X_train, y_train, X_test, y_test = reshape_batch(X_train, y_train, X_test, y_test, n_batch)
model, history = fit_lstm(X_train, y_train, n_lag, n_seq, n_batch, nb_epoch, X_test=X_test, y_test=y_test)
What I Have tried
Different optimizers (kinda all available in keras)
DIfferent recurrent network structures (GRU/LSTM)
Different learning rates
Different epochs from 1 to 1500
Adding/Removing Drop layers with different params (0.1-0.7)
Different LSTM/GRU amount of neurons (1-100)
Number of LSTM/GRU layers, via return_sequences params with more Drop layers.
Different number of forecasts(t+1,t+2 ... t+n) features from 1-365
Different number of lag (t-1, t-2, t-n ...) features from 1-5
Different scale normalization borders (0,1) and (-1,1)
Different n_batch values: 1,8,16,32
What can affect LSTM/GRU give so strange behaviour? And What else should I try to make it work the normal way?
I have a CNN model built in Keras. I then took out its last layer as a feature and retrained an SVM with it.
Is it possible to now find the gradient of the SVMs output wrt the CNN model's input?
I know of this method (Getting gradient of model output w.r.t weights using Keras) and am able to use it to get the gradient wrt input for the layer i am pulling the features out of. I can also get the numerical gradient of the SVM wrt to its input, albeit at the moment its a bit of a mess. Would appreciate some input here as well actually.
But now I need to somehow combine these two to get the gradient of the SVM to the input of the entire CNN model.
"""
Main CNN script
"""
# Imports ##
# general
import matplotlib.pyplot as plt
import numpy as np
# ML libraries
from tensorflow.keras.datasets import mnist
# ML utilities
from tensorflow.keras.utils import to_categorical
# Python scripts used
import train_CNN
import load_CNN
import train_subSVMs
import load_subSVMs
import train_finalSVM
import load_finalSVM
import joblib
def save_array(array, name):
joblib.dump(array, name+'.pkl', compress = 3)
return
def load_array(array, name):
array = joblib.load(array, name)
return array
def show_data_example(i, dataset):
# show some of the images in the dataset
# call multiple times for multiple images
# squeeze is necessary here to get rid of the extra dimension introduced in rehsaping
print('\nExample Image: %s from selected dataset' %i)
plt.imshow(np.squeeze(dataset[i]), cmap=plt.get_cmap('gray'))
plt.show()
return
def load_and_encode(target_shape):
# load dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train, y_train = X_train[:,:,:],y_train[:]
X_test, y_test = X_test[:,:,:], y_test[:]
print('Loaded Mnist dataset')
print('Train: X=%s, y=%s' % (X_train.shape, y_train.shape))
print('Test: X=%s, y=%s' % (X_test.shape, y_test.shape))
# encode y data
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
# normalise X data (X/255 -> [0,1])
X_train = X_train/255.0
X_test = X_test/255.0
# currently dimensions are (m x 28 x 28)
# making them into (m x 28x28x1) 3Dimensional for convolution networks
X_train = X_train.reshape(X_train.shape[0], target_shape[0], target_shape[1], target_shape[2])
X_test = X_test.reshape(X_test.shape[0], target_shape[0], target_shape[1], target_shape[2])
# show an arbitary example image from training set
show_data_example(12, X_train)
return X_train, y_train, X_test, y_test
image_shape = (28,28,1)
# load and encode mnist data
X_train, y_train, X_test, y_test = load_and_encode(image_shape)
# hyper-parameters
learning_rate = 0.1
momentum = 0.9
dropout = 0.5
batch_size = 128
epochs = 50
decay = 1e-6
number_of_classes = 10
# store required data into a packet to send to various imports
packet = [learning_rate, momentum, dropout, batch_size, epochs, decay,
number_of_classes, image_shape,
X_train, y_train, X_test, y_test]
data = [X_train, y_train, X_test, y_test]
#CNN_model = train_CNN.train_model(packet, save_model = 'True')
CNN_model = load_CNN.load_model(packet) # keras sequential model
#subSVM1, subSVM2, subSVM3, features = train_subSVMs.train(CNN_model, data, c=0.1, save_model = 'True', get_accuracies= 'True')
subSVM1, subSVM2, subSVM3, features = load_subSVMs.load(CNN_model, data, c=0.1, get_accuracies='False')
subSVMs = [subSVM1, subSVM2, subSVM3]
feature1_train, feature1_test,\
feature2_train, feature2_test,\
feature3_train, feature3_test = features
final_SVM = joblib.load('saved_finalSVM.pkl') # sklearn svm trained from features
NUMBER = 48
plt.imshow(np.squeeze(X_train[NUMBER,:,:,:]), cmap=plt.get_cmap('binary'))
# gradients of features wrt to input
import tensorflow.keras.backend as K
gradients = K.gradients(CNN_model.get_layer(name='feature1').output, CNN_model.input) # K.gradients(y,x) for dy/dx
f = K.function([CNN_model.input], gradients)
x = np.expand_dims(X_train[NUMBER,:,:,:],axis=0)
a=f([x])
Using Keras, I have images in X and labels in Y. Then I do:
train_datagen = ImageDataGenerator(validation_split = 0.25)
train_generator = train_datagen.flow(X, Y, subset = 'training')
My question is: when train_generator is used within fit_generator of a model, how many samples from each class are actually presented as training samples?
For example, if I have 1000 (x, y) pairs for 3 classes: 500 for class A, 300 for class B and 200 for class C, how many samples from class A, B and C do fit_generator really see as training samples? Or all we can do is: 500*(1.0 - 0.25) and so on?
If we inspect the relevant part of the source code, we would realize that the last validation_split * num_samples samples in the X (and y) will be used for the validation and the others will be used for training:
split_idx = int(len(x) * image_data_generator._validation_split)
# ...
if subset == 'validation':
x = x[:split_idx]
x_misc = [np.asarray(xx[:split_idx]) for xx in x_misc]
if y is not None:
y = y[:split_idx]
else:
x = x[split_idx:]
x_misc = [np.asarray(xx[split_idx:]) for xx in x_misc]
if y is not None:
y = y[split_idx:]
So it is your responsibility if you want to make sure the proportion of classes is the same in both training and validation subsets (i.e. Keras does not guarantee that when using this functionality). The only thing that Keras verifies is that at least one sample from each class is included in both training and validation subsets:
if not np.array_equal(
np.unique(y[:split_idx]),
np.unique(y[split_idx:])):
raise ValueError('Training and validation subsets '
'have different number of classes after '
'the split. If your numpy arrays are '
'sorted by the label, you might want '
'to shuffle them.')
So the solution to have a stratified split (i.e. preserving the porportion of samples for each class in training and validation splits) is to use sklearn.model_selection.train_test_split with stratify argument set:
from sklearn.model_selection import train_test_split
val_split = 0.25
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=val_split, stratify=y)
X = np.concatenate((X_train, X_val))
y = np.concatenate((y_train, y_val))
Now you can pass validation_split=val_split to ImageDataGenerator and it is guaranteed that the proportion of classes is the same in both training and validation subsets.
I have built an autoencoder (1 encoder 8:5, 1 decoder 5:8) which takes the Pima-Indian-Diabetes dataset (https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv) and reduces its dimension (from 8 to 5). I would now like to use these reduced features to classify the data using an mlp. Now, here, I have some problems with the basic understanding of the architecture. How do I use the weights of the autoencoder and feed them into the mlp? I have checked these threads - https://github.com/keras-team/keras/issues/91 and https://www.codementor.io/nitinsurya/how-to-re-initialize-keras-model-weights-et41zre2g. The question here is which weight matrix should I consider? the one for the encoder part or the decoder part? When I add the layers for the mlp, how do I initialise the weights with these saved weights, not getting the exact syntax. Also, should my mlp start with 5 neurons since my reduced dimension is 5? What are the possible dimensions of the mlp for this binary classification problem? If anyone could elaborate please?
The deep autoencoder code is as follows:
# from keras.models import Sequential
from keras.layers import Input, Dense
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy
# Data pre-processing...
# load pima indians dataset
dataset = numpy.loadtxt("C:/Users/dibsa/Python Codes/pima.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:, 0:8]
Y = dataset[:, 8]
# Split data into training and testing datasets
x_train, x_test, y_train, y_test = train_test_split(
X, Y, test_size=0.2, random_state=42)
# scale the data within [0-1] range
scalar = MinMaxScaler()
x_train = scalar.fit_transform(x_train)
x_test = scalar.fit_transform(x_test)
# Autoencoder code begins here...
encoding_dim1 = 5 # size of encoded representations
encoding_dim2 = 3 # size of encoded representations in the bottleneck layer
# this is our input placeholder
input_data = Input(shape=(8,))
# "encoded" is the first encoded representation of the input
encoded = Dense(encoding_dim1, activation='relu', name='encoder1')(input_data)
# "enc" is the second encoded representation of the input
enc = Dense(encoding_dim2, activation='relu', name='encoder2')(encoded)
# "dec" is the lossy reconstruction of the input
dec = Dense(encoding_dim1, activation='sigmoid', name='decoder1')(enc)
# "decoded" is the final lossy reconstruction of the input
decoded = Dense(8, activation='sigmoid', name='decoder2')(dec)
# this model maps an input to its reconstruction
autoencoder = Model(inputs=input_data, outputs=decoded)
autoencoder.compile(optimizer='sgd', loss='mse')
# training
autoencoder.fit(x_train, x_train,
epochs=300,
batch_size=10,
shuffle=True,
validation_data=(x_test, x_test)) # need more tuning
# test the autoencoder by encoding and decoding the test dataset
reconstructions = autoencoder.predict(x_test)
print('Original test data')
print(x_test)
print('Reconstructed test data')
print(reconstructions)
#The stacked autoencoder code is as follows:
# from keras.models import Sequential
from keras.layers import Input, Dense
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy
# Data pre-processing...
# load pima indians dataset
dataset = numpy.loadtxt("C:/Users/dibsa/Python Codes/pima.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:, 0:8]
Y = dataset[:, 8]
# Split data into training and testing datasets
x_train, x_test, y_train, y_test = train_test_split(
X, Y, test_size=0.2, random_state=42)
# scale the data within [0-1] range
scalar = MinMaxScaler()
x_train = scalar.fit_transform(x_train)
x_test = scalar.fit_transform(x_test)
# Autoencoder code goes here...
encoding_dim1 = 5 # size of encoded representations
encoding_dim2 = 3 # size of encoded representations in the bottleneck layer
# this is our input placeholder
input_data1 = Input(shape=(8,))
# the first encoded representation of the input
encoded1 = Dense(encoding_dim1, activation='relu',
name='encoder1')(input_data1)
# the first lossy reconstruction of the input
decoded1 = Dense(8, activation='sigmoid', name='decoder1')(encoded1)
# this model maps an input to its first layer of reconstructions
autoencoder1 = Model(inputs=input_data1, outputs=decoded1)
# this is the first encoder model
enc1 = Model(inputs=input_data1, outputs=encoded1)
autoencoder1.compile(optimizer='sgd', loss='mse')
# training
autoencoder1.fit(x_train, x_train, epochs=300,
batch_size=10, shuffle=True,
validation_data=(x_test, x_test))
FirstAEoutput = autoencoder1.predict(x_train)
input_data2 = Input(shape=(encoding_dim1,))
# the second encoded representations of the input
encoded2 = Dense(encoding_dim2, activation='relu',
name='encoder2')(input_data2)
# the final lossy reconstruction of the input
decoded2 = Dense(encoding_dim1, activation='sigmoid',
name='decoder2')(encoded2)
# this model maps an input to its second layer of reconstructions
autoencoder2 = Model(inputs=input_data2, outputs=decoded2)
# this is the second encoder
enc2 = Model(inputs=input_data2, outputs=encoded2)
autoencoder2.compile(optimizer='sgd', loss='mse')
# training
autoencoder2.fit(FirstAEoutput, FirstAEoutput, epochs=300,
batch_size=10, shuffle=True)
# this is the overall autoencoder mapping an input to its final reconstructions
autoencoder = Model(inputs=input_data1, outputs=encoded2)
# test the autoencoder by encoding and decoding the test dataset
reconstructions = autoencoder.predict(x_test)
print('Original test data')
print(x_test)
print('Reconstructed test data')
print(reconstructions)
If your decoder is trying to reconstruct the input, then it doesn't really make sense to me to attach your classifier to its output. I mean, why not just attach it to the input in the first time? So if you are set on using an auto-encoder, I'd say it's pretty clear that you should attach your classifier to the output of the encoder pipe.
I'm not quite sure what you mean with "use the weights of the autoencoder and feed them into the mlp". You don't feed a layer with another layer's weights, but with it's output signal. This is pretty easy to do on Keras. Let's say you defined your auto-encoder and trained it as such:
from keras Input, Model
from keras import backend as K
from keras.layers import Dense
x = Input(shape=[8])
y = Dense(5, activation='sigmoid' name='encoder')(x)
y = Dense(8, name='decoder')(y)
ae = Model(inputs=x, outputs=y)
ae.compile(loss='mse', ...)
ae.fit(x_train, x_train, ...)
K.models.save_model(ae, './autoencoder.h5')
Then you can attach a classifying layer at the encoder and create a classifier model with the following code:
# load the model from the disk if you
# are in a different execution.
ae = K.models.load_model('./autoencoder.h5')
y = ae.get_layer('encoder').output
y = Dense(1, activation='sigmoid', name='predictions')(y)
classifier = Model(inputs=ae.inputs, outputs=y)
classifier.compile(loss='binary_crossentropy', ...)
classifier.fit(x_train, y_train, ...)
That's it, really. The classifier model will now have the first embedding layer encoder of the ae model as its first layer, followed by a sigmoid decision layer predictions.
If what you are really trying to do is to use the weights learned by the auto-encoder to initialize the weights from the classifier (I'm not positive I recommend this approach):
You can take the weight matrices with layer#get_weights, prune it (because the encoder has 5 units and the classifier only has 1) and finally set the classifier weights. Something in the following lines:
w, b = ae.get_layer('encoder').get_weights()
# remove all units except by one.
neuron_to_keep = 2
w = w[:, neuron_to_keep:neuron_to_keep + 1]
b = b[neuron_to_keep:neuron_to_keep + 1]
classifier.get_layer('predictions').set_weights(w, b)
Idavid, this is for your reference - MLP using Autoencoder reduced features. I need to understand which figure is the correct one? Sorry, I had to upload the picture as an answer as there was no option of uploading an image via comment. I think you are saying figure B is the correct one. Here is the code snippet for the same. Please let me know if I going right.
# This is a mlp classification code with features reduced by an Autoencoder
# from keras.models import Sequential
from keras.layers import Input, Dense
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy
# Data pre-processing...
# load pima indians dataset
dataset = numpy.loadtxt("C:/Users/dibsa/Python Codes/pima.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:, 0:8]
Y = dataset[:, 8]
# Split data into training and testing datasets
x_train, x_test, y_train, y_test = train_test_split(
X, Y, test_size=0.2, random_state=42)
# scale the data within [0-1] range
scalar = MinMaxScaler()
x_train = scalar.fit_transform(x_train)
x_test = scalar.fit_transform(x_test)
# Autoencoder code goes here...
encoding_dim = 5 # size of our encoded representations
# this is our input placeholder
input_data = Input(shape=(8,))
# "encoded" is the encoded representation of the input
encoded = Dense(encoding_dim, activation='relu', name='encoder')(input_data)
# "decoded" is the lossy reconstruction of the input
decoded = Dense(8, activation='sigmoid', name='decoder')(encoded)
# this model maps an input to its reconstruction
autoencoder = Model(inputs=input_data, outputs=decoded)
autoencoder.compile(optimizer='sgd', loss='mse')
# training
autoencoder.fit(x_train, x_train,
epochs=300,
batch_size=10,
shuffle=True,
validation_data=(x_test, x_test)) # need more tuning
# test the autoencoder by encoding and decoding the test dataset
reconstructions = autoencoder.predict(x_test)
print('Original test data')
print(x_test)
print('Reconstructed test data')
print(reconstructions)
# MLP code goes here...
# create model
x = autoencoder.get_layer('encoder').output
# h = Dense(3, activation='relu', name='hidden')(x)
y = Dense(1, activation='sigmoid', name='predictions')(x)
classifier = Model(inputs=autoencoder.inputs, outputs=y)
# Compile model
classifier.compile(loss='binary_crossentropy', optimizer='adam',
metrics=['accuracy'])
# Fit the model
classifier.fit(x_train, y_train, epochs=250, batch_size=10)
print('Now making predictions')
predictions = classifier.predict(x_test)
# round predictions
rounded_predicted_classes = [round(x[0]) for x in predictions]
temp = sum(y_test == rounded_predicted_classes)
acc = temp/len(y_test)
print(acc)