I am playing around with the Keras library, trying to predict a timeserie and getting very bad results, I would like to know why the neural network can't handle even a simple scenario. My (engineered) data look like this:
(The pattern is very simple - result has exactly the same value as the feature, there are 10000 lines like this)
0, 1, 1
1, 1, 1
2, 0, 0
3, 1, 1
4, 1, 1
5, 1, 1
6, 1, 1
7, 0, 0
8, 1, 1
9, 0, 0
10, 1, 1
My Keras code:
data = pd.read_csv("data/" + csv_path)
x = data.ix[:, 1:2]
y = data.ix[:, 2]
test_set_length = int(round(len(x) * TEST_SET_RATIO))
validation_set_length = int(round(len(x) * VALIDATION_SET_RATIO))
x_train_and_val = x[:-test_set_length]
y_train_and_val = y[:-test_set_length]
x_train = x_train_and_val[:-validation_set_length].values
y_train = y_train_and_val[:-validation_set_length].values
x_val = x_train_and_val[-validation_set_length:].values
y_val = y_train_and_val[-validation_set_length:].values
x_test = x[-test_set_length:].values
y_test = y[-test_set_length:].values
scaler = sklearn.preprocessing.StandardScaler().fit(x_train_and_val)
train_gen = keras.preprocessing.sequence.TimeseriesGenerator(
val_gen = keras.preprocessing.sequence.TimeseriesGenerator(
test_gen = keras.preprocessing.sequence.TimeseriesGenerator(
model = keras.models.Sequential()
model.add(keras.layers.Dense(100, activation='relu', input_shape=(TIMESERIES_LENGTH, 1)))
model.add(keras.layers.Dense(1000, activation='relu'))
model.add(keras.layers.Dense(1, activation='sigmoid'))
history = model.fit_generator(
plt.legend(['training accuracy', 'validation accuracy', 'training loss', 'validation loss'], loc='upper left')
I have tried LSTM layers, but they perform similarly badly.
Any idea what am I doing wrong? Thank you very much.
It turns out keras.preprocessing.sequence.TimeseriesGenerator expects the y (y_train in my example) to be shifted by one compared to X (x_train in my case).
Your input data should be in such a shape that the particular subsequence of X ending at index n predicts value at index n + 1 in your y. My original mistake was that it predicted value at index n.
Thanks to Daniel Möller, for pointing me in the right direction.
What is the mean value of the target data? Is it zero? From my experience, the default configuration of NN does not have a constant value which can be obtained by having last layer with affine or linear activation function.
have built multi classification model with Keras and after model is finished I would like to predict value for one of my test input.
This is the part where I scaled features:
x = dataframe.drop("workTime", axis = 1)
x = dataframe.drop("creation", axis = 1)
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
x = pd.DataFrame(sc.fit_transform(x))
y = dataframe["workTime"]
import seaborn as sb
corr = dataframe.corr()
sb.heatmap(corr, cmap="Blues", annot=True)
print("Scaled features:", x.head(3))
Then I did:
y_cat = to_categorical(y)
x_train, x_test, y_train, y_test = train_test_split(x.values, y_cat, test_size=0.2)
And built model:
model = Sequential()
model.add(Dense(16, input_shape = (9,), activation = "relu"))
model.add(Dense(8, activation = "relu"))
model.add(Dense(6, activation = "softmax"))
model.compile(Adam(lr = 0.0001), "categorical_crossentropy", metrics = ["categorical_accuracy"])
model.fit(x_train, y_train, verbose=1, batch_size = 8, epochs=100, shuffle=True)
After my calculation finished, I wanted to take first element from test data and predict
value/classify it.
print(x_test.shape, x_train.shape) // (1550, 9) (6196, 9)
firstTest = x_test[:1]; // [[ 2.76473141 1.21064165 0.18816548 -0.94077449 -0.30981017 -0.37723917
-0.44471711 -1.44141792 0.20222467]]
prediction = model.predict(firstTest)
print(prediction) // [[7.5265622e-01 2.4710520e-01 2.3643016e-04 2.1405797e-06 3.8411264e-19
print(prediction[0]) // [7.5265622e-01 2.4710520e-01 2.3643016e-04 2.1405797e-06 3.8411264e-19
unscaled = sc.inverse_transform(prediction)
print("prediction", unscaled)
During this I retrieve:
ValueError: operands could not be broadcast together with shapes (1,6) (9,) (1,6)
I think it may be related to my scalers.
And please correct me if I wrong, but what I want to achieve here is to either have one output value which points me how this entry was classified or array of possibilities for each classification label.
Thank you for hints
Your StandardScaler was used to scale the input features, you can't apply it (or its inverse) on the outputs!
If you are looking for the probabilities of the test sample being in each class, you already have it in prediction[0].
If you want the final class predicted, just take the one with the largest probability with argmax: tf.math.argmax(prediction[0]).
I am working on a multiclass problem (5-classes, highly imbalanced dataset). I would like to implement an ensemble of convolutional auto-encoders where each auto-encoder is trained on a single class, and then ensemble to obtain the final classification results.
I am however stuck at a point to train each encoder per class. I'm getting the error that I believe has to do with my logic dealing with arrays of the class labels:
IndexError: boolean index did not match indexed array along dimension 1; dimension is 1 but corresponding boolean dimension is 5
I am working with really huge dataset, but I provide an MWE for a 3-class problem to reproduce similar situation below:
#..scikitlearn, keras, numpy ....libraries import
class SingleAED:
def __init__(self, train, test):
self.x_train = train
self.x_test = test
def setSingleModel(self):
autoencoder = Sequential()
activ = 'relu'
autoencoder.add(Conv2D(32, (1, 3), strides=(1, 1), padding='same', activation=activ, input_shape=(1, Threshold, 4)))
autoencoder.add(BatchNormalization(axis = 3))
autoencoder.add(Conv2D(32, (1, 3), strides=(1, 1), padding='same', activation=activ ))
autoencoder.add(BatchNormalization(axis = 3))
autoencoder.add(MaxPooling2D(pool_size=(1, 2) ))
autoencoder.compile(optimizer='adam', loss='mae', metrics=['mean_squared_error'])
filepath = "weights.best.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
autoencoder.fit(self.x_train, self.x_train, epochs=250, batch_size=256, shuffle=True,callbacks=callbacks_list)
return autoencoder
#generate dummy data
X = np.random.randn(20, 1, 5, 4)
a,b,c = np.repeat(0, 7), np.repeat(1, 7), np.repeat(2, 6)
y = np.hstack((a,b,c))
LABELS= list(set(np.ndarray.flatten(y)))
Threshold = len(X[0, 0, :, 0])
NoClass = len(LABELS)
#train-test split
x_train, x_test, y_train, y_test = train_test_split(X, y,
test_size=0.20, random_state=7)
#...to categorical
y_train = keras.utils.to_categorical(y_train, num_classes=NoClass)
y_test = keras.utils.to_categorical(y_test, num_classes=NoClass)
#train an auto-encoder per class
ensemble = []
for i in range(len(LABELS)):
sub_train = x_train[y_train == i]
sub_test = x_test[y_test == i]
autoencoder = SingleAED(sub_train, sub_test)
autoencoder = autoencoder.setSingleModel()
IndexError Traceback (most recent call last)
<ipython-input-98-e00f5454d8b5> in <module>()
2 for i in range(len(LABELS)):
3 print(LABELS[i])
----> 4 sub_train = x_train[y_train == i]
5 sub_test = x_test[y_test == i]
IndexError: boolean index did not match indexed array along dimension 1; dimension is 1 but corresponding boolean dimension is 3
In this case, I want to loop through the classes 0..2 to train an encoder per class. I am not sure why I get this error, can someone help sort this out?
You want to index the x_train array with y_train before it is converted to categorical.
x_train, x_test, y_train, y_test = train_test_split(X, y,
y_train_cat = keras.utils.to_categorical(y_train, num_classes=NoClass)
y_test_cat = keras.utils.to_categorical(y_test, num_classes=NoClass)
#train an auto-encoder per class
ensemble = []
for i in range(len(LABELS)):
sub_train = x_train[y_train == i]
sub_test = x_test[y_test == i]
autoencoder = SingleAED(sub_train, sub_test)
autoencoder = autoencoder.setSingleModel()
I'm trying to classify images of PCBs into two categories (defected and undefected) using categorical cross-entropy as the loss function. The code for the same is as below:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from keras.applications.resnet50 import preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from sklearn.model_selection import train_test_split
def create_compiled_model():
model = Sequential()
model.add(ResNet50(include_top=False, weights=RESNET50_WEIGHTS, input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3), pooling=RESNET50_POOLING_AVERAGE))
model.add(Dense(NUM_CLASSES, activation=DENSE_LAYER_ACTIVATION))
model.layers[0].trainable = False
sgd = SGD(lr = 0.01, decay = 1e-6, momentum = 0.9, nesterov = True)
model.compile(optimizer = sgd, loss = OBJECTIVE_FUNCTION, metrics = LOSS_METRICS)
return model
def data_splitor():
x = np.load("/content/data/xtrain.npy")
y = np.load("/content/data/ytrain.npy")
# Getting the Test and Train splits
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size= TRAIN_TEST_SPLIT, shuffle= True)
# Getting the Train and Validation splits
x__train, x__valid, y__train, y__valid = train_test_split(x_train, y_train, test_size= TRAIN_TEST_SPLIT, shuffle= True)
return x__train, x__valid, x_test, y__train, y__valid, y_test
def data_generator(x, y, batch_size, seed=None, shuffle=True):
data_generator = ImageDataGenerator(horizontal_flip=True, vertical_flip=True, rotation_range=180, brightness_range=[0.3, 1.0], preprocessing_function=preprocess_input)
generator = data_generator.flow(x_train, y_train, batch_size= batch_size, seed= seed, shuffle=shuffle)
return generator
def run_program():
x_train, x_valid, x_test, y_train, y_valid, y_test = data_splitor()
train_generator = data_generator(x_train, y_train, BATCH_SIZE_TRAINING)
validation_generator = data_generator(x_valid, y_valid, BATCH_SIZE_VALIDATION)
cb_early_stopper = EarlyStopping(monitor = 'val_loss', patience = EARLY_STOP_PATIENCE)
cb_checkpointer = ModelCheckpoint(filepath = '/content/model/best.hdf5', monitor = 'val_loss', save_best_only = True, mode = 'auto')
model = create_compiled_model()
fit_history = model.fit_generator(
epochs = NUM_EPOCHS,
callbacks=[cb_checkpointer, cb_early_stopper]
plt.figure(1, figsize = (15,8))
plt.title('model accuracy')
plt.legend(['train', 'valid'])
plt.title('model loss')
plt.legend(['train', 'valid'])
# Testing
test_generator = data_generator(x_test, y_test, BATCH_SIZE_TESTING, 123, False)
pred = model.predict_generator(test_generator, steps = len(test_generator), verbose = 1)
predicted_class_indices = np.argmax(pred, axis = 1)
# Running the program
with tensorflow.device('/device:GPU:0'):
except RuntimeError as e:
And upon executing this, I get the ValueError seen below:
ValueError: in user code:
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:571 train_function *
outputs = self.distribute_strategy.run(
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:951 run **
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2290 call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2649 _call_for_each_replica
return fn(*args, **kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:533 train_step **
y, y_pred, sample_weight, regularization_losses=self.losses)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/compile_utils.py:204 __call__
loss_value = loss_obj(y_t, y_p, sample_weight=sw)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/losses.py:143 __call__
losses = self.call(y_true, y_pred)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/losses.py:246 call
return self.fn(y_true, y_pred, **self._fn_kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/losses.py:1527 categorical_crossentropy
return K.categorical_crossentropy(y_true, y_pred, from_logits=from_logits)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py:4561 categorical_crossentropy
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/tensor_shape.py:1117 assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (None, 1) and (None, 2) are incompatible
I have already looked at this, this and this, but could not resolve the error.
I really appreciate the help in fixing this.
Thanks Praveen
Here is the complete traceback... link
Seems your y_train data have shape (None,1) while your network is expecting (None,2). There are two options to solve this:
1) Change your model output to 1 unit and change loss to binary crossentropy
2) Change your y_train data to categorical. See this
If you can post here your model.summary() and your dataset shapes it will help us to help you.
Your traceback link isn't working.
However, try replacing categorical cross-entropy with binary cross entropy since you have only two classes.
I had the same issue, but instead, i was using labels decoded to int64 format from TFRecord files, changing my loss function from 'CategoricalCrossentropy' to 'SparseCategoricalCrossentropy' resolved the issue.
I encountered a similar issue and these above-mentioned solutions did not work. The main reason why we get this error is when we fail to establish the 1 to 1 mapping of data between the X_train and Y_train. This means the shape of Y_train should be in shape like (No_of_Sequnces, no_of_classes).
Example -
Let's say my dataset has 2000 rows and 5 features. where 1 sequence = 100 rows of data.
So before reshaping x_train will look like this
X_train.shape = (2000,5)
before feeding into LSTM we should reshape it to 3D (usually), hence
On the other hand, our Y_Train will be initially. (if it is in 2D, change it to 1D by flattening)
Y_train.shape = (2000, )
So, before feeding into LSTM we should change the Y_train shape like
Y_train.shape =(20, 5)
the 20 will make the 1:1 mapping with the train set, while the 5 will make the mapping with the final dense layer of the classification model, where we are supposed to use categorical-cross entropy.
Also please note that the Y_train should be in 2D shape. So how do we re-shape it like that?.
Check how the _train data are
If in string use one-hot representation
If integers for each class, convert to categorical (refer)
After changing to categorical refer the Y_train again.
If the class number and number of column is equal, use the following code to reduce the rows to 20 (lik of X_train)
for eachRowTemp in range(df_Y_Labels.__len__()):
if(eachRowTemp%20 == 1):
Y_Label = np.asarray(Y_Label_Array)
This should work. also you should change the Y_test in similar way.
Thanks #Augusto maillo the link was useful to fix error.
Mulitclass classification
For multi class classification the labels have to be converted to a matrix use tensorflow.keras.utils.to_categorical method for converting labels to a matrix.
y, num_classes=None, dtype='float32'
y = [0, 1, 2, 0, 2, 2, 1, 0, 1, 1, 0, 0, 1, 0, 2, 2, 0] # we have 3 classes 0, 1 & 2
y = tf.keras.utils.to_categorical(y, num_classes=3, dtype='int')
array([[1, 0, 0],
[0, 1, 0],
[0, 0, 1],
[1, 0, 0],
[0, 0, 1],
[0, 0, 1],
[0, 1, 0],
[1, 0, 0],
[0, 1, 0],
[0, 1, 0],
[1, 0, 0],
[1, 0, 0],
[0, 1, 0],
[1, 0, 0],
[0, 0, 1],
[0, 0, 1],
[1, 0, 0]])
I have used the MinMax normalization in order to normalize my dataset, both features and label. My question is, it's correct to normalize also the label? If yes, how can I denormalize the output of the neural network (the one that I predict with the test set that is normalized)?
I can't upload the dataset, but it is composed by 18 features and 1 label. It is a regression task, the features and the label are physical quantities.
So the problem is that the y_train_pred and y_test_pred are between 0 and 1. How can I predict the "real value"?
The code:
dataset = pd.read_csv('DataSet.csv', decimal=',', delimiter = ";")
label = dataset.iloc[:,-1]
features = dataset.drop(columns = ['Label'])
features = features[best_features]
X_train1, X_test1, y_train1, y_test1 = train_test_split(features, label, test_size = 0.25, random_state = 1, shuffle = True)
y_test2 = y_test1.to_frame()
y_train2 = y_train1.to_frame()
scaler1 = preprocessing.MinMaxScaler()
scaler2 = preprocessing.MinMaxScaler()
X_train = scaler1.fit_transform(X_train1)
X_test = scaler2.fit_transform(X_test1)
scaler3 = preprocessing.MinMaxScaler()
scaler4 = preprocessing.MinMaxScaler()
y_train = scaler3.fit_transform(y_train2)
y_test = scaler4.fit_transform(y_test2)
optimizer = tf.keras.optimizers.Adamax(lr=0.001)
model = Sequential()
model.add(Dense(80, input_shape = (X_train.shape[1],), activation = 'relu',kernel_initializer='random_normal'))
model.add(Dense(120, activation = 'relu',kernel_initializer='random_normal'))
model.add(Dense(80, activation = 'relu',kernel_initializer='random_normal'))
model.add(Dense(1,activation = 'linear'))
model.compile(loss = 'mse', optimizer = optimizer, metrics = ['mse'])
history = model.fit(X_train, y_train, epochs = 300,
validation_split = 0.1, shuffle=False, batch_size=120
history_dict = history.history
loss_values = history_dict['loss']
val_loss_values = history_dict['val_loss']
y_train_pred = model.predict(X_train)
y_test_pred = model.predict(X_test)
You should denormalize so you can get real world predictions to your neural network, rather than a number between 0-1
The min - max normalization is defined by:
z = (x - min)/(max - min)
With z being the normalized value, x being the label value, max being the max x value, and min being the min x value. So if we have z, min, and max we can resolve for x as follows:
x = z(max - min) + min
Thus before you normalize your data, define variables for the max and min value for the label if it is continuous. Then after you get your pred values, you can use the following function:
y_max_pre_normalize = max(label)
y_min_pre_normalize = min(label)
def denormalize(y):
final_value = y(y_max_pre_normalize - y_min_pre_normalize) + y_min_pre_normalize
return final_value
And apply this function to your y_test/y_pred to get the corresponding value.
You can use this link here to better visualize this.
I'm having trouble with LSTM and Keras.
I try to predict normal/fake domain names.
My dataset is like this:
google, 0
with 50% normal and 50% fake domains
Here's my LSTM model:
def build_model(max_features, maxlen):
"""Build LSTM model"""
model = Sequential()
model.add(Embedding(max_features, 128, input_length=maxlen))
sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=['acc'])
return model
then I preprocess my text data to transform it into numbers:
"""Run train/test on logistic regression model"""
indata = data.get_data()
# Extract data and labels
X = [x[1] for x in indata]
labels = [x[0] for x in indata]
# Generate a dictionary of valid characters
valid_chars = {x:idx+1 for idx, x in enumerate(set(''.join(X)))}
max_features = len(valid_chars) + 1
maxlen = 100
# Convert characters to int and pad
X = [[valid_chars[y] for y in x] for x in X]
X = sequence.pad_sequences(X, maxlen=maxlen)
# Convert labels to 0-1
y = [0 if x == 'benign' else 1 for x in labels]
Then I split my data into training, testing and validation sets:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print("Build model...")
model = build_model(max_features, maxlen)
X_train, X_holdout, y_train, y_holdout = train_test_split(X_train, y_train, test_size=0.2)
And then I train my model on training data and validation data, and evaluate on test data:
history = model.fit(X_train, y_train, epochs=max_epoch, validation_data=(X_holdout, y_holdout), shuffle=False)
scores = model.evaluate(X_test, y_test, batch_size=batch_size)
At the end of my training/testing I have these results:
And these scores when evaluating on test dataset:
loss = 0.060554939906234596
accuracy = 0.978109902033532
However when I predict on a sample of the dataset like this:
LSTM_model = load_model('LSTMmodel_64_sgd.h5')
data = pickle.load(open('traindata.pkl', 'rb'))
#### LSTM ####
"""Run train/test on logistic regression model"""
# Extract data and labels
X = [x[1] for x in data]
labels = [x[0] for x in data]
X1, _, labels1, _ = train_test_split(X, labels, test_size=0.9)
# Generate a dictionary of valid characters
valid_chars = {x:idx+1 for idx, x in enumerate(set(''.join(X1)))}
max_features = len(valid_chars) + 1
maxlen = 100
# Convert characters to int and pad
X1 = [[valid_chars[y] for y in x] for x in X1]
X1 = sequence.pad_sequences(X1, maxlen=maxlen)
# Convert labels to 0-1
y = [0 if x == 'benign' else 1 for x in labels1]
y_pred = LSTM_model.predict(X1)
I have very poor performance:
accuracy = 0.5934741842730341
confusion matrix = [[25201 14929]
[17589 22271]]
F1-score = 0.5780171295094731
Can someone explain to me why?
I have tried 64 instead of 128 for the LSTM node, adam and rmsprop for optimizers, increasing batch_size however performance remains very low.
Ok so I have found the answer.
This is this line
valid_chars = {x:idx+1 for idx, x in enumerate(set(''.join(X1)))}
In Python 3 set seems to produce different results everytime a new python3 console is open.
So running the code in Python 2 has resolved my issues !