I built a model for text summarization, I created a small document (text file) with its summary, then I trained the model on that, I created again the same type of document for test, the training and test documents are pretty similar but with different data.
For example, the training document contains:
name : train
family name : train
The test document:
name : test
family name : test
I was hoping that after training the model, it will remember the structure of the important sentences, after testing I've got an accuracy of 100%.
The problem is when I train the model on another document, the previous test gives lower accuracy, it's like it forgets the previous training.
here is my model:
model = Sequential()
model.add(Embedding(200,64, input_length=max_sent_length))
model.add(Conv1D(filters=64, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
for i in range(0,len(xtrains)):
model.fit(xtrains[i],ytrains[i], epochs=200, batch_size=64, shuffle=False)
I've searched about that and the answers I got is that refitting the model doesn't reset weights, so I was wondering why whenever I train the model on new documents I get lower accuracy for the previous tests, whereas at the beginning I received an accuracy of 100%.
How can I solve this problem?
Related
I have been trying to save the weights of my neural network model so that I could use a few of its layers for another neural network model to be trained on another dataset.
pre-trained model:
model = Sequential()
model.add(tf.keras.layers.Dense(100, input_shape=(X_train_orig_sm.shape)))
model.add(tf.keras.layers.Activation('relu'))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(10))
model.add(tf.keras.layers.Activation('relu'))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(1))
model.add(tf.keras.layers.Activation('sigmoid'))
model.summary()
# need sparse otherwise shape is wrong. check why
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print('Fitting the data to the model')
batch_size = 20
epochs = 10
history = model.fit(X_train_orig_sm, Y_train_orig_sm, batch_size=batch_size, epochs=epochs, verbose=1, validation_split=0.2)
print('Evaluating the test data on the model')
How I saved the weights of neural network:
model.save_weights("dnn_model.h5")
How I try to load the weights of neural network:
dnn_model=model.load_weights("dnn_model.h5")
dnn_model.layers[5]
While trying to load the model, I get the following error:
AttributeError: 'NoneType' object has no attribute 'layers'
I dont seem to understand why the layers of the neural network are not recognised even though the pre-trained neural network is trained before the model was saved. Any advice, solution or direction will be highly appreciated. Thank you.
When you call model.save_weights("dnn_model.h5"), you only save the "weights" of the model. You do not save the actual structure of the model. That's why you cannot access the layers etc.
To save the actual model, you can call the below.
# save
model.save('dnn_model') # save as pb
model.save('dnn_model.h5') # save as HDF5
# load
dnn_model = tf.keras.models.load_model('dnn_model') # load as pb
dnn_model = tf.keras.models.load_model('dnn_model.h5') # load as HDF5
Note: You do not need to add an extension to the name to save as pb.
Source: https://www.tensorflow.org/tutorials/keras/save_and_load
I am writing a program for clasifying images into two categories: "Wires" and "non-Wires". I have hand-labeled around 5000 microscope images, examples:
non-wire
wire
The neural network I am using is adapted from "Deep Learning with Python", chapter about convolutional networks (I don't think convolutional networks are neccesary here because there are no obvious hierarchies; Dense networks should be more suitable):
model = models.Sequential()
model.add(layers.Dense(32, activation='relu',input_shape=(200,200,3)))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Dense(32, activation='relu'))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(2, activation='softmax'))
However, test accuracy after 10 epochs of training does not go over 92% when playing around with the paramters of the network. Training images contain about 1/3 wires, 2/3 non-wires. My question: Do you see any obvious mistakes in this neural network design that inhibits accuracy, or do you think I am limited by the image quality? I have about 4000 train and 1000 test images.
You might get some improvement by trying to handle the class imbalance using a weights dictionary. If the label of non wire is 0 and the label for wire is 1 then the weight dictionary would be
weight_dict= { 0:.5, 1:1}
in model.fit set
class_weight=weight_dict .
Without seeing the results of training (training loss and validation loss) can't tell what else to do. If you are over fitting try adding some dropout layers. Also recommend you try using an adjustable learning using the keras callback ReduceLROnPlateau, and early stopping using the keras callback EarlyStopping. Documentation is here. Set each callback to monitor validation loss. My suggested code is shown below:
reduce_lr=tf.keras.callbacks.ReduceLROnPlateau(
monitor="val_loss",factor=0.5, patience=2, verbose=1)
e_stop=tf.keras.callbacks.EarlyStopping( monitor="val_loss", patience=5,
verbose=0, restore_best_weights=True)
callbacks=[reduce_lr, e_stop]
In model.fit include
callbacks=callbacks
If you want to give a convolutional network a try I recommend transfer learning using the Mobilenetmodel. Documentation for that is here.. My recommend code for that is below:
base_model=tf.keras.applications.mobilenet.MobileNet( include_top=False,
input_shape=(200,200,3) pooling='max', weights='imagenet',dropout=.4)
x=base_model.output
x=keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001 )(x)
x = Dense(1024, activation='relu')(x)
x=Dropout(rate=.3, seed=123)(x)
output=Dense(2, activation='softmax')(x)
model=Model(inputs=base_model.input, outputs=output)
model.compile(Adamax(lr=.001),loss='categorical_crossentropy',metrics=
['accuracy'] )
In model.fit include the callbacks as shown above.
I am trying to build a regression model that predicts the 'Ratings' for movies using the dataset https://www.kaggle.com/shubhammehta21/movie-lens-small-latest-dataset. However after training the model, predictions outputs the same value for all test features. I have read previous similar features that suggested adjusting learning rates, no. of features and checking that the model predicting is the same as the trained model. None of these has worked for me.
I load the data and process it:
links= pd.read_csv('../input/movie-lens-small-latest-dataset/links.csv')
movies=pd.read_csv('../input/movie-lens-small-latest-dataset/movies.csv')
...
dataset=movies.merge(ratings,on='movieId').merge(tags,on='movieId').merge(links,on='movieId')
to_drop='title','genres','timestamp_x','timestamp_y','userId_y','imdbId','tmdbId']
dataset.drop(columns=to_drop,inplace=True)
dataset=pd.get_dummies(dataset)
The code shows how I build the regression model. I have tried adjusting the number of neuron and layers, however, that has not influenced the output.
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.optimizers import Adam
model = Sequential()
model.add(Dense(13, input_dim=1586, kernel_initializer='zero', activation='relu'))
model.add(Dense(6, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal',activation='linear'))
# Compile model
adam = Adam(lr=0.001)
model.compile(loss='mean_squared_error', optimizer=adam,metrics=['mse','mae'])
model.summary()
history = model.fit(train_dataset,train_labels,batch_size=30, epochs=10,verbose=1, validation_split=0.3)
score = model.evaluate(validation_dataset,validation_labels)
print("Test score:", score)
Whenever I try to predict the test dataset:
model.predict(test_dataset)
It predicts the value of
3.97
on all values. I am expecting a range of values between 0 - 5.
You should never (I mean, never) use kernel_initializer='zero' - to be honest, I am surprised that the option even exists in Keras!
Also, kernel_initializer='normal' is not recommended.
As a first step, remove all kernel_initializer arguments, so as to revert to the default and recommended one, kernel_initializer='glorot-uniform'; keep in mind that defaults are there for a reason (usually they work well), and you should change them only if you really have a reason to do so (which I trust you don't have here) and you know what you are doing.
If you still don't get what you would expect, experiment with other parameters (no. of layers/neurons, more epochs etc); you should leave the learning rate (lr) of Adam optimizer as is for starters (it's also one of these default values that seem to work nicely across cases).
My model is like
print('Build main model...')
model = Sequential()
model.add(Merge([left, right], mode='sum'))
model.add(Dense(14, activation='softmax'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
when I use model.evaluate([xtest1, xtest2], y_test), I get an accuracy of 90% but when I use model.predict_classes([x_test1, x_xtest2]), I get totally wrong class labels, going by which my accuracy drops significantly. What is the difference in model.evaluate and model.predict_classes schema? Where am I making the mistake?
Since you ask for loss='binary_cross_entropy' and metric=['accuracy'] in your model compilation, Keras infers that you are interested in the binary accuracy, and this is what it returns in model.evaluate(); in fact, since you have 14 classes, you are actually interested in the categorical accuracy, which is the one reported via model.predict_classes().
So, you should change the loss function in your model compilation to categorical_crossentropy:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
If, for whatever reason, you want to stick with loss='binary_crossentropy' (admittedly it would be a very unusual choice) , you should change the model compilation to clarify that you want the categorical accuracy as follows:
from keras.metrics import categorical_accuracy
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=[categorical_accuracy])
In either of these cases, you will find that the accuracies reported by model.evaluate() and model.predict_classes() are the same, as they should be.
For a more detailed explanation and an example using the MNIST data, see my answer here.
I just started learning NN recently on python with keras, I've a pretty obvious question that nobody seems to ever mention its answer.
the question is very simple.
what happen after you get the data, build the model and train your network ?
every tutorial go through this thoroughly, but never mention how to use your trained model or store it after that.
so for example I written this simple code with keras to train a network on MNIST :
model = Sequential()
model.add(Convolution2D(32, kernel_size=3,data_format="channels_first",
activation='relu', input_shape=(1,28,28)))
model.add(Convolution2D(32, (3 ,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
#compiling
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
#fitting and training
model.fit(X_train, Y_train,batch_size=32, epochs=1, verbose=1)
now how do i store the final network and reuse it again after i close the the the editor ?
for example if i wanna build a simple web interface to upload a MNIST pic and run it through the pre-trained model and detect the answer.
How can i store the trained Model with Python , access it with JS or php, run the uploaded picture through it , and return the output back to the user.
thanks, and sorry if my question seems stupid or obvious.
This is an example how you can save your neural network in keras in json and h5:
# serialize model to JSON
model_json = model.to_json()
with open("model.json", "w") as json_file:
json_file.write(model_json)
# serialize weights to HDF5
model.save_weights("model.h5")
print("Saved model to disk")
and this is how you can load it again:
# load json and create model
json_file = open('model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)
# load weights into new model
loaded_model.load_weights("model.h5")
print("Loaded model from disk")
Finally you can evaluate the loaded model on new test data:
# evaluate loaded model on test data
loaded_model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
score = loaded_model.evaluate(X, Y, verbose=0)
print("%s: %.2f%%" % (loaded_model.metrics_names[1], score[1]*100))