As an experiment I am building a keras model to approximate the determinant of a matrix. However, when I run it the loss goes down at every epoch and the validation loss goes up! For example:
8s - loss: 7573.9168 - val_loss: 21831.5428
Epoch 21/50
8s - loss: 7345.0197 - val_loss: 23594.8540
Epoch 22/50
13s - loss: 7087.7454 - val_loss: 24718.3967
Epoch 23/50
7s - loss: 6851.8714 - val_loss: 25624.8609
Epoch 24/50
6s - loss: 6637.8168 - val_loss: 26616.7835
Epoch 25/50
7s - loss: 6446.8898 - val_loss: 28856.9654
Epoch 26/50
7s - loss: 6255.7414 - val_loss: 30122.7924
Epoch 27/50
7s - loss: 6054.5280 - val_loss: 32458.5306
Epoch 28/50
Here is the complete code:
import numpy as np
import sys
from scipy.stats import pearsonr
from scipy.linalg import det
from sklearn.model_selection import train_test_split
from tqdm import tqdm
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
import math
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from keras import backend as K
def baseline_model():
# create model
model = Sequential()
model.add(Dense(200, input_dim=n**2, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, input_dim=n**2))
# model.add(Dense(1, kernel_initializer='normal'))
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
return model
n = 15
print("Making the input data using seed 7", file=sys.stderr)
np.random.seed(7)
U = np.random.choice([0, 1], size=(n**2,n))
#U is a random orthogonal matrix
X =[]
Y =[]
# print(U)
for i in tqdm(range(100000)):
I = np.random.choice(n**2, size = n)
# Pick out the random rows and sort the rows of the matrix lexicographically.
A = U[I][np.lexsort(np.rot90(U[I]))]
X.append(A.ravel())
Y.append(det(A))
X = np.array(X)
Y = np.array(Y)
print("Data created")
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=baseline_model, epochs=50, batch_size=32, verbose=2)))
pipeline = Pipeline(estimators)
X_train, X_test, y_train, y_test = train_test_split(X, Y,
train_size=0.75, test_size=0.25)
pipeline.fit(X_train, y_train, mlp__validation_split=0.3)
How can I stop it overfitting so badly?
Update 1
I tried adding more layers and L_2 regularization. However, it makes little or no difference.
def baseline_model():
# create model
model = Sequential()
model.add(Dense(n**2, input_dim=n**2, kernel_initializer='glorot_normal', activation='relu'))
model.add(Dense(int((n**2)/2.0), kernel_initializer='glorot_normal', activation='relu', kernel_regularizer=regularizers.l2(0.01)))
model.add(Dense(int((n**2)/2.0), kernel_initializer='glorot_normal', activation='relu', kernel_regularizer=regularizers.l2(0.01)))
model.add(Dense(int((n**2)/2.0), kernel_initializer='glorot_normal', activation='relu', kernel_regularizer=regularizers.l2(0.01)))
model.add(Dense(1, kernel_initializer='glorot_normal'))
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
return model
I increased the number of epochs to 100 and it finishes with:
19s - loss: 788.9504 - val_loss: 18423.2807
Epoch 97/100
24s - loss: 760.2046 - val_loss: 18305.9273
Epoch 98/100
20s - loss: 806.0941 - val_loss: 18174.8706
Epoch 99/100
24s - loss: 780.0487 - val_loss: 18356.7482
Epoch 100/100
27s - loss: 749.2595 - val_loss: 18331.5859
Is it possible to approximate the determinant of a matrix using keras?
I tested your code and got the same result. But let's go into basic understanding of matrix determinant (DET). DET consists of n! products, so you cannot really approximate it with n*n weights in few layers of neural network. This requires number of weights that would not scale to n=15, since 15! is 1307674368000 terms for multiplication in the DET.
Related
I have a file with a set of "words" for example:
1a 9( 9j = 2453
3a 4( 6j 0s = 2309
1 7( 8ll = 4934
It looks like random data but it isn't, it has a score for each set of "words". My file consists of about 1million lines and there is definately patterns in it. There are about 3600 unqiue individual words.
The end column contains a score for that particular arrangement of words.
I have encoded each line to ints and padded them with 0's and put them in a file called words.txt
an example of that file would be:
475,12,2495,2934,105,0,0,0,9384 (last column being the output score)
Now I have this code:
When I run it, it's loss/accuracy is very bad, loss is like 70000000.
What am I doing wrong?
from numpy import loadtxt
from itertools import islice
from keras.models import Sequential
from keras.layers import Dense
# load the dataset
dataset = loadtxt('words.txt', delimiter=',')
X = dataset[:,0:8]
y = dataset[:,8]
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1))
model.compile(loss='mse', optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=150, batch_size=10000)
_, accuracy = model.evaluate(X, y)
print('Accuracy: %.2f' % (accuracy*100))
My goal is to predict the score of random combinations of words that I generate.
Log of fit:
:\AI>python main.py
Using TensorFlow backend.
2021-02-14 08:52:48.350476: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
C:\Users\fordy\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\framework\indexed_slices.py:424: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Epoch 1/150
1047711/1047711 [==============================] - 7s 7us/step - loss: 72945595445.1503 - accuracy: 0.2351
Epoch 2/150
1047711/1047711 [==============================] - 3s 3us/step - loss: 72940365091.2725 - accuracy: 0.0016
Epoch 3/150
1047711/1047711 [==============================] - 3s 3us/step - loss: 72922327250.8712 - accuracy: 0.0016
Epoch 4/150
1047711/1047711 [==============================] - 2s 2us/step - loss: 72883151430.7776 - accuracy: 0.0030
Epoch 5/150
1047711/1047711 [==============================] - 2s 2us/step - loss: 72815216732.1170 - accuracy: 0.0041
Epoch 6/150
1047711/1047711 [==============================] - 2s 2us/step - loss: 72711719248.6156 - accuracy: 0.0012
Epoch 7/150
1047711/1047711 [==============================] - 2s 2us/step - loss: 72566884174.8089 - accuracy: 1.5271e-05
Your model is too small. Try adding embedding layer and LSTM:
model = Sequential()
model.add(tf.keras.layers.Embedding(3600, 12, input_length=8)) # <= adjust vocab size
model.add(tf.keras.layers.LSTM(8))
# model.add(tf.keras.layers.Dense(12, input_dim=8, activation='relu'))
# model.add(tf.keras.layers.Dense(8, activation='relu'))
model.add(Dense(1))
I am currently working on a project involving training a regression model, saving it and then loading it to make further predictions using that model. However I'm having a problem. Each time that I model.predict on images it gives out the same predictions. I am not entirely sure what the problem is, maybe it's in the training stage or i'm just doing something wrong.
I was following this tutorial
All of the files are in this github repo
Here are some bits from the code:
(This part is training the model and saving it)
model = create_cnn(400, 400, 3, regress=True)
opt = Adam(lr=1e-3, decay=1e-3 / 200)
model.compile(loss="mean_absolute_percentage_error", optimizer=opt)
model.fit(X, Y, epochs=70, batch_size=8)
model.save("D:/statispic2/final-statispic_model.hdf5")
The next code part is from loading the model and making predictions.
model = load_model("D:/statispic2/statispic_model.hdf5") # Loading the model
prediction = model.predict(images_ready_for_prediction) #images ready for prediction include a numpy array
#that is loaded with the images just like I loaded them for the training stage.
print(prediction_list)
After trying it out this is the output prediction from the model:
[[0.05169942] # I gave it 5 images as parameters
[0.05169942]
[0.05169942]
[0.05169942]
[0.05169942]]
If anything is unclear, or you would like to see some more code, please let me know.
People saying regression and CNN are two completely different things clearly have missed some basic learnings in their ML course. Yes they are completely different! But should not be compared ;)
CNN is a type of deep neural network usually which became quite famous for its use on images. Therefore it is a framework to solve problem, and can solve both regression AND classification problems.
Regression refers to the type of output you are predicting. So comparing the two directly is quite stupid to be honest.
I cant comment on the specific people misleading you in this section, since I need a specific number of points to do so.
However, back to the problem. Do you encounter this problem before or after saving it? If you encounter it before, I would try scaling your output values to an easier distribution. If it happens after you save, I would look into versions of your framework and the documentation of how they save it.
It could also just be that there is no information in the pictures.
No, no, no! Regression is completely different from CNN. Do a little research and the differences will quickly become apparent. In the meantime, I'll share two code samples with you right here.
Regression:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
#%matplotlib inline
import sklearn
from sklearn.datasets import load_boston
boston = load_boston()
# Now we will load the data into a pandas dataframe and then will print the first few rows of the data using the head() function.
bos = pd.DataFrame(boston.data)
bos.head()
bos.columns = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT']
bos.head()
bos['MEDV'] = boston.target
bos.describe()
bos.isnull().sum()
sns.distplot(bos['MEDV'])
plt.show()
sns.pairplot(bos)
corr_mat = bos.corr().round(2)
sns.heatmap(data=corr_mat, annot=True)
sns.lmplot(x = 'RM', y = 'MEDV', data = bos)
X = bos[['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX','PTRATIO', 'B', 'LSTAT']]
y = bos['MEDV']
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 10)
# Training the Model
# We will now train our model using the LinearRegression function from the sklearn library.
from sklearn.linear_model import LinearRegression
lm = LinearRegression()
lm.fit(X_train, y_train)
# Prediction
# We will now make prediction on the test data using the LinearRegression function and plot a scatterplot between the test data and the predicted value.
prediction = lm.predict(X_test)
plt.scatter(y_test, prediction)
df1 = pd.DataFrame({'Actual': y_test, 'Predicted':prediction})
df2 = df1.head(10)
df2
df2.plot(kind = 'bar')
from sklearn import metrics
from sklearn.metrics import r2_score
print('MAE', metrics.mean_absolute_error(y_test, prediction))
print('MSE', metrics.mean_squared_error(y_test, prediction))
print('RMSE', np.sqrt(metrics.mean_squared_error(y_test, prediction)))
print('R squared error', r2_score(y_test, prediction))
Result:
MAE 4.061419182954711
MSE 34.413968453138565
RMSE 5.866341999333023
R squared error 0.6709339839115628
CNN:
# keras imports for the dataset and building our neural network
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Conv2D, MaxPool2D, Flatten
from keras.utils import np_utils
# to calculate accuracy
from sklearn.metrics import accuracy_score
# loading the dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# building the input vector from the 28x28 pixels
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
# normalizing the data to help with the training
X_train /= 255
X_test /= 255
# one-hot encoding using keras' numpy-related utilities
n_classes = 10
print("Shape before one-hot encoding: ", y_train.shape)
Y_train = np_utils.to_categorical(y_train, n_classes)
Y_test = np_utils.to_categorical(y_test, n_classes)
print("Shape after one-hot encoding: ", Y_train.shape)
# building a linear stack of layers with the sequential model
model = Sequential()
# convolutional layer
model.add(Conv2D(25, kernel_size=(3,3), strides=(1,1), padding='valid', activation='relu', input_shape=(28,28,1)))
model.add(MaxPool2D(pool_size=(1,1)))
# flatten output of conv
model.add(Flatten())
# hidden layer
model.add(Dense(100, activation='relu'))
# output layer
model.add(Dense(10, activation='softmax'))
# compiling the sequential model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
# training the model for 10 epochs
model.fit(X_train, Y_train, batch_size=128, epochs=10, validation_data=(X_test, Y_test))
Result:
Train on 60000 samples, validate on 10000 samples
Epoch 1/10
60000/60000 [==============================] - 27s 451us/step - loss: 0.2037 - accuracy: 0.9400 - val_loss: 0.0866 - val_accuracy: 0.9745
Epoch 2/10
60000/60000 [==============================] - 27s 451us/step - loss: 0.0606 - accuracy: 0.9819 - val_loss: 0.0553 - val_accuracy: 0.9812
Epoch 3/10
60000/60000 [==============================] - 27s 445us/step - loss: 0.0352 - accuracy: 0.9892 - val_loss: 0.0533 - val_accuracy: 0.9824
Epoch 4/10
60000/60000 [==============================] - 27s 446us/step - loss: 0.0226 - accuracy: 0.9930 - val_loss: 0.0572 - val_accuracy: 0.9825
Epoch 5/10
60000/60000 [==============================] - 27s 448us/step - loss: 0.0148 - accuracy: 0.9959 - val_loss: 0.0516 - val_accuracy: 0.9834
Epoch 6/10
60000/60000 [==============================] - 27s 443us/step - loss: 0.0088 - accuracy: 0.9976 - val_loss: 0.0574 - val_accuracy: 0.9824
Epoch 7/10
60000/60000 [==============================] - 26s 442us/step - loss: 0.0089 - accuracy: 0.9973 - val_loss: 0.0526 - val_accuracy: 0.9847
Epoch 8/10
60000/60000 [==============================] - 26s 440us/step - loss: 0.0047 - accuracy: 0.9988 - val_loss: 0.0593 - val_accuracy: 0.9838
Epoch 9/10
60000/60000 [==============================] - 28s 469us/step - loss: 0.0056 - accuracy: 0.9986 - val_loss: 0.0559 - val_accuracy: 0.9836
Epoch 10/10
60000/60000 [==============================] - 27s 449us/step - loss: 0.0059 - accuracy: 0.9981 - val_loss: 0.0663 - val_accuracy: 0.9820
CNN is deep learning. You use regression models for calculating a number, like the price of a car.
I had the exact same issue after pickle.dump and pickle.load my model. The problem I was missing is that I was not normalizing features (vector X) before predicting using the model. I hope the will help you.
Changing the optimizer from Adam() to RMSprop() with a learning rate of >0.001 worked for me.
I tried to learn my NN with breast Cancer Wisconsin
(I add "id" column as an index and changed "diagnosis" column to 0 and 1 with sklearn.preprocessing.LabelEncoder), but my NN is not reducing Loss.
I tried other optimizers and losses but this isn't working.
That's my NN:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, BatchNormalization, InputLayer
import tensorflow.nn as tfnn
model = Sequential()
model.add(Dense(30, activation = tfnn.relu, input_dim = 30))
model.add(BatchNormalization(axis = 1))
model.add(Dense(60, activation = tfnn.relu))
model.add(BatchNormalization(axis = 1))
model.add(Dense(1, activation = tfnn.softmax))
model.compile(loss = 'binary_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
model.fit(data, target, epochs = 6)
And my output:
Epoch 1/6
569/569 [==============================] - 2s 3ms/sample - loss: 10.0025 - acc: 0.3726
Epoch 2/6
569/569 [==============================] - 0s 172us/sample - loss: 10.0025 - acc: 0.3726
Epoch 3/6
569/569 [==============================] - 0s 176us/sample - loss: 10.0025 - acc: 0.3726
Epoch 4/6
569/569 [==============================] - 0s 167us/sample - loss: 10.0025 - acc: 0.3726
Epoch 5/6
569/569 [==============================] - 0s 163us/sample - loss: 10.0025 - acc: 0.3726
Epoch 6/6
569/569 [==============================] - 0s 169us/sample - loss: 10.0025 - acc: 0.3726
I seems that NN after a few iterations stops learning (look at the time of epochs learning, in the first epoch it's 2s and in others it's 0s and in first epoch speed of processing the data is ms/sample, but in other epochs iits us/sample)
Thank you for your time!
Softmax has sum=1.
You can't use softmax with 1 unit. It will always be 1.
Use 'sigmoid'.
Also be careful with 'relu'. It may (by luck) fall into an "all-zeros" region and stop evolving.
Ideally, the batch normalization should be before it (this way you guarantee that there will always be some positive numbers):
model = Sequential()
model.add(Dense(30, input_dim = 30))
model.add(BatchNormalization(axis = 1))
model.add(Activation(tfnn.relu))
model.add(Dense(60)
model.add(BatchNormalization(axis = 1))
model.add(Activation(tfnn.relu))
model.add(Dense(1, activation = tfnn.sigmoid))
Since you have a binary classification task with a single-unit final layer, you should not use tfnn.softmax as an activation for this layer. Use tfnn.sigmoid instead, i.e.
model.add(Dense(1, activation = tfnn.sigmoid)) # last layer
I am learning how to train a keras neural network on the MNIST dataset. However, when I run this code, I get only 10% accuracy after 10 epochs of training. This means that the neural network is predicting only one class, since there are 10 classes. I am sure it is a bug in data preparation rather than a problem with the network architecture, because I got the architecture off of a tutorial (medium tutorial). Any idea why the model is not training?
My code:
from skimage import io
import numpy as np
from numpy import array
from PIL import Image
import csv
import random
from keras.preprocessing.image import ImageDataGenerator
import pandas as pd
from keras.utils import multi_gpu_model
import tensorflow as tf
train_datagen = ImageDataGenerator()
train_generator = train_datagen.flow_from_directory(
directory="./trainingSet",
class_mode="categorical",
target_size=(50, 50),
color_mode="rgb",
batch_size=1,
shuffle=True,
seed=42
)
print(str(train_generator.class_indices) + " class indices")
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D, GlobalAveragePooling2D
from keras.optimizers import SGD
from keras import backend as K
from keras.layers import Input
from keras.models import Model
import keras
from keras.layers.normalization import BatchNormalization
K.clear_session()
K.set_image_dim_ordering('tf')
reg = keras.regularizers.l1_l2(1e-5, 0.0)
def conv_layer(channels, kernel_size, input):
output = Conv2D(channels, kernel_size, padding='same',kernel_regularizer=reg)(input)
output = BatchNormalization()(output)
output = Activation('relu')(output)
output = Dropout(0)(output)
return output
model = Sequential()
model.add(Conv2D(28, kernel_size=(3,3), input_shape=(50, 50, 3)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten()) # Flattening the 2D arrays for fully connected layers
model.add(Dense(128, activation=tf.nn.relu))
model.add(Dropout(0.2))
model.add(Dense(10, activation=tf.nn.softmax))
from keras.optimizers import Adam
import tensorflow as tf
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
from keras.callbacks import ModelCheckpoint
epochs = 10
checkpoint = ModelCheckpoint('mnist.h5', save_best_only=True)
STEP_SIZE_TRAIN=train_generator.n/train_generator.batch_size
model.fit_generator(generator=train_generator,
steps_per_epoch=STEP_SIZE_TRAIN,
epochs=epochs,
callbacks=[checkpoint]
)
The output I am getting is as follows:
Using TensorFlow backend.
Found 42000 images belonging to 10 classes.
{'0': 0, '1': 1, '2': 2, '3': 3, '4': 4, '5': 5, '6': 6, '7': 7, '8': 8, '9': 9} class indices
Epoch 1/10
42000/42000 [==============================] - 174s 4ms/step - loss: 14.4503 - acc: 0.1035
/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/keras/callbacks.py:434: RuntimeWarning: Can save best model only with val_loss available, skipping.
'skipping.' % (self.monitor), RuntimeWarning)
Epoch 2/10
42000/42000 [==============================] - 169s 4ms/step - loss: 14.4487 - acc: 0.1036
Epoch 3/10
42000/42000 [==============================] - 169s 4ms/step - loss: 14.4483 - acc: 0.1036
Epoch 4/10
42000/42000 [==============================] - 168s 4ms/step - loss: 14.4483 - acc: 0.1036
Epoch 5/10
42000/42000 [==============================] - 169s 4ms/step - loss: 14.4483 - acc: 0.1036
Epoch 6/10
42000/42000 [==============================] - 168s 4ms/step - loss: 14.4483 - acc: 0.1036
Epoch 7/10
42000/42000 [==============================] - 168s 4ms/step - loss: 14.4483 - acc: 0.1036
Epoch 8/10
42000/42000 [==============================] - 168s 4ms/step - loss: 14.4483 - acc: 0.1036
Epoch 9/10
42000/42000 [==============================] - 168s 4ms/step - loss: 14.4480 - acc: 0.1036
Epoch 10/10
5444/42000 [==>...........................] - ETA: 2:26 - loss: 14.3979 - acc: 0.1067
The trainingSet directory contains a folder for each 1-9 digit with the images inside the folders. I am training on an AWS EC2 p3.2xlarge instance with the Amazon Deep Learning Linux AMI.
Here is the list of some weird points that I see :
Not rescaling your images -> ImageDataGenerator(rescale=1/255)
Batch Size of 1 (You may want to increase that)
MNIST is grayscale pictures , therefore color_mode should be "grayscale".
(Also you have several unused part in your code, that you may want to delete from the question)
Adding two more point in answer of #abcdaire,
mnist has image size of (28,28), you have assigned it wrong.
Binarization is another method, which can be used. It also make network to learn fast. It can be done like this.
`
imges_dataset = imges_dataset/255.0
imges_dataset = np.where(imges_dataset>0.5,1,0)
Hi I am getting weird results for the following code for the problem posted here (https://www.kaggle.com/c/titanic) -
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.layers.advanced_activations import PReLU, LeakyReLU
from keras.layers.recurrent import SimpleRNN, SimpleDeepRNN
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM, GRU
import pandas as pd
import numpy as np
from sklearn import preprocessing
np.random.seed(1919)
### Constants ###
data_folder = "/home/saj1919/Public/Data_Science_Mining_Study/submissions/titanic/data/"
out_folder = "/home/saj1919/Public/Data_Science_Mining_Study/submissions/titanic/output/"
batch_size = 4
nb_epoch = 10
### load train and test ###
train = pd.read_csv(data_folder+'train.csv', index_col=0)
test = pd.read_csv(data_folder+'test.csv', index_col=0)
print "Data Read complete"
Y = train.Survived
train.drop('Survived', axis=1, inplace=True)
columns = train.columns
test_ind = test.index
train['Age'] = train['Age'].fillna(train['Age'].mean())
test['Age'] = test['Age'].fillna(test['Age'].mean())
train['Fare'] = train['Fare'].fillna(train['Fare'].mean())
test['Fare'] = test['Fare'].fillna(test['Fare'].mean())
category_index = [0,1,2,4,5,6,8,9]
for i in category_index:
print str(i)+" : "+columns[i]
train[columns[i]] = train[columns[i]].fillna('missing')
test[columns[i]] = test[columns[i]].fillna('missing')
train = np.array(train)
test = np.array(test)
### label encode the categorical variables ###
for i in category_index:
print str(i)+" : "+str(columns[i])
lbl = preprocessing.LabelEncoder()
lbl.fit(list(train[:,i]) + list(test[:,i]))
train[:,i] = lbl.transform(train[:,i])
test[:,i] = lbl.transform(test[:,i])
### making data as numpy float ###
train = train.astype(np.float32)
test = test.astype(np.float32)
#Y = np.array(Y).astype(np.int32)
model = Sequential()
model.add(Dense(len(columns), 512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(512, 1))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer="adam")
model.fit(train, Y, nb_epoch=nb_epoch, batch_size=batch_size, validation_split=0.20)
preds = model.predict(test,batch_size=batch_size)
pred_arr = []
for pred in preds:
pred_arr.append(pred[0])
### Output Results ###
preds = pd.DataFrame({"PassengerId": test_ind, "Survived": pred_arr})
preds = preds.set_index('PassengerId')
preds.to_csv(out_folder+'test.csv')
I am getting following results :
Train on 712 samples, validate on 179 samples
Epoch 0
712/712 [==============================] - 0s - loss: -0.0000 - val_loss: -0.0000
Epoch 1
712/712 [==============================] - 0s - loss: -0.0000 - val_loss: -0.0000
Epoch 2
712/712 [==============================] - 0s - loss: -0.0000 - val_loss: -0.0000
Epoch 3
712/712 [==============================] - 0s - loss: -0.0000 - val_loss: -0.0000
Epoch 4
712/712 [==============================] - 0s - loss: -0.0000 - val_loss: -0.0000
Epoch 5
712/712 [==============================] - 0s - loss: -0.0000 - val_loss: -0.0000
Epoch 6
712/712 [==============================] - 0s - loss: -0.0000 - val_loss: -0.0000
Epoch 7
712/712 [==============================] - 0s - loss: -0.0000 - val_loss: -0.0000
Epoch 8
712/712 [==============================] - 0s - loss: -0.0000 - val_loss: -0.0000
Epoch 9
712/712 [==============================] - 0s - loss: -0.0000 - val_loss: -0.0000
I am trying to create a simple 3 layer network. Totally basic code.
I have tried these kind of classification problems before using keras on kaggle. But this time getting this error.
Is it overfitting due to less data.
What I am missing ? Can someone help ?
Old post, but answering anyway in case someone else attempts Titanic with Keras.
Your network may have too many parameters and too little regularization (e.g. dropout).
Call model.summary() right before the model.compile and it will show you how many parameters your network has. Just between your two Dense layers you should have 512 X 512 = 262,144 paramters. That's a lot for 762 examples.
Also you may want to use a sigmoid activation on the last layer and binary_cross entropy loss as you only have two output classes.