How to fit NLP in CNN model? - python

I am doing research on using CNN machine learning model with NLP (multi-label classification)
I read some papers that mentioned getting good results in applying CNN for multi-label classification
I am trying to test this model on Python.
I read many articles about how to work with NLP an Neural Networks.
I have this code that is not working and giving me many errors ( every time I fix the error I get another error )
I ended seeking paid FreeLancers to help me fix the code, I hired 5 guys but non of them was able to fix the code !
you are my last hope.
I hope someone can helpe me fix this code and get it working.
First this is my dataset (100 record sample, just to make sure that code is working, I know it is not enogh for good accuracy. I will tweak and enhance model later)
http://shrinx.it/data100.zip
at the time being I just want this code to work. yet tips on how to enhance accuracy are really welcomed.
Some of the errors I got
InvalidArgumentError: indices[1] = [0,13] is out of order. Many sparse ops require sorted indices.
Use `tf.sparse.reorder` to create a correctly ordered copy.
and
ValueError: Input 0 of layer sequential_8 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 18644]
here is my code
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MultiLabelBinarizer
from keras.layers import *
from keras.callbacks import ModelCheckpoint
from keras.optimizers import Adam
from keras.models import *
# Load Dataset
df_text = pd.read_csv("J:\\__DataSets\\__Samples\\Test\\data100\\text100.csv")
df_results = pd.read_csv("J:\\__DataSets\\__Samples\\Test\\data100\\results100.csv")
df = pd.merge(df_text,df_results, on="ID")
#Prepare multi-label
Labels = []
for i in df['Code']:
Labels.append(i.split(","))
df['Labels'] = Labels
multilabel_binarizer = MultiLabelBinarizer()
multilabel_binarizer.fit(df['Labels'])
y = multilabel_binarizer.transform(df['Labels'])
X = df['Text'].values
#TF-IDF
tfidf_vectorizer = TfidfVectorizer(max_df=0.8, max_features=1000)
xtrain, xval, ytrain, yval = train_test_split(X, y, test_size=0.2, random_state=9)
tfidf_vectorizer = TfidfVectorizer(max_df=0.8, max_features=1000)
# create TF-IDF features
X_train_count = tfidf_vectorizer.fit_transform(xtrain)
X_test_count = tfidf_vectorizer.transform(xval)
#Prepare Model
input_dim = X_train_count.shape[1] # Number of features
output_dim=len(df['Labels'].explode().unique())
sequence_length = input_dim
vocabulary_size = X_train_count.shape[0]
embedding_dim = output_dim
filter_sizes = [3,4,5]
num_filters = 512
drop = 0.5
epochs = 100
batch_size = 30
#CNN Model
inputs = Input(shape=(sequence_length,), dtype='int32')
embedding = Embedding(input_dim=vocabulary_size, output_dim=embedding_dim, input_length=sequence_length)(inputs)
reshape = Reshape((sequence_length,embedding_dim,1))(embedding)
conv_0 = Conv2D(num_filters, kernel_size=(filter_sizes[0], embedding_dim), padding='valid', kernel_initializer='normal', activation='relu')(reshape)
conv_1 = Conv2D(num_filters, kernel_size=(filter_sizes[1], embedding_dim), padding='valid', kernel_initializer='normal', activation='relu')(reshape)
conv_2 = Conv2D(num_filters, kernel_size=(filter_sizes[2], embedding_dim), padding='valid', kernel_initializer='normal', activation='relu')(reshape)
maxpool_0 = MaxPool2D(pool_size=(sequence_length - filter_sizes[0] + 1, 1), strides=(1,1), padding='valid')(conv_0)
maxpool_1 = MaxPool2D(pool_size=(sequence_length - filter_sizes[1] + 1, 1), strides=(1,1), padding='valid')(conv_1)
maxpool_2 = MaxPool2D(pool_size=(sequence_length - filter_sizes[2] + 1, 1), strides=(1,1), padding='valid')(conv_2)
concatenated_tensor = Concatenate(axis=1)([maxpool_0, maxpool_1, maxpool_2])
flatten = Flatten()(concatenated_tensor)
dropout = Dropout(drop)(flatten)
output = Dense(units=2, activation='softmax')(dropout)
# this creates a model that includes
model = Model(inputs=inputs, outputs=output)
#Compile
checkpoint = ModelCheckpoint('weights.{epoch:03d}-{val_acc:.4f}.hdf5', monitor='val_acc', verbose=1, save_best_only=True, mode='auto')
adam = Adam(lr=1e-4, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer=adam, loss='binary_crossentropy', metrics=['accuracy'])
print("Traning Model...")
model.summary()
#Fit
model.fit(X_train_count, ytrain, batch_size=batch_size, epochs=epochs, verbose=1, callbacks=[checkpoint], validation_data=(X_test_count, yval)) # starts training
#Accuracy
loss, accuracy = model.evaluate(X_train_count, ytrain, verbose=False)
print("Training Accuracy: {:.4f}".format(accuracy))
loss, accuracy = model.evaluate(X_test_count, yval, verbose=False)
print("Testing Accuracy: {:.4f}".format(accuracy))
a sample of my dataset
text100.csv
ID Text
1 Allergies to Drugs Attending:[**First Name3 (LF) 1**] Chief Complaint: headache and neck stiffne
2 Complaint: fever, chills, rigors Major Surgical or Invasive Procedure: Arterial l
3 Complaint: Febrile, unresponsive--> GBS meningitis and bacteremia Major Surgi
4 Allergies to Drugs Attending:[**First Name3 (LF) 45**] Chief Complaint: PEA arrest . Major Sur
5 Admitted to an outside hospital with chest pain and ruled in for myocardial infarction. She was tr
6 Known Allergies to Drugs Attending:[**First Name3 (LF) 78**] Chief Complaint: Progressive lethargy
7 Complaint: hypernatremia, unresponsiveness Major Surgical or Invasive Procedure: PEG/tra
8 Chief Complaint: cough, SOB Major Surgical or Invasive Procedure: RIJ placed Hemod
Results100.csv
ID Code
1 A32,D50,G00,I50,I82,K51,M85,R09,R18,T82,Z51
2 418,475,905,921,A41,C50,D70,E86,F32,F41,J18,R11,R50,Z00,Z51,Z93,Z95
3 136,304,320,418,475,921,998,A40,B37,G00,G35,I10,J15,J38,J69,L27,L89,T81,T85
4 D64,D69,E87,I10,I44,N17
5 E11,I10,I21,I25,I47
6 905,C61,C91,E87,G91,I60,M47,M79,R50,S43
7 304,320,355,E11,E86,E87,F06,I10,I50,I63,I69,J15,J69,L89,L97,M81,N17,Z91

I don’t have anything concrete to add at the moment, but I found the following two debugging strategies to be useful for me:
Distill your bugs into different sections. For e.g which errors are related to compiling models and which related to training? There could be errors before the model. For the errors that you showed, when did they first raise? Its kind of hard to see without line number and etc.
This step is useful personally as sometimes later errors are manifestation of earlier ones, so sometimes 50 errors might be just 1-2 at the beginning stage.
For a good library, typically their error messages are helpful. Have you tried what the error messages suggest and how did that go?

Related

Training 1660 NNs in a loop. However, on each iteration the training time of the model will slightly increase making it unfeasible

I was currently using the following code to set 1 column equal to zero and consequently retrain the model for all 10 NNs in NN1_List. However, as the model is going through the loop it slowly (very slowly but still is a big deal if I train 1660 NNs) increases the training time of the Neural Network. I checked a variety of websites and implemented all the possible solutions that I could find such as tf.keras.backend.clear_session(), tf.compat.v1.reset_default_graph(), del model, and gc.collect().
r2_list = list()
for i in tf.range(0, len(training_x.columns), 1):
column = training_x.columns[i]
df = training_x.copy()
df[column].values[:] = 0
prediction_list = list()
for j in tf.range(0, len(NN1_List), 1):
np.random.seed(int(seed_list[j]))
random.seed(int(seed_list[j]))
tf.random.set_seed(int(seed_list[j]))
model = keras.Sequential()
model.add(keras.layers.Dense(
units=64,
kernel_regularizer=keras.regularizers.L1(l1=0.00001),
input_shape=(training_x.shape[1],),
activation='relu')
)
model.add(keras.layers.Dense(
units=1))
## Compile Model.
opt = keras.optimizers.Adam(learning_rate=0.01)
model.compile(optimizer=opt,
loss='mean_squared_error')
## Fit Model.
callback = keras.callbacks.EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=5, restore_best_weights=True)
model.fit(x=df,
y=training_y,
validation_data=(validation_x, validation_y),
batch_size=10000,
epochs=100,
callbacks=[callback])
prediction_testing = model.predict(testing_x)
del model
tf.keras.backend.clear_session()
tf.compat.v1.reset_default_graph()
gc.collect()
prediction_list.append(prediction_testing)
prediction_array = np.mean(prediction_list, axis=0).ravel()
r2 = kelly_gu_r_squared(testing_y, prediction_array)
r2_list.append(r2)
I was wondering if you guys could point me in the right direction to fix this problem.

Python Bayesian Optimization - inhomogeneous shape after 1 dimensions

I'm attempting to perform Bayesian Optimization on deep learning models to expedite hyperparameter tuning compared with grid search. I found code from [https://www.analyticsvidhya.com/blog/2021/05/tuning-the-hyperparameters-and-layers-of-neural-network-deep-learning/] which illustrates a working example but I cannot seem to apply it to my data. My data contains 33 features, hyperparameters I am trying to optimize are 'number of neurons'. 'activation function', 'learning_rate' (for optimizer Adam),'decay' (for optimizer Adam),'batch_size', 'number of epochs'. The error shown in my code is as follow - ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (7,) + inhomogeneous part. When look at stack overflow solutions to similar problems i.e., [https://stackoverflow.com/questions/67183501/setting-an-array-element-with-a-sequence-requested-array-has-an-inhomogeneous-sh], it appears there may be a problem with input shapes? Or values as floats? I am still unsure after looking at these pages which is why I have posted the question here.
Below is the code and the error.
from keras.layers import LeakyReLU
LeakyReLU = LeakyReLU(alpha=0.1)
from bayes_opt import BayesianOptimization
from sklearn.model_selection import StratifiedKFold
def nn_cl_bo(neurons, activation, optimizer, learning_rate, batch_size, epochs ):
optimizerL = ['Adam']
optimizerD= {'Adam':tf.keras.optimizers.Adam(lr=learning_rate)}
activationL = ['relu', 'sigmoid', 'softplus', 'softsign', 'tanh', 'selu',
'elu', 'exponential', LeakyReLU, 'relu']
neurons = round(neurons)
activation = activationL[round(activation)]
batch_size = round(batch_size)
epochs = round(epochs)
def nn_cl_fun():
opt = tf.keras.optimizers.Adam(lr = learning_rate)
nn = Sequential()
nn.add(LSTM(units=neurons, input_shape=
(X_train.shape[1],X_train.shape[2]), activation=activation))
nn.add(Dense(1, activation='sigmoid'))
nn.compile(loss='mae', optimizer=opt, metrics=['mae'])
return nn
es = EarlyStopping(monitor='mae', mode='max', verbose=0, patience=20)
nn = KerasClassifier(build_fn=nn_cl_fun, epochs=epochs, batch_size=batch_size,
verbose=0)
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=123)
score = cross_val_score(nn, X_train, y_train, scoring=score_acc, cv=kfold, fit_params={'callbacks':[es]}).mean()
return score
params_nn ={
'neurons': (10, 100),
'activation':(0, 9),
'optimizer':(0),
'learning_rate':(0.01, 1),
'decay':(0,0.1),
'batch_size':(7,112),
'epochs':(20, 100)
}
Run Bayesian Optimization
nn_bo = BayesianOptimization(nn_cl_bo, params_nn, random_state=111)
nn_bo.maximize(init_points=25, n_iter=4)
ValueError Traceback (most recent call last)
\<ipython-input-67-4f3ad5be1912\> in \<module\>()
10 }
11 # Run Bayesian Optimization
\---\> 12 nn_bo = BayesianOptimization(nn_cl_bo, params_nn, random_state=111)
13 nn_bo.maximize(init_points=25, n_iter=4)
1 frames
/usr/local/lib/python3.7/dist-packages/bayes_opt/target_space.py in __init__(self, target_func, pbounds, random_state)
47 self.\_bounds = np.array(
48 \[item\[1\] for item in sorted(pbounds.items(), key=lambda x: x\[0\])\],
\---\> 49 dtype=np.float
50 )
51
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (7,) + inhomogeneous part.

LSTM model has constant accuracy and doesn't variate

i'm stuck as you can see, with my lstm model. I'm trying to predict the amount of tons to produce per month. When i run the model to train the accuracy is almost constant, it has a minimal variation like:
0.34406
0.34407
0.34408
I tried different combination of activations, initializers and parameters, and the acc don't increase.
I don't know if the problem here is my data, my model or this value is the max acc the model can reach.
Here is the code (if you notice some libraries unused, its because i made some changes by the first version)
import numpy as np
import pandas as pd
from pandas.tseries.offsets import DateOffset
from sklearn.preprocessing import MinMaxScaler, StandardScaler, RobustScaler
from sklearn import preprocessing
import keras
%tensorflow_version 2.x
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.layers import LSTM
from tensorflow.keras.layers import Dropout
from keras.optimizers import Adam
import warnings
warnings.filterwarnings("ignore")
%matplotlib inline
from plotly.offline import iplot
import matplotlib.pyplot as plt
import chart_studio.plotly as py
import plotly.offline as pyoff
import plotly.graph_objs as go
df_ventas = pd.read_csv('/content/drive/My Drive/proyectoPanimex/DEOPE.csv', parse_dates=['Data Emissão'], index_col=0, squeeze=True)
#df_ventas = df_ventas.resample('M').sum().reset_index()
df_ventas = df_ventas.drop(columns= ['weekday', 'month'], axis=1)
df_ventas = df_ventas.reset_index()
df_ventas = df_ventas.rename(columns= {'Data Emissão':'Fecha','Un':'Cantidad'})
df_ventas['dia'] = [x.day for x in df_ventas.Fecha]
df_ventas['mes']=[x.month for x in df_ventas.Fecha]
df_ventas['anio']=[x.year for x in df_ventas.Fecha]
df_ventas = df_ventas[:-48]
df_ventas = df_ventas.drop(columns='Fecha')
df_diff = df_ventas.copy()
df_diff['cantidad_anterior'] = df_diff['Cantidad'].shift(1)
df_diff = df_diff.dropna()
df_diff['diferencia'] = (df_diff['Cantidad'] - df_diff['cantidad_anterior'])
df_supervised = df_diff.drop(['cantidad_anterior'],axis=1)
#adding lags
for inc in range(1,31):
nombre_columna = 'retraso_' + str(inc)
df_supervised[nombre_columna] = df_supervised['diferencia'].shift(inc)
df_supervised = df_supervised.dropna()
df_supervisedNumpy = df_supervised.to_numpy()
train = df_supervisedNumpy
scaler = MinMaxScaler(feature_range=(0, 1))
X_train = scaler.fit(train)
train = train.reshape(train.shape[0], train.shape[1])
train_scaled = scaler.transform(train)
X_train, y_train = train_scaled[:, 1:], train_scaled[:, 0:1]
X_train = X_train.reshape(X_train.shape[0], 1, X_train.shape[1])
#LSTM MODEL
model = Sequential()
act = 'tanh'
actF = 'relu'
model.add(LSTM(200, activation = act, input_dim=34, return_sequences=True ))
model.add(Dropout(0.15))
#model.add(Flatten())
model.add(LSTM(200, activation= act))
model.add(Dropout(0.2))
#model.add(Flatten())
model.add(Dense(200, activation= act))
model.add(Dropout(0.3))
model.add(Dense(1, activation= actF))
optimizer = keras.optimizers.Adam(lr=0.00001)
model.compile(optimizer=optimizer, loss=keras.losses.binary_crossentropy, metrics=['accuracy'])
history = model.fit(X_train, y_train, batch_size = 100,
epochs = 50, verbose = 1)
hist = pd.DataFrame(history.history)
hist['Epoch'] = history.epoch
hist
History plot:
loss acc Epoch
0 0.847146 0.344070 0
1 0.769400 0.344070 1
2 0.703548 0.344070 2
3 0.698137 0.344070 3
4 0.653952 0.344070 4
As you can see the only value that change its loss, but what is going on with Acc?. I'm starting with machine learning, and i have no more knowledge to can see my errors. Thanks!
A Dense(1, activation='softmax') will always freeze and not learn anything
A Dense(1, activation='relu') will very probably freeze and not learn anything
A Dense(1, activation='sigmoid') is ideal for classification (binary) problems and somewhat good for regression with values between 0 and 1.
A Dense(1, activation='tanh') is somewhat good for regression with values between -1 and 1
A Dense(1, activation='softplus') is somewhat good for regression with values between 0 and +infinite
A Dense(1, actiavation='linear') is good for regression in general with no limits (but it's highly recommended that the data be normalized before)
For regression, you can't use accuracy, but the metrics 'mae' and 'mse' don't provide "relative" difference, they provide "absolute" mean difference, one linear, the other squared.
Your output activation should be linear for continuous prediction or softmax for classification. Also multiply your learning rate by 100. Your loss should be mean_absolute_error. You could also easily divide your lstm neurons by a factor of 10. The tanh should be replaced by relu or the likes.
For your accuracy problem, it makes no sense to use accuracy, since you're not trying to classify. For metrics, you can use mae. You're trying to know how far the prediction is from the actual target, on a continuous scale. Accuracy is for categories, not continuous data.

expected dense_1 to have 2 dimensions, but got array with shape (308, 1, 6)

I'm trying to use Conv1D for the first time for multiclass classification of time series data and my model keeps throwing this error when I use it.
import numpy as np
import os
import keras
from keras.models import Sequential
from keras.layers import Conv1D, Dense, TimeDistributed, MaxPooling1D, Flatten
# fix random seed for reproducibility
np.random.seed(7)
dataset1 = np.genfromtxt(os.path.join('data', 'norm_cellcycle_384_17.txt'), delimiter=',', dtype=None)
data = dataset1[1:]
# extract columns
genes = data[:,0]
y_all = data[:,1].astype(int)
x_all = data[:,2:-1].astype(float)
# deleted this line when using sparse_categorical_crossentropy
# 384x6
y_all = keras.utils.to_categorical(y_all)
# 5
num_classes = np.unique(y_all).shape[0]
# split entire data into train set and test set
validation_split = 0.2
val_idx = np.random.choice(range(x_all.shape[0]), int(validation_split*x_all.shape[0]), replace=False)
train_idx = [x for x in range(x_all.shape[0]) if x not in val_idx]
x_train = x_all[train_idx]
y_train = y_all[train_idx]
# 308x17x1
x_train = x_train[:, :, np.newaxis]
# 308x1
y_train = y_train[:,np.newaxis]
x_test = x_all[val_idx]
y_test = y_all[val_idx]
# deleted this line when using sparse_categorical_crossentropy
y_test = keras.utils.to_categorical(y_test)
# 76x17x1
x_test = x_test[:, :, np.newaxis]
# 76x1
y_test = y_test[:,np.newaxis]
print(x_train.shape[0],'train samples')
print(x_test.shape[0],'test samples')
# Create Model
# number of filters for 1D conv
nb_filter = 4
filter_length = 5
window = x_train.shape[1]
model = Sequential()
model.add(Conv1D(filters=nb_filter,kernel_size=filter_length,activation="relu", input_shape=(window,1)))
model.add(MaxPooling1D())
model.add(Conv1D(nb_filter=nb_filter, filter_length=filter_length, activation='relu'))
model.add(MaxPooling1D())
model.add(Flatten())
model.add(Dense(num_classes, activation='softmax'))
model.summary()
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=25, batch_size=2, validation_data=(x_test, y_test))
I don't know why I get this error. When I use binary_crossentropy loss and no one hot encoding for y_all, my model works. But it fails when I use one hot encoding for y_all with categorical_crossentropy loss. When I don't use one hot encoding, keras throws an error making me change y_all to one a binary matrix.
I don't even know where the (1,6) are coming from in the array.
ValueError: Error when checking model target: expected dense_1 to have 2 dimensions, but got array with shape (308, 1, 6)
Please help! I've been stuck on this for many hours! Already went through all the related questions but still doesn't make sense.
Update: I now use sparse_categorical_crossentropy because it has integer support. I deleted the to_categorical lines from the above code and I get this new error:
InvalidArgumentError (see above for traceback): Received a label value
of 5 which is outside the valid range of [0, 5). Label values: 2 5
[[Node:
SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits
= SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_1, Cast)]]
Requested sample of data:
,Main,Gp,c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11,c12,c13,c14,c15,c16,c17
YDL179w,1,-0.75808,-0.90319,-0.98935,-0.73995,-0.67193,-0.12777,-0.95307,-1.01656,0.79730,2.11688,1.98537,0.61591,0.56603,-0.13684,-0.52228,-0.05068,0.78823,
YLR079w,1,-0.48845,-0.70828,-0.47688,-0.65814,-0.45374,-0.47302,-0.71214,-1.02839,0.24048,3.11376,1.28952,0.44874,0.04379,-0.31104,-0.30332,-0.34575,0.82285,
YER111c,1,-0.42218,0.23887,1.84427,-0.02083,-0.61105,-0.65827,-0.79992,-0.39857,-0.09166,2.03314,1.58457,0.68744,0.14443,-0.72910,-1.46097,-0.82353,-0.51662,
YBR200w,1,0.09824,0.55258,-0.89641,-1.19111,-1.11744,-0.76133,0.09824,2.16120,1.46126,1.03148,0.67537,-0.33155,-0.60170,-1.39987,-0.42978,-0.15963,0.81045,
YPL209c,2,-0.65282,-0.32055,2.53702,2.00538,0.60982,0.51014,-0.55314,-1.01832,-0.78573,0.01173,0.07818,-0.05473,-0.22087,0.24432,-0.28732,-1.11801,-0.98510,
YJL074c,2,-0.81087,-0.19448,1.72941,0.59002,-0.53069,-0.25051,-0.92294,-0.92294,-0.53069,0.08570,1.87884,1.97223,0.45927,-0.36258,-0.34390,-1.07237,-0.77351,
YNL233w,2,-0.43997,0.66325,2.85098,0.74739,-0.42127,-0.47736,-0.79524,-0.80459,-0.48671,-0.21558,1.25226,1.01852,-0.10339,-0.56151,-0.96353,-0.46801,-0.79524,
YLR313c,2,-0.46611,0.42952,3.01689,1.13856,0.01902,-0.44123,-0.66514,-0.98856,-0.59050,-0.47855,0.84002,0.39220,0.50416,-0.50342,-0.82685,-0.64026,-0.73977,
YGR041w,2,-0.57187,-0.26687,1.10561,-0.38125,-0.68624,-0.26687,-0.87687,-1.18186,-0.80062,0.60999,2.09686,1.82998,1.14374,0.11437,-0.80062,-0.87687,-0.19062,
So I noticed that even though I know there are 5 classes in this dataset as seen by the unique values obtained for y_all, for some reason Keras to_categorical thinks there are 6 classes.
# 384x6
y_all = keras.utils.to_categorical(y_all)
# 5
num_classes = np.unique(y_all).shape[0]
I don't know why that is. Keeping this in mind I changed this line of code and my model began to run:
model.add(Dense(num_classes, activation='softmax'))
to
model.add(Dense(num_classes+1, activation='softmax'))
I still don't know why to_categorical behaves this way. Anyone know?
to_categorical(x) in Keras will encode the given parameter into n number of classes where n = max(x) + 1, i.e. generally speaking from [0 , max(x)].

Setup Keras options for LSTM modelling

I am trying to figure out to setup process of forecasting some value. Currently, I can't understand what is issue in below code:
in_neurons = 1
out_neurons = 1
hidden_neurons = 20
nb_features = 9
# retrieve data
y_train = train.pop(target).values
X_train = pd.concat([train[['QTR_HR_START', 'QTR_HR_END', 'HOLIDAY_RANK_', 'SPECIAL_EVENT_RANK_',
'IS_AM', 'IS_TOP_RANKED', 'AWARDS_WINS_ANY', 'YEARS_SINCE_RELEASE']],
pd.DataFrame({'DATETIME': pd.DatetimeIndex(train['DATETIME']).astype(np.int64)})])
X_train = X_train.values
y_test = test.pop(target).values
X_test = pd.concat([test[['QTR_HR_START', 'QTR_HR_END', 'HOLIDAY_RANK_', 'SPECIAL_EVENT_RANK_',
'IS_AM', 'IS_TOP_RANKED', 'AWARDS_WINS_ANY', 'YEARS_SINCE_RELEASE']],
pd.DataFrame({'DATETIME': pd.DatetimeIndex(test['DATETIME']).astype(np.int64)})])
X_test = X_test.values
model = Sequential()
model.add(TimeDistributed(Dense(8, input_shape=(X_train.shape[0], 100, nb_features), activation='softmax')))
model.add(LSTM(4, dropout_W=0.2, dropout_U=0.2))
model.add(Dense(1))
model.add(Activation("sigmoid"))
model.compile(loss="mean_squared_error", optimizer="rmsprop", metrics=['accuracy'])
After running the code, I got an exception:
raise Exception('The first layer in a Sequential model must '
Exception: The first layer in a Sequential model must get an input_shape or batch_input_shape argument.
Please advice where I am wrong
EDIT1: I just configured the model as was mentioned in official documentation - http://keras.io/layers/recurrent/
model.add(LSTM(32, input_dim=nb_features, input_length=100))
model.compile(loss="mean_squared_error", optimizer="rmsprop", metrics=['accuracy'])
Exception: Error when checking model input: expected lstm_input_1 to have 3 dimensions, but got array with shape (48614, 9)
It's old, but I'm posting for future use. Keras as input requiers 3D data, as stated in error. It is samples, time steps, features. Despite the fact that you have (48614, 9) Keras takes it as 2D - [samples, features]. In order to fix it, do something like this
def reshape_dataset(train):
trainX = numpy.reshape(train, (train.shape[0], 1, train.shape[1]))
return numpy.array(trainX)
x = reshape_dataset(your_dataset_48614, 9)
now X should be 48614,1, 9 which is [samples, time steps, features] - 3D

Categories