Combine CNN and LSTM for text Multi-class classification

Combine CNN and LSTM for text Multi-class classification - python

I build a model consisting of one CNN and one LSTM. CNN to extract the features and pass that to LSTM. I am working with a Multi-class text classification problem, the input for CNN is TF-IDF.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense,Dropout,Flatten,Activation
from tensorflow.keras.metrics import CategoricalAccuracy, AUC
from tensorflow import keras
import tensorflow as tf
from keras.layers import TimeDistributed
from tensorflow.keras.layers import Conv2D,MaxPooling2D,Conv1D,MaxPooling1D
from keras.layers.recurrent import LSTM
model = Sequential()
model.add(Conv1D(128, 5, activation='relu', input_shape=( 1500,1)))
model.add(MaxPooling1D(pool_size=2))
model.add(LSTM(10))
model.add(Dense(5, activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
history=model.fit(X_train, y_train, epochs=20,validation_data=(X_test,y_test),batch_size=512)
The error as follows:
ValueError: Input 0 of layer sequential_33 is incompatible with the layer: expected axis -1 of input shape to have value 1 but received input with shape (None, 1, 1500)

the model expects an input_shape of 1, 1500 and you are giving one with 1500,1. change your input in the model from model.add(Conv1D(128, 5, activation='relu', input_shape=( 1500,1))) to model.add(Conv1D(128, 5, activation='relu', input_shape=( 1, 1500)))
but there might be more errors with the format after that i'm not sure if you can just add the lstm in your model.

Related

How to implement Many to Many LSTM architecture for numerical data (not timeseries , not NLP) in Keras

I have read this, this
I have numerical data in arrays of shape,
input_array = 14674 x 4
output_array = 13734 x 4
reshaping for LSTM (batch, timesteps, features) gives
input_array= (14574, 100, 4)
output_array = (13634, 100, 4)
Now I would like to build a Many to Many LSTM architecture for this given data,
should I use encoder-decorder or synced sequence input and output architecture
using following model but it works on when input and outputs are same
import tenowingsorflow
from tensorflow.keras.metrics import Recall, Precision
from tensorflow.keras.layers import Conv1D, Dense, MaxPooling1D, Flatten
opt = tensorflow.keras.optimizers.Adam(learning_rate=0.001)
model_enc_dec_cnn = Sequential()
model_enc_dec_cnn.add(Conv1D(filters=64, kernel_size=9, activation='relu', input_shape=(100, 4)))
model_enc_dec_cnn.add(Conv1D(filters=64, kernel_size=11, activation='relu'))
model_enc_dec_cnn.add(MaxPooling1D(pool_size=2))
model_enc_dec_cnn.add(Flatten())
model_enc_dec_cnn.add(RepeatVector(100))
model_enc_dec_cnn.add(LSTM(100, activation='relu', return_sequences=True))
model_enc_dec_cnn.add(TimeDistributed(Dense(4)))
model_enc_dec_cnn.compile( optimizer=opt, loss='mse', metrics=['accuracy'])
history = model_enc_dec_cnn.fit(X,y, epochs=3, batch_size=64, )

Matrix size-incompatible - Keras Tensorflow

I'm trying to train a simple model over some picture data that belongs to 10 classes.
The images are in B/W format (not gray scale), I'm using the image_dataset_from_directory to import the data into python as well as split it into validation/training sets.
My code is as below:
My Imports
import matplotlib.pyplot as plt
import numpy as np
import os
import PIL
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Dense
Read Image Data
trainDT = tf.keras.preprocessing.image_dataset_from_directory(
data_path,
labels="inferred",
label_mode="categorical",
class_names=['0','1','2','3','4','5','6','7','8','9'],
color_mode="grayscale",
batch_size=4,
image_size=(256, 256),
shuffle=True,
seed=44,
validation_split=0.1,
subset='validation',
interpolation="bilinear",
follow_links=False,
)
Model Creation/Compile/Fit
model = Sequential([
Dense(units=128, activation='relu', input_shape=(256,256,1), name='h1'),
Dense(units=64, activation='relu',name='h2'),
Dense(units=16, activation='relu',name='h3'),
layers.Flatten(name='flat'),
Dense(units=10, activation='softmax',name='out')
],name='1st')
model.summary()
model.compile(optimizer='adam' , loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x=trainDT, validation_data=train_data, epochs=10, verbose=2)
The model training returns an error:
InvalidArgumentError Traceback (most recent call last)
....
/// anaconda paths and anaconda python code snippets in the error reporting \\\
....
InvalidArgumentError: Matrix size-incompatible: In[0]: [1310720,3], In[1]: [1,128]
[[node 1st/h1/Tensordot/MatMul (defined at <ipython-input-38-58d6507e2d35>:1) ]] [Op:__inference_test_function_11541]
Function call stack:
test_function
I don't understand where the size mismatch comes from, I've spent a few hours looking around for a solution and trying different things but nothing seems to work for me.
Appreciate any help, thank you in advance!

Dense layers expect flat input (not 3d tensor), but you are sending (256,256,1) shaped tensor into the first dense layer. If you want to use dense layers from the beginning then you will need to move the flatten to be the first layer or you will need to properly reshape your data.
model = tf.keras.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation="relu"),
tf.keras.layers.Dense(10, activation="softmax")
])
Also, the flatten between 2 dense layers makes no sense because the output of a dense layer is flat anyway.
From the structure of your model (especially the flatten placement), I assume that
those dense layers were supposed to be convolutional layers instead.
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (3, 3), activation='relu'),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation="relu"),
tf.keras.layers.Dense(10, activation="softmax")
])
Convolutional layers can process 2D input and they will also produce more dimensional output which you need to flatten before passing it to the dense top (note that you can add more convolutional layers).

Hy mhk777 Hope you are doing well. Brother, I think that you are confusing dense layers with convolution layers. You have to apply some convolution layers to the image before giving it to dense layers. If you don't want to apply convolution than you have to give 2d array to the dense layer i.e (number of samples, data)
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
model = models.Sequential()
# Here are convolutional Layer
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(256,256,1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
# Here are your dense layers
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))
model.summary()
model.compile(optimizer='adam' , loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x=trainDT, validation_data=train_data, epochs=10, verbose=2)

Keras CNN model accuracy not improving and decreasing over epoch?

Newbie to machine learning here.
I'm currently working on a diagnostic machine learning framework using 3D-CNNs on fMRI imaging. My dataset consists of 636 images right now, and I'm trying to distinguish between control and affected (binary classification). However, when I tried to train my model, after every epoch, my accuracy remains at 48.13%, no matter what I do. Additionally, over the epoch, the accuracy decreases from 56% to 48.13%.
So far, I have tried:
changing my loss functions (poisson, categorical cross entropy, binary cross entropy, sparse categorical cross entropy, mean squared error, mean absolute error, hinge, hinge squared)
changing my optimizer (I've tried Adam and SGD)
changing the number of layers
using weight regularization
changing from ReLU to leaky ReLU (I thought perhaps that could help if this was a case of overfitting)
Nothing has worked so far.
Any tips? Here's my code:
#importing important packages
import tensorflow as tf
import os
import keras
from keras.models import Sequential
from keras.layers import Dense, Flatten, Conv3D, MaxPooling3D, Dropout, BatchNormalization, LeakyReLU
import numpy as np
from keras.regularizers import l2
from sklearn.utils import compute_class_weight
from keras.optimizers import SGD
BATCH_SIZE = 64
input_shape=(64, 64, 40, 20)
# Create the model
model = Sequential()
model.add(Conv3D(64, kernel_size=(3,3,3), activation='relu', input_shape=input_shape, kernel_regularizer=l2(0.005), bias_regularizer=l2(0.005), data_format = 'channels_first', padding='same'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(Conv3D(64, kernel_size=(3,3,3), activation='relu', input_shape=input_shape, kernel_regularizer=l2(0.005), bias_regularizer=l2(0.005), data_format = 'channels_first', padding='same'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(BatchNormalization(center=True, scale=True))
model.add(Conv3D(64, kernel_size=(3,3,3), activation='relu', input_shape=input_shape, kernel_regularizer=l2(0.005), bias_regularizer=l2(0.005), data_format = 'channels_first', padding='same'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(Conv3D(64, kernel_size=(3,3,3), activation='relu', input_shape=input_shape, kernel_regularizer=l2(0.005), bias_regularizer=l2(0.005), data_format = 'channels_first', padding='same'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(BatchNormalization(center=True, scale=True))
model.add(Flatten())
model.add(BatchNormalization(center=True, scale=True))
model.add(Dense(128, activation='relu', kernel_regularizer=l2(0.01), bias_regularizer=l2(0.01)))
model.add(Dropout(0.5))
model.add(Dense(128, activation='sigmoid', kernel_regularizer=l2(0.01), bias_regularizer=l2(0.01)))
model.add(Dense(1, activation='softmax', kernel_regularizer=l2(0.01), bias_regularizer=l2(0.01)))
# Compile the model
model.compile(optimizer = keras.optimizers.sgd(lr=0.000001), loss='poisson', metrics=['accuracy', tf.keras.metrics.Precision(), tf.keras.metrics.Recall()])
# Model Testing
history = model.fit(X_train, y_train, batch_size=BATCH_SIZE, epochs=50, verbose=1, shuffle=True)

The main issue is that you are using softmax activation with 1 neuron. Change it to sigmoid with binary_crossentropy as a loss function.
At the same time, bear in mind that you are using Poisson loss function, which is suitable for regression problems not classification ones. Ensure that you detect the exact scenario that your are trying to solve.

Softmax with one neuron makes the model illogical and only use one of the sigmoid activation functions or Softmax in the last layer

How to apply model.fit() function over an CNN-LSTM model?

I am trying to use this to classify the images into two categories. Also I applied model.fit() function but its showing error.
ValueError: A target array with shape (90, 1) was passed for an output of shape (None, 10) while using as loss binary_crossentropy. This loss expects targets to have the same shape as the output.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten, Conv2D, MaxPooling2D, LSTM
import pickle
import numpy as np
X = np.array(pickle.load(open("X.pickle","rb")))
Y = np.array(pickle.load(open("Y.pickle","rb")))
#scaling our image data
X = X/255.0
model = Sequential()
model.add(Conv2D(64 ,(3,3), input_shape = (300,300,1)))
# model.add(MaxPooling2D(pool_size = (2,2)))
model.add(tf.keras.layers.Reshape((16, 16*512)))
model.add(LSTM(128, activation='relu', return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(128, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))
opt = tf.keras.optimizers.Adam(lr=1e-3, decay=1e-5)
model.compile(loss='binary_crossentropy', optimizer=opt,
metrics=['accuracy'])
# model.summary()
model.fit(X, Y, batch_size=32, epochs = 2, validation_split=0.1)

If your problem is categorical, your issue is that you are using binary_crossentropy instead of categorical_crossentropy; ensure that you do have a categorical instead of a binary classification problem.
Also, please note that if your labels are in simple integer format like [1,2,3,4...] and not one-hot-encoded, your loss_function should be sparse_categorical_crossentropy, not categorical_crossentropy.
If you do have a binary classification problem, like said in the error of the above ensure that:
Loss is binary_crossentroy + Dense(1,activation='sigmoid')
Loss is categorical_crossentropy + Dense(2,activation='softmax')

How to use a 1D-CNN model in Lime?

I have a numeric health record dataset. I used a 1D CNN keras model for the classification step.
I am giving a reproductible example in Python:
import tensorflow as tf
import keras
from keras.models import Sequential
from keras.layers import Conv1D, Activation, Flatten, Dense
import numpy as np
a = np.array([[0,1,2,9,3], [0,5,1,33,6], [1, 12,1,8,9]])
train = np.reshape(a[:,1:],(a[:,1:].shape[0], a[:,1:].shape[1],1))
y_train = keras.utils.to_categorical(a[:,:1])
model = Sequential()
model.add(Conv1D(filters=2, kernel_size=2, strides=1, activation='relu', padding="same", input_shape=(train.shape[1], 1), kernel_initializer='he_normal'))
model.add(Flatten())
model.add(Dense(2, activation='sigmoid'))
model.compile(loss=keras.losses.binary_crossentropy,
optimizer=keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, amsgrad=False),
metrics=['accuracy'])
model.fit(train, y_train, epochs=3, verbose=1)
I am getting this error when I apply lime to my 1D CNN model
IndexError: boolean index did not match indexed array along dimension 1; dimension is 4 but corresponding boolean dimension is 1
import lime
import lime.lime_tabular
explainer = lime.lime_tabular.LimeTabularExplainer(train)
Is there a solution ?

I did some minor changes to your initial code (changed from keras to tensorflow.keras)
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv1D, Activation, Flatten, Dense
import numpy as np
a = np.array([[0,1,2,9,3], [0,5,1,33,6], [1, 12,1,8,9]])
train = np.reshape(a[:,1:],(a[:,1:].shape[0], a[:,1:].shape[1],1))
y_train = tf.keras.utils.to_categorical(a[:,:1])
model = Sequential()
model.add(Conv1D(filters=2, kernel_size=2, strides=1, activation='relu',
padding="same", input_shape=(train.shape[1], 1),
kernel_initializer='he_normal'))
model.add(Flatten())
model.add(Dense(2, activation='sigmoid'))
model.compile(loss=tf.keras.losses.binary_crossentropy,
optimizer=tf.keras.optimizers.Adam(lr=0.001, beta_1=0.9,
beta_2=0.999, amsgrad=False),
metrics=['accuracy'])
model.fit(train, y_train, epochs=3, verbose=1)
Then I added some test data because you don't want to train and test your LIME model on the same data
b = np.array([[1,4,5,3,2], [1,4,2,55,1], [7, 3,22,3,10]])
test = np.reshape(b[:,1:],(b[:,1:].shape[0], b[:,1:].shape[1],1))
Here I show how the RecurrentTabularExplainer can be trained
import lime
from lime import lime_tabular
explainer = lime_tabular.RecurrentTabularExplainer(train,training_labels=y_train, feature_names=["random clf"],
discretize_continuous=False, feature_selection='auto', class_names=['class 1','class 2'])
Then you can run your LIME model on one of the examples in your test data:
exp = explainer.explain_instance(np.expand_dims(test[0],axis=0), model.predict, num_features=10)
and finally display the predictions
exp.show_in_notebook()
or just printing the prediction
print(exp.as_list())

You should try lime_tabular.RecurrentTabularExplainer instead of LimeTabularExplainer. It is an explainer for keras-style recurrent neural networks. Check out the examples in LIME documentation for better understanding. Good luck:)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Combine CNN and LSTM for text Multi-class classification - python

Related

How to implement Many to Many LSTM architecture for numerical data (not timeseries , not NLP) in Keras

Matrix size-incompatible - Keras Tensorflow

Keras CNN model accuracy not improving and decreasing over epoch?

How to apply model.fit() function over an CNN-LSTM model?

How to use a 1D-CNN model in Lime?

Categories

Resources