Error in Keras while doing Multi-class classification - python

I am trying to do Multi-class classification in Keras. I am using the
crowdflower dataset.Here is my code below:
import pandas as pd
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
import numpy as np
from sklearn.preprocessing import LabelEncoder
from keras.models import Sequential
from keras.layers import Embedding, Flatten, Dense
from sklearn.preprocessing import LabelEncoder
df=pd.read_csv('text_emotion.csv')
df.drop(['tweet_id','author'],axis=1,inplace=True)
df=df[~df['sentiment'].isin(['empty','enthusiasm','boredom','anger'])]
df = df.sample(frac=1).reset_index(drop=True)
labels = []
texts = []
for i,row in df.iterrows():
texts.append(row['content'])
labels.append(row['sentiment'])
tokenizer = Tokenizer()
tokenizer.fit_on_texts(texts)
sequences = tokenizer.texts_to_sequences(texts)
word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(word_index))
data = pad_sequences(sequences)
encoder = LabelEncoder()
encoder.fit(labels)
encoded_Y = encoder.transform(labels)
labels = np.asarray(encoded_Y)
print('Shape of data tensor:', data.shape)
print('Shape of label tensor:', labels.shape)
indices = np.arange(data.shape[0])
np.random.shuffle(indices)
data = data[indices]
labels = labels[indices]
print labels.shape
model = Sequential()
model.add(Embedding(40000, 8,input_length=37))
model.add(Flatten())
model.add(Dense(100,activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(9, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(data,labels, validation_split=0.2, epochs=150, batch_size=100)
I am getting this error:
ValueError: Error when checking target: expected dense_3 to have shape (9,) but got array with shape (1,)
Can someone please point out the fault with my logic? I understand my question is kind of similar to Exception: Error when checking model target: expected dense_3 to have shape (None, 1000) but got array with shape (32, 2)
But I have not managed to find the bug.

You are making multiple mistakes in that code and I will suggest some improvements to make the code better:
remove: for i,row in df.iterrows(): you can directly use
labels = df['sentiment']
texts = df['content']
While using tokenizer = Tokenizer(5000) set max words, this is vocabulary size.
When padding data = pad_sequences(sequences, maxlen=37) provide max length.
Don't convert the output just to an array of values labels = np.asarray(encoded_Y), it's not a regression. You have to one hot encode it:
from keras.utils import np_utils
labels = np_utils.to_categorical(encoded_Y)
When providing the embedding layer model.add(Embedding(40000, 8,input_length=37)) your vocab size is 40K and embedding dimension is 8. Does't make much sense as the data set you have provided has close to 40K unique words. which can't be all be given a proper embedding. model.add(Embedding(5000, 30, input_length=37)) Change to a more sensible number vocab size. NOTE: if you want to use 40000 please update Tokenizer(5000) to the same number.
Use variables like embedding_dim = 8, vocab_size=40000. whatever the value might be.
Instead of model.add(Dense(9, activation='softmax')) as final layer use this, keeps the code clean.
model.add(Dense(labels.shape[1], activation='softmax'))
Final working code is attached at this Link

Related

Cannot understand issue! Input tensor must be at least 2D

I have written this simple program to make a prediction (do not mind that there is no train/test split). x is a 2D input array of size 40k, 4.
import numpy as np
from numpy import load
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
tf.get_logger().setLevel('INFO')
tf.autograph.set_verbosity(1)
x = load('dataset/metadata/x.npy')
y = load('dataset/metadata/y.npy')
meta_model = keras.Sequential(
[
layers.Dense(3, activation='relu'),
layers.Dense(2, activation='relu'),
layers.Dense(1)
]
)
meta_model.compile(
loss=keras.losses.MeanSquaredError(),
optimizer=keras.optimizers.Adam(lr=0.001),
metrics=[tf.keras.metrics.MeanSquaredError()]
)
meta_model.fit(x, y, batch_size=25, epochs=10, verbose=2)
for i in range (10):
print(y[i], " vs ", meta_model(x[i]))
In the final few lines I am attempting to make the model output a prediction (also I am aware that the prediction is happening on the same data which it is using to learn, I am simply trying to get the model to work). I cannot understand why I am getting the following error (on the last line):
Input tensor must be at least 2D: [3]
Can anyone help explain what I am doing incorrectly?

Why am I getting a constant loss and accuracy?

This is my code:-
# Importing the essential libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Getting the dataset
data = pd.read_csv("sales_train.csv")
X = data.iloc[:, 1:-1].values
y = data.iloc[:, -1].values
# y = np.array(y).reshape(-1, 1)
# Getting the values for november 2013 and 2014 to predict 2015
list_of_november_values = []
list_of_november_values_y = []
for i in range(0, len(y)):
if X[i, 0] == 10 or X[i, 0] == 22:
list_of_november_values.append(X[i, 1:])
list_of_november_values_y.append(y[i])
# Converting list to array
arr_of_november_values = np.array(list_of_november_values)
y_train = np.array(list_of_november_values_y).reshape(-1, 1)
# Scaling the independent values
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(arr_of_november_values)
# Creating the neural network
from keras.models import Sequential
from keras.layers import Dense
nn = Sequential()
nn.add(Dense(units=120, activation='relu'))
nn.add(Dense(units=60, activation='relu'))
nn.add(Dense(units=30, activation='relu'))
nn.add(Dense(units=15, activation='relu'))
nn.add(Dense(units=1, activation='softmax'))
nn.compile(optimizer='adam', loss='mse')
nn.fit(X_train, y_train, batch_size=100, epochs=25)
# Saving the weights
nn.save_weights('weights.h5')
print("Weights Saved")
For my loss, I am getting the same value for every epoch. Is it possible if there is a concept I am missing that is causing my loss to be constant??
Here is the dataset for the code.
The predominant reason is your odd choice of final-layer activation, paired with the loss function used. Reconsider this: you are using softmax activation on a single-unit fully-connected layer. Softmax activation takes a vector and scales it such that the sum of the values are equal to one and it retains proportion according to the following function:
The idea is that your network will only ever output 1, thus there are no gradients, and no learning.
To resolve this, first change your final layer activation to either ReLU or Linear, depending upon the structure of your dataset (I'm not willing to use the provided data myself, but I'm sure you understand the structure of your dataset).
I expect there may be further issues regarding the structure of your network, but I'll leave that up to you. For now, the big issue is your final-layer activation.
Change this line:
nn.add(Dense(units=1, activation='softmax'))
To this line:
nn.add(Dense(units=1))
For a regression problem, you don't need an activation function.

Dimension error with RNN and word classification

I'm pretty new with NLP and I want to classify different words depending on their language (basically my model should tell me if a word is french, or english, or spanish and so on).
When I fit the following model I get a dimension error. The "dataset" contains the words, it's a padded tensor of size (1550, 19) and the "y" contains the different languages, it's also a padded tensor of size (1550, 10).
np.random.seed(42)
tf.random.set_seed(42)
from tensorflow.keras.layers import LSTM, GRU, Input, Embedding, Dense
input = Input(shape=[None])
z = Embedding(max_id + 1, 128, input_shape=[None], mask_zero=True)(input)
z = GRU(128)(z)
output = Dense(18, activation='softmax')(z)
model = keras.models.Model(input, output)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
h = model.fit(dataset, y, epochs=5)
ValueError: Shapes (None, 10) and (None, 18) are incompatible
Do you see where the problem is?
Thanks!
The message tells you that the shapes are not compatible, they need to match. I would have put this as a comment, but I can't due to my reputation, so that's why I answered directly, however I am not sure if it works, have you tried:
output = Dense(10, activation='softmax')(z)

How to get Keras model predicted text back into list of words?

I'm trying to built an Autoencoder neural network for finding outliers in a single column list of text. The text input is like the following:
about_header.png
amaze_header_2.png
amaze_header.png
circle_shape.xml
disableable_ic_edit_24dp.xml
fab_label_background.xml
fab_shadow_black.9.png
fab_shadow_dark.9.png
fab_shadow_light.9.png
fastscroller_handle_normal.xml
fastscroller_handle_pressed.xml
folder_fab.png
The problem is that I don't really know what I'm doing, I'm using Keras, and I've converted these lines of text into a matrix using the Keras Tokenizer, so they can be fed into Keras Model so I can fit and predict them.
The problem is that the predict function returns what I believe is a matrix, and I can't really know for sure what happened because I can't convert the matrix back into the list of text like I originally had.
My entire code is as follows:
import sys
from keras import Input, Model
import matplotlib.pyplot as plt
from keras.layers import Dense
from keras.preprocessing.text import Tokenizer
with open('drawables.txt', 'r') as arquivo:
dados = arquivo.read().splitlines()
tokenizer = Tokenizer(filters='', nb_words=None)
tokenizer.fit_on_texts(dados)
x_dados = tokenizer.texts_to_matrix(dados, mode="count")
tamanho = len(tokenizer.word_index) + 1
tamanho_comprimido = int(tamanho/1.25)
x = Input(shape=(tamanho,))
# Encoder
hidden_1 = Dense(tamanho_comprimido, activation='relu')(x)
h = Dense(tamanho_comprimido, activation='relu')(hidden_1)
# Decoder
hidden_2 = Dense(tamanho, activation='relu')(h)
r = Dense(tamanho, activation='sigmoid')(hidden_2)
autoencoder = Model(input=x, output=r)
autoencoder.compile(optimizer='adam', loss='mse')
history = autoencoder.fit(x_dados, x_dados, epochs=25, shuffle=False)
plt.plot(history.history["loss"])
plt.ylabel("Loss")
plt.xlabel("Epoch")
plt.show()
encoded = autoencoder.predict(x_dados)
result = ???????
You can decode the text using original encoding tokenizer.sequences_to_texts. This accepts a list of integer sequences. To get the sequences you can use np.argmax.
encoded_argmax = np.argmax(encoded, axis=1)
text = tokenizer.sequences_to_texts([encoded_argmax]) # since your output is just a number needs to convert into list

Feature and time steps in LSTM MNIST dataset

I've been working with LSTMs for a while and I think I have grasped the main concepts. I have been trying to play with the Keras environment for a while so that I could get a better idea of how LSTM work, so I decided to train a neural network to identify the MNIST dataset.
I know that when I train a LSTM I should give a tensor as an input (number of samples, time steps, features). I reshaped the image from a 28x28 to a single vector of 784 elements (1x784) and then I make the input_shape = (60000, 1, 784). Eventually I tried to change the number of time steps and my new input_shape becomes (60000,16,49).
What I don't understand is why when I change the number of time steps the feature vector changes from 784 to 49. I think I don't really understand the concept of time steps in an LSTM. Could you please explain it better? Possibly referring to this particular case?
Furthermore, when I increase the time steps the precision is lower, why is so? Shouldn't it be higher?
Thank you.
edit
from __future__ import print_function
import numpy as np
import struct
from keras.models import Sequential
from keras.layers import Dense, LSTM, Activation
from keras.utils import np_utils
train_im = open('train-images-idx3-ubyte','rb')
train_la = open('train-labels-idx1-ubyte','rb')
test_im = open('t10k-images-idx3-ubyte','rb')
test_la = open('t10k-labels-idx1-ubyte','rb')
##training images and labels
magic,num_ima = struct.unpack('>II', train_im.read(8))
rows,columns = struct.unpack('>II', train_im.read(8))
img = np.fromfile(train_im,dtype=np.uint8).reshape(rows*columns, num_ima) #784*60000
magic_l, num_l = struct.unpack('>II', train_la.read(8))
lab = np.fromfile(train_la, dtype=np.int8) #1*60000
## test images and labels
magic, num_test = struct.unpack('>II', test_im.read(8))
rows,columns = struct.unpack('>II', test_im.read(8))
img_test = np.fromfile(test_im,dtype=np.uint8).reshape(rows*columns, num_test) #784x10000
magic_l, num_l = struct.unpack('>II', test_la.read(8))
lab_test = np.fromfile(test_la, dtype=np.int8) #1*10000
batch = 50
epoch=15
hidden_units = 10
classes = 1
a, b = img.T.shape[0:]
img = img.reshape(img.T.shape[0],-1,784)
img_test = img_test.reshape(img_test.T.shape[0],-1,784)
lab = np_utils.to_categorical(lab, 10)
lab_test = np_utils.to_categorical(lab_test, 10)
print(img.shape[0:])
model = Sequential()
model.add(LSTM(40,input_shape =img.shape[1:], batch_size = batch))
model.add(Dense(10))
model.add(Activation('softmax'))
model.compile(optimizer = 'RMSprop', loss='mean_squared_error', metrics = ['accuracy'])
model.fit(img, lab, batch_size = batch,epochs=epoch,verbose=1)
scores = model.evaluate(img_test, lab_test, batch_size=batch)
predictions = model.predict(img_test, batch_size = batch)
print('LSTM test score:', scores[0])
print('LSTM test accuracy:', scores[1])
edit 2
Thank you very much, when I do so I get the following error:
ValueError: Input arrays should have the same number of samples as target arrays. Found 3750 input samples and 60000 target samples.
I know that I should reshape the output as well but I don't know what shape it should have.
Timesteps represent states in time like frames extracted from a video. The shape of the input passed to the LSTM should be in the form (num_samples,timesteps,input_dim). If you want 16 timesteps you should reshape your data as (num_samples//timesteps, timesteps, input_dims)
img=img.reshape(3750,16,784)
So with your batch_size=50,it will pass 50*16 images at a time.
Right now as you keep the num_samples constant, it splits your input_dims.
edit:
The target array will have the same shape as the num_samples i.e 3750 in your case. All the time steps will share the same label. You have to decide what you are going to do with those MNIST sequences. Your current model classifies those sequences (not digits) into 10 classes.

Categories