Related
As a trial, I'm implementing Xception to classify images without using pretrained weight in Tensorflow.
However, the accuracy are too low compared to the original paper.
Could somebody share any advice to address this problem?
I prepared 500 out of 1000 classes from ImageNet and train ready-Xception model with this data from scrach .
I tried the same learning rate and optimizer as used in the original paper.
– Optimizer: SGD
– Momentum: 0.9
– Initial learning rate: 0.045
– Learning rate decay: decay of rate 0.94 every 2 epochs
However, this did not work so well.
I know it is better to use all of 1000 classes rather than only 500, however, I couldn't prepare storage for it.
Did it affect the performance of my code?
Here is my code.
import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras import layers, losses, models, optimizers, callbacks, applications, preprocessing
# scheduler
def scheduler(epoch, lr):
return 0.045*0.94**(epoch/2.0)
lr_decay = callbacks.LearningRateScheduler(scheduler)
# early stopping
EarlyStopping = callbacks.EarlyStopping(monitor='val_loss', min_delta=0, patience=500, verbose=0, mode='auto', restore_best_weights=True)
# build xception
inputs = tf.keras.Input(shape=(224, 224, 3))
x = tf.cast(inputs, tf.float32)
x = tf.keras.applications.xception.preprocess_input(x) #preprocess image
x = applications.xception.Xception(weights=None, include_top=False,)(x, training=True)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(nb_class)(x)
outputs = layers.Softmax()(x)
model = tf.keras.Model(inputs, outputs)
model.compile(optimizer=optimizers.SGD(momentum=0.9, nesterov=True),
loss = 'categorical_crossentropy',
metrics= ['accuracy'])
# fitting data
history = model.fit(image_gen(df_train_chunk, 224, 224, ), #feed images with a generator
batch_size = 32,
steps_per_epoch = 64,
epochs=1000000000,
validation_data = image_gen(df_valid_chunk, 224, 224, ), #feed images with a generator
validation_steps = 64,
callbacks = [lr_decay, EarlyStopping],
)
My results are below. In the original paper, its accuracy reached around 0.8.
In contrast, the performance of my code is too poor.
P.S.
Some might wonder if my generator got wrong, so I put my generator code and result below.
from PIL import Image, ImageEnhance, ImageOps
def image_gen(df_data, h, w, shuffle=True):
nb_class = len(np.unique(df_data['Class']))
while True:
if shuffle:
df_data = df_data.sample(frac=1)
for i in range(len(df_data)):
X = Image.open((df_data.iloc[i]).loc['Path'])
X = X.convert('RGB')
X = X.resize((w,h))
X = preprocessing.image.img_to_array(X)
X = np.expand_dims(X, axis=0)
klass = (df_data.iloc[i]).loc['Class']
y = np.zeros(nb_class)
y[klass] = 1
y = np.expand_dims(y, axis=0)
yield X, y
train_gen = image_gen(df_train_chunk, 224, 224, )
for i in range(5):
X, y = next(train_gen)
print('\n\n class: ', y.argmax(-1))
display(Image.fromarray(X.squeeze(0).astype(np.uint8)))
the result is below.
When you chose only 500 labels, do you choose the first 500?
softmax output starting from 0, so make sure your labels staring from 0 to 499 either.
I am doing research on using CNN machine learning model with NLP (multi-label classification)
I read some papers that mentioned getting good results in applying CNN for multi-label classification
I am trying to test this model on Python.
I read many articles about how to work with NLP an Neural Networks.
I have this code that is not working and giving me many errors ( every time I fix the error I get another error )
I ended seeking paid FreeLancers to help me fix the code, I hired 5 guys but non of them was able to fix the code !
you are my last hope.
I hope someone can helpe me fix this code and get it working.
First this is my dataset (100 record sample, just to make sure that code is working, I know it is not enogh for good accuracy. I will tweak and enhance model later)
http://shrinx.it/data100.zip
at the time being I just want this code to work. yet tips on how to enhance accuracy are really welcomed.
Some of the errors I got
InvalidArgumentError: indices[1] = [0,13] is out of order. Many sparse ops require sorted indices.
Use `tf.sparse.reorder` to create a correctly ordered copy.
and
ValueError: Input 0 of layer sequential_8 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 18644]
here is my code
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MultiLabelBinarizer
from keras.layers import *
from keras.callbacks import ModelCheckpoint
from keras.optimizers import Adam
from keras.models import *
# Load Dataset
df_text = pd.read_csv("J:\\__DataSets\\__Samples\\Test\\data100\\text100.csv")
df_results = pd.read_csv("J:\\__DataSets\\__Samples\\Test\\data100\\results100.csv")
df = pd.merge(df_text,df_results, on="ID")
#Prepare multi-label
Labels = []
for i in df['Code']:
Labels.append(i.split(","))
df['Labels'] = Labels
multilabel_binarizer = MultiLabelBinarizer()
multilabel_binarizer.fit(df['Labels'])
y = multilabel_binarizer.transform(df['Labels'])
X = df['Text'].values
#TF-IDF
tfidf_vectorizer = TfidfVectorizer(max_df=0.8, max_features=1000)
xtrain, xval, ytrain, yval = train_test_split(X, y, test_size=0.2, random_state=9)
tfidf_vectorizer = TfidfVectorizer(max_df=0.8, max_features=1000)
# create TF-IDF features
X_train_count = tfidf_vectorizer.fit_transform(xtrain)
X_test_count = tfidf_vectorizer.transform(xval)
#Prepare Model
input_dim = X_train_count.shape[1] # Number of features
output_dim=len(df['Labels'].explode().unique())
sequence_length = input_dim
vocabulary_size = X_train_count.shape[0]
embedding_dim = output_dim
filter_sizes = [3,4,5]
num_filters = 512
drop = 0.5
epochs = 100
batch_size = 30
#CNN Model
inputs = Input(shape=(sequence_length,), dtype='int32')
embedding = Embedding(input_dim=vocabulary_size, output_dim=embedding_dim, input_length=sequence_length)(inputs)
reshape = Reshape((sequence_length,embedding_dim,1))(embedding)
conv_0 = Conv2D(num_filters, kernel_size=(filter_sizes[0], embedding_dim), padding='valid', kernel_initializer='normal', activation='relu')(reshape)
conv_1 = Conv2D(num_filters, kernel_size=(filter_sizes[1], embedding_dim), padding='valid', kernel_initializer='normal', activation='relu')(reshape)
conv_2 = Conv2D(num_filters, kernel_size=(filter_sizes[2], embedding_dim), padding='valid', kernel_initializer='normal', activation='relu')(reshape)
maxpool_0 = MaxPool2D(pool_size=(sequence_length - filter_sizes[0] + 1, 1), strides=(1,1), padding='valid')(conv_0)
maxpool_1 = MaxPool2D(pool_size=(sequence_length - filter_sizes[1] + 1, 1), strides=(1,1), padding='valid')(conv_1)
maxpool_2 = MaxPool2D(pool_size=(sequence_length - filter_sizes[2] + 1, 1), strides=(1,1), padding='valid')(conv_2)
concatenated_tensor = Concatenate(axis=1)([maxpool_0, maxpool_1, maxpool_2])
flatten = Flatten()(concatenated_tensor)
dropout = Dropout(drop)(flatten)
output = Dense(units=2, activation='softmax')(dropout)
# this creates a model that includes
model = Model(inputs=inputs, outputs=output)
#Compile
checkpoint = ModelCheckpoint('weights.{epoch:03d}-{val_acc:.4f}.hdf5', monitor='val_acc', verbose=1, save_best_only=True, mode='auto')
adam = Adam(lr=1e-4, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer=adam, loss='binary_crossentropy', metrics=['accuracy'])
print("Traning Model...")
model.summary()
#Fit
model.fit(X_train_count, ytrain, batch_size=batch_size, epochs=epochs, verbose=1, callbacks=[checkpoint], validation_data=(X_test_count, yval)) # starts training
#Accuracy
loss, accuracy = model.evaluate(X_train_count, ytrain, verbose=False)
print("Training Accuracy: {:.4f}".format(accuracy))
loss, accuracy = model.evaluate(X_test_count, yval, verbose=False)
print("Testing Accuracy: {:.4f}".format(accuracy))
a sample of my dataset
text100.csv
ID Text
1 Allergies to Drugs Attending:[**First Name3 (LF) 1**] Chief Complaint: headache and neck stiffne
2 Complaint: fever, chills, rigors Major Surgical or Invasive Procedure: Arterial l
3 Complaint: Febrile, unresponsive--> GBS meningitis and bacteremia Major Surgi
4 Allergies to Drugs Attending:[**First Name3 (LF) 45**] Chief Complaint: PEA arrest . Major Sur
5 Admitted to an outside hospital with chest pain and ruled in for myocardial infarction. She was tr
6 Known Allergies to Drugs Attending:[**First Name3 (LF) 78**] Chief Complaint: Progressive lethargy
7 Complaint: hypernatremia, unresponsiveness Major Surgical or Invasive Procedure: PEG/tra
8 Chief Complaint: cough, SOB Major Surgical or Invasive Procedure: RIJ placed Hemod
Results100.csv
ID Code
1 A32,D50,G00,I50,I82,K51,M85,R09,R18,T82,Z51
2 418,475,905,921,A41,C50,D70,E86,F32,F41,J18,R11,R50,Z00,Z51,Z93,Z95
3 136,304,320,418,475,921,998,A40,B37,G00,G35,I10,J15,J38,J69,L27,L89,T81,T85
4 D64,D69,E87,I10,I44,N17
5 E11,I10,I21,I25,I47
6 905,C61,C91,E87,G91,I60,M47,M79,R50,S43
7 304,320,355,E11,E86,E87,F06,I10,I50,I63,I69,J15,J69,L89,L97,M81,N17,Z91
I don’t have anything concrete to add at the moment, but I found the following two debugging strategies to be useful for me:
Distill your bugs into different sections. For e.g which errors are related to compiling models and which related to training? There could be errors before the model. For the errors that you showed, when did they first raise? Its kind of hard to see without line number and etc.
This step is useful personally as sometimes later errors are manifestation of earlier ones, so sometimes 50 errors might be just 1-2 at the beginning stage.
For a good library, typically their error messages are helpful. Have you tried what the error messages suggest and how did that go?
I want to get the activation values for a given input of a trained LSTM network, specifically the values for the cell, the input gate, the output gate and the forget gate. According to this Keras issue and this Stackoverflow question I'm able to get some activation values with the following code:
(basically I'm trying to classify 1-dimensional timeseries using one label per timeseries, but that doesn't really matter for this general question)
import random
from pprint import pprint
import keras.backend as K
import numpy as np
from keras.layers import Dense
from keras.layers.recurrent import LSTM
from keras.models import Sequential
from keras.utils import to_categorical
def getOutputLayer(layerNumber, model, X):
return K.function([model.layers[0].input],
[model.layers[layerNumber].output])([X])
model = Sequential()
model.add(LSTM(10, batch_input_shape=(1, 1, 1), stateful=True))
model.add(Dense(2, activation='softmax'))
model.compile(
loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
# generate some test data
for i in range(10):
# generate a random timeseries of 100 numbers
X = np.random.rand(10)
X = X.reshape(10, 1, 1)
# generate a random label for the whole timeseries between 0 and 1
y = to_categorical([random.randint(0, 1)] * 10, num_classes=2)
# train the lstm for this one timeseries
model.fit(X, y, epochs=1, batch_size=1, verbose=0)
model.reset_states()
# to keep the output simple use only 5 steps for the input of the timeseries
X_test = np.random.rand(5)
X_test = X_test.reshape(5, 1, 1)
# get the activations for the output lstm layer
pprint(getOutputLayer(0, model, X_test))
Using that I get the following activation values for the LSTM layer:
[array([[-0.04106992, -0.00327154, -0.01524276, 0.0055838 , 0.00969929,
-0.01438944, 0.00211149, -0.04286387, -0.01102304, 0.0113989 ],
[-0.05771339, -0.00425535, -0.02032563, 0.00751972, 0.01377549,
-0.02027745, 0.00268653, -0.06011265, -0.01602218, 0.01571197],
[-0.03069103, -0.00267129, -0.01183739, 0.00434298, 0.00710012,
-0.01082268, 0.00175544, -0.0318702 , -0.00820942, 0.00871707],
[-0.02062054, -0.00209525, -0.00834482, 0.00310852, 0.0045242 ,
-0.00741894, 0.00141046, -0.02104726, -0.0056723 , 0.00611038],
[-0.05246543, -0.0039417 , -0.01877101, 0.00691551, 0.01250046,
-0.01839472, 0.00250443, -0.05472757, -0.01437504, 0.01434854]],
dtype=float32)]
So I get for each input value 10 values, because I specified in the Keras model to use a LSTM with 10 neurons. But which one is a cell, which is is the input gate, which one the output gate, which one the forget gate?
Well, these are the output values, to get and look into the value of each gate look into this issue
I paste the essential part here
for i in range(epochs):
print('Epoch', i, '/', epochs)
model.fit(cos,
expected_output,
batch_size=batch_size,
verbose=1,
nb_epoch=1,
shuffle=False)
for layer in model.layers:
if 'LSTM' in str(layer):
print('states[0] = {}'.format(K.get_value(layer.states[0])))
print('states[1] = {}'.format(K.get_value(layer.states[1])))
print('Input')
print('b_i = {}'.format(K.get_value(layer.b_i)))
print('W_i = {}'.format(K.get_value(layer.W_i)))
print('U_i = {}'.format(K.get_value(layer.U_i)))
print('Forget')
print('b_f = {}'.format(K.get_value(layer.b_f)))
print('W_f = {}'.format(K.get_value(layer.W_f)))
print('U_f = {}'.format(K.get_value(layer.U_f)))
print('Cell')
print('b_c = {}'.format(K.get_value(layer.b_c)))
print('W_c = {}'.format(K.get_value(layer.W_c)))
print('U_c = {}'.format(K.get_value(layer.U_c)))
print('Output')
print('b_o = {}'.format(K.get_value(layer.b_o)))
print('W_o = {}'.format(K.get_value(layer.W_o)))
print('U_o = {}'.format(K.get_value(layer.U_o)))
# output of the first batch value of the batch after the first fit().
first_batch_element = np.expand_dims(cos[0], axis=1) # (1, 1) to (1, 1, 1)
print('output = {}'.format(get_LSTM_output([first_batch_element])[0].flatten()))
model.reset_states()
print('Predicting')
predicted_output = model.predict(cos, batch_size=batch_size)
print('Ploting Results')
plt.subplot(2, 1, 1)
plt.plot(expected_output)
plt.title('Expected')
plt.subplot(2, 1, 2)
plt.plot(predicted_output)
plt.title('Predicted')
plt.show()
I'm trying to use Conv1D for the first time for multiclass classification of time series data and my model keeps throwing this error when I use it.
import numpy as np
import os
import keras
from keras.models import Sequential
from keras.layers import Conv1D, Dense, TimeDistributed, MaxPooling1D, Flatten
# fix random seed for reproducibility
np.random.seed(7)
dataset1 = np.genfromtxt(os.path.join('data', 'norm_cellcycle_384_17.txt'), delimiter=',', dtype=None)
data = dataset1[1:]
# extract columns
genes = data[:,0]
y_all = data[:,1].astype(int)
x_all = data[:,2:-1].astype(float)
# deleted this line when using sparse_categorical_crossentropy
# 384x6
y_all = keras.utils.to_categorical(y_all)
# 5
num_classes = np.unique(y_all).shape[0]
# split entire data into train set and test set
validation_split = 0.2
val_idx = np.random.choice(range(x_all.shape[0]), int(validation_split*x_all.shape[0]), replace=False)
train_idx = [x for x in range(x_all.shape[0]) if x not in val_idx]
x_train = x_all[train_idx]
y_train = y_all[train_idx]
# 308x17x1
x_train = x_train[:, :, np.newaxis]
# 308x1
y_train = y_train[:,np.newaxis]
x_test = x_all[val_idx]
y_test = y_all[val_idx]
# deleted this line when using sparse_categorical_crossentropy
y_test = keras.utils.to_categorical(y_test)
# 76x17x1
x_test = x_test[:, :, np.newaxis]
# 76x1
y_test = y_test[:,np.newaxis]
print(x_train.shape[0],'train samples')
print(x_test.shape[0],'test samples')
# Create Model
# number of filters for 1D conv
nb_filter = 4
filter_length = 5
window = x_train.shape[1]
model = Sequential()
model.add(Conv1D(filters=nb_filter,kernel_size=filter_length,activation="relu", input_shape=(window,1)))
model.add(MaxPooling1D())
model.add(Conv1D(nb_filter=nb_filter, filter_length=filter_length, activation='relu'))
model.add(MaxPooling1D())
model.add(Flatten())
model.add(Dense(num_classes, activation='softmax'))
model.summary()
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=25, batch_size=2, validation_data=(x_test, y_test))
I don't know why I get this error. When I use binary_crossentropy loss and no one hot encoding for y_all, my model works. But it fails when I use one hot encoding for y_all with categorical_crossentropy loss. When I don't use one hot encoding, keras throws an error making me change y_all to one a binary matrix.
I don't even know where the (1,6) are coming from in the array.
ValueError: Error when checking model target: expected dense_1 to have 2 dimensions, but got array with shape (308, 1, 6)
Please help! I've been stuck on this for many hours! Already went through all the related questions but still doesn't make sense.
Update: I now use sparse_categorical_crossentropy because it has integer support. I deleted the to_categorical lines from the above code and I get this new error:
InvalidArgumentError (see above for traceback): Received a label value
of 5 which is outside the valid range of [0, 5). Label values: 2 5
[[Node:
SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits
= SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_1, Cast)]]
Requested sample of data:
,Main,Gp,c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11,c12,c13,c14,c15,c16,c17
YDL179w,1,-0.75808,-0.90319,-0.98935,-0.73995,-0.67193,-0.12777,-0.95307,-1.01656,0.79730,2.11688,1.98537,0.61591,0.56603,-0.13684,-0.52228,-0.05068,0.78823,
YLR079w,1,-0.48845,-0.70828,-0.47688,-0.65814,-0.45374,-0.47302,-0.71214,-1.02839,0.24048,3.11376,1.28952,0.44874,0.04379,-0.31104,-0.30332,-0.34575,0.82285,
YER111c,1,-0.42218,0.23887,1.84427,-0.02083,-0.61105,-0.65827,-0.79992,-0.39857,-0.09166,2.03314,1.58457,0.68744,0.14443,-0.72910,-1.46097,-0.82353,-0.51662,
YBR200w,1,0.09824,0.55258,-0.89641,-1.19111,-1.11744,-0.76133,0.09824,2.16120,1.46126,1.03148,0.67537,-0.33155,-0.60170,-1.39987,-0.42978,-0.15963,0.81045,
YPL209c,2,-0.65282,-0.32055,2.53702,2.00538,0.60982,0.51014,-0.55314,-1.01832,-0.78573,0.01173,0.07818,-0.05473,-0.22087,0.24432,-0.28732,-1.11801,-0.98510,
YJL074c,2,-0.81087,-0.19448,1.72941,0.59002,-0.53069,-0.25051,-0.92294,-0.92294,-0.53069,0.08570,1.87884,1.97223,0.45927,-0.36258,-0.34390,-1.07237,-0.77351,
YNL233w,2,-0.43997,0.66325,2.85098,0.74739,-0.42127,-0.47736,-0.79524,-0.80459,-0.48671,-0.21558,1.25226,1.01852,-0.10339,-0.56151,-0.96353,-0.46801,-0.79524,
YLR313c,2,-0.46611,0.42952,3.01689,1.13856,0.01902,-0.44123,-0.66514,-0.98856,-0.59050,-0.47855,0.84002,0.39220,0.50416,-0.50342,-0.82685,-0.64026,-0.73977,
YGR041w,2,-0.57187,-0.26687,1.10561,-0.38125,-0.68624,-0.26687,-0.87687,-1.18186,-0.80062,0.60999,2.09686,1.82998,1.14374,0.11437,-0.80062,-0.87687,-0.19062,
So I noticed that even though I know there are 5 classes in this dataset as seen by the unique values obtained for y_all, for some reason Keras to_categorical thinks there are 6 classes.
# 384x6
y_all = keras.utils.to_categorical(y_all)
# 5
num_classes = np.unique(y_all).shape[0]
I don't know why that is. Keeping this in mind I changed this line of code and my model began to run:
model.add(Dense(num_classes, activation='softmax'))
to
model.add(Dense(num_classes+1, activation='softmax'))
I still don't know why to_categorical behaves this way. Anyone know?
to_categorical(x) in Keras will encode the given parameter into n number of classes where n = max(x) + 1, i.e. generally speaking from [0 , max(x)].
I am trying to figure out to setup process of forecasting some value. Currently, I can't understand what is issue in below code:
in_neurons = 1
out_neurons = 1
hidden_neurons = 20
nb_features = 9
# retrieve data
y_train = train.pop(target).values
X_train = pd.concat([train[['QTR_HR_START', 'QTR_HR_END', 'HOLIDAY_RANK_', 'SPECIAL_EVENT_RANK_',
'IS_AM', 'IS_TOP_RANKED', 'AWARDS_WINS_ANY', 'YEARS_SINCE_RELEASE']],
pd.DataFrame({'DATETIME': pd.DatetimeIndex(train['DATETIME']).astype(np.int64)})])
X_train = X_train.values
y_test = test.pop(target).values
X_test = pd.concat([test[['QTR_HR_START', 'QTR_HR_END', 'HOLIDAY_RANK_', 'SPECIAL_EVENT_RANK_',
'IS_AM', 'IS_TOP_RANKED', 'AWARDS_WINS_ANY', 'YEARS_SINCE_RELEASE']],
pd.DataFrame({'DATETIME': pd.DatetimeIndex(test['DATETIME']).astype(np.int64)})])
X_test = X_test.values
model = Sequential()
model.add(TimeDistributed(Dense(8, input_shape=(X_train.shape[0], 100, nb_features), activation='softmax')))
model.add(LSTM(4, dropout_W=0.2, dropout_U=0.2))
model.add(Dense(1))
model.add(Activation("sigmoid"))
model.compile(loss="mean_squared_error", optimizer="rmsprop", metrics=['accuracy'])
After running the code, I got an exception:
raise Exception('The first layer in a Sequential model must '
Exception: The first layer in a Sequential model must get an input_shape or batch_input_shape argument.
Please advice where I am wrong
EDIT1: I just configured the model as was mentioned in official documentation - http://keras.io/layers/recurrent/
model.add(LSTM(32, input_dim=nb_features, input_length=100))
model.compile(loss="mean_squared_error", optimizer="rmsprop", metrics=['accuracy'])
Exception: Error when checking model input: expected lstm_input_1 to have 3 dimensions, but got array with shape (48614, 9)
It's old, but I'm posting for future use. Keras as input requiers 3D data, as stated in error. It is samples, time steps, features. Despite the fact that you have (48614, 9) Keras takes it as 2D - [samples, features]. In order to fix it, do something like this
def reshape_dataset(train):
trainX = numpy.reshape(train, (train.shape[0], 1, train.shape[1]))
return numpy.array(trainX)
x = reshape_dataset(your_dataset_48614, 9)
now X should be 48614,1, 9 which is [samples, time steps, features] - 3D