Assume, I have two features: x1 and x2. Here, x1 is a vector of word index and x2 is a vector of numerical values. The length of x1 and x2 are equal to 50. There are 6000 rows for each x1 and x2. I combine these two into one such as
X = np.array([np.row_stack((x1[i], x2[i])) for i in range(x1.shape[0])])
My initial LSTM model is
X_input = Input(shape = (50, 2), name = "X_seq")
X_hidden1 = LSTM(units = 256, dropout = 0.25, return_sequences = True)(X_input)
X_hidden2 = LSTM(units = 256, dropout = 0.25, return_sequences = True)(X_hidden1)
X_hidden3 = LSTM(units = 128, dropout = 0.25)(X_hidden2)
X_dense = Dense(units = 128, activation = 'relu')(X_hidden3)
X_dense_dropout = Dropout(0.25)(X_dense)
concat = tf.keras.layers.concatenate(inputs = [X_dense_dropout])
output = Dense(units = num_category, activation = 'softmax', name = "output")(concat)
model = tf.keras.Model(inputs = [X_input], outputs = [output])
model.compile(optimizer = 'adam', loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])
However, I know I need to have an embedding layer to take care of X[0,:] right below the Input layer. Thus, I modified my above code to
X_input = Input(shape = (50, 2), name = "X_seq")
x1_embedding = Embedding(input_dim = max_pages, output_dim = embedding_dim, input_length = max_length)(X_input[0,:])
X_concat = tf.keras.layers.concatenate(inputs = [x1_embedding, X_input[1,:]])
X_hidden1 = LSTM(units = 256, dropout = 0.25, return_sequences = True)(X_concat)
X_hidden2 = LSTM(units = 256, dropout = 0.25, return_sequences = True)(X_hidden1)
X_hidden3 = LSTM(units = 128, dropout = 0.25)(X_hidden2)
X_dense = Dense(units = 128, activation = 'relu')(X_hidden3)
X_dense_dropout = Dropout(0.25)(X_dense)
concat = tf.keras.layers.concatenate(inputs = [X_dense_dropout])
output = Dense(units = num_category, activation = 'softmax', name = "output")(concat)
model = tf.keras.Model(inputs = [X_input], outputs = [output])
model.compile(optimizer = 'adam', loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])
Python shows an error
ValueError: A `Concatenate` layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 2, 15), (None, 2)]
any suggestion? many thanks
The problem is that the input to the concat layer has different dimensions and hence we can't concat them. To overcome this issue we could just reshape the input to the concat layer using tf.keras.layers.Reshape like below and the rest will be same.
reshaped_input = tf.keras.layers.Reshape((-1,1))(X_input[:, 1])
X_concat = tf.keras.layers.concatenate(inputs = [x1_embedding, reshaped_input])
Related
I am trying to have a normal classification model with a tabular dataset. I came across the Attention layer and I would like to use it to improve my model's accuracy.
input_features_size = X_train.shape[1]
layers = [
tf.keras.Input(shape = input_features_size),
tf.keras.layers.Dense(64, activation = 'relu', name = 'first_layer'),
tf.keras.layers.Dense(128, activation = 'relu', name = 'second_layer'),
tf.keras.layers.BatchNormalization(axis = 1),
tf.keras.layers.Dense(1, activation = 'sigmoid', name = 'output_layer')
]
metrics = [
tf.keras.metrics.BinaryAccuracy(name = 'accuracy'),
tf.keras.metrics.Precision(name = 'precision'),
tf.keras.metrics.Recall(name = 'recall')
]
NUM_EPOCHS = 20
deep_learning_model = Sequential(layers = layers, name = 'DL_Classifier')
deep_learning_model.compile(
loss = binary_crossentropy,
optimizer = Adam(learning_rate = 1e-4),
metrics = metrics
)
I tried adding Addention layer (tf.keras.layers.Attention()) in the layers list but I am doing some mistake here. I am getting this error : Attention layer must be called on a list of inputs, namely [query, value]
How to add an Attention layer?
I am trying to online-train a neural network. I want to use the Tensorflow Keras train_on_batch function on a convolutional neural network. Here it is:
look_back=1600
inputTensor = keras.layers.Input([look_back+3,2])
inputTensorReshaped = tf.reshape(inputTensor, [1, look_back + 3, 2, 1])
#split into 2 groups
inputgroup1 = keras.layers.Lambda(lambda x: x[:, :3], output_shape=((1, 3, 2, 1)))(inputTensorReshaped)
inputgroup2 = keras.layers.Lambda(lambda x: x[:, 3:look_back + 3], output_shape=((1, look_back,2, 1)))(inputTensorReshaped)
conv1 = keras.layers.Conv2D(filters=1024, kernel_size=(10, 2), activation='relu')(inputgroup2)#10
pool1 = keras.layers.MaxPooling2D(pool_size=(2, 1))(conv1)
dropout1 = keras.layers.Dropout(rate=0.1)(pool1)
norm1 = keras.layers.LayerNormalization()(dropout1)
conv2 = keras.layers.Conv2D(filters=512, kernel_size=(8, 1), activation='relu')(norm1)
pool2 = keras.layers.MaxPooling2D(pool_size=(2, 1))(conv2)
dropout2 = keras.layers.Dropout(rate=0.1)(pool2)
norm2 = keras.layers.LayerNormalization()(dropout2)
conv3 = keras.layers.Conv2D(filters=256, kernel_size=(6, 1), activation='relu')(norm2)
pool3 = keras.layers.MaxPooling2D(pool_size=(2, 1))(conv3)
dropout3 = keras.layers.Dropout(rate=0.1)(pool3)
norm3 = keras.layers.LayerNormalization()(dropout3)
conv4 = keras.layers.Conv2D(filters=128, kernel_size=(4, 1), activation='relu')(norm3)
pool4 = keras.layers.MaxPooling2D(pool_size=(2, 1))(conv4)
dropout4 = keras.layers.Dropout(rate=0.1)(pool4)
norm4 = keras.layers.LayerNormalization()(dropout4)
conv5 = keras.layers.Conv2D(filters=64, kernel_size=(2, 1), activation='relu')(norm4)
pool5 = keras.layers.MaxPooling2D(pool_size=(2, 1))(conv5)
dropout5 = keras.layers.Dropout(rate=0.1)(pool5)
norm5 = keras.layers.LayerNormalization()(dropout5)
flatten1 = keras.layers.Flatten()(norm5)
dense1 = keras.layers.Dense(32, activation='relu')(flatten1)
misclayer1 = keras.layers.Dense(32, activation='relu')(inputgroup1)
miscdropout1 = keras.layers.Dropout(rate=0.1)(misclayer1)
miscnorm1 = keras.layers.LayerNormalization()(miscdropout1)
misclayer2 = keras.layers.Dense(128, activation='relu')(miscnorm1)
miscdropout2 = keras.layers.Dropout(rate=0.1)(misclayer2)
miscnorm2 = keras.layers.LayerNormalization()(miscdropout2)
misclayer3 = keras.layers.Dense(32, activation='relu')(miscnorm2)
miscdropout3 = keras.layers.Dropout(rate=0.1)(misclayer3)
miscnorm3 = keras.layers.LayerNormalization()(miscdropout3)
miscflatten1 = keras.layers.Flatten()(miscnorm3)
misclayer4 = keras.layers.Dense(32, activation='relu')(miscflatten1)
rejoinlayer = keras.layers.Concatenate()([dense1, misclayer4])
processing1 = keras.layers.Dense(64, activation='relu')(rejoinlayer)
totalnorm1 = keras.layers.LayerNormalization()(processing1)
processing2 = keras.layers.Dense(32, activation='relu')(totalnorm1)
totaldropout1 = keras.layers.Dropout(rate=0.2)(processing2)
processing3 = keras.layers.Dense(16, activation='relu')(totaldropout1)
totalnorm2 = keras.layers.LayerNormalization()(processing3)
processing4 = keras.layers.Dense(8, activation='relu')(totalnorm2)
totaldropout2 = keras.layers.Dropout(rate=0.2)(processing4)
processing5 = keras.layers.Dense(4, activation='relu')(totaldropout2)
output = keras.layers.Dense(1, activation='linear')(processing5)
model = keras.Model(inputTensor,output)
model.compile(optimizer=keras.optimizers.SGD(learning_rate=0.00005, momentum=0.1, nesterov=True), loss="mean_squared_error")
#trains the model with the 1st state, action, and value
def train():
global qtable
x = []
y = []
for i in range(0, 8):
state = qtable.loc[qtable.index[i], "state"]
action = [qtable.loc[qtable.index[i], "action"], qtable.loc[qtable.index[0], "action"]]
x.append([action])
x[i].extend(state)
y.append([qtable.loc[qtable.index[i], "value"]])
print("training...loss:")
with tf.device('/gpu:0'):
print(model.train_on_batch(np.nan_to_num(np.array(x)), np.nan_to_num(np.array(y))))
In this case the variable "state" would be a 1202-by-2 list [[a,b],[c,d],[e,f],...] and the variable "action" would be a 1-by-2 list [a,b] before being appended/extended to x. In theory, the training I want is a batch size of 8 with a 1203-by-2 input shape. However, I get this error:
ValueError: Cannot reshape a tensor with 19248 elements to shape [1,1203,2,1] (2406 elements) for '{{node model/tf.reshape/Reshape}} = Reshape[T=DT_FLOAT, Tshape=DT_INT32](IteratorGetNext, model/tf.reshape/Reshape/shape)' with input shapes: [8,1203,2], [4] and with input tensors computed as partial shapes: input[1] = [1,1203,2,1].
This shows that all the inputs and outputs are being put into the CNN at once which is not what I want. Instead, I want the data to be in a batch of 8. How can I do this??? Am I even using "train_on_batch" correctly
batch_size:: Integer or None. Number of samples per batch of
computation. If unspecified, batch_size will default to 32. Do not
specify the batch_size if your data is in the form of a dataset,
generators, or keras.utils.Sequence instances (since they generate
batches).
Find the below example with batch_size
num_classes = 5
model = Sequential([
layers.experimental.preprocessing.Rescaling(1./255, input_shape=(img_height, img_width, 3)),
layers.Conv2D(16, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(32, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(64, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(num_classes)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit(
train_ds,
validation_data=val_ds,
batch_size = 32
epochs=10
)
I wanted to replace the Dense_out layer with a convolution one, can anybody tell me how to do it?
code:
model = Sequential()
conv_1 = Conv2D(filters = 32,kernel_size=(3,3),activation='relu')
model.add(conv_1)
conv_2 = Conv2D(filters=64,kernel_size=(3,3),activation='relu')
model.add(conv_2)
pool = MaxPool2D(pool_size = (2,2),strides = (2,2), padding = 'same')
model.add(pool)
drop = Dropout(0.5)
model.add(drop)
model.add(Flatten())
Dense_1 = Dense(128,activation = 'relu')
model.add(Dense_1)
Dense_out = Dense(57,activation = 'softmax')
model.add(Dense_out)
model.compile(optimizer='Adam',loss='categorical_crossentropy',metric=['accuracy'])
model.fit(train_image,train_label,epochs=10,verbose = 1,validation_data=(test_image,test_label))
print(model.summary())
when I'm trying this code :
model = Sequential()
conv_01 = Conv2D(filters = 32,kernel_size=(3,3),activation='relu')
model.add(conv_01)
conv_02 = Conv2D(filters=64,kernel_size=(3,3),activation='relu')
model.add(conv_02)
pool = MaxPool2D(pool_size = (2,2),strides = (2,2), padding = 'same')
model.add(pool)
conv_11 = Conv2D(filters=64,kernel_size=(3,3),activation='relu')
model.add(conv_11)
pool_2 = MaxPool2D(pool_size=(2,2),strides=(2,2),padding='same')
model.add(pool_2)
drop = Dropout(0.3)
model.add(drop)
model.add(Flatten())
Dense_1 = Dense(128,activation = 'relu')
model.add(Dense_1)
Dense_2 = Dense(64,activation = 'relu')
model.add(Dense_2)
conv_out = Conv2D(filters= 64,kernel_size=(3,3),activation='relu')
model.add(Dense_out)
model.compile(optimizer='Adam',loss='categorical_crossentropy',metrics=['accuracy'])
model.fit(train_image,train_label,epochs=10,verbose = 1,validation_data=(test_image,test_label))
I get the following error
ValueError: Input 0 of layer conv2d_3 is incompatible with the layer:
expected ndim=4, found ndim=2. Full shape received: [None, 64]
I am new at this so an explanation would greatly help
You will need to reshape to be able to use a 2x2 filter as needed in a conv2D layer.
You can use:
out = keras.layers.Reshape(target_shape)
model.add(out)
and then do the convolution:
conv_out = Conv2D(filters=3,kernel_size=(3,3),activation='softmax')
model.add(conv_out)
with filters being the number of channels you want in you output layer (3 for RGB).
More info about the layers and parameters in Keras Documentation
I am currently trying to implement a CNN which purpose is to perform classification, but for some reason am I not able to define my output dimension to 1.
Here is an example code:
import keras
from keras.layers.merge import Concatenate
from keras.models import Model
from keras.layers import Input, Dense
from keras.layers import Dropout
from keras.layers.core import Dense, Activation, Lambda, Reshape,Flatten
from keras.layers import Conv2D, MaxPooling2D, Reshape, ZeroPadding2D
import numpy as np
train_data_1 = np.random.randint(100,size=(100,3,6,3))
train_data_2 = np.random.randint(100,size=(100,3,6,3))
test_data_1 = np.random.randint(100,size=(10,3,6,3))
test_data_2 = np.random.randint(100,size=(10,3,6,3))
labels_train_data =np.random.randint(145,size=100)
labels_test_data =np.random.randint(145,size=10)
input_img_1 = Input(shape=(3, 6, 3))
input_img_2 = Input(shape=(3, 6, 3))
conv2d_1_1 = Conv2D(filters = 32, kernel_size = (3,3) , padding = "same" , activation = 'relu' , name = "conv2d_1_1" )(input_img_1)
conv2d_2_1 = Conv2D(filters = 64, kernel_size = (3,3) , padding = "same" , activation = 'relu' )(conv2d_1_1)
conv2d_3_1 = Conv2D(filters = 64, kernel_size = (3,3) , padding = "same" , activation = 'relu' )(conv2d_2_1)
conv2d_4_1 = Conv2D(filters = 32, kernel_size = (1,1) , padding = "same" , activation = 'relu' )(conv2d_3_1)
conv2d_4_1_flatten = Flatten()(conv2d_4_1)
conv2d_1_2 = Conv2D(filters = 32, kernel_size = (3,3) , padding = "same" , activation = 'relu' , name = "conv2d_1_2")(input_img_2)
conv2d_2_2 = Conv2D(filters = 64, kernel_size = (3,3) , padding = "same" , activation = 'relu' )(conv2d_1_2)
conv2d_3_2 = Conv2D(filters = 64, kernel_size = (3,3) , padding = "same" , activation = 'relu' )(conv2d_2_2)
conv2d_4_2 = Conv2D(filters = 32, kernel_size = (1,1) , padding = "same" , activation = 'relu' )(conv2d_3_2)
conv2d_4_2_flatten = Flatten()(conv2d_4_2)
merge = keras.layers.concatenate([conv2d_4_1_flatten, conv2d_4_2_flatten])
dense1 = Dense(100, activation = 'relu')(merge)
dense2 = Dense(50,activation = 'relu')(dense1)
dense3 = Dense(1 ,activation = 'softmax')(dense2)
model = Model(inputs = [input_img_1, input_img_2] , outputs = dense3)
model.compile(loss="sparse_categorical_crossentropy", optimizer="adam")
print model.summary()
labels_train = keras.utils.to_categorical(labels_train_data, num_classes=145)
labels_test = keras.utils.to_categorical(labels_test_data, num_classes=145)
hist_current = model.fit(x = [train_data_1, train_data_2],
y = labels_train,
shuffle=False,
validation_data=([test_data_1 ,test_data_2], labels_test),
validation_split=0.1,
epochs=150000,
batch_size = 15,
verbose=1)
And the error message being:
Traceback (most recent call last):
File "test_model.py", line 57, in <module>
verbose=1)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1405, in fit
batch_size=batch_size)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1299, in _standardize_user_data
exception_prefix='model target')
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 133, in _standardize_input_data
str(array.shape))
ValueError: Error when checking model target: expected dense_3 to have shape (None, 1) but got array with shape (100, 145)
several inconsistencies in your model :
dense3 = Dense(1 ,activation = 'softmax')(dense2) : you cannot use a softmax on one neuron alone. The softmax normalizese the output of the layer so that it sums up to 1... In this case if you normalize one value alone, it will always output 1. However this is not why you get the error
just how many classes do you have? From your network, you output one value (last layer is Dense(1)) so I would expect that you want to predict 2 classes (output 1 or 0). But here We see that your output is a categorical with 145 possibilities... Your label_train array is 100 one hot vectors of length 145, so I assume that you want to classify the 100 samples into 145 different categories... This is why keras is complaining, your networks outputs (100,1) and your targets (labels) are (100,145). What do you really want to do ?
Edit :
Following the comment, since you want to predictif the image belongs to one of 145 classes, you will have to output 145 values. So you will have to change the top layers of your network so that your last layer is a Dense(145, activation='softmax'). So I propose that you replace
dense1 = Dense(100, activation = 'relu')(merge)
dense2 = Dense(50,activation = 'relu')(dense1)
dense3 = Dense(1 ,activation = 'softmax')(dense2)
with
dense1 = Dense(200, activation = 'relu')(merge)
dense2 = Dense(150, activation = 'relu')(dense1)
dense3 = Dense(145, activation = 'softmax')(dense2)
If you really want to have 3 dense layers, otherwise you can just remove the middle one... This will depend on your usecase, so the architecture of the hidden layers is up to you. I'm just insisting that your last layer should be a Dense(145, activation='softmax').
Makes sense?
Edit 2 :
On top of that, you shouldn't encode your targets (labels) as categoricals, when you use sparse_categorical_crossentropy, it is done automatically under the hood.
So either you use keras.utils.to_categorical on your targets with loss=categorical_crossentropy
or you don't transform the targets with keras.utils.to_categorical and use loss=sparse_categorical_crossentropy.
It's running on my machine.
I have been writing simple autoencoder using tflearn.
net = tflearn.input_data (shape=[None, train.shape [1]])
net = tflearn.fully_connected (net, 500, activation = 'tanh', regularizer = None, name = 'fc_en_1')
#hidden state
net = tflearn.fully_connected (net, 100, activation = 'tanh', regularizer = 'L1', name = 'fc_en_2', weight_decay = 0.0001)
net = tflearn.fully_connected (net, 500, activation = 'tanh', regularizer = None, name = 'fc_de_1')
net = tflearn.fully_connected (net, train.shape [1], activation = 'linear', name = 'fc_de_2')
net = tflearn.regression(net, optimizer='adam', learning_rate=0.01, loss='mean_square', metric='default')
model = tflearn.DNN (net)
Model is trained well, but after training I want to use separately encoder and decoder.
How can I do it? Right now I can restore input, and I want to be able to convert input to hidden representation and restore input from arbitrary hidden representation.
You can just save names of encoder and decoder inputs/outputs.
Namely (added INPUT, HIDDEN_STATE, OUTPUT):
net = tflearn.input_data (shape=[None, train.shape [1]])
INPUT = net
net = tflearn.fully_connected (net, 500, activation = 'tanh', regularizer = None, name = 'fc_en_1')
#hidden state
net = tflearn.fully_connected (net, 100, activation = 'tanh', regularizer = 'L1', name = 'fc_en_2', weight_decay = 0.0001)
HIDDEN_STATE = net
net = tflearn.fully_connected (net, 500, activation = 'tanh', regularizer = None, name = 'fc_de_1')
net = tflearn.fully_connected (net, train.shape [1], activation = 'linear', name = 'fc_de_2')
OUTPUT = net
net = tflearn.regression(net, optimizer='adam', learning_rate=0.01, loss='mean_square', metric='default')
model = tflearn.DNN (net)
And then use such functions to encode/decode:
def encode (X):
if len (X.shape) < 2:
X = X.reshape (1, -1)
tflearn.is_training (False, model.session)
res = model.session.run (HIDDEN_STATE, feed_dict={INPUT.name:X})
return res
def decode (X):
if len (X.shape) < 2:
X = X.reshape (1, -1)
#just to pass something to place_holder
zeros = np.zeros ((X.shape [0], train.shape [1]))
tflearn.is_training (False, model.session)
res = model.session.run (OUTPUT, feed_dict={INPUT.name:zeros, HIDDEN_STATE.name:X})
return res
Thanks for your answer #discharged-spider. I just encoded/decoded 2,000 vectors of size 1,000 and reduced their dimension using the autoencoder mentioned above. However, whenever I try to find a mapping from the output of the decoder to the actual input, only on 1 vector it successfully maps the result of decoder output to the actual output. I'm not sure how I can increase the accuracy here.
I use the euclidian distance to find the closest vector to the output of the decoder.