I'm pretty new to Keras and Tensorflow in general, so perhaps this is a stupid question...
What I want to achieve is the following:
I have a set of words, let's say: cat, dog, cow,..
Those words should be encoded based on a given alphabet, on the position of the character there is a 1 in the vector, else a 0.
For cat e.g. something like 1,0,1,0,0,0,0,0,....,1,0,0,...0.
I use the Keras Tokenizer for that:
tk = Tokenizer(char_level=True, oov_token='UNK')
alphabet="abcdefghijklmnopqrstuvwxyzöäü0123456789-,;.!?:'\"/\\|_##$%^&*~`+-=<>()[]{}"
char_dict = {}
for i, char in enumerate(alphabet):
char_dict[char] = i + 1
# Use char_dict to replace the tk.word_index
tk.word_index = char_dict
# Add 'UNK' to the vocabulary
tk.word_index[tk.oov_token] = max(char_dict.values()) + 1
x_train = tk.texts_to_matrix(x_train)
Those vectors are passed into the Keras model for prediction. Now, I want the transformation to happen in the Keras model. So, the user should provide "cat" to the model and not a numeric vector like above. And the model should also return "cat".
How can I achieve that? I saw that there is a Lambda layer in Keras, is this the correct approach?
Thanks in advance.
Edit for clarification:
The model at the moment looks like this:
model = Sequential()
model.add(Dense(128, input_shape=(len(alphabet)+1,)))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))
But what I want to achieve is to have an input layer that gets strings as inputs and converts the strings to the format the actual first layer can read.Something like this:
model = Sequential()
**model.add(transformation_layer)**
model.add(Dense(128, input_shape=(len(alphabet)+1,)))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))
Edit 2
This is what I tried, but getting the following error when running the "model.fit" function:
tensorflow.python.framework.errors_impl.OperatorNotAllowedInGraphError: iterating over tf.Tensor is not allowed in Graph execution. Use Eager execution or decorate this function with #tf.function.
def transform_layer(x):
return tk.texts_to_matrix(x)
print('Building model...')
transform_layer = Lambda(transform_layer)
model = Sequential()
model.add(transform_layer)
model.add(Dense(128, input_shape=(len(alphabet)+1,)))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
history = model.fit(np.array(['test','test2']), np.array(['blub','blub2']),
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_split=0.1)
Related
I am trying to code a 5 class classifier ANN, and this code return this error:
classifier = Sequential()
classifier.add(Dense(units=10, input_dim=14, kernel_initializer='uniform', activation='relu'))
classifier.add(Dense(units=6, kernel_initializer='uniform', activation='relu'))
classifier.add(Dense(units=5, kernel_initializer='uniform', activation='softmax'))
classifier.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
RD_Model = classifier.fit(X_train,y_train, batch_size=10 , epochs=10, verbose=1)
File "c:\Program Files\Python310\lib\site-packages\keras\backend.py", line 5119, in categorical_crossentropy
target.shape.assert_is_compatible_with(output.shape)
ValueError: Shapes (None, 1) and (None, 5) are incompatible
I figured this is caused because I have a probability matrix instead of an actual output, so I have been trying to apply an argmax, but haven't figured a way
Can someone help me out?
Have you tried applying:
tf.keras.backend.argmax()
You can define a lambda layer using the following:
from keras.layer import Lambda
from keras import backend as K
def argmax_layer(input):
return K.argmax(input, axis=-1)
Keras provides two paradigms for defining a model topology.
The code you are using uses the Sequential API. You might have to revert to the Functional API.
input_layer = Input(shape=(14,))
layer_1 = Dense(10, activation="relu")(input_layer)
layer_2 = Dense(6, activation="relu")(layer_1)
layer_3 = argmax_layer()(layer_2 )
output_layer= Dense(5, activation="linear")(layer_3 )
model = Model(inputs=input_layer, outputs=output_layer)
model.compile(optimizer='adam',
loss='categorical_crossentropy', metrics=['accuracy'])
Another option would be to instantiate an inherited class of a Keras Layer.
https://www.tutorialspoint.com/keras/keras_customized_layer.htm
As Dr. Snoopy mentioned, it was indeed a problem of one-hot encoding... I missed to do that, resulting in my model not working.
So I just one hot encoded it:
encoder = LabelEncoder()
encoder.fit(y_train)
encoded_Y = encoder.transform(y_train)
# convert integers to dummy variables (i.e. one hot encoded)
dummy_y = np_utils.to_categorical(encoded_Y)
And it worked after using dummy_y. Thank you for your help.
I've trained simple GRU with attention layer and now I'm trying to visualize attention weights (I've already got them). Input is 2 one-hot encoded sequences (one is correct, the other is almost the same but has permutations of letters). The task is to define which one of the sequences is correct.
Here's my NN:
optimizer = keras.optimizers.RMSprop()
max_features = 4 #number of words in the dictionary
num_classes = 2
model = keras.Sequential()
model.add(GRU(128, input_shape=(70, max_features), return_sequences=True, activation='tanh'))
model.add(Dropout(0.5))
atn_layer = model.add(SeqSelfAttention())
model.add(Flatten())
model.add(Dense(num_classes, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer=optimizer,
metrics=['accuracy'])
I've tried several things found on StackOverflow but didn't succeed. The thing particularly is that I don't understand how to couple my input and attention weights. I'd appreciate any help and suggestions.
I am trying to train a Neural Network(NN) implemented through Keras to implement the following function.
y(n) = y(n-1)*0.9 + x(n)*0.1
So the idea is to have a signal as train_x data and pass through the above function to get a train_y data, giving us a (train_x, train_y) training data.
import numpy as np
from keras.models import Sequential
from keras.layers.core import Activation, Dense
from keras.callbacks import EarlyStopping
import matplotlib.pyplot as plt
train_x = np.concatenate((np.ones(100)*120,np.ones(150)*150,np.ones(150)*90,np.ones(100)*110), axis=None)
train_y = np.ones(train_x.size)*train_x[0]
alpha = 0.9
for i in range(train_x.size):
train_y[i] = train_y[i-1]*alpha + train_x[i]*(1 - alpha)
train_x data vs train_y data plot
The function under question y(n) is a low pass function and makes the x(n) value to not change abruptly, as shown in the plot.
Then I make a NN and fit it with (train_x, train_y) and plot the
model = Sequential()
model.add(Dense(128, kernel_initializer='normal', input_dim=1, activation='relu'))
model.add(Dense(256, kernel_initializer='normal', activation='relu'))
model.add(Dense(256, kernel_initializer='normal', activation='relu'))
model.add(Dense(256, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal', activation='linear'))
model.compile(loss='mean_absolute_error',
optimizer='adam',
metrics=['accuracy'])
history = model.fit(train_x, train_y, epochs=200, verbose=0)
print(history.history['loss'][-1])
plt.plot(history.history['loss'])
plt.show()
loss_plot_200_epoch
And the final loss value is approximately 2.9, which I thought was pretty good. But then the accuracy plot was like this
accuracy_plot_200_epochs
So when I check the prediction of the neural network over the data it was trained on
plt.plot(model.predict(train_x))
plt.plot(train_x)
plt.show()
train_x_vs_predict_x
The values have just offsetted by a little and that's all. I tried changing the activation functions, number of neurons and layers but the result still is the same. What am I doing wrong?
---- Edit ----
Made the NN to accept 2 dimensional input and it works as intended
import numpy as np
from keras.models import Sequential
from keras.layers.core import Activation, Dense
from keras.callbacks import EarlyStopping
import matplotlib.pyplot as plt
train_x = np.concatenate((np.ones(100)*120,np.ones(150)*150,np.ones(150)*90,np.ones(100)*110), axis=None)
train_y = np.ones(train_x.size)*train_x[0]
alpha = 0.9
for i in range(train_x.size):
train_y[i] = train_y[i-1]*alpha + train_x[i]*(1 - alpha)
train = np.empty((500,2))
for i in range(500):
train[i][0]=train_x[i]
train[i][1]=train_y[i]
model = Sequential()
model.add(Dense(128, kernel_initializer='normal', input_dim=2, activation='relu'))
model.add(Dense(256, kernel_initializer='normal', activation='relu'))
model.add(Dense(256, kernel_initializer='normal', activation='relu'))
model.add(Dense(256, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal', activation='linear'))
model.compile(loss='mean_absolute_error',
optimizer='adam',
metrics=['accuracy'])
history = model.fit(train, train_y, epochs=100, verbose=0)
print(history.history['loss'][-1])
plt.plot(history.history['loss'])
plt.show()
If I execute your code, I get the following plot for the X-Y values:
If I didn't miss something important here and you really feed that to your neural net, you probably can't expect better results. The reason is, that a neural net is just a function that can only calculate one output vector for one input. In your case the output vector would consist of only one element (your y value), but as you can see in the diagram above, for x=90 there is not just one single output. So what you feed to your neural net, cannot really be calculated as a function and so most likely the network tries to calculate the straight line between point ~(90, 145) and ~(150, 150). I mean, the "upper line" in the diagram.
The neural network you're building is a simple multi-layer perceptron with one input node and one output node. This means that it is essentially a function that accepts one real number and returns one real number -- context is not passed in and can therefore not be considered. The expression
model.predict(train_x)
Does not evaluate a vector-to-vector function for the vector train_x but evaluates a number-to-number function for every number in train_x, then returns the list of results. This is why you get flat segments in the train_x_vs_predict_x plot: the same input numbers produce the same output numbers every time.
Given this constraint, the approximation is actually quite good. For example, the network has seen for x values of 150 many y values of 150 and a few lower ones but never anything above 150. So, given an x of 150, it predicts a y value of slightly lower than 150.
The function you wanted, on the other hand, refers to the previous function value and will need information about this in its input. If what you're trying to build is a function that accepts a sequence of real numbers and returns a sequence of real numbers, you could do that with a many-to-many recurrent network (and you're going to need a lot more training data than one example sequence), but since you can calculate the function directly, why bother with neural networks at all? There's no need to whip out the chainsaw where a butter knife will do.
I'm having some trouble understanding why my Keras model has problems generating proper results (it now always returns 0). I have been able to find some others with this problem (ref 1, ref 2), but I haven't been able to understand the underlying cause.
Question: Why is my model only giving one, constant prediction?
Training Data Example
The last column is the prediction, 0 or 1.
32856500,1,1,200,6842314460,0
32800000,-1,0,0,0,0
32800000,-1,1,0,6845343222,0
32800000,-1,2,0,13692319489,0
32800000,-1,3,0,20539336035,0
32769900,-1,4,-30100,27389628085,0
32769900,-1,5,-30100,34239941481,0
32750000,-1,6,-50000,41091099905,0
32750000,-1,7,-50000,47945852379,1
Keras Code for Training
I'm using the sigmoid activation for the binary results. But I'm not sure if the issue lies here or in -for example- the binary_crossentropy or SGD optimizer.
def trainKerasModel(X, Y, path, dimensions):
# Create model
model = Sequential()
model.add(Dense(120, input_dim=dimensions, activation='sigmoid'))
model.add(Dense(100, activation='sigmoid'))
model.add(Dense(80, activation='sigmoid'))
model.add(Dense(60, activation='sigmoid'))
model.add(Dense(40, activation='sigmoid'))
model.add(Dense(20, activation='sigmoid'))
model.add(Dense(12, activation='sigmoid'))
model.add(Dense(10, activation='sigmoid'))
model.add(Dense(8, activation='sigmoid'))
model.add(Dense(6, activation='sigmoid'))
model.add(Dense(4, activation='sigmoid'))
model.add(Dense(2, activation='sigmoid'))
model.add(Dense(1, activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer=SGD(lr=0.01), metrics=['accuracy'])
# Fit the model
model.fit(X, Y, epochs=EPOCHS, batch_size=BATCHSIZE)
# Evaluate
scores = model.evaluate(X, Y)
Helpers().Log(model.metrics_names[1], scores[1]*100)
# Save model
with open(path+".json", "w") as json_file:
json_file.write(model.to_json())
# serialize weights to HDF5
model.save_weights(path+".h5")
Helpers().Log("Saved model to disk")
someFilePath = "file.csv"
dataset = numpy.loadtxt(someFilePath, delimiter=",")
dimensions = len(dataset[0]) - 1
trainKerasModel(dataset[:,0:dimensions], dataset[:,dimensions], someFilePath, dimensions)
Keras Code for Predictions
model = model_from_json(loaded_model_json)
model.load_weights(someWeightsFile)
Xnew = preprocess_input(numpy.array([[32856500,1,1,200,6842314460,0], [32800000,-1,3,0,20539336035,0], [32750000,-1,7,-50000,47945852379,1]]))
Ynew = model.predict_classes(Xnew)
print(Ynew)
12 sigmoid fc layers will never learn anything.
Read theory.
maybe you sould try just 3 layers with tanh , and no af if tanh on input. -1 for false, 1 for true.
Also apply tanh to input datasincethey are not normalized. Also cross entropy has no sence if you have only one output.
plus extending 5 input to 120 features then 12 layers is horrible overfit. You should have here 3 layers like with ~20, 16,10 items, tanh, mse loss, ca 1e-3 1e-4 learning rate
I want to specify 2 loss functions 1 for the object class which is cross-entropy and the other for the bounding box which is mean squared error. how to specify in model.compile each output with the corresponding loss function?
model = Sequential()
model.add(Dense(128, activation='relu'))
out_last_dense = model.add(Dense(128, activation='relu'))
object_type = model.add(Dense(1, activation='softmax'))(out_last_dense)
object_coordinates = model.add(Dense(4, activation='softmax'))(out_last_dense)
/// here is the problem i want to specify loss function for object type and coordinates
model.compile(loss= keras.losses.categorical_crossentropy,
optimizer= 'sgd', metrics=['accuracy'])
First of all, you can't use Sequential API here since your model has two output layers (i.e. what you have written is all wrong and would raise error). Instead you must use Keras Functional API:
inp = Input(shape=...)
x = Dense(128, activation='relu')(inp)
x = Dense(128, activation='relu')(x)
object_type = Dense(1, activation='sigmoid', name='type')(x)
object_coordinates = Dense(4, activation='linear', name='coord')(x)
Now, you can specify a loss function (as well as metric) for each output layer based on their names given above and using a dictionary:
model.compile(loss={'type': 'binary_crossentropy', 'coord': 'mse'},
optimizer='sgd', metrics={'type': 'accuracy', 'coord': 'mae'})
Further, note that you are using softmax as the activation function and I have changed it to sigomid and linear above. That's because: 1) using softmax on a layer with one unit does not make sense (if there are more than 2 classes then you should use softmax), and 2) the other layer predicts coordinates and therefore using softmax is not suitable at all (unless the problem formulation let you do so).