I have encounter a strange thing while doing a dummy model in Keras. For reasons that are not important now, I decided to try to train a set of weights to become the identity matrix. My code was the following:
import tensorflow as tf
from tensorflow import keras
import numpy as np
tfe = tf.contrib.eager
tf.enable_eager_execution()
i4 = np.eye(4)
inds = np.random.randint(0,4,size=2000)
data = i4[inds]
model = keras.Sequential([keras.layers.Dense(4, kernel_regularizer=
keras.regularizers.l2(.001), kernel_initializer='zeros')])
model.compile(optimizer=tf.train.AdamOptimizer(.001), loss= 'mse', metrics = ['accuracy'])
model.fit(data,inds, epochs=50)
this did horribly on what should be a very simple task.I changed the last line to
model.fit(data, data, epochs =50)
which I think essentially means I am feeding the labels as one hot vectors. With this line, the training did exactly what I wanted it to on this very simple task. So, my questions are:
Why would this not work with the first line and work with the second?
What do I need to do to be able to feed the output to keras not as one hot vectors? Not that I mind converting. It's just that some of the examples I've seen - even MNIST - don't seem to convert their labels to one hots before feeding them in. What's the issue here? Is keras trying convert the numerical/other labels I've given it in a way I don't expect? If so, how does it convert such labels so I can predict the response correctly?
The model you used is trying to minimize the mean square error. Thus, it is obvious that the second line is the way to go:
model.fit(data, data, epochs=50)
because to learn the identity matrix, we should have: x =y, and thus data is both the inputs and outputs.
Why this does not work:
model.fit(data, inds, epochs=50)
Well, in this case your network output is of size 4 (dense layer), but you give it outputs of size 1 (inds). You should get an error...
How to do it without using one hot vectors for the output vectors:
One way is to use the sparse categorical crossentropy loss instead as such:
i4 = np.eye(4)
inds = np.random.randint(0,4,size=32)
data = i4[inds]
model = keras.Sequential([keras.layers.Dense(4, kernel_initializer='zeros', activation='softmax')])
model.compile(optimizer=tf.train.AdamOptimizer(.001), loss= 'sparse_categorical_crossentropy', metrics = ['accuracy'])
model.fit(data, inds, epochs=50)
and then you will see that the model will fit the inds very accurately:
In [4]: np.argmax(model.predict(data), axis=1)
Out[4]:
array([3, 1, 1, 3, 0, 3, 2, 0, 2, 1, 0, 2, 0, 0, 1, 2, 3, 2, 3, 0, 3, 2,
1, 2, 3, 3, 3, 1, 0, 1, 2, 0])
In [5]: inds
Out[5]:
array([3, 1, 1, 3, 0, 3, 2, 0, 2, 1, 0, 2, 0, 0, 1, 2, 3, 2, 3, 0, 3, 2,
1, 2, 3, 3, 3, 1, 0, 1, 2, 0])
and the train accuracy :
In [6]: np.mean(np.argmax(model.predict(data), axis=1) == inds)
Out[6]: 1.0
Related
I am pretty new to keras and trying to build a model which takes in a list as input and returns a number between 1 and 16 (or 0 and 15, i have 16 classes)
this is my code so far:
import numpy as np
from tensorflow.keras.layers.experimental.preprocessing import TextVectorization
from tensorflow.keras import layers
from tensorflow import keras
import tensorflow as tf
vectorizer = TextVectorization(output_mode = "int")
'''
vocab_data = np.array(["In animus fert nova"])
vectorizer.adapt(vocab_data) #wichtig
print(vectorizer(vocab_data))
'''
# 2 Möglichkeiten für die ersten 4 Metren --> 2^4 = 16 verschiedene Hexameter
num_Hexameter = 16
inputs = keras.Input(shape = (None,1), dtype=tf.int64)
x = layers.Dense(30)(inputs)
x = layers.Dense(20)(x)
x = layers.Dense(15)(x)
x = layers.Dense(10)(x)
x = layers.Dense(5)(x)
outputs = layers.Dense(num_Hexameter, activation = "softmax")(x)
model = keras.Model(inputs = inputs, outputs = outputs)
model.summary()
model.compile(optimizer = keras.optimizers.RMSprop(learning_rate=1e-3),
loss=keras.losses.SparseCategoricalCrossentropy(), metrics=["accuracy"])
test_train_data_raw = np.array([["In nova fert animus mutatas dicere formas"],
["corpora di coeptis nam vos mutastis et illas"],
["adspirate meis primaque ab origine mundi"],
["ad mea perpetuum deducite tempora carmen"]])
vectorizer.adapt(test_train_data_raw)
test_train_data = vectorizer(test_train_data_raw)
test_train_data_lbl = np.array([[0, 0, 0, 0, 0, 0, 0, 3],[0, 0, 0, 0, 0, 0, 0, 7], [0, 0, 0, 0, 0, 0, 0, 10], [0, 0, 0, 0, 0, 0, 0, 2]])
test_train_data_lbl = np.array([3, 7, 10, 2])
print(test_train_data_lbl.shape)
history = model.fit(test_train_data, test_train_data_lbl, batch_size = 2, epochs = 10)
print(history.history)
test = vectorizer(np.array([["In nova fert animus mutatas dicere formas"]]))
print(test)
print(model.predict(test).shape)
the problem is if i leave the line test_train_data_lbl = np.array([3, 7, 10, 2]) uncommented it will give me the error: ValueError: Shape mismatch: The shape of labels (received (2, 1)) should equal the shape of logits except for the last dimension (received (2, 8, 16))
commenting the line leads to no error, but the result of model.predict(test) will be an array with shape (1, 7, 16). I understand that the 7 comes from the 7 words i have and 16 from the number of classes but i need this to be of the shape (1, 16) so that i can predict which class the line will be in.
I also now that i have to less training data, but i first wanted to make the model work without errors before generating the training data.
When I start the simple neural net I got an error. By the way, the code should output the first number of the test array.
There have been other errors(there was one having to do with the data's dtype).
import tensorflow as tf
import numpy as np
from tensorflow import keras
data = np.array([[0, 1, 1], [0, 0, 1], [1, 1, 1]])
labels = np.array([0, 0, 1])
data.dtype = float
print(data.dtype)
model = keras.Sequential([
keras.layers.Dense(3, activation=tf.nn.relu),
keras.layers.Dense(2, activation=tf.nn.softmax)])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(data, labels)
prediction = model.predict([0, 1, 0])
print(prediction)
I get this error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Matrix size-incompatible: In[0]: [3,1], In[1]: [3,3]
[[{{node sequential/dense/Relu}}]]
You are getting above error because of below line:
prediction = model.predict([0, 1, 0])
You are passing a list which should be a numpy array and of shape Nx3, where N is basically batch size and can be 1, 2, etc. In this case, it will be 1.
In order to make it correct, change it to
prediction = model.predict(np.expand_dims(np.array([0, 1, 0], dtype=np.float32), 0))
or
prediction = model.predict(np.array([[0, 1, 0]], dtype=np.float32))
And, change data.dtype = float to data.dtype = np.float32.
When I pass one-hot encoded labels as train and validation data into tensorflow keras' model.fit() function, the metric tf.keras.metrics.TruePositives() returns wrong values.
I'm running Tensorflow 2.0.
For example, if this is my code:
model.compile(optimizer, 'binary_crossentropy',
['accuracy', tf.keras.metrics.TruePositives()])
history = model.fit(train_data, train_labels_binary, batch_size=32, epochs=30,
validation_data=(val_data, val_labels_binary),
callbacks=[early_stopping])
train_labels_binary is this: array([[1, 0], [1, 0], [0, 1]])
and the resulting y_pred's are array([[1, 0], [1, 0], [0, 1]])
then tf.keras.metrics.TruePositives() should return 1, but it returns 3.
Any help would be greatly appreciated!!
Ok I did some more experimenting and it is fixed when the input is not 1-hot encoded and there is only 1 output neuron. So all metrics run correctly if we change the following 2 lines:
This: train_labels = np.eye(2)[np.random.randint(0, 2, size=(10, 1)).reshape(-1)]
To: train_labels = np.random.randint(0, 2, size=(10, 1))
and
This: model.add(layers.Dense(units=2, activation='sigmoid'))
To: model.add(layers.Dense(units=1, activation='sigmoid'))
I am trying to create an autoencoder from scratch for my dataset. It is a variational autoencoder for feature extraction. I am pretty new to machine learning and I would like to know how to feed my input data to the autoencoder.
My data is a time series data. It looks like below:
array([[[ 10, 0, 10, ..., 10, 0, 0],
...,
[ 0, 12, 32, ..., 2, 2, 2]],
[[ 0, 3, 7, ..., 7, 3, 0],
.....
[ 0, 2, 3, ..., 3, 4, 6]],
[[1, 3, 1, ..., 0, 10, 2],
...,
[2, 11, 12, ..., 1, 1, 8]]], dtype=int64)
It is a stack of arrays and the shape is (3, 1212, 700).
And where do I pass the label?
The examples online are simple and there is no detailed description as to how to feed the data in reality. Any examples or explanations will be highly helpful.
This can be solved using a generator. The generator takes your time series data of 700 data points each with 3 channels and 1212 time steps and it outputs a batch.
In the example I've written the batches are each the same time period, for example batch 0 is the first 10 time steps for each of your 700 samples, batch 1 is the time steps 1:11 for each of your 700 samples. If you want to mix this up in some way then you should edit the generator. The epoch ends when each batch has been tested and trained on. For the neural network a very simple encoder, decoder model can be enough to prove the concept - but you will probably want to replace with your own model. The variable n is used to determine how many time steps are used for the autoencoder.
import numpy as np
import pandas as pd
import keras
from keras.layers import Dense, Flatten
from tensorflow.python.client import device_lib
# check for my gpu
print(device_lib.list_local_devices())
# make some fake data
# your data
data = np.random.random((3, 1212, 700))
# this is a generator
def image_generator(data, n):
start = 0
end = n
while end < data.shape[1] -1:
last_n_steps = data[:,start:end].T
yield (last_n_steps, last_n_steps)
start +=1
end +=1
# the generator MUST loop
if end == data.shape[1] -1:
start = 0
end = n
n = 10
# basic model - replace with your own
encoder_input = Input(shape = (n,3), name = "encoder_input")
fc = Flatten()(encoder_input)
fc = Dense(100, activation='relu',name = "fc1")(fc)
encoder_output = Dense(5, activation='sigmoid',name = "encoder_output")(fc)
encoder = Model(encoder_input,encoder_output)
decoder_input = Input(shape = encoder.layers[-1].output_shape[1:], name = "decoder_input")
fc = Dense(100, activation='relu',name = "fc2")(decoder_input)
output = Dense(5, activation='sigmoid',name = "output")(fc)
decoder = Model(decoder_input,output)
combined_model_input = Input(shape = (n,3), name = "combined_model_input")
autoencoder = Model(combined_model_input, decoder(encoder(combined_model_input)))
model = Model(input_layer,output_layer)
model.compile(optimizer="adam", loss='mean_squared_error')
print(model.summary())
#and training
training_history = model.fit_generator(image_generator(data, n),
epochs =5,
initial_epoch = 0,
steps_per_epoch=data.shape[2]-n,
verbose=1
)
I want to modify my input by adding several different suffixes to the input vectors. For example, if the (single) input is [1, 5, 9, 3] I want to create three vectors (stored as matrix) like this:
[[1, 5, 9, 3, 1, 0, 0],
[1, 5, 9, 3, 0, 1, 0],
[1, 5, 9, 3, 0, 0, 1]]
Of course, this is just one observation so the input to the model is (None, 4) in this case. The simple way is to prepare the input data somewhere else (numpy most probably) and adjust the shape of input accordingly. That I can do but I would prefer doing it inside TensorFlow/Keras.
I have isolated the problem into this code:
import keras.backend as K
from keras import Input, Model
from keras.layers import Lambda
def build_model(dim_input: int, dim_eye: int):
input = Input((dim_input,))
concat = Lambda(lambda x: concat_eye(x, dim_input, dim_eye))(input)
return Model(inputs=[input], outputs=[concat])
def concat_eye(x, dim_input, dim_eye):
x = K.reshape(x, (-1, 1, dim_input))
x = K.repeat_elements(x, dim_eye, axis=1)
eye = K.expand_dims(K.eye(dim_eye), axis=0)
eye = K.tile(eye, (-1, 1, 1))
out = K.concatenate([x, eye], axis=2)
return out
def main():
import numpy as np
n = 100
dim_input = 20
dim_eye = 3
model = build_model(dim_input, dim_eye)
model.compile(optimizer='sgd', loss='mean_squared_error')
x_train = np.zeros((n, dim_input))
y_train = np.zeros((n, dim_eye, dim_eye + dim_input))
model.fit(x_train, y_train)
if __name__ == '__main__':
main()
The problem seems to be in the -1 in shape argument in tile function. I tried to replace it with 1 and None. Each has its own error:
-1: error during model.fit
tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected multiples[0] >= 0, but got -1
1: error duting model.fit
tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [32,3,20] vs. shape[1] = [1,3,3]
None: error during build_model:
Failed to convert object of type <class 'tuple'> to Tensor. Contents: (None, 1, 1). Consider casting elements to a supported type.
You need to use K.shape() instead to get the symbolic shape of input tensor. That's because the batch size is None and therefore passing K.int_shape(x)[0] or None or -1 as a part of the second argument of K.tile() would not work:
eye = K.tile(eye, (K.shape(x)[0], 1, 1))