Predicting with Nan in input - python

I trained a (0,1) model with tensorflow but without Nans in it. Is there any way to predict some values with Nan in it. I use 'adam' as optimizer.
Making model:
input_size = 16
output_size = 2
hidden_layer_size = 50
model = tf.keras.Sequential([
tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 1st hidden layer
tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 2nd hidden layer
tf.keras.layers.Dense(output_size, activation='softmax') # output layer
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
batch_size = 100
max_epochs = 20
early_stopping=tf.keras.callbacks.EarlyStopping()
model.fit(train_inputs, # train inputs
train_targets, # train targets
batch_size=batch_size, # batch size
epochs=max_epochs, # epochs that we will train for (assuming early stopping doesn't kick in)
callbacks=[early_stopping],
validation_data=(validation_inputs, validation_targets), # validation data
verbose = 1 # making sure we get enough information about the training process
)
Potential input I'd like to add:
x=np.array([[ 0.8048038 , 2.22810658, 0.7184345 , -0.59266753, 1.73062328,
0.69392477, -1.35764524, -0.55833263, 0.10620523, 1.31206921,
-1.07966389, 1.04462389, -0.99787875, 0.797905 , -0.35954954,
np.NaN]])
The return I get:
array([[nan, nan]], dtype=float32)
So is there any way to achive it?

The optimizer needs to be able to do computations with the input. This means NaN is not a valid input for that, as there really is no good way to do anything with it in this case. You therefore have to either replace these NaNs with meaningful numbers, or you will be unable to use this data point and you will have to drop it like so:
x = x[np.isfinite(x)]

Related

Multi-output losses keras/tensorflow do not behave as expected. Target #2 MAE loss is huge and >1 despite sigmoid activation and 0-1 scaled labels

SOLVED: see comment by Marco Cerliani. tl;dr Each output must be
passed to the model as a separate array.
I have a neural network with two different targets to estimate:
Target #1 is 0-infinity, final single node dense layer uses 'linear' activation function.
Target #2 is 0-1 scaled, final single node dense layer uses uses a 'sigmoid' activation function.
Both outputs are using MAE loss, however MAE for Target #2 is almost as high as for Target #1. As Target #2 is 0-1 and the sigmoid can only give a 0-1 output, I would expect that the loss for Target #2 should not be able to be >1.
Indeed, when I only estimate Target #2 in a single output model I always get a loss <1. The problem arises only when using multiple-outputs.
Is this a bug, or am I doing something wrong?
optimizer = tf.keras.optimizers.Adam(learning_rate=1e-5)
mae_loss = tf.keras.losses.MeanAbsoluteError()
rmse_metric = tf.keras.metrics.RootMeanSquaredError()
inputs = tf.keras.layers.Input(shape=(IMG_SIZE, IMG_SIZE, CHANNELS))
model = tf.keras.applications.vgg16.VGG16(include_top=False, input_tensor=inputs,
weights='imagenet')
# Freeze the pretrained weights if needed
model.trainable = True
# Rebuild top
x = tf.keras.layers.GlobalAveragePooling2D(name='avg_pool')(model.output)
x = tf.keras.layers.BatchNormalization()(x)
top_dropout_rate = 0.0 # adjustable dropout
x = tf.keras.layers.Dropout(top_dropout_rate, name='top_dropout')(x)
x = tf.keras.layers.Dense(512, activation='relu', name='dense_top_1')(x)
output_1 = tf.keras.layers.Dense(1, activation='linear', name='output_1')(x)
output_2 = tf.keras.layers.Dense(1, activation='sigmoid', name='output_2')(x)
model = tf.keras.Model(inputs, [output_1, output_2],
name='VGG16_modified')
model.compile(optimizer=optimizer, loss=mae_loss, metrics=rmse_metric)
model.fit(X_train, y_train, batch_size=16, epochs=epochs, validation_data=[X_val, y_val], verbose=1)
I have also tried to compile explicitly with two separate losses:
model.compile(optimizer=optimizer, loss=[mae_loss, mae_loss], metrics=[rmse_metric, rmse_metric])
Example targets:
[[2.05e+02 7.45e-01]
[1.33e+02 1.46e-01]
[8.00e+01 2.77e-01]
[8.30e+01 4.29e-01]
[9.80e+01 1.50e-01]
[6.10e+01 3.10e-01]
[1.00e+02 4.09e-01]
[2.20e+02 9.17e-01]
[1.20e+02 1.52e-01]]
Terminal outputs (partly cropped, but you get the idea (it shouldn't be possible for loss #2 to be >76 given the sigmoid function and above targets):
TensorFlow v.2.8.0
Your model expects 2 targets (output_1 and output_2), while you are using only y_train as a target during model.fit.
You should fit your model passing two separated target in this way:
model.fit(X_train, [y_train[:,[0]],y_train[:,[1]]], ...)

Passing a Custom mask to the LSTM data for training and validation

I have an LSTM architecture ready:
input1 = Input(shape=(1500, 3))
lstm = LSTM(units=100, return_sequences=False, activation='relu')(input1)
outputs = Dense(150, activation="sigmoid")(lstm)
model = Model(inputs=input1, outputs=outputs)
model.compile(loss="binary_crossentropy", optimizer="adam",
metrics=["accuracy"])
The LSTM layer supports a calling argument called mask.
The way I'm reading the data is by using two generators, one iterates through training files and the other through the validation files (so on the .fit method I pass the training and validation generators).
model.fit(
x=training_generator,
epochs=10,
steps_per_epoch=5, # there are 5 files
validation_data=validation_generator,
validation_steps=5, # there are 5 files
verbose=1
)
Therefore each file will have a given mask (one for the training file, another for the validation file). Therefore my question is, how can I specify which mask to use?
The way I found to work was to transform the data during the preprocessing stage. If you replace the values in your data, according to the mask, with an number you know is not in your data, for instance 0 or -999, you can then add another layer to the architecture called Masking. This layer has a parameter called mask_value which will be the same number you used to transform your data:
input1 = Input(shape=(n_timesteps, n_channels))
masking = Masking(mask_value=-999)(input1)
lstm1 = LSTM(units=100, return_sequences=False,
activation="tanh")(masking)
outputs = Dense(n_timesteps, activation="sigmoid")(lstm1)
model = Model(inputs=input1, outputs=outputs)
model.compile(loss=keras.losses.BinaryCrossentropy(),
optimizer=tf.keras.optimizers.Adam(learning_rate=0.01))
This way you can then pass this as the input to the LSTM (since LSTMs allow this, some other types of layers do not).

ValueError: Data cardinality is ambiguous with tf.keras

I have a dataframe with two columns; the first includes a sentence and the second is a target label (9 in total - sentence can be classified to more than one label).
I have used word2vec to vectorise the text and thats resulted in an array with length 64.
The initial problem I had
Tensorflow - ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type float)
To overcome this I have converted the np.array to
train_inputs = tf.convert_to_tensor([df_train_title_train])
But now I am getting a new problem - see below.
I have been researching stackflow and other sources for days and am struggling to get my simple neural network to work.
print(train_inputs.shape)
print(train_targets.shape)
print(validation_inputs.shape)
print(validation_targets.shape)
print(train_inputs[0].shape)
print(train_targets[0].shape)
(1, 63586, 64)
(63586, 9)
(1, 7066, 64)
(7066, 9)
(63586, 64)
(9,)
# Set the input and output sizes
input_size = 64
output_size = 9
# Use same hidden layer size for both hidden layers. Not a necessity.
hidden_layer_size = 64
# define how the model will look like
model = tf.keras.Sequential([
tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 1st hidden layer
tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 2nd hidden layer
tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 2nd hidden layer
tf.keras.layers.Dense(output_size, activation='softmax') # output layer
])
# model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
### Training
# That's where we train the model we have built.
# set the batch size
batch_size = 10
# set a maximum number of training epochs
max_epochs = 10
# fit the model
# note that this time the train, validation and test data are not iterable
model.fit(train_inputs, # train inputs
train_targets, # train targets
batch_size=batch_size, # batch size
epochs=max_epochs, # epochs that we will train for (assuming early stopping doesn't kick in)
validation_data=(validation_inputs, validation_targets), # validation data
verbose = 2 # making sure we get enough information about the training process
)
Error Message
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/data_adapter.py in _check_data_cardinality(data)
1527 label, ", ".join(str(i.shape[0]) for i in nest.flatten(single_data)))
1528 msg += "Make sure all arrays contain the same number of samples."
-> 1529 raise ValueError(msg)
1530
1531
ValueError: Data cardinality is ambiguous:
x sizes: 1
y sizes: 63586
Make sure all arrays contain the same number of samples.
You do not set the shape of your input anywhere; you should do this either with an explicit Input layer in the beginning of your model (see the example in the docs):
# before the first Dense layer:
tf.keras.Input(shape=(64,))
or by including an input_shape argument in your first layer:
tf.keras.layers.Dense(hidden_layer_size, activation='relu', input_shape=(64,)), # 1st hidden layer
Most probably, you will not even need convert_to_tensor (not quite sure though).
Also, irrelevant to your issue, but since you are in a multi-class setting, you should use loss='categorical_crossentropy', and not binary_crossentropy; see Why binary_crossentropy and categorical_crossentropy give different performances for the same problem?

My Neuronal Network isn't learning (negative R_Squared, always same loss, categorial input data, regression)

I try to get my Neuronal Network to work but unfortunately it looks like I am missing something.
I have input data from different categories.
For example the type of a machine. ('abc', 'bcd', 'dca').
So one line of my input contains different words from different distinct word-categories. At the moment I have ~70.000 samples with 12 features.
First I use sklearns labelEncoder to transform every word into a number.
The vocabulary size goes up to 17903.
My simple newtwork looks like this:
#Start with the NN
model = tf.keras.Sequential([
tf.keras.layers.Embedding(np.amax(ml_input)+1, 300, input_length = x_train.shape[1]),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(500, activation=tf.keras.activations.softmax),
tf.keras.layers.Dense(1, activation = tf.keras.activations.linear)
])
model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.01),
loss=tf.keras.losses.mean_absolute_error,
metrics=[R_squared])
model.summary()
#Train the Model
callback = [tf.keras.callbacks.EarlyStopping(monitor='loss', min_delta=5.0, patience=15),
tf.keras.callbacks.ReduceLROnPlateau(monitor='loss', factor=0.1, patience=5, min_delta=5.00, min_lr=0)
]
history = model.fit(x_train, y_train, epochs=50, batch_size=64, verbose =2, callbacks = callback)
The loss of the first epoch is about 120 and after two epochs 70 but now it doesn't change anymore. So after two epochs my net isn't learning anymore.
I already tried other loss functions, standarize my labels (they go from 3 to 500mins), more neurons, another dense layer, another activation function. But after two epochs alway loss of 70. My R_Squared is something like -0.02 it changes but alway stays negative near 0.
It seems like my network isn't learning at all.
Does anyone have an Idea of what I am doing wrong?
Thanks for your help!

LSTM with keras

I have some training data x_train and some corresponding labels for this x_train called y_train. Here is how x_train and y_train are constructed:
train_x = np.array([np.random.rand(1, 1000)[0] for i in range(10000)])
train_y = (np.random.randint(1,150,10000))
train_x has 10000 rows and 1000 columns for each row.
train_y has a label between 1 and 150 for each sample in train_x and represents a code for each train_x sample.
I also have a sample called sample, which is 1 row with 1000 columns, which I want to use for prediction on this LSTM model. This variable is defined as
sample = np.random.rand(1,1000)[0]
I am trying to train and predict an LSTM on this data using Keras. I want to take in this feature vector and use this LSTM to predict one of the codes in range 1 to 150. I know these are random arrays, but I cannot post the data I have. I have tried the following approach which I believe should work, but am facing some issues
model = Sequential()
model.add(LSTM(output_dim = 32, input_length = 10000, input_dim = 1000,return_sequences=True))
model.add(Dense(150, activation='relu'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(train_x, train_y,
batch_size=128, nb_epoch=1,
verbose = 1)
model.predict(sample)
Any help or adjustments to this pipeline would be great. I am not sure if the output_dim is correct. I want to pass train the LSTM on each sample of the 1000 dimension data and then reproduce a specific code that is in range 1 to 150. Thank you.
I see at least three things you need to change:
Change this line:
model.add(Dense(150, activation='relu'))
to:
model.add(Dense(150, activation='softmax'))
as leaving 'relu' as activation makes your output unbounded whereas it needs to have a probabilistic interpretation (as you use categorical_crossentropy).
Change loss or target:
As you are using categorical_crossentropy you need to change your target to be a one-hot encoded vector of length 150. Another way is to leave your target but to change loss to sparse_categorical_crossentropy.
Change your target range:
Keras has a 0-based array indexing (as in Python, C and C++ so your values should be in range [0, 150) instead [1, 150].

Categories