I'm new to machine learning and trying to fit a sample data set with neural networks in python using tensorflow. After having implemented the neural network in Dymola I want to compare the outputs of the function with those from the neural network.
The sample data set is:
import tensorflow as tf
from keras import metrics
import numpy as np
from keras.models import *
from keras.layers import Dense, Dropout
from keras import optimizers
from keras.callbacks import *
import scipy.io as sio
import mat4py as m4p
inputs = np.linspace(0, 15, num=3000)
outputs = 1/7 * ((inputs/5)^3 - (inputs/3)^2 + 5)
Inputs and outputs are then scaled into the interval [0; 0.9]:
inputs_max = np.max(inputs)
inputs_min = np.min(inputs)
outputs_max = np.max(outputs)
outputs_min = np.min(outputs)
upper_bound = 0.9
lower_bound = 0
m_in = (upper_bound - lower_bound) / (inputs_max - inputs_min)
c_in = upper_bound - (m_in * inputs_max)
scaled_in = m_in * inputs + c_in
m_out = (upper_bound - lower_bound) / (outputs_max - outputs_min)
c_out = upper_bound - (m_out * outputs_max)
scaled_out = m_in * inputs + c_in
and after that the neural network is trained with:
# shuffle values
def shuffle_in_unison(a, b):
assert len(a) == len(b)
shuffled_a = np.empty(a.shape, dtype=a.dtype)
shuffled_b = np.empty(b.shape, dtype=b.dtype)
permutation = np.random.permutation(len(a))
for old_index, new_index in enumerate(permutation):
shuffled_a[new_index] = a[old_index]
shuffled_b[new_index] = b[old_index]
return shuffled_a, shuffled_b
tf_features_64 = scaled_in
tf_labels_64 = scaled_out
tf_features_32 = tf_features_64.astype(np.float32)
tf_labels_32 = tf_labels_64.astype(np.float32)
X = tf_features_32
Y = tf_labels_32
shuffle_in_unison(X, Y)
# define callbacks
filepath = "weights-improvement-{epoch:02d}-{val_loss:.2f}.hdf5"
savebestCallBack = ModelCheckpoint(filepath, monitor='val_loss', verbose=1,
save_best_only=True, save_weights_only=False, mode='auto', period=1)
tbCallBack = TensorBoard(log_dir='./Graph',
histogram_freq=5,
write_graph=True,
write_images=True)
esCallback = EarlyStopping(monitor='val_loss',
min_delta=0,
patience=500,
verbose=0,
mode='min')
# neural network architecture
visible = Input(shape=(1,))
x = Dense(40, activation='tanh')(visible)
x = Dense(39, activation='tanh')(x)
x = Dense(38, activation='tanh')(x)
x = Dense(30, activation='tanh')(x)
output = Dense(1)(x)
# setup optimizer
Optimizer = optimizers.adam(lr=0.0007, amsgrad=True)
model = Model(inputs=visible, outputs=output)
model.compile(optimizer=Optimizer,
loss=['mse'],
metrics=['mae', 'mse']
)
model.fit(X, Y, epochs=1000, batch_size=1, verbose=1,
shuffle=True, validation_split=0.05, callbacks=[tbCallBack, esCallback])
# return weights
weights1 = model.layers[1].get_weights()[0]
biases1 = model.layers[1].get_weights()[1]
print('Layer1---------------------------------------------------------------------------------------------------------')
print('weights1:')
print(repr(weights1.transpose()))
print('biases1:')
print(repr(biases1))
w1 = weights1.transpose()
b1 = biases1.transpose()
we1 = {'w1' : w1.tolist()}
bi1 = {'b1' : b1.tolist()}
.........
......
Later on, I implemented the trained neural network in the program "Dymola" by loading the weights and biases in pre-configured "neural network base classes" (which have been used several times and are working).
// Modelica code for Dymola:
Real inputs;
Real outputs;
Real scaled_outputs;
Real scaled_inputs(start=0);
Real scaled_outputsfunc;
der(scaled_inputs) = 0.9;
//part of the neural network implementation in Dymola
NeuralNetwork.BaseClasses.NeuralNetworkLayer neuralNetworkLayer1(
NeuronActivationFunction=NeuralNetwork.Types.ActivationFunction.TanSig,
numInputs=1,
numNeurons=40,
weightTable=[-0.367953330278397; ......])
annotation (Placement(transformation(extent={{-76,22},{-56,42}})));
//scaled inputs
neuralNetworkLayer1.u[1] = scaled_inputs;
//scaled outputs
neuralNetworkLayer5.y[1]= scaled_outputs;
//scaled_inputs = 0.06 * inputs
inputs = 1/0.06 * (scaled_inputs);
outputs = 1/875 * inputs^3 - 1/63 * inputs^2 + 5/7;
scaled_outputsfunc = 1.2173139581825052 * outputs - 0.3173139581825052;
When plotting and comparing the scaled outputs of the function and the returned (scaled) values of the neural network I noticed that the approximation is very good in the interval from [0.5; 0.8], but the closer the inputs reach the boundaries the worse the approximation becomes.
Unfortunately, I have no clue why this is happening and how to fix this issue. I'd be very glad if someone could help me.
I want to answer my own question: I forgot to specify the activation function in the output layer in my python code, which Keras then set to a linear function by default, see also:
https://keras.io/layers/core/
In Dymola, where my ANN was implemented, 'tanh' was the activation function in the last layer, which lead to a divergence near the boundaries.
The correct python code for this application must be:
visible = Input(shape=(1,))
x = Dense(40, activation='tanh')(visible)
x = Dense(39, activation='tanh')(x)
x = Dense(38, activation='tanh')(x)
x = Dense(30, activation='tanh')(x)
output = Dense(1, activation='tanh')(x)
Related
My question is in reference to the paper Learning Confidence for Out-of-Distribution Detection in Neural Networks.
I need help in creating a custom loss function in tensorflow 2.0+ as per the paper to get confident prediction from the CNN on a in distribution (if the image belongs to train categories) image while a low prediction for an out of distribution (any random image) image. The paper suggests adding a confidence estimation branch to any conventional feedforward architecture in parallel with the original class prediction branch (refer to image below)
In order to define the loss function, the softmax prediction probabilities are adjusted by interpolating between the original predictions(pi) and the target probability distribution y, where the degree of interpolation is indicated by the network’s confidence(c):
pi'= c · pi + (1 − c)yi and the final loss is :
I need help in implementing this along with the loss function in Tensorflow 2.0+, below is what I could think of, from my knowledge:
import tensorflow.keras.backend as k
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.applications import ResNet50
#Defining custom loss function
def custom_loss(c):
def loss(y_true, y_pred):
interpolated_p = c*y_pred+ (1-c)*y_true
return -k.reduce_sum((k.log(interpolated_p) * y_true), axis=-1) - k.log(c)
return loss
#Defining model strcuture using resnet50
basemodel = ResNet50(weights = "imagenet",include_top = False)
headmodel = basemodel.output
headmodel = layers.AveragePooling2D(pool_size = (7,7))(headmodel)
#Add a sigmoid layer to the pooling output
conf_branch = layers.Dense(1,activation = "sigmoid",name = "confidence_branch")(headmodel)
# Add a softmax layer after the pooling output
softmax_branch = layers.Dense(10,activation = "softmax",name = "softmax_branch")(headmodel)
# Instantiate an end-to-end model predicting both confidence and class prediction
model = keras.Model(
inputs=basemodel.input,
outputs=[softmax_branch, conf_branch],
)
model.compile(loss=custom_loss(c=conf_branch.output), optimizer='rmsprop')
Appreciate any help on this ! Thanks !
The following is the code I wrote for the keras implementation:
num_classes = 10
basemodel = ResNet50(weights = "imagenet",include_top = False)
headmodel = basemodel.output
headmodel = layers.AveragePooling2D(pool_size = (7,7))(headmodel)
conf_branch = layers.Dense(1,activation = "sigmoid",name="confidence_branch")(headmodel)
softmax_branch = layers.Dense(num_classes,activation = "softmax",name = "softmax_branch")(headmodel)
output = Concatenate(axis=-1)([softmax_branch , conf_branch])
def custom_loss(y_true, y_pred, budget=0.3):
with tf.compat.v1.variable_scope("LAMBDA", reuse=tf.compat.v1.AUTO_REUSE):
LAMBDA = tf.compat.v1.get_variable("LAMBDA", dtype=tf.float32, initializer=tf.constant(0.1))
pred_original = y_pred[:, 0:num_classes]
confidence = y_pred[:, num_classes]
eps = 1e-12
pred_original = tf.clip_by_value(pred_original, 0. + eps, 1. - eps)
confidence = tf.clip_by_value(confidence, 0. + eps, 1. - eps)
b = np.random.uniform(size=y_true.shape[0], low=0.0, high=1.0)
conf = confidence * b + (1 - b)
conf = tf.expand_dims(conf, axis=-1)
pred_new = pred_original * conf + y_true * (1 - conf)
xentropy_loss = tf.reduce_mean(-tf.reduce_sum(y_true * tf.math.log(pred_new), axis=-1))
confidence_loss = tf.reduce_mean(-tf.math.log(confidence))
total_loss = xentropy_loss + LAMBDA * confidence_loss
def true_func():
return LAMBDA / 1.01
def false_func():
return LAMBDA / 0.99
LAMBDA_NEW = tf.cond(budget > confidence_loss, true_func, false_func)
LAMBDA.assign(LAMBDA_NEW)
# tf.print(LAMBDA)
return total_loss
def accuracy(y_true, y_pred):
y_pred = y_pred[:, :num_classes]
correct_pred = tf.equal(tf.argmax(y_pred, 1), tf.argmax(y_true, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
return accuracy
model = Model(inputs=basemodel.input, outputs=output)
optimizer = keras.optimizers.Adam(learning_rate=0.001)
model.compile(loss=custom_loss, optimizer=optimizer, metrics=[accuracy])
I am using TF2 (2.3.0) NN to approximate the function y which solves the ODE: y'+3y=0
I have defined cutsom loss class and function in which I am trying to differentiate the single output with respect to the single input so the equation holds, provided that y_true is zero:
from tensorflow.keras.losses import Loss
import tensorflow as tf
class CustomLossOde(Loss):
def __init__(self, x, model, name='ode_loss'):
super().__init__(name=name)
self.x = x
self.model = model
def call(self, y_true, y_pred):
with tf.GradientTape() as tape:
tape.watch(self.x)
y_p = self.model(self.x)
dy_dx = tape.gradient(y_p, self.x)
loss = tf.math.reduce_mean(tf.square(dy_dx + 3 * y_pred - y_true))
return loss
but running the following NN:
import tensorflow as tf
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense
from tensorflow.keras import Input
from custom_loss_ode import CustomLossOde
num_samples = 1024
x_train = 4 * (tf.random.uniform((num_samples, )) - 0.5)
y_train = tf.zeros((num_samples, ))
inputs = Input(shape=(1,))
x = Dense(16, 'tanh')(inputs)
x = Dense(8, 'tanh')(x)
x = Dense(4)(x)
y = Dense(1)(x)
model = Model(inputs=inputs, outputs=y)
loss = CustomLossOde(model.input, model)
model.compile(optimizer=Adam(learning_rate=0.01, beta_1=0.9, beta_2=0.99),loss=loss)
model.run_eagerly = True
model.fit(x_train, y_train, batch_size=16, epochs=30)
for now I am getting 0 loss from the fisrt epoch, which doesn't make any sense.
I have printed both y_true and y_test from within the function and they seem OK so I suspect that the problem is in the gradien which I didn't succeed to print.
Apprecitate any help
Defining a custom loss with the high level Keras API is a bit difficult in that case. I would instead write the training loop from scracth, as it allows a finer grained control over what you can do.
I took inspiration from those two guides :
Advanced Automatic Differentiation
Writing a training loop from scratch
Basically, I used the fact that multiple tape can interact seamlessly. I use one to compute the loss function, the other to calculate the gradients to be propagated by the optimizer.
import tensorflow as tf
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense
from tensorflow.keras import Input
num_samples = 1024
x_train = 4 * (tf.random.uniform((num_samples, )) - 0.5)
y_train = tf.zeros((num_samples, ))
inputs = Input(shape=(1,))
x = Dense(16, 'tanh')(inputs)
x = Dense(8, 'tanh')(x)
x = Dense(4)(x)
y = Dense(1)(x)
model = Model(inputs=inputs, outputs=y)
# using the high level tf.data API for data handling
x_train = tf.reshape(x_train,(-1,1))
dataset = tf.data.Dataset.from_tensor_slices((x_train,y_train)).batch(1)
opt = Adam(learning_rate=0.01, beta_1=0.9, beta_2=0.99)
for step, (x,y_true) in enumerate(dataset):
# we need to convert x to a variable if we want the tape to be
# able to compute the gradient according to x
x_variable = tf.Variable(x)
with tf.GradientTape() as model_tape:
with tf.GradientTape() as loss_tape:
loss_tape.watch(x_variable)
y_pred = model(x_variable)
dy_dx = loss_tape.gradient(y_pred, x_variable)
loss = tf.math.reduce_mean(tf.square(dy_dx + 3 * y_pred - y_true))
grad = model_tape.gradient(loss, model.trainable_variables)
opt.apply_gradients(zip(grad, model.trainable_variables))
if step%20==0:
print(f"Step {step}: loss={loss.numpy()}")
Let's say I want to train three models simultaneously (model1, model2, and model3) and while training have the models one and two share losses jointly with the main network (model1). So the main model can learn representations from the two other models in between layers.
Loss total = (weight1)loss m1 + (weight2)(loss m1 - loss m2) + (weight3)(loss m1 - loss m3)
So far I have the following:
def threemodel(num_nodes, num_class, w1, w2, w3):
#w1; w2; w3 are loss weights
in1 = Input((6373,))
enc1 = Dense(num_nodes)(in1)
enc1 = Dropout(0.3)(enc1)
enc1 = Dense(num_nodes, activation='relu')(enc1)
enc1 = Dropout(0.3)(enc1)
enc1 = Dense(num_nodes, activation='relu')(enc1)
out1 = Dense(units=num_class, activation='softmax')(enc1)
in2 = Input((512,))
enc2 = Dense(num_nodes, activation='relu')(in2)
enc2 = Dense(num_nodes, activation='relu')(enc2)
out2 = Dense(units=num_class, activation='softmax')(enc2)
in3 = Input((768,))
enc3 = Dense(num_nodes, activation='relu')(in3)
enc3 = Dense(num_nodes, activation='relu')(enc3)
out3 = Dense(units=num_class, activation='softmax')(enc3)
adam = Adam(lr=0.0001)
model = Model(inputs=[in1, in2, in3], outputs=[out1, out2, out3])
model.compile(loss='categorical_crossentropy', #continu together
optimizer='adam',
metrics=['accuracy'] not sure know what changes need to be made here)
## I am confused on how to formulate the shared losses equation here to share the losses of out2 and out3 with out1.
After searching a little it seems that can maybe do the following:
loss_1 = tf.keras.losses.categorical_crossentropy(y_true_1, out1)
loss_2 = tf.keras.losses.categorical_crossentropy(y_true_2, out2)
loss_3 = tf.keras.losses.categorical_crossentropy(y_true_3, out3)
model.add_loss((w1)*loss_1 + (w2)*(loss_1 - loss_2) + (w3)*(loss_1 - loss_3))
Can this work? I feel like by doing what I suggested above is not really doing what I want which is to have the main model (mod1) learn representations from the two other models (mod2 and mod3) in between layers.
Any suggestions?
Since you are not interested in using trainable weights (I label them coefficients to distinguish them from trainable weights) you can concatenate the outputs and pass them as single output to a custom loss function. This means that those coefficients will be available when the training will start.
You should provide a custom loss function as mentioned. A loss function is expected to take 2 arguments only so you should such a function aka categorical_crossentropy which should also be familiar with the parameters you are interested also like coeffs and num_class. So I instantiate a wrapper function with the arguments I want and then pass the inside actual loss function as the main loss function.
from tensorflow.keras.layers import Dense, Dropout, Input, Concatenate
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model
from tensorflow.python.framework import ops
from tensorflow.python.framework import smart_cond
from tensorflow.python.ops import math_ops
from tensorflow.python.ops import array_ops
from tensorflow.python.keras import backend as K
def categorical_crossentropy_base(coeffs, num_class):
def categorical_crossentropy(y_true, y_pred, from_logits=False, label_smoothing=0):
"""Computes the categorical crossentropy loss.
Args:
y_true: tensor of true targets.
y_pred: tensor of predicted targets.
from_logits: Whether `y_pred` is expected to be a logits tensor. By default,
we assume that `y_pred` encodes a probability distribution.
label_smoothing: Float in [0, 1]. If > `0` then smooth the labels.
Returns:
Categorical crossentropy loss value.
https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/python/keras/losses.py#L938-L966
"""
y_pred1 = y_pred[:, :num_class] # the 1st prediction
y_pred2 = y_pred[:, num_class:2*num_class] # the 2nd prediction
y_pred3 = y_pred[:, 2*num_class:] # the 3rd prediction
# you should adapt the ground truth to contain all 3 ground truth of course
y_true1 = y_true[:, :num_class] # the 1st gt
y_true2 = y_true[:, num_class:2*num_class] # the 2nd gt
y_true3 = y_true[:, 2*num_class:] # the 3rd gt
loss1 = K.categorical_crossentropy(y_true1, y_pred1, from_logits=from_logits)
loss2 = K.categorical_crossentropy(y_true2, y_pred2, from_logits=from_logits)
loss3 = K.categorical_crossentropy(y_true3, y_pred3, from_logits=from_logits)
# combine the losses the way you like it
total_loss = coeffs[0]*loss1 + coeffs[1]*(loss1 - loss2) + coeffs[2]*(loss2 - loss3)
return total_loss
return categorical_crossentropy
in1 = Input((6373,))
enc1 = Dense(num_nodes)(in1)
enc1 = Dropout(0.3)(enc1)
enc1 = Dense(num_nodes, activation='relu')(enc1)
enc1 = Dropout(0.3)(enc1)
enc1 = Dense(num_nodes, activation='relu')(enc1)
out1 = Dense(units=num_class, activation='softmax')(enc1)
in2 = Input((512,))
enc2 = Dense(num_nodes, activation='relu')(in2)
enc2 = Dense(num_nodes, activation='relu')(enc2)
out2 = Dense(units=num_class, activation='softmax')(enc2)
in3 = Input((768,))
enc3 = Dense(num_nodes, activation='relu')(in3)
enc3 = Dense(num_nodes, activation='relu')(enc3)
out3 = Dense(units=num_class, activation='softmax')(enc3)
adam = Adam(lr=0.0001)
total_out = Concatenate(axis=1)([out1, out2, out3])
model = Model(inputs=[in1, in2, in3], outputs=[total_out])
coeffs = [1, 1, 1]
model.compile(loss=categorical_crossentropy_base(coeffs=coeffs, num_class=num_class), optimizer='adam', metrics=['accuracy'])
I am not sure about the metrics regarding accuracy though. But I think it will work without other changes. I am also using K.categorical_crossentropy but you can freely change it with another implementation as well of course.
I'm trying to execute a Bayesian Neural Network that I found on the paper "Uncertainty on Deep Learning", Yarin Gal. I found this code on GitHub:
import math
from scipy.misc import logsumexp
import numpy as np
from keras.regularizers import l2
from keras import Input
from keras.layers import Dropout
from keras.layers import Dense
from keras import Model
import time
class net:
def __init__(self, X_train, y_train, n_hidden, n_epochs = 40,
normalize = False, tau = 1.0, dropout = 0.05):
"""
Constructor for the class implementing a Bayesian neural network
trained with the probabilistic back propagation method.
#param X_train Matrix with the features for the training data.
#param y_train Vector with the target variables for the
training data.
#param n_hidden Vector with the number of neurons for each
hidden layer.
#param n_epochs Number of epochs for which to train the
network. The recommended value 40 should be
enough.
#param normalize Whether to normalize the input features. This
is recommended unless the input vector is for
example formed by binary features (a
fingerprint). In that case we do not recommend
to normalize the features.
#param tau Tau value used for regularization
#param dropout Dropout rate for all the dropout layers in the
network.
"""
# We normalize the training data to have zero mean and unit standard
# deviation in the training set if necessary
if normalize:
self.std_X_train = np.std(X_train, 0)
self.std_X_train[ self.std_X_train == 0 ] = 1
self.mean_X_train = np.mean(X_train, 0)
else:
self.std_X_train = np.ones(X_train.shape[ 1 ])
self.mean_X_train = np.zeros(X_train.shape[ 1 ])
X_train = (X_train - np.full(X_train.shape, self.mean_X_train)) / \
np.full(X_train.shape, self.std_X_train)
self.mean_y_train = np.mean(y_train)
self.std_y_train = np.std(y_train)
y_train_normalized = (y_train - self.mean_y_train) / self.std_y_train
y_train_normalized = np.array(y_train_normalized, ndmin = 2).T
# We construct the network
N = X_train.shape[0]
batch_size = 128
lengthscale = 1e-2
reg = lengthscale**2 * (1 - dropout) / (2. * N * tau)
inputs = Input(shape=(X_train.shape[1],))
inter = Dropout(dropout)(inputs, training=True)
inter = Dense(n_hidden[0], activation='relu', W_regularizer=l2(reg))(inter)
for i in range(len(n_hidden) - 1):
inter = Dropout(dropout)(inter, training=True)
inter = Dense(n_hidden[i+1], activation='relu', W_regularizer=l2(reg))(inter)
inter = Dropout(dropout)(inter, training=True)
outputs = Dense(y_train_normalized.shape[1], W_regularizer=l2(reg))(inter)
model = Model(inputs, outputs)
model.compile(loss='mean_squared_error', optimizer='adam')
# We iterate the learning process
start_time = time.time()
model.fit(X_train, y_train_normalized, batch_size=batch_size, nb_epoch=n_epochs, verbose=0)
self.model = model
self.tau = tau
self.running_time = time.time() - start_time
# We are done!
def predict(self, X_test, y_test):
"""
Function for making predictions with the Bayesian neural network.
#param X_test The matrix of features for the test data
#return m The predictive mean for the test target variables.
#return v The predictive variance for the test target
variables.
#return v_noise The estimated variance for the additive noise.
"""
X_test = np.array(X_test, ndmin = 2)
y_test = np.array(y_test, ndmin = 2).T
# We normalize the test set
X_test = (X_test - np.full(X_test.shape, self.mean_X_train)) / \
np.full(X_test.shape, self.std_X_train)
# We compute the predictive mean and variance for the target variables
# of the test data
model = self.model
standard_pred = model.predict(X_test, batch_size=500, verbose=1)
standard_pred = standard_pred * self.std_y_train + self.mean_y_train
rmse_standard_pred = np.mean((y_test.squeeze() - standard_pred.squeeze())**2.)**0.5
T = 10000
Yt_hat = np.array([model.predict(X_test, batch_size=500, verbose=0) for _ in range(T)])
Yt_hat = Yt_hat * self.std_y_train + self.mean_y_train
MC_pred = np.mean(Yt_hat, 0)
rmse = np.mean((y_test.squeeze() - MC_pred.squeeze())**2.)**0.5
# We compute the test log-likelihood
ll = (logsumexp(-0.5 * self.tau * (y_test[None] - Yt_hat)**2., 0) - np.log(T)
- 0.5*np.log(2*np.pi) + 0.5*np.log(self.tau))
test_ll = np.mean(ll)
# We are done!
return rmse_standard_pred, rmse, test_ll
I'm new at programming, so I have to study Classes on Python to understand the code. But my answer goes when I try to execute the code, but it ask a "vector with the numbers of neurons for each hidden layer", and I don't know how to create this vector, and which does it mean for the code. I've tried to create different vectors, like
vector = np.array([1, 2, 3]) but sincerely I don't know the correct answer. The only I have is the feature data and the target data. I hope you can help me.
That syntax is correct vector = np.array([1, 2, 3]). That is the way to define a vector in python's numpy.
A neural network can have any number o hidden (internal) layers. And each layer will have a certain number of neurons.
So in this code, a vector=np.array([100, 150, 100]), means that the network should have 3 hidden layers (because the vector has 3 values), and the hidden layers should have, from input to output 100, 150, 100 neurons respectively.
So, there is the universal approximation theorem which says that a neural network can approximate any continuous function, provided it has at least one hidden layer and uses non-linear activation there.
So my doubt is as follows: "How do I approximate a function using neural networks with my input being other functions?"
Let's say I want to approximate y = x + 1 and I have z_1 = 2x, z_2 = 3x + 3 and z_3 = 4x + 1, with x being time variant. What I want my model to learn is the relationship between z_1, z_2, z_3 and y, as I may write *y = -6 * z_1 - 1 * z_2 + 4 z_3* ( I want my network to learn this relationship).
From time 0 to T I have the value of all functions and can do a supervised learning, but from (T + 1) +, I will only have z_1, z_2 and z_3 and so, I would be using the network to approximate the future values of y based on these z functions (z_1, z_2, z_3).
How do I implement that on python using Keras? I used the following code but didn't get any decent results.
import numpy as np
import matplotlib.pyplot as plt
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import RMSprop
n = 10000
def z_1(x):
x_0 = []
for i in x:
x_0.append(2*i)
return x_0
def z_2(x):
x_0 = []
for i in x:
x_0.append(3*i + 3)
return x_0
def z_3(x):
x_0 = []
for i in x:
x_0.append(4* i + 1)
return x_0
def z_0(x):
x_0 = []
for i in x:
x_0.append(i + 1)
return x_0
model = Sequential()
model.add(Dense(500, activation='relu', input_dim=3))
model.add(Dense(500, activation='relu'))
model.add(Dense(1, activation='linear'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
np.random.seed(seed = 2000)
input = np.random.random(n) * 10
dataset = z_0(input)
input_1 = z_1(input)
input_2 = z_2(input)
input_3 = z_3(input)
x_train = np.array([input_1[0:int(0.8*n)], input_2[0:int(0.8*n)], input_3[0:int(0.8*n)]])
y_train = np.array([dataset[0:int(0.8*n)]])
x_train = x_train.reshape(int(0.8*n), 3)
y_train = y_train.reshape(int(0.8*n),1)
es = keras.callbacks.EarlyStopping(monitor='val_loss',
min_delta=0,
patience=0,
verbose=0, mode='auto')
model.fit(x_train, y_train, epochs=100, batch_size=128, callbacks = [es])
x_test = np.array([input_1[int(n-100):n], input_2[int(n-100):n], input_3[int(n-100):n]])
x_test = x_test.reshape(int(100), 3)
classes = model.predict(x_test, batch_size=128)
y_test = np.array([dataset[int(n-100):n]]).reshape(int(100),1)
plt.plot(y_test,c='b', label = 'test data')
plt.plot(classes,c='r', label = 'test result')
plt.legend()
plt.show()
You can't do this with a feedforward neural network. You need to do this with recurrent neural networks. Look up LSTM or GRU cells in Keras.
https://keras.io/layers/recurrent/