optimization neural network with genetic algorithm- optimize layers and weights - python

i try to build neural network model using the following code - multi task model
inp = Input((336,))
x = Dense(300, activation='relu')(inp)
x = Dense(256, activation='relu')(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.1)(x)
x = Dense(56, activation='relu')(x)
x = Dense(16, activation='relu')(x)
x = Dropout(0.1)(x)
out_reg = Dense(1, name='reg')(x)
out_class = Dense(1, activation='sigmoid', name='class')(x) # I suppose bivariate classification problem
model = Model(inp, [out_reg, out_class])
model.compile('adam', loss={'reg':'mse', 'class':'binary_crossentropy'},
loss_weights={'reg':0.5, 'class':0.5})
now i want to use genetic algorithm optimize neural network weights, layers and number of neurons using genetic algorithm in python
i learned many tutorial about it but i didn't find any materiel discuss how to implement it
any help could be appreciated

Initially, I think it is better to
- Fix the architecture of the model,
- Know how many trainable parameters are there and their format,
- Create a random population of trainable parameters,
- Define objective function to optimize,
- Implements GA operation (reproduction, crossover, mutation etc),
- Resize these population of weights and biases into correct format,
- Then run ML model with those weights and biases,
- Get loss, and update population and,
- Repeat the above process a number of epoch/with a stopping criteria
Hope it helps.

If you are this new to machine learning, I would not recommend using genetic algorithms to optimize your weights. You have already compiled your model with "Adam", which is an excellent gradient-descent based optimizer that is going to do all of the work for you, and you should use that instead.
Check out the Tensorflow quickstart tutorial for more information https://www.tensorflow.org/tutorials/quickstart/beginner
Here's an example of how to implement genetic algorithms from a Google search... https://towardsdatascience.com/introduction-to-genetic-algorithms-including-example-code-e396e98d8bf3
If you want to do hypertuning with genetic algorithms, you can encode hyperparemeters of the network (number of layers, neurons) as your genes. Evaluating the fitness will be very costly, because it would involve having to train the network for a given task to get its final test loss.
If you want to do optimization with genetic algorithms, you can encode the model weights as genes, and the fitness would be directly related to the loss of the network.

Related

which type of ANN should I use?

I am working on a project in which I have to predict the methane production
input:pH,temperature,solution concentration
output: methene production
I have used Keras TensorFlow
my questions are:
(as of now I have 60 experimental data) the accuracy is always 0.2-0.3 why?should I increase te number of data?
I used the following code:
classifier.add(Dense(6, activation='relu', kernel_initializer='uniform',input_dim=9))
classifier.add(Dense(6, kernel_initializer = 'uniform', activation = 'relu'))
classifier.add(Dense(1, kernel_initializer= 'uniform', activation = 'sigmoid'))
classifier.compile(optimizer = 'adam', loss='mean_squared_error',metrics=['mean_squared_error'])
classifier.fit(X_train, y_train, batch_size = 10, epochs = 100)
3.It is possible to predict other than binary outputs, right? if no then which one will be suitable for predicting non binary values
If you only have 60 data points, yes definitely try to get more data. In general it is good to have hundreds (if not thousands) of data points to effectively train a neural network. Your network looks fine (assuming the relationship between those inputs and the output is fairly linear), if that is not the case you could try making your hidden layer wider (more neurons).
It is definitely possible to predict other than binary outputs, in fact it looks like your network should be doing so. It really just depends on the activation function you put on your output layer. For example, softmax is good for classifying data when there are several possible labels. For binary classification, a sigmoid activation function is good. If you're just trying to predict an output quantity, you can probably just not have an activation function on your output.
yes have to provide more data to learn the pattern in data points, if have linear regression than used it for better

solving ODE with neural network by fixed point theorem

I want to solve ODE
with neural network and compare this with true solution $u(x) = x$.
Method
Build a neural network, say $v(x;w)$, where $x$ is 1-d input and $w$ is weight.
Set loss function as
Use either adam or sgd to minimize the loss
Question
In theory, this shall converge to true solution by fixed point theorem.
But I do not know how to implement it, especially how to set the loss function with tensorflow gradient.
Code
Here is some code up to the point I stuck with.
#creat training data
x_train = np.arange(10)/10. #the derivatives will be taken at these pts.
y_train = np.zeros(x_train.shape) #we do not need it indeed.
# model for the candidate function
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(10, input_shape=(1,), use_bias=True),
tf.keras.layers.Dense(1)
])
Any opinion will be appreciated.

Back-propagation with complex parameters in Tensorflow

Currently I'm trying to train a complex data generated from telecom engineering models. The weights and biases are also complex. I have used the relu activation for the hidden layers as follows at the l-th layer:
A_l = tf.complex(tf.nn.relu(tf.real(Z_l)), tf.nn.relu(tf.imag(Z_l)))
But how to do it for the cost and optimizer, please? I am really confused because I'm a beginner in machine learning. I have gone through some papers about non-analytic functions, but none of them helped to use the Tensorflow API. For example: how do I rewrite the functions below?
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = Z_out, labels = y))
optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost)
I have seen a recommendation to split the cost as real and imaginary parts as:
cost_R = .., cost_I = ...
but I didn't try it because I think the optimizer will be split and the optimization not be working. I used the relu activation for the hidden layers as follows at the l-th layer:
A_l = tf.complex(tf.nn.relu(tf.real(Z_l)), tf.nn.relu(tf.imag(Z_l)))
But how to the cost and optimizer?
Any help is much appreciated.

Understanding Regularization in Keras

I am trying to understand why regularization syntax in Keras looks the way that it does.
Roughly speaking, regularization is way to reduce overfitting by adding a penalty term to the loss function proportional to some function of the model weights. Therefore, I would expect that regularization would be defined as part of the specification of the model's loss function.
However, in Keras the regularization is defined on a per-layer basis. For instance, consider this regularized DNN model:
input = Input(name='the_input', shape=(None, input_shape))
x = Dense(units = 250, activation='tanh', name='dense_1', kernel_regularizer=l2, bias_regularizer=l2, activity_regularizer=l2)(x)
x = Dense(units = 28, name='dense_2',kernel_regularizer=l2, bias_regularizer=l2, activity_regularizer=l2)(x)
y_pred = Activation('softmax', name='softmax')(x)
mymodel= Model(inputs=input, outputs=y_pred)
mymodel.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
I would have expected that the regularization arguments in the Dense layer were not needed and I could just write the last line more like:
mymodel.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'], regularization='l2')
This is obviously wrong syntax, but I was hoping someone could elaborate for me a bit on why the regularizes are defined this way and what is actually happening when I use layer-level regularization.
The other thing I don't understand is under what circumstances would I use each or all of the three regularization options: (kernel_regularizer, activity_regularizer, bias_regularizer)?
Let's break down the components of your question:
Your expectation of regularisation is probably in line with a feed-forward network where yes the penalty term is applied to the weights of the overall network. But this is not necessarily the case when you have RNNs mixed with CNNs etc so Keras opts give fine grain control. Perhaps for easy setup, a regularisation at model level could be added to the API for all weights.
When you use layer regularisation, the base Layer class actually adds the regularising term to the loss which at training time penalises the corresponding layer's weights etc.
Now in Keras you can often apply regularisation to 3 different things as in Dense layer. Every layer has different kernels such recurrent etc, so for the question let's look at the ones you are interested in but the same roughly applies to all layers:
kernel: this applies to actual weights of the layer, in Dense it is the W of Wx+b.
bias: this is the bias vector of the weights, so you can apply a different regulariser for it, the b in Wx+b.
activity: is applied to the output vector, the y in y = f(Wx + b).

Predicting time series data with Neural Network in python

I'm a beginner in Neural Network and trying to predict values which are temperature values(output) with 5 inputs in python. I used keras package in python to work Neural Network.
Also, I used two algorithms which are feedforward Neural Network(Regression) and Recurrent Neural Network(LSTM) to predict values. However, both of algorithms didn't work well for forecasting.
In my case of Feedforward Neural Network(Regression), I used 3 hidden layers(with 100, 200, 300 neurons) like code below,
def baseline_model():
# create model
model = Sequential()
model.add(Dense(100, input_dim=5, kernel_initializer='normal', activation='sigmoid'))
model.add(Dense(200, kernel_initializer = 'normal', activation='sigmoid'))
model.add(Dense(300, kernel_initializer = 'normal', activation='sigmoid'))
model.add(Dense(1, kernel_initializer='normal'))
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
return model
df = DataFrame({'Time': TIME_list, 'input1': input1_list, 'input2': input2_list, 'input3': input3_list, 'input4': input4_list, 'input5': input5_list, 'output': output_list})
df.index = pd.to_datetime(df.Time)
df = df.values
#Setting training data and test data
train_size_x = int(len(df)*0.8) #The user can change the range of training data
print(train_size_x)
X_train = df[0:train_size_x, 0:5]
t_train = df[0:train_size_x, 6]
X_test = df[train_size_x:int(len(df)), 0:5]
t_test = df[train_size_x:int(len(df)), 6]
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)
scale = StandardScaler()
X_train = scale.fit_transform(X_train)
X_test = scale.transform(X_test)
#Regression in Keras package
clf = KerasRegressor(build_fn=baseline_model, nb_epoch=50, batch_size=5, verbose=0)
clf.fit(X_train,t_train)
res = clf.predict(X_test)
However, the error was quite big. The maximum absolute error was 78.4834. So I tried to minimize that error by changing number of hidden layer or neurons in hidden layer, but the error stayed around same.
After feedforward NN, secondly, I used Recurrent Neural Network(LSTM) algorithm which can predict by using only one input. In my case, the input is temperature. It gives me much less error than the feedforward NN, but I was lost in deep thought that Recurrent Nueral Network(LSTM) I implemented is little ambiguous in my case because it didn't use 5 inputs that affect the output(temperature value) such as feedforward regression that I implemented above.
And now I got lost what other kinds of algorithm I should use.
Any suggestions or ideas for my case..?
Thanks in advance.
I have to agree with the commenter to your question, you are jumping a little ahead of yourself. Neural networks can seem like black magic at times and its worth taking the time to understand whats actually going on under the hood. A good place to start learning and experimenting is with sklearn. Sklearn is a good place to start because you can try different techniques easily, this will help you learn quickly how to structure your problems. There is also an abundance of info and tutorials.
From there, you will be better equipped to tackling your own NN from scratch. Additionally, sklearn has many useful functions to pre-process/normalize your training data, which is a whole art in itself.
There are tons of good networks already available for common situations. Most of the work is in choosing the right structure for your problem, getting good data to train on, and massaging that data so it can be utilized properly.
Check it out... http://scikit-learn.org/stable/

Categories