batch normalization, yes or no?

batch normalization, yes or no? - python

I use Tensorflow 1.14.0 and Keras 2.2.4. The following code implements a simple neural network:
import numpy as np
np.random.seed(1)
import random
random.seed(2)
import tensorflow as tf
tf.set_random_seed(3)
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import Input, Dense, Activation
x_train=np.random.normal(0,1,(100,12))
model = Sequential()
model.add(Dense(8, input_shape=(12,)))
# model.add(tf.keras.layers.BatchNormalization())
model.add(Activation('linear'))
model.add(Dense(12))
model.add(Activation('linear'))
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(x_train, x_train,epochs=20, validation_split=0.1, shuffle=False,verbose=2)
The final val_loss after 20 epochs is 0.7751. When I uncomment the only comment line to add the batch normalization layer, the val_loss changes to 1.1230.
My main problem is way more complicated, but the same thing occurs. Since my activation is linear, it does not matter if I put the batch normalization after or before the activation.
Questions: Why batch normalization cannot help? Is there anything I can change so that the batch normalization improves the result without changing the activation functions?
Update after getting a comment:
An NN with one hidden layer and linear activations is kind of like PCA. There are tons of papers on this. For me, this setting gives minimal MSE among all combinations of activation functions for the hidden layer and output.
Some resources that state linear activations mean PCA:
https://arxiv.org/pdf/1702.07800.pdf
https://link.springer.com/article/10.1007/BF00275687
https://www.quora.com/How-can-I-make-a-neural-network-to-work-as-a-PCA

Yes.
The behavior you're observing is a bug - and you don't need BN to see it; plot to the left is for #V1, to the right is for #V2:
#V1
model = Sequential()
model.add(Dense(8, input_shape=(12,)))
#model.add(Activation('linear')) <-- uncomment == #V2
model.add(Dense(12))
model.compile(optimizer='adam', loss='mean_squared_error')
Clearly nonsensical, as Activation('linear') after a layer with activation=None (=='linear') is an identity: model.layers[1].output.name == 'activation/activation/Identity:0'. This can be confirmed further by fetching and plotting intermediate layer outputs, which are identical for 'dense' and 'activation' - will omit here.
So, the activation does literally nothing, except it doesn't - somewhere along the commit chain between 1.14.0 and 2.0.0, this was fixed, though I don't know where. Results w/ BN using TF 2.0.0 w/ Keras 2.3.1 below:
val_loss = 0.840 # without BN
val_loss = 0.819 # with BN
Solution: update to TensorFlow 2.0.0, Keras 2.3.1.
Tip: use Anaconda w/ virtual environment. If you don't have any virtual envs yet, run:
conda create --name tf2_env --clone base
conda activate tf2_env
conda uninstall tensorflow-gpu
conda uninstall keras
conda install -c anaconda tensorflow-gpu==2.0.0
conda install -c conda-forge keras==2.3.1
May be a bit more involved than this, but that's subject of another question.
UPDATE: importing from keras instead of tf.keras also solves the problem.
Disclaimer: BN remains a 'controversial' layer in Keras, yet to be fully fixed - see Relevant Git; I plan on investigating it myself eventually, but for your purposes, this answer's fix should suffice.
I also recommend familiarizing yourself with BN's underlying theory, in particular regarding its train vs. inference operation; in a nutshell, batch sizes under 32 is a pretty bad idea, and dataset should be sufficiently large to allow BN to accurately approximate test-set gamma and beta.
Code used:
x_train=np.random.normal(0, 1, (100, 12))
model = Sequential()
model.add(Dense(8, input_shape=(12,)))
#model.add(Activation('linear'))
#model.add(tf.keras.layers.BatchNormalization())
model.add(Dense(12))
model.compile(optimizer='adam', loss='mean_squared_error')
W_sum_all = [] # fit rewritten to allow runtime weight collection
for _ in range(20):
for i in range(9):
x = x_train[i*10:(i+1)*10]
model.train_on_batch(x, x)
W_sum_all.append([])
for layer in model.layers:
if layer.trainable_weights != []:
W_sum_all[-1] += [np.sum(layer.get_weights()[0])]
model.evaluate(x[-10:], x[-10:])
plt.plot(W_sum_all)
plt.title("Sum of weights (#V1)", weight='bold', fontsize=14)
plt.legend(labels=["dense", "dense_1"], fontsize=14)
plt.gcf().set_size_inches(7, 4)
Imports/pre-executions:
import numpy as np
np.random.seed(1)
import random
random.seed(2)
import tensorflow as tf
if tf.__version__[0] == '2':
tf.random.set_seed(3)
else:
tf.set_random_seed(3)
import matplotlib.pyplot as plt
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import Input, Dense, Activation

Related

Save Keras model for production mode without TensorFlow

I want to save a trained Keras model, so that it can be used in the Django REST backend of an application. I did a lot of research, but it seems there isn't any way to use these models without TensorFlow installed.
So, what is the use of this storage? I don't want to install a heavy library like TensorFlow on the server. I tested saving with pickle and joblib, as well as Keras' own model.save().
Is there a way to load this model without installing TensorFlow and only with Keras itself?
This is a part of my code,
from keras.models import Sequential
from keras.layers import Dense, LSTM, Dropout
xtrain, ytrain = np.array(xtrain), np.array(ytrain)
ytrain = np.reshape(ytrain, (ytrain.shape[0], 1, 1))
model = Sequential()
model.add(LSTM(150, return_sequences=True, input_shape=(xtrain.shape[1], 1)))
model.add(LSTM(150, return_sequences=False))
model.add(Dense(25))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(xtrain, ytrain, batch_size=1, epochs=7)
model.save('model.h5')
which normally works perfectly, but if I use the model elsewhere, I get this error:
ModuleNotFoundError: No module named 'tensorflow'

You do not need to use TensorFlow in production. You can use coefficient by replacing what random functions in your programming language.
Sample: Input array, time coefficients matrixes, and unboxed system inputs to output with feedback system in the box containers.
temp = tf.random.normal([10], 1, 0.2, tf.float32)
temp = np.asarray(temp) * np.asarray([ coefficient_0, coefficient_1, coefficient_2, coefficient_3, coefficient_4, coefficient_5, coefficient_6, coefficient_7, coefficient_8, coefficient_9 ]) #action = actions['up']
temp = tf.nn.softmax(temp)
action = int(np.argmin(temp))

Difference between Experimental Preprocessing layers and normal preprocessing layers in Tensorflow

import tensorflow as tf
import keras
import tensorflow.keras.layers as tfl
from tensorflow.keras.layers.experimental.preprocessing import RandomFlip, RandomRotation
I am trying to figure out which I should use for Data Augmentation. In the documentation, there is:
tf.keras.layers.RandomFlip and RandomRotation
Then we have in tf.keras.layers.experimental.preprocessing the same things, randomFlip and RandomRotation.
Which should I use? I've seen guides that use both.
This is my current code:
def data_augmenter():
data_augmentation = tf.keras.Sequential([
tfl.RandomFlip(),
tfl.RandomRotation(0.2)
])
return data_augmentation
and this is a part of my model:
def ResNet50(image_shape = IMG_SIZE, data_augmentation=data_augmenter()):
input_shape = image_shape + (3,)
# Remove top layer in order to put mine with the correct classification labels, get weights for imageNet
base_model = tf.keras.applications.resnet_v2.ResNet50V2(input_shape=input_shape, include_top=False, weights='imagenet')
# Freeze base model
base_model.trainable = False
# Define input layer
inputs = tf.keras.Input(shape=input_shape)
# Apply Data Augmentation
x = data_augmentation(inputs)
I am a bit confused here..

If you find something in an experimental module and something in the same package by the same name, these will typically be aliases of one another. For the sake of backwards compatibility, they don't remove the experimental one (at least not for a few iterations.)
You should generally use the non-experimental one if it exists, since this is considered stable and should not be removed or changed later.
The following page shows Keras preprocessing exerimental. If it redirects to the preprocessing module, it's an alias. https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing

How to use Tensorflow addons' metrics correctly in functional API?

I have an LSTM model to perform binary classification of human activities using multivariate smartphone sensor data. The two classes are imbalanced (1:50). Therefore I would like to use F1-score as a metric, but I saw that it was deprecated as a metric.
Before it was best practice to use a callback function for the metric to ensure it was applied on the whole dataset, however, recently the TensorFlow addons reintroduced the F1-Score.
I now have a problem to apply this score to my functional API. Here is the code I am currently running:
import tensorflow as tf
import tensorflow_addons as tfa
from tensorflow import kerasdef
create_model(n_neurons=150, learning_rate=0.01, activation="relu", loss="binary_crossentropy"):
#create input layer and assign to current output layer
input_ = keras.layers.Input(shape=(X_train.shape[1],X_train.shape[2]))
#add LSTM layer
lstm = keras.layers.LSTM(n_neurons, activation=activation)(input_)
#Output Layer
output = keras.layers.Dense(1, activation="sigmoid")(lstm)
#Create Model
model = keras.models.Model(inputs=[input_], outputs=[output])
#Add optimizer
optimizer=keras.optimizers.SGD(lr=learning_rate, clipvalue=0.5)
#Compile model
model.compile(loss=loss, optimizer=optimizer, metrics=[tfa.metrics.F1Score(num_classes=2, average="micro")])
print(model.summary())
return model
#Create the model
model = create_model()
#fit the model
history = model.fit(X_train,y_train,
epochs=300,
validation_data=(X_val, y_val))
If I use another value for the metric argument average (e.g., average=None or average="macro") then I get an error message when fitting the model:
ValueError: Dimension 0 in both shapes must be equal, but are 2 and 1. Shapes are [ 2 ] and [ 1 ]. for 'AssignAddVariableOp' (op: 'AssignAddVariableOp') with input shapes: [ ], [ 1 ].
And if I use the value average="micro" I am not getting the error, but the F1-score is 0 throughout the learning process, while my loss decreases.
I believe I am still doing something wrong here. Can anybody provide an explanation for me?

Updated answer: The crucial bit is to import tf.keras, not keras. Then you can use e.g. tf.keras.metrics.Precision or tfa.metrics.F1Score without problems. See also here.
Old answer:
The problem with tensorflow-addons is that the implementation of the current release (0.6.0) only counts exact matches, such that a comparison e.g. of 1 and 0.99 yields 0. Of course, this is practically useless in a neural network. This has been fixed in 0.7.0 (not yet released). You can install it as follows:
pip3 install --upgrade pip
pip3 install tfa-nightly
and then use a threshold (everything below the threshold is counted as 0, otherwise as 1):
tfa.metrics.FBetaScore(num_classes=2,average="micro",threshold=0.9)
See also https://github.com/tensorflow/addons/issues/490.
The problem with other values for average is discussed here: https://github.com/tensorflow/addons/issues/746.
Beware that there are two other problems that probably lead to useless results, see also https://github.com/tensorflow/addons/issues/818:
the model uses binary classification, but f1-score in tfa assumes categorical classification with one-hot encoding
f1-score is called at each batch step at validation.
These problems should not appear when using the Keras metrics.

In this example, I will show how to use Adamw optimizer and F1_score using Tensorflow.
lr_schedule = keras.optimizers.schedules.ExponentialDecay(
initial_learning_rate, decay_steps=100000, decay_rate=0.96, staircase=True
)
import tensorflow_addons as tfa
f1 = tfa.metrics.F1Score(num_classes=2, average=None)
model=(..)
model.compile(
loss="binary_crossentropy",
optimizer=tfa.optimizers.AdamW(learning_rate=lr_schedule,weight_decay = 0.0001),
metrics=["acc",'AUC',f1],
)

keras load model error trying to load a weight file containing 17 layers into a model with 0 layers

I am currently working on vgg16 model with keras.
I fine tune vgg model with some of my layer.
After fitting my model (training), I save my model with model.save('name.h5').
It can be saved without problem.
However, when I try to reload the model with load_model function, it shows the error:
You are trying to load a weight file containing 17 layers into a model
with 0 layers
Did anyone meet this problem before?
My keras verion is 2.2.
Here is part of my code ...
from keras.models import load_model
vgg_model = VGG16(weights='imagenet',include_top=False,input_shape=(224,224,3))
global model_2
model_2 = Sequential()
for layer in vgg_model.layers:
model_2.add(layer)
for layer in model_2.layers:
layer.trainable= False
model_2.add(Flatten())
model_2.add(Dense(128, activation='relu'))
model_2.add(Dropout(0.5))
model_2.add(Dense(2, activation='softmax'))
model_2.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model_2.fit(x=X_train,y=y_train,batch_size=32,epochs=30,verbose=2)
model_2.save('name.h5')
del model_2
model_2 = load_model('name.h5')
Actually I do not delete the model and then load_model immediately,
just for showing my problem.

It seems that this problem is related with the input_shape parameter of the first layer. I had this problem with a wrapper layer (Bidirectional) which did not have an input_shape parameter set. In code:
model.add(Bidirectional(LSTM(units=units, input_shape=(None, feature_size)), merge_mode='concat'))
did not work for loading my old model because the input_shape is only defined for the LSTM layer not the outer one. Instead
model.add(Bidirectional(LSTM(units=units), input_shape=(None, feature_size), merge_mode='concat'))
worked because the wrapper Birectional layer now has an input_shape parameter. Maybe you should check if the VGG net input_shape parameter is set or not or you should add a single input_layer to your model with the correct input_shape parameter.

I spent 6 hours looking around for a solution.. to apply me trained model.
finally i tried VGG16 as model and using h5 weights i´ve trained on my own and Great!
weights_model='C:/Anaconda/weightsnew2.h5' # my already trained weights .h5
vgg=applications.vgg16.VGG16()
cnn=Sequential()
for capa in vgg.layers:
cnn.add(capa)
cnn.layers.pop()
for layer in cnn.layers:
layer.trainable=False
cnn.add(Dense(2,activation='softmax'))
cnn.load_weights(weights_model)
def predict(file):
x = load_img(file, target_size=(longitud, altura))
x = img_to_array(x)
x = np.expand_dims(x, axis=0)
array = cnn.predict(x)
result = array[0]
respuesta = np.argmax(result)
if respuesta == 0:
print("Gato")
elif respuesta == 1:
print("Perro")

In case anyone is still wondering about this error:
I had the same Problem and spent days figuring out, whats causing it. I have a copy of my whole code and dataset on another system on which it worked. I noticed that it is something about the training, because without training my model, saving and loading was no problem.
The only difference between my systems was, that I was using tensorflow-gpu on my main system and for this reason, the tensorflow base version was a little bit lower (1.14.0 instead of 2.2.0). So all I had to do was using
model.fit_generator()
instead of
model.fit()
before saving it. And it works

"Could not interpret optimizer identifier" error in Keras

I got this error when I tried to modify the learning rate parameter of SGD optimizer in Keras. Did I miss something in my code or my Keras was not installed properly?
Here is my code:
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense, Flatten, GlobalAveragePooling2D, Activation
import keras
from keras.optimizers import SGD
model = Sequential()
model.add(Dense(64, kernel_initializer='uniform', input_shape=(10,)))
model.add(Activation('softmax'))
model.compile(loss='mean_squared_error', optimizer=SGD(lr=0.01), metrics= ['accuracy'])*
and here is the error message:
Traceback (most recent call last): File
"C:\TensorFlow\Keras\ResNet-50\test_sgd.py", line 10, in
model.compile(loss='mean_squared_error', optimizer=SGD(lr=0.01), metrics=['accuracy']) File
"C:\Users\nsugiant\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\keras_impl\keras\models.py",
line 787, in compile
**kwargs) File "C:\Users\nsugiant\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\keras_impl\keras\engine\training.py",
line 632, in compile
self.optimizer = optimizers.get(optimizer) File "C:\Users\nsugiant\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\keras_impl\keras\optimizers.py",
line 788, in get
raise ValueError('Could not interpret optimizer identifier:', identifier) ValueError: ('Could not interpret optimizer identifier:',
<keras.optimizers.SGD object at 0x000002039B152FD0>)

The reason is you are using tensorflow.python.keras API for model and layers and keras.optimizers for SGD. They are two different Keras versions of TensorFlow and pure Keras. They could not work together. You have to change everything to one version. Then it should work.

I am bit late here, Your issue is you have mixed Tensorflow keras and keras API in your code. The optimizer and the model should come from same layer definition. Use Keras API for everything as below:
from keras.models import Sequential
from keras.layers import Dense, Dropout, LSTM, BatchNormalization
from keras.callbacks import TensorBoard
from keras.callbacks import ModelCheckpoint
from keras.optimizers import adam
# Set Model
model = Sequential()
model.add(LSTM(128, input_shape=(train_x.shape[1:]), return_sequences=True))
model.add(Dropout(0.2))
model.add(BatchNormalization())
# Set Optimizer
opt = adam(lr=0.001, decay=1e-6)
# Compile model
model.compile(
loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy']
)
I have used adam in this example. Please use your relevant optimizer as per above code.
Hope this helps.

This problem is mainly caused due to different versions. The tensorflow.keras version may not be same as the keras.
Thus causing the error as mentioned by #Priyanka.
For me, whenever this error arises, I pass in the name of the optimizer as a string, and the backend figures it out.
For example instead of
tf.keras.optimizers.Adam
or
keras.optimizers.Adam
I do
model.compile(optimizer= 'adam' , loss= keras.losses.binary_crossentropy, metrics=['accuracy'])

from tensorflow.keras.optimizers import SGD
This works well.
Since Tensorflow 2.0, there is a new API available directly via tensorflow:
https://www.pyimagesearch.com/2019/10/21/keras-vs-tf-keras-whats-the-difference-in-tensorflow-2-0/
Solution works for tensorflow==2.2.0rc2, Keras==2.2.4 (on Win10)
Please also note that the version above uses learning_rate as parameter and no longer lr.

For some libraries (e.g. keras_radam) you'll need to set up an environment variable before the import:
import os
os.environ['TF_KERAS'] = '1'
import tensorflow
import your_library

In my case it was because I missed the parentheses. I am using tensorflow_addons so my code was like
model.compile(optimizer=tfa.optimizers.LAMB, loss='binary_crossentropy',
metrics=['binary_accuracy'])
And it gives
ValueError: ('Could not interpret optimizer identifier:', <class tensorflow_addons.optimizers.lamb.LAMB'>)
Then I changed my code into:
model.compile(optimizer=tfa.optimizers.LAMB(), loss='binary_crossentropy',
metrics=['binary_accuracy'])
and it works.

recently, in the latest update of Keras API 2.5.0 , importing Adam optimizer shows the following error:
from keras.optimizers import Adam
ImportError: cannot import name 'Adam' from 'keras.optimizers'
instead use the following for importing optimizers (i.e. Adam) :
from keras.optimizers import adam_v2
optimizer = adam_v2.Adam(learning_rate=lr, decay=lr/epochs)
Model.compile(loss='--', optimizer=optimizer , metrics=['--'])

Running the Keras documentaion example https://keras.io/examples/cifar10_cnn/
and installing the latest keras and tensor flow versions
(at the time of this writing
tensorflow 2.0.0a0 and Keras version 2.2.4 )
I had to import explicitly the optimizer the keras the example is using,specifically the line on top of the example :
opt = tensorflow.keras.optimizers.rmsprop(lr=0.0001, decay=1e-6)
was replaced by
from tensorflow.keras.optimizers import RMSprop
opt = RMSprop(lr=0.0001, decay=1e-6)
In the recent version the api "broke" and keras.stuff in a lot of cases became tensorflow.keras.stuff.

Use one style in one kernel, try not to mix
from keras.optimizers import sth
with
from tensorflow.keras.optimizers import sth

I tried the following and it worked for me:
from keras import optimizers
sgd = optimizers.SGD(lr=0.01)
model.compile(loss='mean_squared_error', optimizer=sgd)

use
from tensorflow.keras import optimizers
instead of
from keras import optimizers

Try changing your import lines to
from keras.models import Sequential
from keras.layers import Dense, ...
Your imports seem a little strange to me. Maybe you could elaborate more on that.

I have misplaced parenthesis and got this error,
Initially it was
x=Conv2D(filters[0],(3,3),use_bias=False,padding="same",kernel_regularizer=l2(reg),x))
The corrected version was
x=Conv2D(filters[0],(3,3),use_bias=False,padding="same",kernel_regularizer=l2(reg))(x)

Just give
optimizer = 'sgd' / 'RMSprop'

I got the same error message and resolved this issue, in my case, by replacing the assignment of optimizer:
optimizer=keras.optimizers.Adam
with its instance instead of the class itself:
optimizer=keras.optimizers.Adam()

I tried everything in this thread to fix it but they didn't work. However, I managed to fix it for me. For me, the issue was that calling the optimizer class, ie. tensorflow.keras.optimizers.Adam caused the error, but calling the optimizer as a function, ie. tensorflow.keras.optimizers.Adam() worked. So my code looks like:
model.compile(
loss=tensorflow.keras.losses.categorical_crossentropy(),
optimizer=tensorflow.keras.optimizers.Adam()
)
Looking at the tensorflow github, I am not the only one with this error where calling the function rather than the class fixed the error.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

batch normalization, yes or no? - python

Related

Save Keras model for production mode without TensorFlow

Difference between Experimental Preprocessing layers and normal preprocessing layers in Tensorflow

How to use Tensorflow addons' metrics correctly in functional API?

keras load model error trying to load a weight file containing 17 layers into a model with 0 layers

"Could not interpret optimizer identifier" error in Keras

Categories

Resources