Keras - Default Axis for softmax function is set to Axis - python

I am learning how to create sequential models. I have a model:
*model = Sequential()*
I then went on to add pooling layers and convolution layers (which were fine). But when creating the dense layer:
*model.add(Dense(num_classes, activation = 'softmax'))*
the line returned:
*tf.nn.softmax(x, axis=axis)*
which caused an error as axis was not defined. Both Keras and TensorFlow documentation show that the default axis for softmax is None or -1.
Is this a bug with keras? And is there an easy workaround (If I were to set the axis I am not sure what the input tensor would be)?
-I can add the rest of the code if necessary, but it simply consists of the other layers and I don't think it will help much.

I believe your Keras and/or TensorFlow is not up to date, you should update it/them.
This was a known problem in Keras in the Summer of 2017 and was fixed in this commit. See more on this comment on the bug report.
Also axis was introduced as an argument on November 22, 2017 in TensorFlow's softmax() so if the TensorFlow version is 1.4.0 or less, that would also cause this error.
Which one exactly causes the error depends on the rank of the tensor processed if you review the source of Keras at the linked commit.
This code is fine with the current versions (tested on https://colab.research.google.com):
import keras
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.optimizers import SGD
print( keras.__version__ )
model = Sequential()
model.add( Dense(6, input_shape=(6,), activation = 'softmax' ) )
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
outputs
2.1.6
but more importantly, compiles the model with no error.

Related

Fine tune Universal Sentence Encoder with Keras

I am trying to fine tune Universal Sentence Encoder and use the new encoder layer for something else.
import tensorflow as tf
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import Dense, Dropout
import tensorflow_hub as hub
module_url = "universal-sentence-encoder"
model = Sequential([
hub.KerasLayer(module_url, input_shape=[], dtype=tf.string, trainable=True, name="use"),
Dropout(0.5, name="dropout"),
Dense(256, activation="relu", name="dense"),
Dense(len(y), activation="sigmoid", name="activation")
])
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
model.fit(X, y, batch_size=256, epochs=30, validation_split=0.25)
This worked. Loss went down and accuracy was decent. Now I want to extract just Universal Sentence Encoder layer. However, here is what I get.
Do you know how I can fix this nan issue? I expected to see encoding of numeric values.
Is it only possible to save tuned_use layer as a model as this post recommends? Ideally, I want to save tuned_use layer just like Universal Sentence Encoder so that I can open and use it exactly the same as hub.KerasLayer(tuned_use_location, input_shape=[], dtype=tf.string).
Hoping this will help someone, I ended up solving this by using universal-sentence-encoder-4 instead of universal-sentence-encoder-large-5. I spent quite a lot of time troubleshooting but it was tough as there was no issue with the input data and the model was trained successfully. This might be due to gradient exploding issue but could not add gradient clipping or Leaky ReLU into the original architecture.

Is this behavior normal or is it a bug? Training with incompatible shapes in tensorflow

I have found a weird behavior in tensorflow.keras, it doesn't occur in the classic keras.
I have these shapes in my dataset.
x_train = np.random.rand(60,3,1)
y_train = np.random.rand(60,1)
And this LSTM network
from tensorflow.keras.layers import Dense, LSTM
from tensorflow.keras import Sequential
model = Sequential()
model.add(LSTM(120,input_shape=(3,1)))
model.add(Dense(2,activation="relu"))
model.compile(loss="MSE",optimizer="adam")
model.fit(x_train,y_train,epochs=1)
model.summary()
It's supposed this shouldn't work because the output of the network is (,2) and y_train is (,1). But it start training.
But using the classic keras it fails, as I expected.
from keras.layers import Dense, LSTM
from keras import Sequential
model = Sequential()
model.add(LSTM(120,input_shape=(3,1)))
model.add(Dense(2,activation="relu"))
model.compile(loss="MSE",optimizer="adam")
model.fit(x_train,y_train,epochs=1)
model.summary()
The version are, and I'm using Google Colab:
Tensorflow: 2.2.0
Keras: 2.3.1
What could be causing this? Is this a bug or a new feature?

Keras trained regression model predicts same output for all set of test features

I am trying to build a regression model that predicts the 'Ratings' for movies using the dataset https://www.kaggle.com/shubhammehta21/movie-lens-small-latest-dataset. However after training the model, predictions outputs the same value for all test features. I have read previous similar features that suggested adjusting learning rates, no. of features and checking that the model predicting is the same as the trained model. None of these has worked for me.
I load the data and process it:
links= pd.read_csv('../input/movie-lens-small-latest-dataset/links.csv')
movies=pd.read_csv('../input/movie-lens-small-latest-dataset/movies.csv')
...
dataset=movies.merge(ratings,on='movieId').merge(tags,on='movieId').merge(links,on='movieId')
to_drop='title','genres','timestamp_x','timestamp_y','userId_y','imdbId','tmdbId']
dataset.drop(columns=to_drop,inplace=True)
dataset=pd.get_dummies(dataset)
The code shows how I build the regression model. I have tried adjusting the number of neuron and layers, however, that has not influenced the output.
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.optimizers import Adam
model = Sequential()
model.add(Dense(13, input_dim=1586, kernel_initializer='zero', activation='relu'))
model.add(Dense(6, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal',activation='linear'))
# Compile model
adam = Adam(lr=0.001)
model.compile(loss='mean_squared_error', optimizer=adam,metrics=['mse','mae'])
model.summary()
history = model.fit(train_dataset,train_labels,batch_size=30, epochs=10,verbose=1, validation_split=0.3)
score = model.evaluate(validation_dataset,validation_labels)
print("Test score:", score)
Whenever I try to predict the test dataset:
model.predict(test_dataset)
It predicts the value of
3.97
on all values. I am expecting a range of values between 0 - 5.
You should never (I mean, never) use kernel_initializer='zero' - to be honest, I am surprised that the option even exists in Keras!
Also, kernel_initializer='normal' is not recommended.
As a first step, remove all kernel_initializer arguments, so as to revert to the default and recommended one, kernel_initializer='glorot-uniform'; keep in mind that defaults are there for a reason (usually they work well), and you should change them only if you really have a reason to do so (which I trust you don't have here) and you know what you are doing.
If you still don't get what you would expect, experiment with other parameters (no. of layers/neurons, more epochs etc); you should leave the learning rate (lr) of Adam optimizer as is for starters (it's also one of these default values that seem to work nicely across cases).

concatenate layers to a fully connected layer Tensorflow

I am trying to implement a siamese network from Sergey Zagoruyko using Tensorflow
http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Zagoruyko_Learning_to_Compare_2015_CVPR_paper.pdf
I don't know to concatenate the 2 input layers to a top network (fully connected layer + relu + fully connected layer)
This may not be what you are looking for, but I recommend trying Keras. It is a flexible, high-level framework built on tensorflow that makes it extremely easy to accomplish what you are attempting. This would be how you could do it in Keras (with 32 inputs and 32 neurons in your FC layers).
from keras.models import Sequential
from keras.layers import Dense, Activation, Input
model = Sequential()
model.add(Input(shape=(32,)))
model.add(Dense(32))
model.add(Activation("relu"))
model.add(Dense(32))
Alternatively, using just tensorflow you could use this strategy.
import tensorflow as tf
x = tf.placeholder(shape, dtype=dtype)
y = tf.layers.dense(x, 32)
y = tf.nn.relu(y)
y = tf.layers.dense(y, 32)
But I personally think keras is more elegant, plus it adds a whole lot of more useful features, such as model.output, model.input, and much more. In fact, keras has recently been built into tensorflow's contrib module as tf.contrib.keras. Hope that helps!

train Keras model with BatchNorm layer with tensorflow

I'm use keras to build a model, and write optimizing codes and all the others in tensorflow. When I was using quite simple layers like Dense or Conv2D, everything was straightforward. But adding BatchNormalization layer into my keras model makes problem complicated.
Since BatchNormalization layer behaves differently in training phase and testing phase, I figured out that I need K.learning_phase():True in my feed_dict. But following code is not working well. It runs with no error, but the model's performance isn't getting any better.
import keras.backend as K
...
x_train, y_train = get_data()
sess.run(train_op, feed_dict={x:x_train, y:y_train, K.learning_phase():True})
When I tried training keras model with keras fit function, it worked well.
What should I do to train a keras model with BatchNormalization layer in tensorflow?
Actually I duplicated this question that I hadn't seen.
I found the answer here, it just consists in passing an special argument to the BatchNormalization layer call

Categories