How to mix many distributions in one tensorflow probability layer?

How to mix many distributions in one tensorflow probability layer? - python

I have several DistributionLambda layers as the outputs of one model, and I would like to make a Concatenate-like operation into a new layer, in order to have only one output that is the mix of all the distributions, assuming they are independent. Then, I can apply a log-likelihood loss to the output of the model. Otherwise, I cannot apply the loss over a Concatenate layer, because it lost the log_prob method. I have been trying with the Blockwise distribution, but with no luck so far.
Here an example code:
from tensorflow.keras import layers
from tensorflow.keras import models
from tensorflow.keras import optimizers
from tensorflow_probability import distributions
from tensorflow_probability import layers as tfp_layers
def likelihood_loss(y_true, y_pred):
"""Adding negative log likelihood loss."""
return -y_pred.log_prob(y_true)
def distribution_fn(params):
"""Distribution function."""
return distributions.Normal(
params[:, 0], math.log(1.0 + math.exp(params[:, 1])))
output_steps = 3
...
lstm_layer = layers.LSTM(10, return_state=True)
last_layer, l_h, l_c = lstm_layer(last_layer)
lstm_state = [l_h, l_c]
dense_layer = layers.Dense(2)
last_layer = dense_layer(last_layer)
last_layer = tfp_layers.DistributionLambda(
make_distribution_fn=distribution_fn)(last_layer)
output_layers = [last_layer]
# Get output sequence, re-injecting the output of each step
for number in range(1, output_steps):
last_layer = layers.Reshape((1, 1))(last_layer)
last_layer, l_h, l_c = lstm_layer(last_layer, initial_state=lstm_states)
# Storing state for next time step
lstm_states = [l_h, l_c]
last_layer = tfp_layers.DistributionLambda(
make_distribution_fn=distribution_fn)(dense_layer(last_layer))
output_layers.append(last_layer)
# This does not work
# last_layer = distributions.Blockwise(output_layers)
# This works for the model but cannot compute loss
# last_layer = layers.Concatenate(axis=1)(output_layers)
the_model = models.Model(inputs=[input_layer], outputs=[last_layer])
the_model.compile(loss=likelihood_loss, optimizer=optimizers.Adam(lr=0.001))

The problem is your Input, not your output layer ;)
Input:0 is referenced in your error message.
Could you try to be more specific about your input?

Related

Keras : second derivative of a keras model is giving 0 all the time

To calculate the second derivative of a neural network output with respect to the input using a keras model, I implemented codes used in those 2 posts :
Keras: using the sum of the first and second derivatives of a model as final output
Second derivative in Keras
However, for both techniques, the second derivative is always equal to 0. Here is the problem reproduced on a simple regression of the quadratic function:
import tensorflow as tf
import tensorflow.keras.backend as K
from tensorflow.keras.backend import set_session
from tensorflow.keras import datasets, layers, models
from tensorflow.keras import optimizers
import numpy as np
x = np.linspace(0, 1, 1000)
y = x**2
def model_regression():
model = tf.keras.Sequential([
layers.Dense(1024, input_dim=1, activation="relu"),
layers.Dense(1, activation="linear")])
model.compile(optimizer=optimizers.Adam(lr=0.001),
loss='mean_squared_error')
return model
my_model = model_regression()
my_model.fit(x, y, epochs=10, batch_size=20)
y_pred = my_model.predict(x)
#### Technique 1
first_input = K.gradients(my_model.output, my_model.input)
second_input = K.gradients(first_input, my_model.input)
iterate_first = K.function([my_model.input], [first_input])
iterate_second = K.function([my_model.input], [second_input])
#### Technique 2
# def grad(y, x):
# print(y.shape)
# return layers.Lambda(lambda z: K.gradients(z[0], z[1]), output_shape=[1])([y, x])
# derivative1 = grad(my_model.output, my_model.input)
# derivative2 = grad(derivative1, my_model.input)
# iterate_first = K.function([my_model.input], [derivative1])
# iterate_second = K.function([my_model.input], [derivative2])
first_derivative, second_derivative = [], []
for i in range(x.shape[0]):
first_derivative.append(iterate_first(np.array([[x[i]]]))[0][0][0])
second_derivative.append(iterate_second(np.array([[x[i]]]))[0][0][0])
Here is the graphical result, as you can see the second derivative is always equal to 0 while it should approximatively be equal to 2.
How can I compute correctly the second derivative of my neural network in keras ?

What is the best activation function to use for time series prediction

I am using the Sequential model from Keras, with the DENSE layer type. I wrote a function that recursively calculates predictions, but the predictions are way off. I am wondering what is the best activation function to use for my data. Currently I am using hard_sigmoid function. The output data values range from 5 to 25. The input data has the shape (6,1) and the output data is a single value. When I plot the predictions they never decrease. Thank you for the help!!
# create and fit Multilayer Perceptron model
model = Sequential();
model.add(Dense(20, input_dim=look_back, activation='hard_sigmoid'))
model.add(Dense(16, activation='hard_sigmoid'))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=200, batch_size=2, verbose=0)
#function to predict using predicted values
numOfPredictions = 96;
for i in range(numOfPredictions):
temp = [[origAndPredictions[i,0],origAndPredictions[i,1],origAndPredictions[i,2],origAndPredictions[i,3],origAndPredictions[i,4],origAndPredictions[i,5]]]
temp = numpy.array(temp)
temp1 = model.predict(temp)
predictions = numpy.append(predictions, temp1, axis=0)
temp2 = []
temp2 = [[origAndPredictions[i,1],origAndPredictions[i,2],origAndPredictions[i,3],origAndPredictions[i,4],origAndPredictions[i,5],predictions[i,0]]]
temp2 = numpy.array(temp2)
origAndPredictions = numpy.vstack((origAndPredictions, temp2))
update:
I used this code to implement the swish.
from keras.backend import sigmoid
def swish1(x, beta = 1):
return (x * sigmoid(beta * x))
def swish2(x, beta = 1):
return (x * sigmoid(beta * x))
from keras.utils.generic_utils import get_custom_objects
from keras.layers import Activation
get_custom_objects().update({'swish': Activation(swish)})
model.add(Activation(custom_activation,name = "swish1"))
update:
Using this code:
from keras.backend import sigmoid
from keras import backend as K
def swish1(x):
return (K.sigmoid(x) * x)
def swish2(x):
return (K.sigmoid(x) * x)
Thanks for all the help!!

Although there is no best activation function as such, I find Swish to work particularly well for Time-Series problems. AFAIK keras doesn't provide Swish builtin, you can use:
from keras.utils.generic_utils import get_custom_objects
from keras import backend as K
from keras.layers import Activation
def custom_activation(x, beta = 1):
return (K.sigmoid(beta * x) * x)
get_custom_objects().update({'custom_activation': Activation(custom_activation)})
Then use it in model:
model.add(Activation(custom_activation,name = "Swish"))

Your output data ranges from 5 to 25 and your output ReLU activation will give you values from 0 to inf. So what you try is to "parameterize" your outputs or normalize your labels. This means, using sigmoid as activation (outputs in (0,1)) and transform your labels by subtracting 5 and dividing by 20, so they will be in (almost) the same interval as your outputs, [0,1]. Or you can use sigmoid and multiply your outputs by 20 and add 5 before calculating the loss.
Would be interesting to see the results.

Keras with activity_regularizer that is updated every iteration

I am building a simple neural network using Keras. It has activity regularization so that the output of the only hidden layer is forced to have small values. Here is the code:
import numpy as np
import math
import keras
from keras.models import Model, Sequential
from keras.layers import Input, Dense, Activation
from keras import regularizers
from keras import backend as K
a=1
def my_regularizer(inputs):
means=K.mean((inputs),axis=1)
return a*K.sum(means)**2
x_train=np.random.uniform(low=-1,high=1,size=(200,2))
model=Sequential([
Dense(20,input_shape=(2,),activity_regularizer=my_regularizer),
Activation('tanh'),
Dense(2,),
Activation('linear')
])
model.compile(optimizer='adam',loss='mean_squared_error')
model.fit(x_train,x_train,epochs=20,validation_split=0.1)
Questions:
1) Currently, parameter a is set at the beginning and it does not change. How can I change the code such that the parameter a is updated after each iteration such that
a_new=f(a_old,input)
where input is the values at the hidden layer and f(.) is an arbitrary function.
2) I want my activity regularizer to be applied after the first activation function tanh is applied. Have I written my code correctly? The term "activity_regularizer=my_regularizer" in
Dense(20,input_sahpe=(2,),activity_regularizer=my_regularizer)
makes me feel that the regularizer is being applied to values before the activation function tanh.

You can - but first, you need a valid Keras Regularizer object (your function won't work):
class MyActivityRegularizer(Regularizer):
def __init__(self, a=1):
self.a = K.variable(a, name='a')
# gets called at each train iteration
def __call__(self, x): # your custom function here
means = K.mean(x, axis=1)
return self.a * K.sum(means)**2
def get_config(self): # required class method
return {"a": float(K.get_value(self.a))}
Next, to work with .fit, you need a custom Keras Callback object (see alternative at bottom):
class ActivityRegularizerScheduler(Callback):
""" 'on_batch_end' gets automatically called by .fit when finishing
iterating over a batch. The model, and its attributes, are inherited by
'Callback' (except at __init__) and can be accessed via, e.g., self.model """
def __init__(self, model, update_fn):
self.update_fn=update_fn
self.activity_regularizers=_get_activity_regularizers(model)
def on_batch_end(self, batch, logs=None):
iteration = K.get_value(self.model.optimizer.iterations)
new_activity_reg = self.update_fn(iteration)
# 'activity_regularizer' references model layer's activity_regularizer (in this
# case 'MyActivityRegularizer'), so its attributes ('a') can be set directly
for activity_regularizer in self.activity_regularizers:
K.set_value(activity_regularizer.a, new_activity_reg)
def _get_activity_regularizers(model):
activity_regularizers = []
for layer in model.layers:
a_reg = getattr(layer,'activity_regularizer',None)
if a_reg is not None:
activity_regularizers.append(a_reg)
return activity_regularizers
Lastly, you'll need to create your model within the Keras CustomObjectScope - see in full ex. below.
Example usage:
from keras.layers import Dense
from keras.models import Sequential
from keras.regularizers import Regularizer
from keras.callbacks import Callback
from keras.utils import CustomObjectScope
from keras.optimizers import Adam
import keras.backend as K
import numpy as np
def make_model(my_reg):
return Sequential([
Dense(20, activation='tanh', input_shape=(2,), activity_regularizer=my_reg),
Dense(2, activation='linear'),
])
my_reg = MyActivityRegularizer(a=1)
with CustomObjectScope({'MyActivityRegularizer':my_reg}): # required for Keras to recognize
model = make_model(my_reg)
opt = Adam(lr=1e-4)
model.compile(optimizer=opt, loss='mse')
x = np.random.randn(320,2) # dummy data
y = np.random.randn(320,2) # dummy labels
update_fn = lambda x: .5 + .4*np.cos(x) #x = number of train updates (optimizer.iterations)
activity_regularizer_scheduler = ActivityRegularizerScheduler(model, update_fn)
model.fit(x,y,batch_size=32,callbacks=[activity_regularizer_scheduler],
epochs=4,verbose=1)
To TRACK your a and make sure it's changing, you can get its value at, e.g., each epoch end via:
for epoch in range(4):
model.fit(x,y,batch_size=32,callbacks=[activity_regularizer_scheduler],epochs=1)
print("Epoch {} activity_regularizer 'a': {}".format(epoch,
K.get_value(_get_activity_regularizers(model)[0].a)))
# My output:
# Epoch 0 activity_regularizer 'a': 0.7190816402435303
# Epoch 1 activity_regularizer 'a': 0.4982417821884155
# Epoch 2 activity_regularizer 'a': 0.2838689386844635
# Epoch 3 activity_regularizer 'a': 0.8644570708274841
Regarding (2), I'm afraid you're right - the 'tanh' outputs won't be used; you'll need to pass activation='tanh' instead.
Lastly, you can do it without a callback, via train_on_batch - but a drawback is, you'll need to feed data to the model yourself (and shuffle it, etc):
activity_regularizers = _get_activity_regularizers(model)
for iteration in range(100):
x, y = get_data()
model.train_on_batch(x,y)
iteration = K.get_value(model.optimizer.iterations)
for activity_regularizer in activity_regularizers:
K.set_value(activity_regularizer, update_fn(iteration))

How to get logits from a sequential model in keras/tensorflow? [duplicate]

I have trained a binary classification model with CNN, and here is my code
model = Sequential()
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
border_mode='valid',
input_shape=input_shape))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
# (16, 16, 32)
model.add(Convolution2D(nb_filters*2, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters*2, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
# (8, 8, 64) = (2048)
model.add(Flatten())
model.add(Dense(1024))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(2)) # define a binary classification problem
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adadelta',
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=batch_size,
nb_epoch=nb_epoch,
verbose=1,
validation_data=(x_test, y_test))
And here, I wanna get the output of each layer just like TensorFlow, how can I do that?

You can easily get the outputs of any layer by using: model.layers[index].output
For all layers use this:
from keras import backend as K
inp = model.input # input placeholder
outputs = [layer.output for layer in model.layers] # all layer outputs
functors = [K.function([inp, K.learning_phase()], [out]) for out in outputs] # evaluation functions
# Testing
test = np.random.random(input_shape)[np.newaxis,...]
layer_outs = [func([test, 1.]) for func in functors]
print layer_outs
Note: To simulate Dropout use learning_phase as 1. in layer_outs otherwise use 0.
Edit: (based on comments)
K.function creates theano/tensorflow tensor functions which is later used to get the output from the symbolic graph given the input.
Now K.learning_phase() is required as an input as many Keras layers like Dropout/Batchnomalization depend on it to change behavior during training and test time.
So if you remove the dropout layer in your code you can simply use:
from keras import backend as K
inp = model.input # input placeholder
outputs = [layer.output for layer in model.layers] # all layer outputs
functors = [K.function([inp], [out]) for out in outputs] # evaluation functions
# Testing
test = np.random.random(input_shape)[np.newaxis,...]
layer_outs = [func([test]) for func in functors]
print layer_outs
Edit 2: More optimized
I just realized that the previous answer is not that optimized as for each function evaluation the data will be transferred CPU->GPU memory and also the tensor calculations needs to be done for the lower layers over-n-over.
Instead this is a much better way as you don't need multiple functions but a single function giving you the list of all outputs:
from keras import backend as K
inp = model.input # input placeholder
outputs = [layer.output for layer in model.layers] # all layer outputs
functor = K.function([inp, K.learning_phase()], outputs ) # evaluation function
# Testing
test = np.random.random(input_shape)[np.newaxis,...]
layer_outs = functor([test, 1.])
print layer_outs

From https://keras.io/getting-started/faq/#how-can-i-obtain-the-output-of-an-intermediate-layer
One simple way is to create a new Model that will output the layers that you are interested in:
from keras.models import Model
model = ... # include here your original model
layer_name = 'my_layer'
intermediate_layer_model = Model(inputs=model.input,
outputs=model.get_layer(layer_name).output)
intermediate_output = intermediate_layer_model.predict(data)
Alternatively, you can build a Keras function that will return the output of a certain layer given a certain input, for example:
from keras import backend as K
# with a Sequential model
get_3rd_layer_output = K.function([model.layers[0].input],
[model.layers[3].output])
layer_output = get_3rd_layer_output([x])[0]

Based on all the good answers of this thread, I wrote a library to fetch the output of each layer. It abstracts all the complexity and has been designed to be as user-friendly as possible:
https://github.com/philipperemy/keract
It handles almost all the edge cases.
Hope it helps!

Following looks very simple to me:
model.layers[idx].output
Above is a tensor object, so you can modify it using operations that can be applied to a tensor object.
For example, to get the shape model.layers[idx].output.get_shape()
idx is the index of the layer and you can find it from model.summary()

This answer is based on: https://stackoverflow.com/a/59557567/2585501
To print the output of a single layer:
from tensorflow.keras import backend as K
layerIndex = 1
func = K.function([model.get_layer(index=0).input], model.get_layer(index=layerIndex).output)
layerOutput = func([input_data]) # input_data is a numpy array
print(layerOutput)
To print output of every layer:
from tensorflow.keras import backend as K
for layerIndex, layer in enumerate(model.layers):
func = K.function([model.get_layer(index=0).input], layer.output)
layerOutput = func([input_data]) # input_data is a numpy array
print(layerOutput)

I wrote this function for myself (in Jupyter) and it was inspired by indraforyou's answer. It will plot all the layer outputs automatically. Your images must have a (x, y, 1) shape where 1 stands for 1 channel. You just call plot_layer_outputs(...) to plot.
%matplotlib inline
import matplotlib.pyplot as plt
from keras import backend as K
def get_layer_outputs():
test_image = YOUR IMAGE GOES HERE!!!
outputs = [layer.output for layer in model.layers] # all layer outputs
comp_graph = [K.function([model.input]+ [K.learning_phase()], [output]) for output in outputs] # evaluation functions
# Testing
layer_outputs_list = [op([test_image, 1.]) for op in comp_graph]
layer_outputs = []
for layer_output in layer_outputs_list:
print(layer_output[0][0].shape, end='\n-------------------\n')
layer_outputs.append(layer_output[0][0])
return layer_outputs
def plot_layer_outputs(layer_number):
layer_outputs = get_layer_outputs()
x_max = layer_outputs[layer_number].shape[0]
y_max = layer_outputs[layer_number].shape[1]
n = layer_outputs[layer_number].shape[2]
L = []
for i in range(n):
L.append(np.zeros((x_max, y_max)))
for i in range(n):
for x in range(x_max):
for y in range(y_max):
L[i][x][y] = layer_outputs[layer_number][x][y][i]
for img in L:
plt.figure()
plt.imshow(img, interpolation='nearest')

From: https://github.com/philipperemy/keras-visualize-activations/blob/master/read_activations.py
import keras.backend as K
def get_activations(model, model_inputs, print_shape_only=False, layer_name=None):
print('----- activations -----')
activations = []
inp = model.input
model_multi_inputs_cond = True
if not isinstance(inp, list):
# only one input! let's wrap it in a list.
inp = [inp]
model_multi_inputs_cond = False
outputs = [layer.output for layer in model.layers if
layer.name == layer_name or layer_name is None] # all layer outputs
funcs = [K.function(inp + [K.learning_phase()], [out]) for out in outputs] # evaluation functions
if model_multi_inputs_cond:
list_inputs = []
list_inputs.extend(model_inputs)
list_inputs.append(0.)
else:
list_inputs = [model_inputs, 0.]
# Learning phase. 0 = Test mode (no dropout or batch normalization)
# layer_outputs = [func([model_inputs, 0.])[0] for func in funcs]
layer_outputs = [func(list_inputs)[0] for func in funcs]
for layer_activations in layer_outputs:
activations.append(layer_activations)
if print_shape_only:
print(layer_activations.shape)
else:
print(layer_activations)
return activations

Previous solutions were not working for me. I handled this issue as shown below.
layer_outputs = []
for i in range(1, len(model.layers)):
tmp_model = Model(model.layers[0].input, model.layers[i].output)
tmp_output = tmp_model.predict(img)[0]
layer_outputs.append(tmp_output)

Wanted to add this as a comment (but don't have high enough rep.) to #indraforyou's answer to correct for the issue mentioned in #mathtick's comment. To avoid the InvalidArgumentError: input_X:Y is both fed and fetched. exception, simply replace the line outputs = [layer.output for layer in model.layers] with outputs = [layer.output for layer in model.layers][1:], i.e.
adapting indraforyou's minimal working example:
from keras import backend as K
inp = model.input # input placeholder
outputs = [layer.output for layer in model.layers][1:] # all layer outputs except first (input) layer
functor = K.function([inp, K.learning_phase()], outputs ) # evaluation function
# Testing
test = np.random.random(input_shape)[np.newaxis,...]
layer_outs = functor([test, 1.])
print layer_outs
p.s. my attempts trying things such as outputs = [layer.output for layer in model.layers[1:]] did not work.

Assuming you have:
1- Keras pre-trained model.
2- Input x as image or set of images. The resolution of image should be compatible with dimension of the input layer. For example 80*80*3 for 3-channels (RGB) image.
3- The name of the output layer to get the activation. For example, "flatten_2" layer. This should be include in the layer_names variable, represents name of layers of the given model.
4- batch_size is an optional argument.
Then you can easily use get_activation function to get the activation of the output layer for a given input x and pre-trained model:
import six
import numpy as np
import keras.backend as k
from numpy import float32
def get_activations(x, model, layer, batch_size=128):
"""
Return the output of the specified layer for input `x`. `layer` is specified by layer index (between 0 and
`nb_layers - 1`) or by name. The number of layers can be determined by counting the results returned by
calling `layer_names`.
:param x: Input for computing the activations.
:type x: `np.ndarray`. Example: x.shape = (80, 80, 3)
:param model: pre-trained Keras model. Including weights.
:type model: keras.engine.sequential.Sequential. Example: model.input_shape = (None, 80, 80, 3)
:param layer: Layer for computing the activations
:type layer: `int` or `str`. Example: layer = 'flatten_2'
:param batch_size: Size of batches.
:type batch_size: `int`
:return: The output of `layer`, where the first dimension is the batch size corresponding to `x`.
:rtype: `np.ndarray`. Example: activations.shape = (1, 2000)
"""
layer_names = [layer.name for layer in model.layers]
if isinstance(layer, six.string_types):
if layer not in layer_names:
raise ValueError('Layer name %s is not part of the graph.' % layer)
layer_name = layer
elif isinstance(layer, int):
if layer < 0 or layer >= len(layer_names):
raise ValueError('Layer index %d is outside of range (0 to %d included).'
% (layer, len(layer_names) - 1))
layer_name = layer_names[layer]
else:
raise TypeError('Layer must be of type `str` or `int`.')
layer_output = model.get_layer(layer_name).output
layer_input = model.input
output_func = k.function([layer_input], [layer_output])
# Apply preprocessing
if x.shape == k.int_shape(model.input)[1:]:
x_preproc = np.expand_dims(x, 0)
else:
x_preproc = x
assert len(x_preproc.shape) == 4
# Determine shape of expected output and prepare array
output_shape = output_func([x_preproc[0][None, ...]])[0].shape
activations = np.zeros((x_preproc.shape[0],) + output_shape[1:], dtype=float32)
# Get activations with batching
for batch_index in range(int(np.ceil(x_preproc.shape[0] / float(batch_size)))):
begin, end = batch_index * batch_size, min((batch_index + 1) * batch_size, x_preproc.shape[0])
activations[begin:end] = output_func([x_preproc[begin:end]])[0]
return activations

In case you have one of the following cases:
error: InvalidArgumentError: input_X:Y is both fed and fetched
case of multiple inputs
You need to do the following changes:
add filter out for input layers in outputs variable
minnor change on functors loop
Minimum example:
from keras.engine.input_layer import InputLayer
inp = model.input
outputs = [layer.output for layer in model.layers if not isinstance(layer, InputLayer)]
functors = [K.function(inp + [K.learning_phase()], [x]) for x in outputs]
layer_outputs = [fun([x1, x2, xn, 1]) for fun in functors]

Well, other answers are very complete, but there is a very basic way to "see", not to "get" the shapes.
Just do a model.summary(). It will print all layers and their output shapes. "None" values will indicate variable dimensions, and the first dimension will be the batch size.

Generally, output size can be calculated as
[(W−K+2P)/S]+1
where
W is the input volume - in your case you have not given us this
K is the Kernel size - in your case 2 == "filter"
P is the padding - in your case 2
S is the stride - in your case 3
Another, prettier formulation:

Deep Network Produce zero Accuracy

I am trying to build a deep network using theano. However the accuracy is zero. I can not figure out my mistake. I am trying to create a deep learning network with 3 hidden layers and one output. I am tyring to do a classification task and I have 5 classes. Therefore, the output layer have 5 nodes.
Any suggestion?
#!/usr/bin/env python
from __future__ import print_function
import theano
import theano.tensor as T
import lasagne
import numpy as np
import sklearn.datasets
import os
import csv
import pandas as pd
# Lasagne is pre-release, so it's interface is changing.
# Whenever there's a backwards-incompatible change, a warning is raised.
# Let's ignore these for the course of the tutorial
import warnings
warnings.filterwarnings('ignore', module='lasagne')
from lasagne.objectives import categorical_crossentropy, aggregate
#load the data and prepare it
df = pd.read_excel('risk_sample_data_9.20.16_anon.xls',skiprows=0)
rawdata = df.values
# remove empty rows (odd rows)
mask = np.ones(len(rawdata), dtype=bool)
mask[::2] = False
data = rawdata[mask]
idx = np.array([1,5,6,7])
m = np.zeros_like(data)
m[:,idx] = 1
X = np.ma.masked_array(data,m)
X = np.ma.filled(X, fill_value=0)
X = X.astype(theano.config.floatX)
y = data[:,7] # extract financial rating labels
# convert char lables into int , A=1 , B=2, C=3, D=4, F=5
y[y == 'A'] = 1
y[y == 'B'] = 2
y[y == 'C'] = 3
y[y == 'D'] = 4
y[y == 'F'] = 5
y = pd.to_numeric(y)
y = y.astype('int32')
#y = y.astype(theano.config.floatX)
N_CLASSES = 5
# First, construct an input layer.
# The shape parameter defines the expected input shape,
# which is just the shape of our data matrix data.
l_in = lasagne.layers.InputLayer(shape=X.shape)
# We'll create a network with two dense layers:
# A tanh hidden layer and a softmax output layer.
l_hidden1 = lasagne.layers.DenseLayer(
# The first argument is the input layer
l_in,
# This defines the layer's output dimensionality
num_units=250,
# Various nonlinearities are available
nonlinearity=lasagne.nonlinearities.rectify)
l_hidden2 = lasagne.layers.DenseLayer(
# The first argument is the input layer
l_hidden1,
# This defines the layer's output dimensionality
num_units=100,
# Various nonlinearities are available
nonlinearity=lasagne.nonlinearities.rectify)
l_hidden3 = lasagne.layers.DenseLayer(
# The first argument is the input layer
l_hidden2,
# This defines the layer's output dimensionality
num_units=50,
# Various nonlinearities are available
nonlinearity=lasagne.nonlinearities.rectify)
l_hidden4 = lasagne.layers.DenseLayer(
# The first argument is the input layer
l_hidden3,
# This defines the layer's output dimensionality
num_units=10,
# Various nonlinearities are available
nonlinearity=lasagne.nonlinearities.sigmoid)
# For our output layer, we'll use a dense layer with a softmax nonlinearity.
l_output = lasagne.layers.DenseLayer(
l_hidden4, num_units=N_CLASSES, nonlinearity=lasagne.nonlinearities.softmax)
net_output = lasagne.layers.get_output(l_output)
# As a loss function, we'll use Theano's categorical_crossentropy function.
# This allows for the network output to be class probabilities,
# but the target output to be class labels.
true_output = T.ivector('true_output')
# get_loss computes a Theano expression for the objective,
# given a target variable
# By default, it will use the network's InputLayer input_var,
# which is what we want.
#loss = objective.get_loss(target=true_output)
loss = lasagne.objectives.categorical_crossentropy(net_output, true_output)
loss = aggregate(loss, mode='mean')
# Retrieving all parameters of the network is done using get_all_params,
# which recursively collects the parameters of all layers
# connected to the provided layer.
all_params = lasagne.layers.get_all_params(l_output)
# Now, we'll generate updates using Lasagne's SGD function
updates = lasagne.updates.sgd(loss, all_params, learning_rate=1)
# Finally, we can compile Theano functions for training and
# computing the output.
# Note that because loss depends on the input variable of our input layer,
# we need to retrieve it and tell Theano to use it.
train = theano.function([l_in.input_var, true_output], loss, updates=updates)
get_output = theano.function([l_in.input_var], net_output)
def eq(x, y):
if x==y:
return 1
return 0
print("Training ...")
# Train for 100 epochs
for n in xrange(10):
train(X, y)
y_predicted = np.argmax(get_output(X), axis=1)
correct = reduce(lambda a, b: a+b, map(eq, y_predicted, y))
print("Iteration {} correct prediction {}".format(n, correct))
# Compute the predicted label of the training data.
# The argmax converts the class probability output to class label
y_predicted = np.argmax(get_output(X), axis=1)
print(y_predicted)

The learning rate seems way too high. Try a lower learning rate first. It might be that your model diverges on the task. Hard to tell without being able to try it on your data.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to mix many distributions in one tensorflow probability layer? - python

The problem is your Input, not your output layer ;) Input:0 is referenced in your error message. Could you try to be more specific about your input?

Related

Keras : second derivative of a keras model is giving 0 all the time

What is the best activation function to use for time series prediction

Keras with activity_regularizer that is updated every iteration

How to get logits from a sequential model in keras/tensorflow? [duplicate]

Deep Network Produce zero Accuracy

Categories

Resources