While I was learning TensorFlow (Version 1) through text, I ran into the following problem:
# Generate data
xy, labels = make_circles(n_samples=200, noise=0.1, random_state=717)
features = xy
num_hidden1 = 10
num_hidden2 = 5
x = tf.placeholder(tf.float32, shape=(None, 2))
y = tf.placeholder(tf.float32, shape=(None, 1))
rand_init = tf.random_normal_initializer(seed=624)
# Hidden Layer 1
hidden1 = tf.contrib.layers.fully_connected(x, num_hidden1,
activation_fn=tf.nn.sigmoid,
weights_initializer=rand_init,
biases_initializer=rand_init)
# Hidden Layer 2
hidden2 = tf.contrib.layers.fully_connected(hidden1, num_hidden2,
activation_fn=tf.nn.sigmoid,
weights_initializer=rand_init,
biases_initializer=rand_init)
# Output Layer
yhat = tf.contrib.layers.fully_connected(hidden2, 1,
activation_fn=tf.nn.sigmoid,
weights_initializer=rand_init,
biases_initializer=rand_init)
loss = tf.reduce_mean(-y * tf.log(yhat) - (1-y) * tf.log(1-yhat))
# Prepare algorithm
MaxEpochs = 2500
lr = 0.1
optimizer = tf.train.AdamOptimizer(lr)
train = optimizer.minimize(loss)
# Shuffle data
np.random.seed(7382)
idx = np.arange(0, len(features))
np.random.shuffle(idx)
shuffled_features = features[idx]
shuffled_labels = labels[idx]
# Stochastic method
batch_size = 25
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
for epoch in range(MaxEpochs):
if epoch % 500 == 0:
loss_val = sess.run(loss, feed_dict={x: features, y: labels.reshape(-1,1)})
plot_model(sess, yhat, xy, labels, f_fn, 'Epoch {}\n (loss={:1.2f})'.format(epoch, loss_val))
for x_batch, y_batch in generate_batches(batch_size, shuffled_features, shuffled_labels):
sess.run(train, feed_dict={x: x_batch, y: y_batch.reshape(-1,1)})
loss_val = sess.run(loss, feed_dict={x: features, y: labels.reshape(-1,1)})
print(loss_val)
plot_model(sess, yhat, xy, labels, f_fn, 'Epoch {}\n (loss={:1.2f})'.format(epoch+1, loss_val))
/usr/local/lib/python3.6/dist-packages/matplotlib/contour.py:1483: UserWarning: Warning: converting a masked element to nan.self.zmax = float(z.max())
/usr/local/lib/python3.6/dist-packages/matplotlib/contour.py:1134: RuntimeWarning: invalid value encountered in greater over = np.nonzero(lev > self.zmax)[0]
The example worked fine when testing with MaxEpoch=20 and it was increased to 2500 in order to show a case of overfitting with loss=0.18; however, when I run it, the loss function starts outputting NaN after ~400 epoch.
Is the example from text outdated or is this supposed to be a mistake?
You should be decaying your learning rate as you run for longer iterations. Most likely your learning rate of 0.1 is too high for 2500 epochs. You can try a lower lr and run 2500 epochs to demonstrate over fitting on training data.
Related
I"m working through the an example problem with TensorFlow (working with placeholders specifically) and don't understand why I'm receiving (what appears to be) a shape/type error when I'm fairly confident those are what they should be.
I've tried playing around with the various float types in X_batch & y_batch, tried changing the size from being "None" (unspecified) to what I will be passing in (100), none of which have worked
import tensorflow as tf
import numpy as np
from sklearn.datasets import fetch_california_housing
def fetch_batch(epoch, batch_index, batch_size, X, y):
np.random.seed(epoch * batch_index)
indices = np.random.randint(m, size=batch_size)
X_batch = X[indices]
y_batch = y[indices]
return X_batch.astype('float32'), y_batch.astype('float32')
if __name__ == "__main__":
housing = fetch_california_housing()
m, n = housing.data.shape
# standardizing input data
standardized_housing = (housing.data - np.mean(housing.data)) / np.std(housing.data)
std_housing_bias = np.c_[np.ones((m, 1)), standardized_housing]
# using the size "n+1" to account for the bias term
X = tf.placeholder(tf.float32, shape=(None, n+1), name='X')
y = tf.placeholder(tf.float32, shape=(None, 1), name='y')
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1, 1), dtype=tf.float32, name='theta')
y_pred = tf.matmul(X, theta, name='predictions')
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name='mse')
n_epochs = 1000
learning_rate = 0.01
batch_size = 100
n_batches = int(np.ceil(m / batch_size))
# using the Gradient Descent Optimizer class from tensorflow's optimizer selection
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)
# creates a node in the computational graph that initializes all variables when it is run
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epochs):
for batch_index in range(n_batches):
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size, std_housing_bias, \
housing.target.reshape(-1, 1))
print(X_batch.shape, X_batch.dtype, y_batch.shape, y_batch.dtype)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
if epoch % 100 == 0:
print(f"Epoch {epoch} MSE = {mse.eval()}")
best_theta = theta.eval()
print("Mini Batch Gradient Descent Beta Estimates")
print(best_theta)
The error I'm getting is:
InvalidArgumentError: You must feed a value for placeholder tensor 'X' with dtype float and shape [?,9]
[[node X (defined at /Users/marshallmcquillen/Scripts/lab.py:25) ]]
I've thrown a print statement printing X_batch and y_batch properties, and they are what I expect them to be but still aren't working.
The mse you want to evaluate is also dependent on placeholder X and y therefore you need to provide with feed_dict as well. You can fix it by changing the line to
if epoch % 100 == 0:
print(f"Epoch {epoch} MSE = {mse.eval(feed_dict={X: X_batch, y: y_batch})}")
But since you are trying to evaluate the model, it is reasonable to use a test dataset. So ideally it would be
if epoch % 100 == 0:
print(f"Epoch {epoch} MSE = {mse.eval(feed_dict={X: X_test, y: y_test})}")
I am running Vanilla RNN code on tensorflow in google colab. I want to plot training error, validation error and prediction accuracy over training progress without using tensorboard. I am new to tensorflow. Can anyone please guide me. Here is a part of my code
# Model predictions
cls_prediction = tf.argmax(output_logits, axis=1, name='predictions')
# Define the loss function, optimizer, and accuracy
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=output_logits), name='loss')
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate, name='Adam-op').minimize(loss)
correct_prediction = tf.equal(tf.argmax(output_logits, 1), tf.argmax(y, 1), name='correct_pred')
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32), name='accuracy')
init = tf.global_variables_initializer()
sess = tf.InteractiveSession()
sess.run(init)
global_step = 0
# Number of training iterations in each epoch
num_tr_iter = int(len(y_train) / batch_size)
for epoch in range(epochs):
print('Training epoch: {}'.format(epoch + 1))
x_train, y_train = randomize(x_train, y_train)
for iteration in range(num_tr_iter):
global_step += 1
start = iteration * batch_size
end = (iteration + 1) * batch_size
x_batch, y_batch = get_next_batch(x_train, y_train, start, end)
x_batch = x_batch.reshape((batch_size, timesteps, num_input))
# Run optimization op (backprop)
feed_dict_batch = {x: x_batch, y: y_batch}
sess.run(optimizer, feed_dict=feed_dict_batch)
if iteration % display_freq == 0:
# Calculate and display the batch loss and accuracy
loss_batch, acc_batch = sess.run([loss, accuracy],
feed_dict=feed_dict_batch)
print("iter {0:3d}:\t Loss={1:.2f},\tTraining Accuracy={2:.01%}".
format(iteration, loss_batch, acc_batch))
# Run validation after every epoch
feed_dict_valid = {x: x_valid[:1000].reshape((-1, timesteps, num_input)), y: y_valid[:1000]}
loss_valid, acc_valid = sess.run([loss, accuracy], feed_dict=feed_dict_valid)
print('---------------------------------------------------------')
print("Epoch: {0}, validation loss: {1:.2f}, validation accuracy: {2:.01%}".
format(epoch + 1, loss_valid, acc_valid))
print('---------------------------------------------------------')
What changes should I make in the code to get the plots?
One such way would be to store the values in a list, then use something like matplotlib to plot the values
Example code:
import matplotlib.pyplot as plt
plt.plot([1, 2, 3, 4])
plt.ylabel('some numbers')
plt.show()
will plot a straight line. In your case, you'd call plt.plot(list_of_prediction_accuracy) or whatever list you want to visualize
I have created a very simple TensorFlow neural network, but clearly I must have skipped a step somewhere or mixed up sample code from different tutorials, because the results are nonsensical, and the training error only increases with each epoch.
Here's a fully self-contained example (MVCE), trying to train the network to calculate the square function:
import tensorflow as tf
import numpy as np
# hard-coded input and labels for demonstration
training_x = np.array([[1.], [2.],[3.],[4.],[5.]]).T
labels_training = np.array([[1.],[4.],[9.],[16.],[25.]]).T
# Hyperparameters
num_epochs = 1000
learning_rate = 0.001
LAYERS = 3
# setup the Neural Network
INPUT = len(training_x)
OUTPUT = len(labels_training)
X = tf.placeholder(tf.float32, shape=[INPUT,None])
Y = tf.placeholder(tf.float32, shape=[OUTPUT, None])
parameters = {
'W1': tf.Variable(np.random.randn(LAYERS,INPUT), dtype=tf.float32),
'b1': tf.Variable(np.zeros([LAYERS,1]), dtype=tf.float32),
'W2': tf.Variable(np.random.randn(OUTPUT,LAYERS), dtype=tf.float32),
'b2': tf.Variable(np.zeros([OUTPUT,1]), dtype=tf.float32)
}
Z1 = tf.add(tf.matmul(parameters['W1'], X), parameters['b1']) # W1*X + b
A2 = tf.nn.relu(Z1)
Z2 = tf.add(tf.matmul(parameters['W2'], A2), parameters['b2'])
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=Z2, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(num_epochs):
_ , c = sess.run([optimizer, cost], feed_dict={X: training_x, Y: labels_training})
if epoch % 200 == 0:
print ("Cost after epoch %i: %f" % (epoch, c))
# Test predictions by computing the output using training set as input
output = sess.run(Z2, feed_dict={X: training_x})
print(np.array2string(output, precision=3))
Example output (YMMV due to the random initialization vector):
Cost after epoch 0: 158.512558
Cost after epoch 200: 227.178513
Cost after epoch 400: 319.617218
Cost after epoch 600: 436.471069
Cost after epoch 800: 577.651733
[[23.437 38.291 53.145 67.999 82.852]]
I tried your code and I think you should change cost function. If I change it to cost = tf.reduce_mean(tf.losses.mean_squared_error(labels = Y, predictions = Z2)) then it works better.
EDIT:
And when I didn't transpose your input and output data it reduces cost to 0 in under 200 epochs.
I think its because of the following lines
Z1 = tf.add(tf.matmul(parameters['W1'], X), parameters['b1'])
it should be
Z1 = tf.add(tf.matmul( X,parameters['W1']), parameters['b1'])
Same thing for Z2
Found an explanation on This SO Post
I'm trying to create a neural network that takes 13 features as input from multiple csv files one at a time and measure accuracy after each iteration. Here is my code snippet:
import tensorflow as tf
import numpy as np
from tensorflow.contrib.layers import fully_connected
import os
import pandas as pd
n_inputs = 13
n_hidden1 = 30
n_hidden2 = 10
n_outputs = 2
learning_rate = 0.01
n_epochs = 40
batch_size = 1
patient_id = os.listdir('./subset_numerical')
output = pd.read_csv('output.csv')
sepsis_pat = output['output'].tolist()
X = tf.placeholder(tf.float32, shape=[None, n_inputs], name="X")
y = tf.placeholder(tf.int64, shape=[None], name="y")
def data_processor(n):
id = pd.read_csv('./subset_numerical/'+patient_id[n])
id_input = np.array([id['VALUE'].tolist()])
for s in sepsis_pat:
if str(s) == str(patient_id[n].split('.')[0]):
a = 1
try:
if a == 1:
a = 0
return [id_input, np.array([1])]
except:
return [id_input, np.array([0])]
def test_set():
id_combined = []
out = []
for p in range(300, len(patient_id)):
try:
id1 = pd.read_csv('./subset_numerical/' + patient_id[p])
id_input1 = np.array(id1['VALUE'].tolist())
id_combined.append(id_input1)
for s in sepsis_pat:
if str(s) == str(patient_id[p].split('.')[0]):
a = 1
try:
if a == 1:
a = 0
out.append([1, 0])
except:
out.append([0, 1])
except:
pass
return [np.array(id_combined), np.array(out)]
# Declaration of hidden layers and calculation of loss goes here
# Construction phase begins
with tf.name_scope("dnn"):
hidden1 = fully_connected(X, n_hidden1, scope="hidden1")
hidden2 = fully_connected(hidden1, n_hidden2, scope="hidden2")
logits = fully_connected(hidden2, n_outputs, scope="outputs", activation_fn=None) # We will apply softmax here later
# Calculating loss
with tf.name_scope("loss"):
xentropy = tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=logits)
loss = tf.reduce_mean(xentropy, name="loss")
# Training with gradient descent optimizer
with tf.name_scope("train"):
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
training_op = optimizer.minimize(loss)
# Measuring accuracy
with tf.name_scope("eval"):
correct = tf.nn.in_top_k(logits, y, 1)
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))
accuracy_summary = tf.summary.scalar('accuracy', accuracy)
# Variable initialization and saving model goes here
# Construction is finished. Let's get this to work.
with tf.Session() as sess:
init.run()
for epoch in range(n_epochs):
a = 0
for iteration in range(300 // batch_size):
X_batch, y_batch = data_processor(iteration)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
acc_train = accuracy.eval(feed_dict={X: X_batch, y: y_batch})
X_test, y_test = test_set()
acc_test = accuracy.eval(feed_dict={X: X_test, y: y_test})
print(epoch, "Train accuracy:", acc_train, "Test accuracy:", acc_test)
save_path = saver.save(sess, "./my_model_final.ckpt")
But I'm stuck with this error:
logits and labels must be same size: logits_size=[1,2] labels_size=[1,1]
The error seems to occur at this line:
correct = tf.nn.in_top_k(logits, y, 1)
What am I doing wrong?
Based on your error log provided, the problem is in this line of your code:
xentropy = tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=logits)
Ensure that both of them have same shape and dtype.
The shape should be of the format [batch_size, num_classes] and dtype should be of type float16, float32 or float64. Check the documentation of softmax_cross_entropy_with_logits for more details.
Since you've defined n_outputs = 2, the shape of logits is [?, 2] (? means batch size), while the shape of y is just [?]. In order to apply the softmax loss function, the last FC layer should return a flat tensor, which can be compared with y.
Solution: set n_outputs = 1.
I'm working on an RBF network using Tensorflow, but there's this error that comes up at line 112 that says this: ValueError: Cannot feed value of shape (40, 13) for Tensor 'Placeholder:0', which has shape '(?, 12)'
Here's my code below. I created my own activation function for my RBF network by following this tutorial. Also, if there is anything else you notice that needs to be fixed, please point it out to me, because I am very new to Tensorflow so it would be helpful to get any feedback I can get.
import tensorflow as tf
import numpy as np
import math
from sklearn import datasets
from sklearn.model_selection import train_test_split
from tensorflow.python.framework import ops
ops.reset_default_graph()
RANDOM_SEED = 42
tf.set_random_seed(RANDOM_SEED)
boston = datasets.load_boston()
data = boston["data"]
target = boston["target"]
N_INSTANCES = data.shape[0]
N_INPUT = data.shape[1] - 1
N_CLASSES = 3
TEST_SIZE = 0.1
TRAIN_SIZE = int(N_INSTANCES * (1 - TEST_SIZE))
batch_size = 40
training_epochs = 400
learning_rate = 0.001
display_step = 20
hidden_size = 200
target_ = np.zeros((N_INSTANCES, N_CLASSES))
data_train, data_test, target_train, target_test = train_test_split(data, target_, test_size=0.1, random_state=100)
x_data = tf.placeholder(shape=[None, N_INPUT], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, N_CLASSES], dtype=tf.float32)
# creates activation function
def gaussian_function(input_layer):
initial = math.exp(-2*math.pow(input_layer, 2))
return initial
np_gaussian_function = np.vectorize(gaussian_function)
def d_gaussian_function(input_layer):
initial = -4 * input_layer * math.exp(-2*math.pow(input_layer, 2))
return initial
np_d_gaussian_function = np.vectorize(d_gaussian_function)
np_d_gaussian_function_32 = lambda input_layer: np_d_gaussian_function(input_layer).astype(np.float32)
def tf_d_gaussian_function(input_layer, name=None):
with ops.name_scope(name, "d_gaussian_function", [input_layer]) as name:
y = tf.py_func(np_d_gaussian_function_32, [input_layer],[tf.float32], name=name, stateful=False)
return y[0]
def py_func(func, inp, Tout, stateful=True, name=None, grad=None):
rnd_name = 'PyFunGrad' + str(np.random.randint(0, 1E+8))
tf.RegisterGradient(rnd_name)(grad)
g = tf.get_default_graph()
with g.gradient_override_map({"PyFunc": rnd_name}):
return tf.py_func(func, inp, Tout, stateful=stateful, name=name)
def gaussian_function_grad(op, grad):
input_variable = op.inputs[0]
n_gr = tf_d_gaussian_function(input_variable)
return grad * n_gr
np_gaussian_function_32 = lambda input_layer: np_gaussian_function(input_layer).astype(np.float32)
def tf_gaussian_function(input_layer, name=None):
with ops.name_scope(name, "gaussian_function", [input_layer]) as name:
y = py_func(np_gaussian_function_32, [input_layer], [tf.float32], name=name, grad=gaussian_function_grad)
return y[0]
# end of defining activation function
def rbf_network(input_layer, weights):
layer1 = tf.matmul(tf_gaussian_function(input_layer), weights['h1'])
layer2 = tf.matmul(tf_gaussian_function(layer1), weights['h2'])
output = tf.matmul(tf_gaussian_function(layer2), weights['output'])
return output
weights = {
'h1': tf.Variable(tf.random_normal([N_INPUT, hidden_size], stddev=0.1)),
'h2': tf.Variable(tf.random_normal([hidden_size, hidden_size], stddev=0.1)),
'output': tf.Variable(tf.random_normal([hidden_size, N_CLASSES], stddev=0.1))
}
pred = rbf_network(x_data, weights)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y_target))
my_opt = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y_target, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
init = tf.global_variables_initializer()
sess = tf.InteractiveSession()
sess.run(init)
# Training loop
for epoch in range(training_epochs):
avg_cost = 0.
total_batch = int(data_train.shape[0] / batch_size)
for i in range(total_batch):
randidx = np.random.randint(int(TRAIN_SIZE), size=batch_size)
batch_xs = data_train[randidx, :]
batch_ys = target_train[randidx, :]
sess.run(my_opt, feed_dict={x_data: batch_xs, y_target: batch_ys})
avg_cost += sess.run(cost, feed_dict={x_data: batch_xs, y_target: batch_ys})/total_batch
if epoch % display_step == 0:
print("Epoch: %03d/%03d cost: %.9f" % (epoch, training_epochs, avg_cost))
train_accuracy = sess.run(accuracy, feed_dict={x_data: batch_xs, y_target: batch_ys})
print("Training accuracy: %.3f" % train_accuracy)
test_acc = sess.run(accuracy, feed_dict={x_data: data_test, y_target: target_test})
print("Test accuracy: %.3f" % (test_acc))
sess.close()
As it has been said, you should have N_Input = data.shape[1].
Actually data.shape[0] relates the number of realisations you have in your data-set and data.shape[1] tells us how many features the network should consider.
The number of features is by definition the size of the input layer regardless how many data you will propose (via feed_dict) to your network.
Plus boston dataset is a regression problem while softmax_cross_entropy is a cost function for classification problem. You can try tf.square to evaluate the euclidean distance between what you are predicting and what you want :
cost = tf.reduce_mean(tf.square(pred - y_target))
You will see that your network is learning, even though the accuracy is not very high.
Edit :
Your code is actually learning well but you used the wrong tool to measure it.
Mainly, your errors still reside in the fact that you are dealing with regression problem not with a classification problem.
In classification problem you can evaluate the accuracy of your on-going learning process using
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y_target, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
It consists in checking if the predicted class is the same as the expected class, for an input among x_test.
In regression problem, doing so is senseless since you are looking for a real number i.e. an infinity of possibility from the classification point of view.
In regression problem you can estimate the error (mean or whatever) between predicted values and expected values. We can use what I suggested below :
cost = tf.reduce_mean(tf.square(pred - y_target))
I modified your code consequently here it is
pred = rbf_network(x_data, weights)
cost = tf.reduce_mean(tf.square(pred - y_target))
my_opt = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
#correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y_target, 1))
#accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
init = tf.global_variables_initializer()
sess = tf.InteractiveSession()
sess.run(init)
plt.figure("Error evolution")
plt.xlabel("N_epoch")
plt.ylabel("Error evolution")
tol = 5e-4
epoch, err=0, 1
# Training loop
while epoch <= training_epochs and err >= tol:
avg_cost = 0.
total_batch = int(data_train.shape[0] / batch_size)
for i in range(total_batch):
randidx = np.random.randint(int(TRAIN_SIZE), size=batch_size)
batch_xs = data_train[randidx, :]
batch_ys = target_train[randidx, :]
sess.run(my_opt, feed_dict={x_data: batch_xs, y_target: batch_ys})
avg_cost += sess.run(cost, feed_dict={x_data: batch_xs, y_target: batch_ys})/total_batch
plt.plot(epoch, avg_cost, marker='o', linestyle="none", c='k')
plt.pause(0.05)
err = avg_cost
if epoch % 10 == 0:
print("Epoch: {}/{} err = {}".format(epoch, training_epochs, avg_cost))
epoch +=1
print ("End of learning process")
print ("Final epoch = {}/{} ".format(epoch, training_epochs))
print ("Final error = {}".format(err) )
sess.close()
The output is
Epoch: 0/400 err = 0.107879924503
Epoch: 10/400 err = 0.00520248359747
Epoch: 20/400 err = 0.000651647908274
End of learning process
Final epoch = 26/400
Final error = 0.000474644409471
We plot the evolution of the error in the training through the different epochs
I'm also new to Tensorflow and this is my first answer in stackoverflow. I tried your code and I got the same error.
You can see in the error code ValueError: Cannot feed value of shape (40, 13) for Tensor 'Placeholder:0', which has shape '(?, 12), that there is a mismatch in the shapes of the first placeholder:
x_data = tf.placeholder(shape=[None, N_INPUT], dtype=tf.float32)
so I'm not sure why the N_INPUT has a -1 in this line
N_INPUT = data.shape[1] - 1
I tried removing it and the code runs. Though it looks like the network isn't learning.
While this implementation will do the job, I don't think its the most optimal RBF implementation. You are using a fixed size of 200 centroids (hidden units) in your RBF. This causes the centroids to not be optimally placed and the width of your Gaussian basis function to not be optimally sized. Typically the centroids should be learned in an unsupervised pre-stage by using K Means or any other kind of clustering algorithm.
So your 1st training stage would involve finding the centroids/centers of the RBFs, and the 2nd stage would be the actual classification/regression using the RBF Network