Hello I am working with binary time series of expression data as follows:
0: decrease expression
1: increase expression
I am training a Bidirectional LSTM network to predict the next value, but instead of giving me values of 0 or 1, it returns values like:
0.564
0.456
0.423
0.58
How can I get it to return 0 or 1?
this is my code:
ventana = 10
n_features = 1
neurons = 256 #155
activacion = 'softmax'
perdida = 0.25
batch_size = 32 # 32
epochs = 100 # 200
X, y = split_sequence(cierres, ventana)
X = X.reshape((X.shape[0], X.shape[1], n_features))
# define model
model = Sequential()
model.add(Bidirectional(LSTM(neurons, activation=activacion), input_shape=(ventana, n_features)))
model.add(Dropout(perdida))
model.add(Dense(1))
model.compile(optimizer='adam', loss='binary_crossentropy')
# fit model
model.fit(X, y, epochs=epochs, batch_size=batch_size, verbose=1, shuffle=False)
The network is effectively performing a regression on the data, and doesn't give an exact 0 or 1. By giving a number in between, it is producing something of a degree of confidence, with number closer to 1 being more confidently a 1. To transform this, you can apply thresholding, where you round the output to 0 or 1.
import numpy as np
y_out = model.fit(...)
y_pred = np.round(y_out)
That being said, this doesn't actually minimize some kinds of loss functions. If you are being scored on a function like MSE, it is better to keep the numbers as they are.
Related
For example
I have time seriese datalike this
[[1,0,0,0] [1,0,0,1],[1,0,1,0],[1,1,0,0]],,,,
and it predict the next one from past two.
I want to put [[1,0,0,0],[1,0,0,1]] and get [1,0,1,0]
So I made model like these below.
input_len = 2
n_in = 4
n_hidden = 512
model = Sequential()
model.add(LSTM(n_hidden, input_shape=(input_len,n_in), return_sequences=True))
model.add(Dropout(0.1))
model.add(LSTM(n_hidden, return_sequences=False))
model.add(Dense(n_hidden, activation="linear"))
model.add(Dense(n_in, activation="linear"))
opt = Adam(lr=0.001)
model.compile(loss='mse', optimizer=opt)
model.summary()
#trainning and validate data
X #X.shape (800, 2, 4) [ [[1,0,0,1],[1,0,0,1]],[[1,0,0,1],[1,0,0,0]],,,
Y #Y.shape (200, 2, 4)
val_x #val_x.shape (800,1,4) [[1,0,1,0]][1,1,1,0],,,,
val_y #val_y.shape (200,1,4)
history = model.fit(x, y, epochs=50,validation_data=(val_x, val_y))
#then predict
in_ = np.array[[1,0,0,1][1,1,1,1]]
out_ = model.predict(in_)
print(out_)
I expect as the result at least 1 or 0.
however I get the number like this [[4.9627638e-01 1.4797167e-01 3.3314908e-01 1.3892795e-04]]
I guess this is relevant with activation or optimizer...
Am I correct? or how should I do for 1 and 0 data?
change linear to relu
the result becomes between [0.41842282 0.1275532 0. 0.4288069]
However still it is not 0 or 1....
Model output can not be discrete because it should be differentiable. Try to add something like that:
out_ = tf.cast(tf.math.greater(out_, 0.5), tf.int32)
It is not right prediction, but the accuracy depends on your data (e.g. if your data is random and there is no pattern - then you get 6% accuracy). Try to train based on only [[1,0,0,0] [1,0,0,1],[1,0,1,0]] to be sure that your model works.
When training my binary neural network I'm observing something curious. Despite the test and training data and labels being balanced and symmetric, the network's predictions are not.
After 100 epochs this is what I get:
1 prediction: 0.89635 0 prediction: 0.4742
I was expecting an even 0.5, 0.5 split.
Why does the network skew towards one side?
My network is trying to predict the winner in a basketball game given an input vector of the scores of all 10 players. The output is a sigmoid indicating whether team 1 is winning. The network should be symmetric, i.e if [team1_scores,team2_scores] = 1 then [team2_scores,team1_scores] = 0. To ensure this I flip the training data and labels so that the winning and the losing team are in both places in the input vector.
Here is my code:
from tflearn.layers.core import fully_connected, input_data
from tflearn.layers.estimator import regression
import tflearn
import numpy as np
#flip data so that [team1_scores, team2_scores] becomes [team2_scores, team1_scores]
def flip(x):
return np.concatenate([x[:,5:], x[:,:5]], axis=1)
#this function interweaves 2 vectors so that [0,0,0] and [1,1,1] becomes [0,1,0,1,0,1]
def interweave(a,b):
c = np.empty((a.shape[0] + b.shape[0],a.shape[1]), dtype=a.dtype)
c[0::2] = a
c[1::2] = b
return c
net = input_data(shape=[None, 10])
net = fully_connected(net, 32, activation='relu')
net = fully_connected(net, 16, activation='relu')
net = fully_connected(net, 1, activation='sigmoid')
net = regression(net, shuffle_batches=True, loss='binary_crossentropy')
model = tflearn.DNN(net)
x = np.load("scores.npy")
x_flipped = flip(x)
#x is sorted such that the winning team always comes first in the input vector, so the labels are all 1
y = np.ones((x.shape[0], 1))
y_flipped = np.zeros((x.shape[0], 1))
x_symmetric = interweave(x, x_flipped)
y_symmetric = interweave(y, y_flipped)
for epoch in range(100):
model.fit(x_symmetric, y_symmetric, n_epoch=1, shuffle=True, validation_set=None, show_metric=True, batch_size=128)
acc_reg = model.evaluate(x, y)[0]
acc_flip = model.evaluate(x_flipped, y_flipped)[0]
print(f"1 prediction: {acc_reg} 0 prediction: {acc_flip}")
And here is the training data: scores.npy
The training data is standardized and sorted so that the winning team comes before the losing team. Thus all labels are 1
I have a dataset that looks like this:
df.head(5)
data labels
0 [0.0009808844009380855, 0.0008974465127279559] 1
1 [0.0007158940267629654, 0.0008202958833774329] 3
2 [0.00040971929722210984, 0.000393972522972382] 3
3 [7.916243163372941e-05, 7.401835468434177e243] 3
4 [8.447556379936086e-05, 8.600626393842705e-05] 3
The 'data' column is my X and the labels is y. The df has 34890 rows. Each row contains 2 floats. The data represents a bunch of sequential text and each observation is a representation of a sentence. There are 5 classes.
I am training it on this LSTM code:
data = df.data.values
labels = pd.get_dummies(df['labels']).values
X_train, X_test, y_train, y_test = train_test_split(data,labels, test_size = 0.10, random_state = 42)
X_train = X_train.reshape((X_train.shape[0],1,X_train.shape[1])) # shape = (31401, 1, 5)
X_test = X_test.reshape((X_test.shape[0],1,X_test.shape[1])) # shape = (3489, 1, 5)
### y_train shape = (31401, 5)
### y_test shape = (3489, 5)
### Bi_LSTM
Bi_LSTM = Sequential()
Bi_LSTM.add(layers.Bidirectional(layers.LSTM(32)))
Bi_LSTM.add(layers.Dropout(.5))
# Bi_LSTM.add(layers.Flatten())
Bi_LSTM.add(Dense(11, activation='softmax'))
def compile_and_fit(history):
history.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
history = history.fit(X_train,
y_train,
epochs=30,
batch_size=32,
validation_data=(X_test, y_test))
return history
LSTM_history = compile_and_fit(Bi_LSTM)
The model trains, but the val accuracy is fixed at 53% for every epoch. I am assuming this is because of my class imbalance problem (1 class takes up ~53% of the data, the other 4 are somewhat evenly distributed throughout the remaining 47%).
How do I balance my data? I am aware of typical over/under sampling techniques on non-time series data, but I can't over/under sample because that would mess with the sequential time-series nature of the data. Any advice?
EDIT
I am attempting to use the class_weight argument in Keras to address this. I am passing this dict into the class_weight argument:
class_weights = {
0: 1/len(df[df.label == 1]),
1: 1/len(df[df.label == 2]),
2: 1/len(df[df.label == 3]),
3: 1/len(df[df.label == 4]),
4: 1/len(df[df.label == 5]),
}
Which I am basing off of this recommendation:
https://stats.stackexchange.com/questions/342170/how-to-train-an-lstm-when-the-sequence-has-imbalanced-classes
However, the acc/loss is now really awful. I get ~30% accuracy with a dense net, so I expected the LSTM to be an improvement. See acc/loss curves below:
Keras/Tensorflow enable to use class_weight or sample_weights in model.fit method
class_weight affects the relative weight of each class in the calculation of the objective function. sample_weights, as the name suggests, allows further control of the relative weight of samples that belong to the same class
class_weight accepts a dictionary where you compute the weights of each class while sample_weights receive a univariate array of dim == len(y_train) where you assign specific weight to each sample
Hello i m trying to complete an assignment based on training a perceptron (without any hidden layer) to perform binary classification using sigmoid activation function. but due to some reason my code is not working correctly. although the error is decreasing after each epoch but accuracy is not increasing. i have target labels 1 and 0, but my predicted labels are almost all close to one. none of my predicted label is representing the 0 class.
below is my code. anyone please tell me what have i done wrong.
<# Create a Neural_Network class
class Neural_Network(object):
def __init__(self,inputSize = 2,outputSize = 1 ):
# size of layers
self.inputSize = inputSize
self.outputSize = outputSize
#weights
self.W1 = 0.01*np.random.randn(inputSize+1, outputSize) # randomly initialize W1 using random function of numpy
# size of the wieght will be (inputSize +1, outputSize) that +1 is for bias
def feedforward(self, X): #forward propagation through our network
n,m=X.shape
Xbias = np.ones((n,1)) #bias term in input
Xnew = np.hstack((Xbias,X)) #adding biasterm in input to match the dimension with the weigth
self.product=np.dot(Xnew,self.W1) # dot product of X (input) and set of weights
output=self.sigmoid(self.product) # apply activation function (i.e. sigmoid)
return output # return your answer with as a final output of the network
def sigmoid(self, s):# apply sigmoid function on s and return its value
return (1./(1. + np.exp(-s))) #activation sigmoid function
def sigmoid_derivative(self, s):#derivative of sigmoid
#derivative of sigmoid = sigmoid(x)*(1-sigmoid(x))
return s*(1-s) # here s will be sigmoid(x)
def backwardpropagate(self,X, Y, y_pred, lr):
# backward propagate through the network
# compute error in output which is loss, compute cross entropy loss function
self.output_error=self.crossentropy(Y,y_pred) #output error
# applying derivative of sigmoid to the error
self.error_deriv=self.output_error*self.sigmoid_derivative(y_pred)
# adjust set of weights
n,m=X.shape
Xbias = np.ones((n,1)) #bias term in input
Xnew = np.hstack((Xbias,X)) #adding biasterm in input to match the dimension with the weigth
self.W1 += lr*(Xnew.T.dot(self.error_deriv)) # W1=W1+ learningrate*errorderiv*input
#self.W1 += X.T.dot(self.z2_delta)
def crossentropy(self, Y, Y_pred):
# compute error based on crossentropy loss
#Cross entropy= sum(Y_actual*log(y_predicted))/N. here 1e-6 is used to avoid log 0
N = Y_pred.shape[0]
#cr_entropy=-np.sum(((Y*np.log(Y_pred+1e-6))+((1-Y)*np.log(1-Y_pred+1e-6))))/N
cr_entropy=-np.sum(Y*np.log(Y_pred+1e-6))/N
return cr_entropy #error
Null=None
def train(self, trainX, trainY,epochs = 100, learningRate = 0.001, plot_err = True ,validationX = Null, validationY = Null):
tr_error=[]
for i in range(epochs):
# feed forward trainX and trainY and recievce predicted value
y_predicted=self.feedforward(trainX)
print(i,y_predicted)
# backpropagation with trainX, trainY, predicted value and learning rate.
self.backwardpropagate(trainX,trainY,y_predicted,learningRate)
tr_error.append(self.output_error)
print(i,self.output_error)
print(i,self.W1)
# """"""if validationX and validationY are not null than show validation accuracy and error of the model.""""""
# plot error of the model if plot_err is true
epocharray=range(0,epochs)
plt.plot(epocharray,tr_error,'r',linewidth=3.0) #plotting error vs. no. of epochs
plt.xlabel('No. of Epochs')
plt.ylabel('Cross Entropy Error')
plt.title('Error Vs. Epoch')
def predict(self, testX):
# predict the value of testX
self.ytest_pred=self.feedforward(testX)
def accuracy(self, testX, testY):
import math
# predict the value of trainX
self.ytest_pred1=self.feedforward(testX)
acc=0
# compare it with testY
for j in range(len(testY)):
q=math.ceil(self.ytest_pred1[j])
#p=round(q)
if testY[j] == q:
acc +=1
accuracy=acc/float(len(testX))*100
print("Percentage Accuracy is", accuracy,"%")
# compute accuracy, print it and """"""show in the form of picture""""""
return accuracy # return accuracy>
# generating dataset point
np.random.seed(1)
no_of_samples = 2000
dims = 2
#Generating random points of values between 0 to 1
class1=np.random.rand(no_of_samples,dims)
#To add separability we will add a bias of 1.1
class2=np.random.rand(no_of_samples,dims)+1.1
class_1_label=np.array([1 for n in range(no_of_samples)])
class_2_label=np.array([0 for n in range(no_of_samples)])
#Lets visualize the dataset
plt.scatter(class1[:,0],class1[:,1], marker='^', label="class 1")
plt.scatter(class2[:,0],class2[:,1], marker='o', label="class 2")
plt.xlabel('Dimension 1')
plt.ylabel('Dimension 2')
plt.legend(loc='best')
plt.show()
from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split
# Data concatenation
data = np.concatenate((class1,class2),axis=0)
label = np.concatenate((class_1_label,class_2_label),axis=0)
#Note: shuffle this dataset before dividing it into three parts
data,label=shuffle(data,label)
#print(data)
# now using train_test_split command to split data into 60% training data, 20% testing data and 20% validation data
trainX, testX, trainY, testY = train_test_split(data, label, test_size=0.2, random_state=1)
trainX, validX, trainY, validY = train_test_split(trainX, trainY, test_size=0.25, random_state=1)
model = Neural_Network(2,1)
# try different combinations of epochs and learning rate
model.train(trainX, trainY, epochs = 100, learningRate = 0.000001, validationX = validX, validationY = validY)
model.accuracy( testX,testY)
the Results are coming like this(no label going near 0)
0 [[0.49670809]
[0.4958389 ]
[0.4966064 ]
...
[0.49537492]
[0.49566927]
[0.4961255 ]]
0 828.1069658303942
0 [[0.48311074]
[0.51907406]
[0.52764299]]
1 [[0.69813116]
[0.91746189]
[0.80408611]
...
[0.74821077]
[0.87150079]
[0.75187736]]
1 250.96538025031356
1 [[0.56983781]
[0.59205773]
[0.60057486]]
2 [[0.72602796]
[0.94067579]
[0.83591236]
...
[0.77916283]
[0.90032058]
[0.78291184]]
2 210.645081151866
2 [[0.63353102]
[0.64265939]
[0.65118627]]
3 [[0.74507968]
[0.95318096]
[0.85588864]
...
[0.79953834]
[0.91705918]
[0.80329027]]
3 186.2933734713245
3 [[0.6846678 ]
[0.68164316]
[0.69020355]]
4 [[0.75952936]
[0.96114086]
[0.87010085]
...
[0.81456476]
[0.92830628]
[0.81829009]]
4 169.32091332021724
4 [[0.72771826]
[0.71342293]
[0.72202744]]
5 [[0.77112943]
[0.96669774]
[0.88093323]
...
[0.82635507]
[0.93649788]
[0.83004119]]
5 156.53923256347372
Please help me to solve this problem
I see you have set learning rate too small. Set it to 0.001 and Increase epoch to 20k and you will see your model learning well.
Plotting error vs epoch's should give you better idea where to stop.
I train a convolutional neural network (CNN) with TensorFlow. When the training is finished I calculate the accuracy with the following code:
...
correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))
eval_batch_size = 1
good = 0
total = 0
for i in range(int(mnist.test.num_examples/eval_batch_size)):
testSet = mnist.test.next_batch(eval_batch_size, shuffle=False)
good += accuracy.eval(feed_dict={ x: testSet[0], y: testSet[1]})
total += testSet[0].shape[0]
accuracy_eval = good/total
For “good” I get the value 1.0 when the test image is correct detected and the value 0.0 if not.
I want get the values for all ten output-nodes. For example, I evaluate a test-image with a handwritten “8” so maybe the output-node for the number “8” is 0.6 and for the number “3” is 0.3 and for “5” is 0.05 and the last 0.05 spread out over the seven other output-nodes.
So how I get all this ten values for each test image in TensorFlow?
You can do that by adding the following line:
pred=prediction.eval(feed_dict={ x: testSet[0], y: testSet[1]})
right after
testSet = mnist.test.next_batch(eval_batch_size, shuffle=False)
Then pred will be an array that contains 1 probability vector, and this is the vector you are interested in.