Im trying to write a multi-class perceptron algorithm for the MNIST dataset.
now I have the following code which works, but due to the fact its iterating 60k times it works slowly.
weights is the size - (785,10)
def multiClassPLA(train_data, train_labels, weights):
epoch_err = [] # will hold the misclassified ratio for each epoch
best_weights = weights
best_error = 1
for epoch in range(EPOCH):
err = 0
# randomize the data before each epoch
train_data, train_labels = randomizeData(train_data, train_labels)
for x, y in zip(train_data, train_labels):
h = oneVsAllLabeling_(np.dot(weights, x))
diff = (y - h) / 2
x = x.reshape(1, x.shape[0])
diff = diff.reshape(CLASSES, 1)
update_step = ETA * np.dot(diff, x)
weights += update_step
return weights
oneVsAllLabeling_(X) function returns a vector which contain 1 at the argmax and -1 elsewhere. the truth labels has the same form of course.
with this algorithm I'm getting ~90% accuracy, safe but slowly.
after further exploration of the problem I found that I can improve the code using array/matrix multiplication.
so I've started to do the following:
def oneVsAllLabeling(X):
idx = np.argmax(X, axis=1)
mask = np.zeros(X.shape, dtype=bool)
mask[np.arange(len(idx)),idx] = 1
out = 2 * mask - 1
return out.astype(int)
def zeroOneError(prediction):
tester = np.zeros((1, CLASSES))
good_prediction = len(np.where(prediction == tester))
return len(prediction) - good_prediction
def preceptronModelFitting(data, weights, labels, to_print, epoch=None):
prediction = np.matmul(data, weights)
prediction = oneVsAllLabeling(prediction)
diff = (prediction - labels) / 2
error = zeroOneError(diff)
accuracy = error / len(data)
if to_print:
print("Epoch: {}. Loss: {}. Accuracy: {}".format(epoch, error, accuracy))
return prediction, error, accuracy
def multiClassPLA2(train_data, train_labels, test_data, test_labels, weights):
predicted_output = np.zeros((1, CLASSES))
train_loss_vec = np.array([])
train_accuracy_vec = np.array([])
test_loss_vec = np.array([])
test_accuracy_vec = np.array([])
for epoch in range(EPOCH):
# randomize the data before each epoch
train_data, train_labels = randomizeData(train_data, train_labels)
train_prediction, train_error, train_accuracy = preceptronModelFitting(train_data, weights, train_labels, to_print=False)
return weights
after calling preceptronModelFitting() I get a matrix the in the size (60k,10) which every entry has the following shape:
train_prediction[0]=[0,0,1,0,0,-1,0,0,0,0]
and the data has the shape (60k, 785)
now what I need to do is, if possible, to multiply each row with each of the data entries and sum so that in total what ill get is a matrix the size (785,10) which I can update with it the old set of weights.
which its almost equivalent to what I do in the not efficient algorithm, the only differance is that I update the weights every new data entry instead of after seeing all the data.
Thanks!
OK you did most of the job done, and even you had part of the answer in your title.
np.matmul(X.T, truth - prediction)
Using this, it will get you what you want in a one line.
notice that this based on the fact that indeed truth, prediction, X are as you mentioned.
Related
I am using a simple GAN model from an example found in github https://github.com/paperd/deep-learning-models/blob/main/chapter10/ch10.ipynb.
In the example below, a custom function is created to train the gan. The codings_size is the dimension of the random noise
generated. In the function, random noise is generated each time. However, i want to use my own set of vectors, shape [6000, 208]. Here the shape of the random noise is each time (32, 30) (32 is the batch size, and 30 is the codings_size which is the number of columns. In my case is 208 columns with 6000 rows. My problem is how to loop through my vector for each epoch. I am a bit confused.
Do i have to replace the following with a for loop running through my vector and then add the batch_size in order to iterate through the next batch?:
`
noise = tf.random.normal(
shape=[batch_size, codings_size])
generated_images = generator(noise)
`
def train_gan(gan, dataset, batch_size,
codings_size, n_epochs=50):
generator, discriminator = gan.layers
for epoch in range(n_epochs):
print('Epoch {}/{}'.format(epoch + 1, n_epochs))
for X_batch in dataset:
# phase 1 - training the discriminator
noise = tf.random.normal(
shape=[batch_size, codings_size])
generated_images = generator(noise)
X_fake_and_real = tf.concat(
[generated_images, X_batch], axis=0)
y1 = tf.constant([[0.]] * batch_size + [[1.]] * batch_size)
discriminator.trainable = True
discriminator.train_on_batch(X_fake_and_real, y1)
# phase 2 - training the generator
noise = tf.random.normal(
shape=[batch_size, codings_size])
y2 = tf.constant([[1.]] * batch_size)
discriminator.trainable = False
gan.train_on_batch(noise, y2)
plot_multiple_images(generated_images, 8)
plt.show()
I added comments in the following function in order to check how to use my own vector
`
# Creating a custom loop for Training
# Since the training loop is unusual, we can't use the regular
# fit method. Instead, we create a custom loop that needs a
# Dataset to iterate through images.
# import my dataset
from numpy import genfromtxt
diffvoltages = genfromtxt('DiffVoltages.csv', delimiter=',')
diffvoltages.shape
# the output is (6000, 208)
def train_gan(gan, dataset, batch_size, codings_size, n_epochs=50):
generator, discriminator = gan.layers
for epoch in range(n_epochs):
print('Epoch {}/{}'.format(epoch + 1, n_epochs))
for X_batch in dataset:
# phase 1 - training the discriminator
# noise = tf.random.normal(shape=[batch_size, codings_size])
# instead of using random.normal i want to use the diffvoltages shown below
noise = diffvoltages[batch_size]
generated_images = generator(noise)
X_fake_and_real = tf.concat(
[generated_images, X_batch], axis=0)
y1 = tf.constant([[0.]] * batch_size + [[1.]] * batch_size)
discriminator.trainable = True
discriminator.train_on_batch(X_fake_and_real, y1)
# phase 2 - training the generator
# noise = tf.random.normal(
shape=[batch_size, codings_size])
# This is the vector i want to use, but for the next loop i have to skip batch size i suppose. This is my problem. I am not sure how to proceed. Should i use for loop in both cases or is there i simpler way to do it?
noise = diffvoltages[batch_size]
y2 = tf.constant([[1.]] * batch_size)
discriminator.trainable = False
gan.train_on_batch(noise, y2)
# noise + batch_size?
plot_multiple_images(generated_images, 8)
plt.show()
`
Thank you for reading my post.
I’m currently developing the peak detection algorithm using CNN to determine the ideal convolution kernel which is representable as the ideal mother wavelet function that will maximize the peak detection accuracy.
To begin with, I created my own IoU loss function and the simple model and tried to run the learning. The execution itself worked without any errors, but somehow it failed.
The parameters of the model with custom loss function doesn't upgraded thorough its learning over epochs
My own loss function is described as below.
def IoU(inputs: torch.Tensor, labels: torch.Tensor,
smooth: float=0.1, threshold: float = 0.5, alpha: float = 1.0):
'''
- alpha: a parameter that sharpen the thresholding.
if alpha = 1 -> thresholded input is the same as raw input.
'''
thresholded_inputs = inputs**alpha / (inputs**alpha + (1 - inputs)**alpha)
inputs = torch.where(thresholded_inputs < threshold, 0, 1)
batch_size = inputs.shape[0]
intersect_tensor = (inputs * labels).view(batch_size, -1)
intersect = intersect_tensor.sum(-1)
union_tensor = torch.max(inputs, labels).view(batch_size, -1)
union = union_tensor.sum(-1)
iou = (intersect + smooth) / (union + smooth) # We smooth our devision to avoid 0/0
iou_score = iou.mean()
return 1- iou_score
and my training model is,
class MLP(nn.Module):
def __init__(self):
super().__init__()
self.net = nn.Sequential(
nn.Conv1d(1, 1, kernel_size=32, stride=1, padding=16),
nn.Linear(257, 256),
nn.LogSoftmax(1)
)
def forward(self, x):
return self.net(x)
model = MLP()
opt = optim.Adadelta(model.parameters())
# initialization of the kernel of Conv1d
def init_kernel(m):
if type(m) == nn.Conv1d:
nn.init.kaiming_normal_(m.weight)
print(m.weight)
plt.plot(m.weight[0][0].detach().numpy())
model.apply(init_kernel)
def step(x, y, is_train=True):
opt.zero_grad()
y_pred = model(x)
y_pred = y_pred.reshape(-1, 256)
loss = IoU(y_pred, y)
loss.requires_grad = True
loss.retain_grad = True
if is_train:
loss.backward()
opt.step()
return loss, y_pred
and lastly, the execution code is,
from torch.autograd.grad_mode import F
train_loss_arr, val_loss_arr = [], []
valbose = 10
epochs = 200
for e in range(epochs):
train_loss, val_loss, acc = 0., 0., 0.,
for x, y in train_set.as_numpy_iterator():
x = torch.from_numpy(x)
y = torch.from_numpy(y)
model.train()
loss, y_pred = step(x, y, is_train=True)
train_loss += loss.item()
train_loss /= len(train_set)
for x, y ,in val_set.as_numpy_iterator():
x = torch.from_numpy(x)
y = torch.from_numpy(y)
model.eval()
with torch.no_grad():
loss, y_pred = step(x, y, is_train=False)
val_loss += loss.item()
val_loss /= len(val_set)
train_loss_arr.append(train_loss)
val_loss_arr.append(val_loss)
# visualize current kernel to check whether the learning is on progress safely.
if e % valbose == 0:
print(f"Epoch[{e}]({(e*100/epochs):0.2f}%): train_loss: {train_loss:0.4f}, val_loss: {val_loss:0.4f}")
fig, axs = plt.subplots(1, 4, figsize=(12, 4))
print(y_pred[0], y_pred[0].shape)
axs[0].plot(x[0][0])
axs[0].set_title("spectra")
axs[1].plot(y_pred[0])
axs[1].set_title("y pred")
axs[2].plot(y[0])
axs[2].set_title("y true")
axs[3].plot(model.state_dict()["net.0.weight"][0][0].numpy())
axs[3].set_title("kernel1")
plt.show()
with these programs, I tried to evaluate this simple model, however, model parameters didn't change at all over epochs.
Visualization of the results at epoch 0 and 30.
epoch 0:
prediction and kernel at epoch0
epoch 30:
prediction and kernel at epoch30
As you can see, the kernel has not be modified through its learning over epochs.
I took a survey to figure out what causes this problem for hours but I'm still not sure how to fix my loss function and model into trainable ones.
Thank you.
Try printing the gradient after loss.backward() with:
y_pred.grad()
I suspect what you'll find is that after a backward pass, the gradient of y_pred is zero. This means that either a.) gradient is not enabled for one or more of the variables at which the computation graph has a node, or b.) (more likely) you are using an operation which is not differentiable.
In your case, at a minimum torch.where is non-differentiable, so you'll need to replace that. Thersholding operations are non-differentiable and are generally replaced with "soft" thresholding operations (see Softmax instead of max function for classification) so that gradient computation still works. Try replacing this with a soft threshold or no threshold at all.
I am new to tensorflow, i am trying to use Linear regression technique to train my module, but the function results a tensor of Nans! Here is the code
That's how i read the dataset
train_x = np.asanyarray(df[['Fat']]).astype(np.float32)
train_y = np.asanyarray(df[['Calories']]).astype(np.float32)
the weights initialization
a = tf.Variable(20.0)
b = tf.Variable(10.0)
the linear regression function
#tf.function
def h(x):
y = a*x +b
return y
the cost function
#tf.function
def costFunc(y_predicted,train_y):
return tf.reduce_mean(tf.square(y_predicted-train_y))
the module training
learning_rate = 0.01
train_data = []
loss_values =[]
a_values = []
b_values = []
# steps of looping through all your data to update the parameters
training_epochs = 200
train model
for epoch in range(training_epochs):
with tf.GradientTape() as tape:
y_predicted = h(train_x)
loss_value = loss_object(train_y,y_predicted)
loss_values.append(loss_value)
get gradients
gradients = tape.gradient(loss_value, [b,a])
# compute and adjust weights
a_values.append(a.numpy())
b_values.append(b.numpy())
b.assign_sub(gradients[0]*learning_rate)
a.assign_sub(gradients[1]*learning_rate)
if epoch % 5 == 0:
train_data.append([a.numpy(), b.numpy()])
but when i print (a*train_x) the result is Nans tensor
UPDATE
I found that the problem is in the dataset, when i changed the dataset it gives tensor of numbers, but i still don't know what is the problem with the first dataset
I am sorry, the mistake is silly, i had to re-initialize the variables every time by running the cells each time i run the script, because the variables had a value which results in infinite values
I estimate ratings in a user-item matrix by decomposing the matrix into two matrices P and Q using PyTorch Matrix Factorization. I got my loss function L(X-PQ).
Let's say rows of X correspond to users, and x is new user's row, so that new X is X concatenated with x.
Now I want to minimize L(X' - P'Q) = L(X - PQ) + L(x - x_pQ). Since I have already trained P and Q.
I want to train x_p that is the new user's row, but leave Q fixed.
So my question would be, is there a way in PyTorch to train MatrixFactorization model for P with fixed Q?
Code I'm working with:
class MatrixFactorizationWithBiasXavier(nn.Module):
def __init__(self, num_people, num_partners, bias=(-0.01, 0.01), emb_size=100):
super(MatrixFactorizationWithBiasXavier, self).__init__()
self.person_emb = nn.Embedding(num_people, emb_size)
self.person_bias = nn.Embedding(num_people, 1)
self.partner_emb = nn.Embedding(num_partners, emb_size)
self.parnter_bias = nn.Embedding(num_partners, 1)
torch.nn.init.xavier_uniform_(self.person_emb.weight)
torch.nn.init.xavier_uniform_(self.partner_emb.weight)
self.person_bias.weight.data.uniform_(bias[0], bias[1])
self.parnter_bias.weight.data.uniform_(bias[0], bias[1])
def forward(self, u, v):
u = self.person_emb(u)
v = self.partner_emb(v)
bias_u = self.person_bias(u).squeeze()
bias_v = self.parnter_bias(v).squeeze()
# calculate dot product
# u*v is a element wise vector multiplication
return torch.sigmoid((u*v).sum(1) + bias_u + bias_v)
def test(model, df_test, verbose=False):
model.eval()
# .to(dev) puts code on either gpu or cpu.
people = torch.LongTensor(df_test.id.values).to(dev)
partners = torch.LongTensor(df_test.pid.values).to(dev)
decision = torch.FloatTensor(df_test.decision.values).to(dev)
y_hat = model(people, partners)
loss = F.mse_loss(y_hat, decision)
if verbose:
print('test loss %.3f ' % loss.item())
return loss.item()
def train(model, df_train, epochs=100, learning_rate=0.01, weight_decay=1e-5, verbose=False):
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate, weight_decay=weight_decay)
model.train()
for epoch in range(epochs):
# From numpy to PyTorch tensors.
# .to(dev) puts code on either gpu or cpu.
people = torch.LongTensor(df_train.id.values).to(dev)
partners = torch.LongTensor(df_train.pid.values).to(dev)
decision = torch.FloatTensor(df_train.decision.values).to(dev)
# calls forward method of the model
y_hat = model(people, partners)
# Using mean squared errors loss function
loss = F.mse_loss(y_hat, decision)
optimizer.zero_grad()
loss.backward()
optimizer.step()
if verbose and epoch % 100 == 0:
print(loss.item())
Found a solution.
Turns out I can register a hook on my embedding(partner emb, that is my Q) (that i want to stay fixed) so that it says it's gradient is zeroed.
mask = torch.zeros_like(mf_model.partner_emb.weight)
mf_model.partner_emb.weight.register_hook(lambda grad: grad*mask)
I'm trying to use TensorFlow in python, to make some prediction with cryptocurrency data. The problem is that the output of the prediction is like a 0.1-0.9 number whereas the cryptocurrency data should be a 10000-10100 format, and I don't find a solution to convert the 0.* number to the real one.
I've try to create a ratio, with substrat max - min from predicted values, and max-min from tested data, and divide to have a ratio but when I multiply this ratio with prediction there is a big rate of error ( found a 14000 number instead of a 10000 one )
Here some code :
train_start = 0
train_end = int(np.floor(0.7*n))
test_start = train_end
test_end = n
data_train = data[np.arange(train_start, train_end), :]
data_test = data[np.arange(test_start, test_end), :]
Scale data:
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
data_train = scaler.fit_transform(data_train)
data_test = scaler.transform(data_test)
Build X and y:
X_train = data_train[:, 1:]
y_train = data_train[:, 0]
X_test = data_test[:, 1:]
y_test = data_test[:, 0]
.
.
.
n_data = 10
n_neurons_1 = 1024
n_neurons_2 = 512
n_neurons_3 = 256
n_neurons_4 = 128
n_target = 1
X = tf.compat.v1.placeholder(dtype=tf.compat.v1.float32, shape=[None, n_data])
Y = tf.compat.v1.placeholder(dtype=tf.compat.v1.float32, shape=[None])
Hidden layer
..
Output layer (must be transposed)
..
Cost function
..
Optimizer
..
Make Session:
sess = tf.compat.v1.Session()
Run initializer:
sess.run(tf.compat.v1.global_variables_initializer())
Setup interactive plot:
plt.ion()
fig = plt.figure()
ax1 = fig.add_subplot(111)
line1, = ax1.plot(y_test)
line2, = ax1.plot(y_test*0.5)
plt.show()
epochs = 10
batch_size = 256
for e in range(epochs):
# Shuffle training data
shuffle_indices = np.random.permutation(np.arange(len(y_train)))
X_train = X_train[shuffle_indices]
y_train = y_train[shuffle_indices]
# Minibatch training
for i in range(0, len(y_train) // batch_size):
start = i * batch_size
batch_x = X_train[start:start + batch_size]
batch_y = y_train[start:start + batch_size]
# Run optimizer with batch
sess.run(opt, feed_dict={X: batch_x, Y: batch_y})
# Show progress
if np.mod(i, 5) == 0:
# Prediction
pred = sess.run(out, feed_dict={X: X_test})
#This pred var is the output of the prediction
I persiste my result in a file and this is what its looks like :
2019-08-21 06-AM;15310.444858356934;0.50021994;
2019-08-21 12-PM;14287.717187390663;0.46680558;
2019-08-21 06-PM;14104.63871795706;0.46082407;
For example, the last prediction is 0,46 but when I try to convert it I found 14104 whereas it should be nearer a 10000 value
Does anyone have an idea how to convert those predictions?
Thanks!
You will have to make use of inverse_transform of MinMaxScaler to convert back the output you are getting in range of 0-1.
You have not given your model, but I believe you are making use of regression task with few dense layers. You will have to keep minimizing your loss. If you are using mean squared error, the larger the loss, more is the likelihood your output will be far away from the desired set of results.
Even after your loss is a small number and the result is coming good for train samples, but the prediction is bad for test dataset, you may have to consider increasing your train dataset so that more possibilities are covered. If that is not possible, consider reducing the number of neurons in your neural network so that it stops over-fitting.
You can do some postprocessing to restrict the output to some desired range.