Updating weights in basic neural network - python

Problem:
I'm new to neural networks topic and today wanted to learn how to make my neural network to learn.
I'm trying to do an exercise found in the internet.
The sum of errors for all series should be:
1.501535 but I am getting 7.394650000000001, so I thought that my weights are not updating. And that's exactly the issue, but I have no idea how to update the weights correctly.
Code:
import numpy as np
def calculate(input_numbers: list[float], weights: np.array, iterations: int, alpha: float, goal: list[float]):
num_rows, num_cols = weights.shape
if not len(input_numbers) == num_cols:
print("Wrong matrix")
return 0
error = 0
prediction = np.dot(input_numbers, np.transpose(weights))
delta = prediction - goal
weights_delta = np.outer(delta, input_numbers)
weights = weights - (weights_delta * alpha)
error = error + (pow(prediction - goal, 2))
print("\nXXXXXXXXXXXXXXXX SUMMARY XXXXXXXXXXXXXXXXX")
print("delta :" + str(delta))
print("weights_delta :" + str(weights_delta))
print("Weights : " + str(weights))
print("error : " + str(error))
return np.sum(error)
input_1 = [8.5, 0.65, 1.2]
goal_1 = [0.1, 1, 0.1]
input_2 = [9.5, 0.8, 1.3]
goal_2 = [0, 1, 0]
input_3 = [9.9, 0.8, 0.5]
goal_3 = [0, 0, 0.1]
input_4 = [9.0, 0.9, 1.0]
goal_4 = [0.1, 1, 0.2]
weights_matrix = np.array([[0.1, 0.1, -0.3], [0.1, 0.2, 0.0], [0.0, 1.3, 0.1]])
for x in range(50):
print("\nITERATION: ", x)
error_sum = calculate(input_1, weights_matrix, 1, 0.01, goal_1)
error_sum = error_sum + calculate(input_2, weights_matrix, 1, 0.01, goal_2)
error_sum = error_sum + calculate(input_3, weights_matrix, 1, 0.01, goal_3)
error_sum = error_sum + calculate(input_4, weights_matrix, 1, 0.01, goal_4)
print("TOTAL ERROR: ", error_sum)
Would someone be so kind to guide me on where and how I should update the weights? I have tried returning weights in calculate(), but results were totally wrong, so I guess it should be done in a different way.

Ok, so, I don't know why this is a thing, probably something to do with value assignment. Like, in the first case, it defines a new variable weights, and in the second line it changes the array you've passed.
Anyway, change
weights = weights - (weights_delta * alpha)
to
weights -= (weights_delta * alpha)

Related

Python: Optimize weights in portfolio

I have the following dataframe with weights:
df = pd.DataFrame({'a': [0.1, 0.5, 0.1, 0.3], 'b': [0.2, 0.4, 0.2, 0.2], 'c': [0.3, 0.2, 0.4, 0.1],
'd': [0.1, 0.1, 0.1, 0.7], 'e': [0.2, 0.1, 0.3, 0.4], 'f': [0.7, 0.1, 0.1, 0.1]})
and then I normalize each row using:
df = df.div(df.sum(axis=1), axis=0)
I want to optimize the normalized weights of each row such that no weight is less than 0 or greater than 0.4.
If the weight is greater than 0.4, it will be clipped to 0.4 and the additional weight will be distributed to the other entries in a pro-rata fashion (meaning the second largest weight will receive more weight so it gets close to 0.4, and if there is any remaining weight, it will be distributed to the third and so on).
Can this be done using the "optimize" function?
Thank you.
UPDATE: I would also like to set a minimum bound for the weights. In my original question, the minimum weight bound was automatically considered as zero, however, I would like to set a constraint such that the minimum weight is at at least equal to 0.05, for example.
Unfortunately, I can only find a loop solution to this problem. When you trim off the excess weight and redistribute it proportionally, the underweight may go over the limit. Then they have to be trimmed off. And the cycle keep repeating until no value is overweight. The same goes for underweight rows.
# The original data frame. No normalization yet
df = pd.DataFrame(
{
"a": [0.1, 0.5, 0.1, 0.3],
"b": [0.2, 0.4, 0.2, 0.2],
"c": [0.3, 0.2, 0.4, 0.1],
"d": [0.1, 0.1, 0.1, 0.7],
"e": [0.2, 0.1, 0.3, 0.4],
"f": [0.7, 0.1, 0.1, 0.1],
}
)
def ensure_min_weight(row: np.array, min_weight: float):
while True:
underweight = row < min_weight
if not underweight.any():
break
missing_weight = min_weight * underweight.sum() - row[underweight].sum()
row[~underweight] -= missing_weight / row[~underweight].sum() * row[~underweight]
row[underweight] = min_weight
def ensure_max_weight(row: np.array, max_weight: float):
while True:
overweight = row > max_weight
if not overweight.any():
break
excess_weight = row[overweight].sum() - (max_weight * overweight.sum())
row[~overweight] += excess_weight / row[~overweight].sum() * row[~overweight]
row[overweight] = max_weight
values = df.to_numpy()
normalized = values / values.sum(axis=1)[:, None]
min_weight = 0.15 # just for fun
max_weight = 0.4
for i in range(len(values)):
row = normalized[i]
ensure_min_weight(row, min_weight)
ensure_max_weight(row, max_weight)
# Normalized weight
assert np.isclose(normalized.sum(axis=1), 1).all(), "Normalized weight must sum up to 1"
assert ((min_weight <= normalized) & (normalized <= max_weight)).all(), f"Normalized weight must be between {min_weight} and {max_weight}"
print(pd.DataFrame(normalized, columns=df.columns))
# Raw values
# values = normalized * values.sum(axis=1)[:, None]
# print(pd.DataFrame(values, columns=df.columns))
Note that this algorithm will run into infinite loop if your min_weight and max_weight are illogical: try min_weight = 0.4 and max_weight = 0.5. You should handle these errors in the 2 ensure functions.

How to get the probability of given predictions for Sequential model Sklearn

I have a Sequential, Convolutional neural network model which is being trained on sklearns handwritten digit dataset, Im now evaluating the model and attempting to create an Roc curve from scratch, This requires me the get the Tpr and Fpr for my test data so I can plot it according to given threshold(From what i understand). Im trying to get the prediction probability for each prediction but I cannot seem to figure out how, I have tried predict_proba() but I think i have the wrong model, Is there a way to do this, if not how Can i go about plotting my Roc; Do i have to make a conf. matrix for each prediction, this seems long winded?
I found an Roc curve function on another post Ill display below, im trying to obtain 'score'
score = np.array([0.9, 0.8, 0.7, 0.6, 0.55, 0.54, 0.53, 0.52, 0.51, 0.505, 0.4, 0.39, 0.38, 0.37, 0.36, 0.35, 0.34, 0.33, 0.30, 0.1])
y = np.array([1,1,0, 1, 1, 1, 0, 0, 1, 0, 1,0, 1, 0, 0, 0, 1 , 0, 1, 0])
# false positive rate
fpr = []
# true positive rate
tpr = []
# Iterate thresholds from 0.0, 0.01, ... 1.0
thresholds = np.arange(0.0, 1.01, .01)
# get number of positive and negative examples in the dataset
P = sum(y)
N = len(y) - P
# iterate through all thresholds and determine fraction of true positives
# and false positives found at this threshold
for thresh in thresholds:
FP=0
TP=0
for i in range(len(score)):
if (score[i] > thresh):
if y[i] == 1:
TP = TP + 1
if y[i] == 0:
FP = FP + 1
fpr.append(FP/float(N))
tpr.append(TP/float(P))
plt.scatter(fpr, tpr)
plt.show()

Why does my neural network have extremely low weights after a few epochs?

I just started to learn about neural network and this is my first one. The problem is that the more data I have, the lower the weight become after 2-3 epochs which is unusual and this cause my NN to learn nothing.
To repodruce
In DataSet class, search for function CreateData and change nbofexample to something like 20, you'll see if you print the weights that they are in a normal range (evenly spaced between -1 and 1) but then if you set the nbofexample to something like 200, then after only 2 or 3 epochs, most of the weigths of the last layer will be extremely close from 0 and they will stay in that zone for the rest of the training. Obviously, this cause the NN to fail.
By the way, my NN is basically analyzing arrays of number between 0 and 9 divided by 10 as a normalization to check if the array is sorted. In the code below I put a lot of comments the code can be easily understand.
There's is probably an easy fix but I just don't get it :(
Here is the complete code if you want try it: (it's in python btw)
import numpy as np
import time
import random
import time
#This class is only used for creating the data if needed
class DataSet():
#check if sorted
def checkPossibility(A):
return sorted(A) == A
#will be used later for more complex problems (taken from the faster answer of a coding challenge on LeetCode)
#def checkPossibility(A):
# p = None
# for i in range(len(A) - 1):
# if A[i] > A[i+1]:
# if p is not None:
# return False
# p = i
# return (p is None or p == 0 or p == len(A)-2 or
# A[p-1] <= A[p+1] or A[p] <= A[p+2])
#returns inputs and outputs using my poorly written algorithm
def CreateData():
#settings
nbofchar=4
nbofexample=200
#initialize arrays
inputs = [0]*nbofchar;
output = [1]
#handling dumbness
if nbofexample>pow(10,nbofchar):
print("Too much data... resizing to max data")
nbofexample=pow(10,nbofchar)
elif nbofexample==0:
print("You need examples to train! (Error nbofexample==0)")
#if there is more than half of the max possible example being request, then create all possible examples and delete randomly until it's the requested size
if nbofexample>pow(10,nbofchar)/2:
#creating all possible examples
for i in range(1,pow(10,nbofchar)):
new_ex = [int(a) for a in str(i)]
while len(new_ex)<nbofchar:
new_ex=[0]+new_ex
inputs = np.vstack((inputs,np.dot(new_ex,1/10))) #normalization /10 so the value is between 0 and 1 ¯\_(ツ)_/¯
output = np.vstack((output,[int(DataSet.checkPossibility(new_ex))]))
#deleting
while len(inputs)>nbofexample:
index = random.randint(0,len(inputs)-1)
inputs = np.delete(inputs,index)
output = np.delete(output,index)
return inputs, output
#if there is less than half (or half) then, create example randomly until it's the requested size
else:
i=1
while i < nbofexample:
new_ex = [random.randint(0,9) for a in range(nbofchar)]
if sum(np.any(inputs)==new_ex)==0:
i+=1
inputs = np.vstack((inputs,np.dot(new_ex,1/10))) #normalization /10 so the value is between 0 and 1 ¯\_(ツ)_/¯
output = np.vstack((output,[int(DataSet.checkPossibility(new_ex))]))
return inputs, output
#assigning weights to each layer
class NeuLayer():
def __init__(self, nbofneuron, inputsperneuron):
self.weight = 2 * np.random.random((inputsperneuron,nbofneuron))-1
#the actual neural network
class NeuNet():
def __init__(self, layers):
self.layers = layers
def _sigmoid(self, x):
k = 1
return 1 / (1+np.exp(-x/k))
def _sigmoid_derivative(self, x):
return x * (1-x)
def train(self, training_set_inputs, training_set_outputs, nboftime):
#debug
timer1 = 0
if len(self.layers)<2: return
for iteration in range(nboftime):
delta = [0] * len(self.layers)
error = [0] * len(self.layers)
outputlayers = self.think(training_set_inputs)
#find deltas for each layer "i" (to be able to properly change weights)
for i in range(len(self.layers)-1,-1,-1):
if i==len(self.layers)-1:
error[i] = training_set_outputs - outputlayers[i]
else:
error[i] = np.dot(delta[i+1],self.layers[i+1].weight.T)
delta[i] = error[i] * self._sigmoid_derivative(outputlayers[i])
#assign weigths for each layer "i"
for i in range(len(self.layers)):
if i==0:
self.layers[0].weight += np.dot(training_set_inputs.T,delta[0])
else:
self.layers[i].weight += np.dot(outputlayers[i-1].T,delta[i])
#display progression and the test result
if Display_progression:
if timer1<time.time():
timer1=time.time()+delay
value = ((iteration+1)/nboftime)*100
test_input = np.array([.1,.2,.1,.1])
print('%.2f'%value+"% test_input = " + str(test_input) + " test_output = "+ str(self.think(test_input)[-1]))
#return output of each layer from an input
def think(self, input):
outforlayers = [None]*len(self.layers)
outforlayer = input
for i in range(len(self.layers)):
outforlayer = self._sigmoid(np.dot(outforlayer, self.layers[i].weight))
outforlayers[i] = outforlayer
return outforlayers
#datamaker
creating_data=True
train = True
if creating_data:
#creates files with inputs and their expected output
print("Start creating data...")
input, output = DataSet.CreateData();
print("Data created!")
file = open("data_input","wb")
np.save(file, input)
file.close;
file = open("data_output","wb")
np.save(file, output)
file.close;
if train:
default_data_set=False
if default_data_set:
#default training set
inp_training = np.array([[0, 0, 0, 0, 0], [0.1, 0, 0, 0, 0], [0, 0.1, 0, 0, 0], [0.1, 0.1, 0, 0, 0], [0, 0, 0.1, 0, 0], [0.1, 0, 0.1, 0, 0], [0, 0.1, 0.1, 0, 0], [0.1, 0.1, 0.1, 0, 0],
[0, 0, 0, 0.1, 0], [0.1, 0, 0, 0.1, 0], [0, 0.1, 0, 0.1, 0], [0.1, 0.1, 0, 0.1, 0], [0, 0, 0.1, 0.1, 0], [0.1, 0, 0.1, 0.1, 0], [0, 0.1, 0.1, 0.1, 0], [0.1, 0.1, 0.1, 0.1, 0],
[0, 0, 0, 0, 0.1], [0.1, 0, 0, 0, 0.1], [0, 0.1, 0, 0, 0.1], [0.1, 0.1, 0, 0, 0.1], [0, 0, 0.1, 0, 0.1], [0.1, 0, 0.1, 0, 0.1], [0, 0.1, 0.1, 0, 0.1], [0.1, 0.1, 0.1, 0, 0.1],
[0, 0, 0, 0.1, 0.1], [0.1, 0, 0, 0.1, 0.1], [0, 0.1, 0, 0.1, 0.1], [0.1, 0.1, 0, 0.1, 0.1], [0, 0, 0.1, 0.1, 0.1], [0.1, 0, 0.1, 0.1, 0.1], [0, 0.1, 0.1, 0.1, 0.1], [0.1, 0.1, 0.1, 0.1, 0.1]])
out_training = np.array([[0,0,0,0,0,0,0,1,
0,0,0,1,0,1,1,1,
0,0,0,1,0,1,1,1,
0,1,1,1,1,1,1,1]]).T
else:
print("Loading data files...")
file = open("data_input","rb")
inp_training = np.load(file)
file.close;
file = open("data_output","rb")
out_training = np.load(file)
file.close;
print("Done reading from data files!")
#debug
Display_progression = True;
delay = 1 #seconds
#initialize
np.random.seed(5)
netlayer_input = NeuLayer(10,len(inp_training[0]))
netlayer2 = NeuLayer(10,10)
netlayer3 = NeuLayer(10,10)
netlayer4 = NeuLayer(10,10)
netlayer_out = NeuLayer(len(out_training[0]),10)
All_layers = [netlayer_input,netlayer2,netlayer3,netlayer4,netlayer_out]
brain = NeuNet(All_layers)
#train
print("Start training...")
brain.train(inp_training, out_training, 100000)
print("Done!")
#final test
outputfinal = brain.think(np.array([0,.1,.3,.7]))
#output
a = outputfinal[-1] #[-1] so we get the last layer's output(s)
print(a)
Note
This is my first time asking a question on stackoverflow so tell me if I'm missing crucial information for this question.
Neural Networks can suffer from something known as the Vanishing Gradient Problem, caused by the more classical activations like Sigmoid or Tanh.
In laymen terms, basically activations like Sigmoid and Tanh really squeeze the inputs, right? For example, sigmoid(10) and sigmoid(100) are .9999 and 1 respectively. Even though the inputs have changed so much, the outputs have barely changed - the function is effectively constant at this point. And where a function is almost constant, its derivative tends to zero (or a very small value). These very small derivatives/gradients multiply with each other and become effectively zero, preventing your model from learning anything at all - your weights get stuck and stop updating.
I suggest you do some further reading on this topic at your own time. Among several solutions, one way to solve this is to use a different activation, like ReLU.

Python - matrix multiplication code problem

I have this exercise where I get to build a simple neural network with one input layer and one hidden layer... I made the code below to perform a simple matrix multiplication, but it's not doing it properly as when I do the multiplication by hand. What am I doing wrong in my code?
#toes %win #fans
ih_wgt = ([0.1, 0.2, -0.1], #hid[0]
[-0.1, 0.1, 0.9], #hid[1]
[0.1, 0.4, 0.1]) #hid[2]
#hid[0] hid[1] #hid[2]
ho_wgt = ([0.3, 1.1, -0.3], #hurt?
[0.1, 0.2, 0.0], #win?
[0.0, 1.3, 0.1]) #sad?
weights = [ih_wgt, ho_wgt]
def w_sum(a,b):
assert(len(a) == len(b))
output = 0
for i in range(len(a)):
output += (a[i] * b[i])
return output
def vect_mat_mul(vec, mat):
assert(len(vec) == len(mat))
output = [0, 0, 0]
for i in range(len(vec)):
output[i]= w_sum(vec, mat[i])
return output
def neural_network(input, weights):
hid = vect_mat_mul(input, weights[0])
pred = vect_mat_mul(hid, weights[1])
return pred
toes = [8.5, 9.5, 9.9, 9.0]
wlrec = [0.65, 0.8, 0.8, 0.9]
nfans = [1.2, 1.3, 0.5, 1.0]
input = [toes[0],wlrec[0],nfans[0]]
pred = neural_network(input, weights)
print(pred)
the output of my code is:
[0.258, 0, 0]
The way I attempted to solve it by hand is as follows:
I multiplied the input vector [8.5, 0.65, 1.2] with the input weight matrix
ih_wgt = ([0.1, 0.2, -0.1], #hid[0]
[-0.1, 0.1, 0.9], #hid[1]
[0.1, 0.4, 0.1]) #hid[2]
[0.86, 0.295, 1.23]
the output vector is then fed into the network as an input vector which is then multiplied by the hidden weight matrix
ho_wgt = ([0.3, 1.1, -0.3], #hurt?
[0.1, 0.2, 0.0], #win?
[0.0, 1.3, 0.1]) #sad?
the correct output prediction:
[0.2135, 0.145, 0.5065]
Your help would be much appreciated!
You're almost there! Only a simple indentation thing is the reason:
def vect_mat_mul(vec, mat):
assert(len(vec) == len(mat))
output = [0, 0, 0]
for i in range(len(vec)):
output[i]= w_sum(vec, mat[i])
return output # <-- This one was inside the for loop

Weighted masking in TensorFlow

I have the following task: having two vectors
[v_1, ..., v_n] and [w_1, ..., w_n] build new vector [v_1] * w_1 + ... + [v_n] * w_n.
For exmaple for v = [0.5, 0.1, 0.7] and w = [2, 3, 0] the result will be
[0.5, 0.5, 0.1, 0.1, 0.1].
In case of using vanilla python, the solution would be
v, w = [...], [...]
res = []
for i in range(len(v)):
res += [v[i]] * w[i]
Is it possible to build such code within TensorFlow function? It seems to be an extension of tf.boolean_mask with additional argument like weights or repeats.
Here is a simple solution using tf.sequence_mask:
import tensorflow as tf
v = tf.constant([0.5, 0.1, 0.7])
w = tf.constant([2, 3, 0])
m = tf.sequence_mask(w)
v2 = tf.tile(v[:, None], [1, tf.shape(m)[1]])
res = tf.boolean_mask(v2, m)
sess = tf.InteractiveSession()
print(res.eval())
# array([0.5, 0.5, 0.1, 0.1, 0.1], dtype=float32)

Categories