I am new to ML and tried to build a Linear Regression Model by myself. Object is to predict the fahrenheit values for celcius values.
This is my code:
celsius_q = np.array([-40, -10, 0, 8, 15, 22, 38], dtype= float)
fahrenheit_a = np.array([-40, 14, 32, 46, 59, 72, 100], dtype = float)
inputs = celsius_q
output_expected = fahrenheit_a
# y = m * x + b
m = 100
b = 0
m_gradient = 0
b_gradient = 0
learning_rate = 0.00001
#Forwardpropagation
for i in range(10000):
for i in range(len(inputs)):
m_gradient += (m + (b * inputs[i] - output_expected[i]))
b_gradient += inputs[i] * (m + (b * inputs[i]) - output_expected[i])
m_new = m - learning_rate * (2/len(inputs)) * m_gradient
b_new = b - learning_rate * (2/len(inputs)) * b_gradient
The code generates wrong weights for m and b, no matter how much I change the learning_rate and the epochs. The weights for minimal loss function has to be:
b = 1.8
m = 32
What am I doing wrong?
The update of m and b needs to happen every step but this is not going to be enough. You also need to slightly increase your learning rate, say twice:
import numpy as np
celsius_q = np.array([-40, -10, 0, 8, 15, 22, 38], dtype=float)
fahrenheit_a = np.array([-40, 14, 32, 46, 59, 72, 100], dtype=float)
inputs = celsius_q
output_expected = fahrenheit_a
# y = m * x + b
m_new = m = 100.0
b_new = b = 0.0
m_gradient = 0.0
b_gradient = 0.0
learning_rate = 0.0002
# Forwardpropagation
for i in range(10000):
m_gradient, b_gradient = 0, 0
for i in range(len(inputs)):
m_gradient += (m_new + (b_new * inputs[i] - output_expected[i]))
b_gradient += inputs[i] * (m_new + (b_new * inputs[i]) - output_expected[i])
m_new -= learning_rate * m_gradient
b_new -= learning_rate * b_gradient
print(m_new, b_new)
Getting:
31.952623523538897 1.7979482813813066
which is close to the expected 32 and 1.8.
You should continually update your parameters, in every step. Something like:
for i in range(10000):
m_gradient, b_gradient = 0, 0
for i in range(len(inputs)):
m_gradient += (m + (b * inputs[i] - output_expected[i]))
b_gradient += inputs[i] * (m + (b * inputs[i]) - output_expected[i])
m -= learning_rate * m_gradient
b -= learning_rate * b_gradient
(But I didn't check your math.)
Related
I want to implement the Gradient Descent Algorithm on this simple data but I am facing problems. It would be great if someone points me in the right direction. The answer should be 7 for x=6 but I'm not getting there.
X = [1, 2, 3, 4]
Y = [2, 3, 4, 5]
m_gradient = 0
b_gradient = 0
m, b = 0, 0
learning_rate = 0.1
N = len(Y)
for p in range(100):
for idx in range(len(Y)):
x = X[idx]
y = Y[idx]
hyp = (m * x) + b
m_gradient += -(2/N) * x * (y - hyp)
b_gradient += -(2/N) * (y - hyp)
m = m - (m_gradient * learning_rate)
b = b - (b_gradient * learning_rate)
print(b+m*6)
You are calculating the gradients incorrectly for all but the first iteration. You need to set both gradients to 0 in the outer for loop.
X = [1, 2, 3, 4]
Y = [2, 3, 4, 5]
m_gradient = 0
b_gradient = 0
m, b = 0, 0
learning_rate = 0.1
N = len(Y)
for p in range(100):
for idx in range(len(Y)):
x = X[idx]
y = Y[idx]
hyp = (m * x) + b
m_gradient += -(2/N) * x * (y - hyp)
b_gradient += -(2/N) * (y - hyp)
m = m - (m_gradient * learning_rate)
b = b - (b_gradient * learning_rate)
m_gradient, b_gradient = 0, 0
print(b+m*6)
For example consider b_gradient. Before first iteration b_gradient = 0 and is calculated as 0 + -0.5*(y0 - (m*x0 +b)) + -0.5(y1 - (m*x1 +b)) + -0.5(y2 - (m*x2 + b)) + -0.5(y3 - (m*x3 + b)), where x0 and y0 are X[0] and Y[0], respectively.
After the first iteration the value of b_gradient is -7, this is correct.
The problem starts with the second iteration. Instead of calculating b_gradient as the sum of (-0.5(yn - (m*xn + b)) for 0 <= n <= 3, you calculated it as the previous value of b_gradient plus the sum of (-0.5(yn - (m*xn + b)) for 0 <= n <= 3.
After the second iteration the value of b_gradient is -2.6, this is incorrect. The correct value is 4.4, note that 4.4 - 7 = -2.6.
It seems you want coefficients for Linear Regression using Gradient Descent. Some more data points, a slightly smaller learning rate, training for more epochs by looking at the loss will help reduce error.
As input size gets larger the code below will give slightly off results. The above mentioned methods such as training for more epoch will give correct results for larger range of numbers.
Vectorized Version
import numpy as np
X = np.array([1, 2, 3, 4, 5, 6, 7])
Y = np.array([2, 3, 4, 5, 6, 7, 8])
w_gradient = 0
b_gradient = 0
w, b = 0.5, 0.5
learning_rate = .01
loss = 0
EPOCHS = 2000
N = len(Y)
for i in range(EPOCHS):
# Predict
Y_pred = (w * X) + b
# Loss
loss = np.square(Y_pred - Y).sum() / (2.0 * N)
if i % 100 == 0:
print(loss)
# Backprop
grad_y_pred = (2 / N) * (Y_pred - Y)
w_gradient = (grad_y_pred * X).sum()
b_gradient = (grad_y_pred).sum()
# Optimize
w -= (w_gradient * learning_rate)
b -= (b_gradient * learning_rate)
print("\n\n")
print("LEARNED:")
print(w, b)
print("\n")
print("TEST:")
print(np.round(b + w * (-2)))
print(np.round(b + w * 0))
print(np.round(b + w * 1))
print(np.round(b + w * 6))
print(np.round(b + w * 3000))
# Expected: 30001, but gives 30002.
# Training for 3000 epochs will give expected result.
# For simple demo with less training data and small input range 2000 in enough
print(np.round(b + w * 30000))
Output
LEARNED:
1.0000349103409163 0.9998271260509328
TEST:
-1.0
1.0
2.0
7.0
3001.0
30002.0
Loop Version
import numpy as np
X = np.array([1, 2, 3, 4, 5, 6, 7])
Y = np.array([2, 3, 4, 5, 6, 7, 8])
w_gradient = 0
b_gradient = 0
w, b = 0.5, 0.5
learning_rate = .01
loss = 0
EPOCHS = 2000
N = len(Y)
for i in range(EPOCHS):
w_gradient = 0
b_gradient = 0
loss = 0
for j in range(N):
# Predict
Y_pred = (w * X[j]) + b
# Loss
loss += np.square(Y_pred - Y[j]) / (2.0 * N)
# Backprop
grad_y_pred = (2 / N) * (Y_pred - Y[j])
w_gradient += (grad_y_pred * X[j])
b_gradient += (grad_y_pred)
# Optimize
w -= (w_gradient * learning_rate)
b -= (b_gradient * learning_rate)
# Print loss
if i % 100 == 0:
print(loss)
print("\n\n")
print("LEARNED:")
print(w, b)
print("\n")
print("TEST:")
print(np.round(b + w * (-2)))
print(np.round(b + w * 0))
print(np.round(b + w * 1))
print(np.round(b + w * 6))
print(np.round(b + w * 3000))
# Expected: 30001, but gives 30002.
# Training for 3000 epochs will give expected result.
# For simple demo with less training data and small input range 2000 in enough
print(np.round(b + w * 30000))
Output
LEARNED:
1.0000349103409163 0.9998271260509328
TEST:
-1.0
1.0
2.0
7.0
3001.0
30002.0
I have written this code, which is not converging and runs more iterations than I expect. I expect it to run 17 iterations and it does 24. I am not able to figure out the reason!
import numpy as np
from numpy import *
A = [[10, -1, 2, 0],
[ -1, 11, -1, 3 ],
[ 2, -1, 10, -1 ],
[ 0, 3, -1, 8 ] ]
b = [6, 25, -11, 15]
def GaussSiedelAccelerated(A, b, e, x, w):
e = float(e)
iterations = 0
Epsilon = float()
n = len(A)
condition = True
while condition:
for i in range(n):
s1 = 0
s2 = 0
tempx = x.copy() # Record answer of the previous iteration
for j in range(1,i,1):
s1 = s1 + x[j]*A[i][j]
for k in range(i+1,n,1):
s2 = s2 + tempx[k]*A[i][k]
x[i] = x[i]*(1-w) + w*(b[i] - s1 - s2)/A[i][i]
iterations = iterations +1
Epsilon = max(abs(x-tempx))/max(abs(x))
print("Output vector for the run no.", iterations, "is:", x)
print("Error for the run no.", iterations, "is: \t", Epsilon)
condition = Epsilon > e
return x, Epsilon, iterations
x0 = np.zeros(len(A))
x, Epsilon, iterations = GaussSiedelAccelerated(A,b,0.0001,x0, 1.1)
I am trying to implement cubic spline interpolation in 3 dimensions, however I am unsure how to modify the code I have currently written to implement the z-axis. The purpose of this code will be to calculate a trajectory between a starting point and an end point, which passes through several intermediate points. Any assistance would be greatly appreciated!
import sys
import numpy as np
import matplotlib.pyplot as plt
X = np.array([1, 5, 8, 12, 16, 20, 25, 30, 38], np.float)
Y = np.array([20, 14, 10, 7, 3, 8, 17, 5, 3], np.float)
num_points = 1000
H_x = np.diff(X)
H_y = np.diff(Y)
H_n = N - 1
Alfa = 1 / H_x[1 : H_n - 1]
Gamma = 1 / H_x[1 : H_n - 1]
Beta = 2 * (1 / H_x[:H_n - 1] + 1 / H_x[1:])
dF = H_y / H_x
Delta = 3 * (dF[1:] / H_x[1:] + dF[:H_n-1] / H_x[:H_n-1])
TDM = np.diag(Alfa, k=-1) + np.diag(Beta, 0) + np.diag(Gamma, +1)
B = np.linalg.solve(TDM, Delta)
B = np.hstack([0, B, 0])
C = (3*dF - B[1:] - 2 * B[:H_n]) / H_x
D = (B[:H_n] + B[1:] - 2 * dF) / (H_x ** 2)
x_step = (X[N-1] - X[0]) / num_points
x_points = []
x_base = X[0]
for i in range(num_points):
x_points.append(x_base+x_step*i)
y_points = []
for x_point in x_points:
for i in range(N-1):
if ((x_point >= X[i]) and (x_point <= X[i+1])):
y_point = Y[i] + B[i] * (x_point - X[i]) + C[i] * ((x_point - X[i]) ** 2) + D[i] * ((x_point - X[i]) ** 3)
y_points.append(y_point)
spline, nodes = plt.plot(x_points, y_points, "-g", X, Y, "o")
plt.axis([X[0]-3, X[N-1]+3, np.min(y_points)-3, np.max(y_points)+3])
plt.title(u'P(x)')
plt.xlabel(u'X')
plt.ylabel(u'Y')
plt.grid()
plt.savefig('cubic_spline.png', format = 'png')
plt.show()
Here is my code:
import numpy as np
cx = np.array([0, 0, 3, 3])
cy = np.array([0, 3, 4, 0])
M = len(cx)
for j in range(M):
wx = 0
wy = 0
for i in range(M):
if i == j:
continue
x = cx[i] - cx[j]
y = cy[i] - cy[j]
wx += -x / np.sqrt(x ** 2 + y ** 2)
wy += -y / np.sqrt(x ** 2 + y ** 2)
Move = (
wx / np.sqrt(wx ** 2 + wy ** 2),
wy / np.sqrt(wx ** 2 + wy ** 2),
)
What is wrong with my code?
Your help will be highly appreciated
import numpy as np
cx = np.array([0, 0, 3, 3])
cy = np.array([0, 3, 4, 0])
M = len(cx)
for j in range(M):
wx = 0
wy = 0
for i in range(M):
if i == j:
continue
x = cx[i] - cx[j]
y = cy[i] - cy[j]
wx += -x / np.sqrt(x ** 2 + y ** 2)
wy += -y / np.sqrt(x ** 2 + y ** 2)
Move = (
wx / np.sqrt(wx ** 2 + wy ** 2),
wy / np.sqrt(wx ** 2 + wy ** 2),
)
print(Move)
Try this:
import numpy as np
cx = np.array([0, 0, 3, 3])
cy = np.array([0, 3, 4, 0])
M = len(cx)
for j in range(M):
wx = 0
wy = 0
for i in range(M):
if i == j:
continue
x = cx[i] - cx[j]
y = cy[i] - cy[j]
wx += -x / np.sqrt(x ** 2 + y ** 2)
wy += -y / np.sqrt(x ** 2 + y ** 2)
Move = (
wx / np.sqrt(wx ** 2 + wy ** 2),
wy / np.sqrt(wx ** 2 + wy ** 2),
)
print(Move)
I have the following code written in R to estimate three coefficients (a, b and c):
y <- c(120, 125, 158, 300, 350, 390, 2800, 5900, 7790)
t <- 1:9
fit <- nls(y ~ a * (((b + c)^2/b) * exp(-(b + c) * t))/(1 + (c/b) *
exp(-(b + c) * t))^2, start = list(a = 17933, b = 0.01, c = 0.31))
and i get this result
> summary(fit )
Formula: y ~ a * (((b + c)^2/b) * exp(-(b + c) * t))/(1 + (c/b) * exp(-(b +
c) * t))^2
Parameters:
Estimate Std. Error t value Pr(>|t|)
a 2.501e+04 2.031e+03 12.312 1.75e-05 ***
b 1.891e-05 1.383e-05 1.367 0.221
c 1.254e+00 1.052e-01 11.924 2.11e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 248.8 on 6 degrees of freedom
Number of iterations to convergence: 33
Achieved convergence tolerance: 6.836e-06
How to make the same thing with Python ?
You can use curve_fit, which gives you the same result:
import scipy.optimize as optimization
import numpy as np
y = np.array([120, 125, 158, 300, 350, 390, 2800, 5900, 7790])
t = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
start = np.array([17933, 0.01, 0.31])
def f(t,a,b,c):
num = a*(np.exp(-t*(b+c))*np.power(b+c, 2)/b)
denom = np.power(1+(c/b)*np.exp(-t*(b+c)), 2)
return num/denom
print(optimization.curve_fit(f, t, y, start))
#(array([ 2.50111448e+04, 1.89129922e-05, 1.25426156e+00]), array([[ 4.12657233e+06, 2.58151776e-02, -2.00881091e+02],
# [ 2.58151776e-02, 1.91318685e-10, -1.44733425e-06],
# [ -2.00881091e+02, -1.44733425e-06, 1.10654268e-02]]))