I want to implement the Gradient Descent Algorithm on this simple data but I am facing problems. It would be great if someone points me in the right direction. The answer should be 7 for x=6 but I'm not getting there.
X = [1, 2, 3, 4]
Y = [2, 3, 4, 5]
m_gradient = 0
b_gradient = 0
m, b = 0, 0
learning_rate = 0.1
N = len(Y)
for p in range(100):
for idx in range(len(Y)):
x = X[idx]
y = Y[idx]
hyp = (m * x) + b
m_gradient += -(2/N) * x * (y - hyp)
b_gradient += -(2/N) * (y - hyp)
m = m - (m_gradient * learning_rate)
b = b - (b_gradient * learning_rate)
print(b+m*6)
You are calculating the gradients incorrectly for all but the first iteration. You need to set both gradients to 0 in the outer for loop.
X = [1, 2, 3, 4]
Y = [2, 3, 4, 5]
m_gradient = 0
b_gradient = 0
m, b = 0, 0
learning_rate = 0.1
N = len(Y)
for p in range(100):
for idx in range(len(Y)):
x = X[idx]
y = Y[idx]
hyp = (m * x) + b
m_gradient += -(2/N) * x * (y - hyp)
b_gradient += -(2/N) * (y - hyp)
m = m - (m_gradient * learning_rate)
b = b - (b_gradient * learning_rate)
m_gradient, b_gradient = 0, 0
print(b+m*6)
For example consider b_gradient. Before first iteration b_gradient = 0 and is calculated as 0 + -0.5*(y0 - (m*x0 +b)) + -0.5(y1 - (m*x1 +b)) + -0.5(y2 - (m*x2 + b)) + -0.5(y3 - (m*x3 + b)), where x0 and y0 are X[0] and Y[0], respectively.
After the first iteration the value of b_gradient is -7, this is correct.
The problem starts with the second iteration. Instead of calculating b_gradient as the sum of (-0.5(yn - (m*xn + b)) for 0 <= n <= 3, you calculated it as the previous value of b_gradient plus the sum of (-0.5(yn - (m*xn + b)) for 0 <= n <= 3.
After the second iteration the value of b_gradient is -2.6, this is incorrect. The correct value is 4.4, note that 4.4 - 7 = -2.6.
It seems you want coefficients for Linear Regression using Gradient Descent. Some more data points, a slightly smaller learning rate, training for more epochs by looking at the loss will help reduce error.
As input size gets larger the code below will give slightly off results. The above mentioned methods such as training for more epoch will give correct results for larger range of numbers.
Vectorized Version
import numpy as np
X = np.array([1, 2, 3, 4, 5, 6, 7])
Y = np.array([2, 3, 4, 5, 6, 7, 8])
w_gradient = 0
b_gradient = 0
w, b = 0.5, 0.5
learning_rate = .01
loss = 0
EPOCHS = 2000
N = len(Y)
for i in range(EPOCHS):
# Predict
Y_pred = (w * X) + b
# Loss
loss = np.square(Y_pred - Y).sum() / (2.0 * N)
if i % 100 == 0:
print(loss)
# Backprop
grad_y_pred = (2 / N) * (Y_pred - Y)
w_gradient = (grad_y_pred * X).sum()
b_gradient = (grad_y_pred).sum()
# Optimize
w -= (w_gradient * learning_rate)
b -= (b_gradient * learning_rate)
print("\n\n")
print("LEARNED:")
print(w, b)
print("\n")
print("TEST:")
print(np.round(b + w * (-2)))
print(np.round(b + w * 0))
print(np.round(b + w * 1))
print(np.round(b + w * 6))
print(np.round(b + w * 3000))
# Expected: 30001, but gives 30002.
# Training for 3000 epochs will give expected result.
# For simple demo with less training data and small input range 2000 in enough
print(np.round(b + w * 30000))
Output
LEARNED:
1.0000349103409163 0.9998271260509328
TEST:
-1.0
1.0
2.0
7.0
3001.0
30002.0
Loop Version
import numpy as np
X = np.array([1, 2, 3, 4, 5, 6, 7])
Y = np.array([2, 3, 4, 5, 6, 7, 8])
w_gradient = 0
b_gradient = 0
w, b = 0.5, 0.5
learning_rate = .01
loss = 0
EPOCHS = 2000
N = len(Y)
for i in range(EPOCHS):
w_gradient = 0
b_gradient = 0
loss = 0
for j in range(N):
# Predict
Y_pred = (w * X[j]) + b
# Loss
loss += np.square(Y_pred - Y[j]) / (2.0 * N)
# Backprop
grad_y_pred = (2 / N) * (Y_pred - Y[j])
w_gradient += (grad_y_pred * X[j])
b_gradient += (grad_y_pred)
# Optimize
w -= (w_gradient * learning_rate)
b -= (b_gradient * learning_rate)
# Print loss
if i % 100 == 0:
print(loss)
print("\n\n")
print("LEARNED:")
print(w, b)
print("\n")
print("TEST:")
print(np.round(b + w * (-2)))
print(np.round(b + w * 0))
print(np.round(b + w * 1))
print(np.round(b + w * 6))
print(np.round(b + w * 3000))
# Expected: 30001, but gives 30002.
# Training for 3000 epochs will give expected result.
# For simple demo with less training data and small input range 2000 in enough
print(np.round(b + w * 30000))
Output
LEARNED:
1.0000349103409163 0.9998271260509328
TEST:
-1.0
1.0
2.0
7.0
3001.0
30002.0
Related
I am new to ML and tried to build a Linear Regression Model by myself. Object is to predict the fahrenheit values for celcius values.
This is my code:
celsius_q = np.array([-40, -10, 0, 8, 15, 22, 38], dtype= float)
fahrenheit_a = np.array([-40, 14, 32, 46, 59, 72, 100], dtype = float)
inputs = celsius_q
output_expected = fahrenheit_a
# y = m * x + b
m = 100
b = 0
m_gradient = 0
b_gradient = 0
learning_rate = 0.00001
#Forwardpropagation
for i in range(10000):
for i in range(len(inputs)):
m_gradient += (m + (b * inputs[i] - output_expected[i]))
b_gradient += inputs[i] * (m + (b * inputs[i]) - output_expected[i])
m_new = m - learning_rate * (2/len(inputs)) * m_gradient
b_new = b - learning_rate * (2/len(inputs)) * b_gradient
The code generates wrong weights for m and b, no matter how much I change the learning_rate and the epochs. The weights for minimal loss function has to be:
b = 1.8
m = 32
What am I doing wrong?
The update of m and b needs to happen every step but this is not going to be enough. You also need to slightly increase your learning rate, say twice:
import numpy as np
celsius_q = np.array([-40, -10, 0, 8, 15, 22, 38], dtype=float)
fahrenheit_a = np.array([-40, 14, 32, 46, 59, 72, 100], dtype=float)
inputs = celsius_q
output_expected = fahrenheit_a
# y = m * x + b
m_new = m = 100.0
b_new = b = 0.0
m_gradient = 0.0
b_gradient = 0.0
learning_rate = 0.0002
# Forwardpropagation
for i in range(10000):
m_gradient, b_gradient = 0, 0
for i in range(len(inputs)):
m_gradient += (m_new + (b_new * inputs[i] - output_expected[i]))
b_gradient += inputs[i] * (m_new + (b_new * inputs[i]) - output_expected[i])
m_new -= learning_rate * m_gradient
b_new -= learning_rate * b_gradient
print(m_new, b_new)
Getting:
31.952623523538897 1.7979482813813066
which is close to the expected 32 and 1.8.
You should continually update your parameters, in every step. Something like:
for i in range(10000):
m_gradient, b_gradient = 0, 0
for i in range(len(inputs)):
m_gradient += (m + (b * inputs[i] - output_expected[i]))
b_gradient += inputs[i] * (m + (b * inputs[i]) - output_expected[i])
m -= learning_rate * m_gradient
b -= learning_rate * b_gradient
(But I didn't check your math.)
I have written this code, which is not converging and runs more iterations than I expect. I expect it to run 17 iterations and it does 24. I am not able to figure out the reason!
import numpy as np
from numpy import *
A = [[10, -1, 2, 0],
[ -1, 11, -1, 3 ],
[ 2, -1, 10, -1 ],
[ 0, 3, -1, 8 ] ]
b = [6, 25, -11, 15]
def GaussSiedelAccelerated(A, b, e, x, w):
e = float(e)
iterations = 0
Epsilon = float()
n = len(A)
condition = True
while condition:
for i in range(n):
s1 = 0
s2 = 0
tempx = x.copy() # Record answer of the previous iteration
for j in range(1,i,1):
s1 = s1 + x[j]*A[i][j]
for k in range(i+1,n,1):
s2 = s2 + tempx[k]*A[i][k]
x[i] = x[i]*(1-w) + w*(b[i] - s1 - s2)/A[i][i]
iterations = iterations +1
Epsilon = max(abs(x-tempx))/max(abs(x))
print("Output vector for the run no.", iterations, "is:", x)
print("Error for the run no.", iterations, "is: \t", Epsilon)
condition = Epsilon > e
return x, Epsilon, iterations
x0 = np.zeros(len(A))
x, Epsilon, iterations = GaussSiedelAccelerated(A,b,0.0001,x0, 1.1)
I have this code and I want to edit it to do something else:
def pol(poly, n, x):
result = poly[0]
#Using Horner's method
for i in range(1, n):
result = result * x + poly[i]
return result
#Let us evaluate value of
#ax^3 - bx^2 - x - 10 for x = 1
poly = [5, 9, -1, -10]
x = 1
n = len(poly)
print("Value of polynomial is: ", pol(poly, n, x))
I wonder how can I can change the coefficients of the polynomial. And this code just calculates:
x^3 and x^2
How can I make this code calculate for example this polynomial:
p(x) = 5x^10 + 9x - 7x - 10
or any polynomial in Python?
Your code should work, you just need to present the correct input. For
p(x) = 5x^10 + 9x - 7x - 10
you should provide:
poly2 = [5, 0, 0, 0, 0, 0, 0, 0, 0, 9-7, 10]
Alternate pol - implementation:
def pol(poly, x):
n = len(poly) # no need to provide it at call
rp = poly[::-1] # [-10, -1, 9, 5] so they correlate with range(n) as power
print("Poly:",poly, "for x =",x)
result = 0
for i in range(n):
val = rp[i] * x**i
print(rp[i],' * x^', i, ' = ', val, sep='') # debug output
result += val
return result
x = 2 # 1 is a bad test candidate - no differences for 1**10 vs 1**2
# 5x^3 + 9x^2 - x - 10 for x = 1
poly = [5, 9, -1, -10]
print("Value of polynomial is: ", pol(poly, x))
# p(x) = 5x^10 + 9x - 7x - 10
poly2 = [5, 0, 0, 0, 0, 0, 0, 0, 0, 9-7, 10]
print("Value of polynomial is: ", pol(poly2, x))
Output:
Poly: [5, 9, -1, -10] for x = 2
-10 * x^0 = -10
-1 * x^1 = -2
9 * x^2 = 36
5 * x^3 = 40
Value of polynomial is: 64
Poly: [5, 0, 0, 0, 0, 0, 0, 0, 0, 2, -10] for x = 2
-10 * x^0 = -10
2 * x^1 = 4
0 * x^2 = 0
0 * x^3 = 0
0 * x^4 = 0
0 * x^5 = 0
0 * x^6 = 0
0 * x^7 = 0
0 * x^8 = 0
0 * x^9 = 0
5 * x^10 = 5120
Value of polynomial is: 5114
I am trying to implement cubic spline interpolation in 3 dimensions, however I am unsure how to modify the code I have currently written to implement the z-axis. The purpose of this code will be to calculate a trajectory between a starting point and an end point, which passes through several intermediate points. Any assistance would be greatly appreciated!
import sys
import numpy as np
import matplotlib.pyplot as plt
X = np.array([1, 5, 8, 12, 16, 20, 25, 30, 38], np.float)
Y = np.array([20, 14, 10, 7, 3, 8, 17, 5, 3], np.float)
num_points = 1000
H_x = np.diff(X)
H_y = np.diff(Y)
H_n = N - 1
Alfa = 1 / H_x[1 : H_n - 1]
Gamma = 1 / H_x[1 : H_n - 1]
Beta = 2 * (1 / H_x[:H_n - 1] + 1 / H_x[1:])
dF = H_y / H_x
Delta = 3 * (dF[1:] / H_x[1:] + dF[:H_n-1] / H_x[:H_n-1])
TDM = np.diag(Alfa, k=-1) + np.diag(Beta, 0) + np.diag(Gamma, +1)
B = np.linalg.solve(TDM, Delta)
B = np.hstack([0, B, 0])
C = (3*dF - B[1:] - 2 * B[:H_n]) / H_x
D = (B[:H_n] + B[1:] - 2 * dF) / (H_x ** 2)
x_step = (X[N-1] - X[0]) / num_points
x_points = []
x_base = X[0]
for i in range(num_points):
x_points.append(x_base+x_step*i)
y_points = []
for x_point in x_points:
for i in range(N-1):
if ((x_point >= X[i]) and (x_point <= X[i+1])):
y_point = Y[i] + B[i] * (x_point - X[i]) + C[i] * ((x_point - X[i]) ** 2) + D[i] * ((x_point - X[i]) ** 3)
y_points.append(y_point)
spline, nodes = plt.plot(x_points, y_points, "-g", X, Y, "o")
plt.axis([X[0]-3, X[N-1]+3, np.min(y_points)-3, np.max(y_points)+3])
plt.title(u'P(x)')
plt.xlabel(u'X')
plt.ylabel(u'Y')
plt.grid()
plt.savefig('cubic_spline.png', format = 'png')
plt.show()
So this is my code to plot a Bézier curve:
def bezier(a):
n = np.shape(a)[0]-1
# initialise arrays
B = np.zeros([101, 2])
terms = np.zeros([n+1, 2])
# create an array of values for t from 0 to 1 in 101 steps
t = np.linspace(0, 1, 101)
# loop through all t values
for i in range(0, 101):
#calculate terms inside sum in equation 13
for j in range(0, n + 1):
# YOUR CODE HERE
terms[j,:] = ((1 - t[i]) ** 3 * a[0,:] \
+ 3 * t[i] * (1-t[i]) ** 2 * a[1,:] \
+ 3 * t[i] ** 2 * (1-t[i]) * a[2,:]
+ t[i] ** 3 * a[3,:])
#sum terms to find Bezier curve
B[i, :] = sum(terms, 0)
# plot Bezier
pl.plot(B[:, 0], B[:, 1])
# plot control points
pl.plot(a[:, 0], a[:, 1],'ko')
# plot control polygon
pl.plot(a[:, 0], a[:, 1],'k')
return B
And when I try to pass it some control points:
a = np.array([[0, 0], [0.5, 1], [1, 0]])
B = bezier(a)
I receive this IndexError:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-16-fce87c9f1c04> in <module>()
1 a = np.array([[0, 0], [0.5, 1], [1, 0]])
----> 2 B = bezier(a)
<ipython-input-13-3bb3bb02cc87> in bezier(a)
11 for j in range(0, n + 1):
12 # YOUR CODE HERE
---> 13 terms[j,:] = ((1 - t[i]) ** 3 * a[0,:] + 3 * t[i] * (1-t[i]) ** 2 * a[1,:] + 3 * t[i] ** 2 * (1-t[i]) * a[2,:] + t[i] ** 3 * a[3,:])
14 #sum terms to find Bezier curve
15 B[i, :] = sum(terms, 0)
IndexError: index 3 is out of bounds for axis 0 with size 3
So I figure it is trying to access something outside the container but I can't see where it is I need to change the code.
You array a = np.array([[0, 0], [0.5, 1], [1, 0]] does not have element with index 3. Add another point to the array. You need four points for a Bezier curve, anyway.