Coupled set of equations - Wrong answer from scipy's fsolve - python

I'm trying to solve the following coupled equations:
x = 1;
y - 0.5*y - 0.7*v = 0;
w - 0.7*x - 0.5*x = 0;
v = 1.
(I know that the equations = 1 seem unnecessary but I need them for a later generalization of the code).
My code is the following:
import numpy as np
from scipy.optimize import fsolve
def myFunction(z):
x = z[0]
y = z[1]
w = z[2]
v = z[3]
F = np.empty((4))
F[0] = 1
F[1] = y - 0.5*y - 0.7*v
F[2] = w - 0.7*x - 0.5*w
F[3] = 1
return F
zGuess = np.array([1,2.5,2.5,1])
z = fsolve(myFunction,zGuess)
print(z)
The answer I get is [-224.57569869, -314.40597772, -314.40597817, -224.57569837], but I would expect [1, 1.4, 1.4, 1]. Why does fsolve is unable to find the answer to this simple set of equations? What's more: why the values of x and v, which are not modified at any point, are not the same as those of the initial guess?

To convert the requirement x = 1 to code, rewrite it as x - 1 = 0. That says that the line F[0] = 1 should be changed to F[0] = x - 1. Similarly, the line F[3] = 1 should be F[3] = v - 1.

Related

Reimplement Eigen rotation matrix conversion to quaternions in Python

for consistency, I want to reimplement the conversion from a rotation matrix to quaternions of the Cpp Eigen library in python.
The Cpp implementation can be found here and below you can find my python implementation:
def rotationMatrixToQuaternion3(m):
#q0 = qw
t = np.matrix.trace(m)
q = np.asarray([0.0, 0.0, 0.0, 0.0], dtype=np.float64)
if(t > 0):
t = np.sqrt(t + 1)
q[0] = 0.5 * t
t = 0.5/t
q[1] = (m[2,1] - m[1,2]) * t
q[2] = (m[0,2] - m[2,0]) * t
q[3] = (m[1,0] - m[0,1]) * t
else:
i = 0
if (m[1,1] > m[0,0]):
i = 1
if (m[2,2] > m[i,i]):
i = 2
j = (i+1)%3
k = (j+1)%3
t = np.sqrt(m[i,i] - m[j,j] - m[k,k] + 1)
q[i] = 0.5 * t
t = 0.5 / t
q[0] = (m[k,j] - m[j,k]) * t
q[j] = (m[j,i] + m[i,j]) * t
q[k] = (m[k,i] + m[i,k]) * t
return q
Here are some example results:
rotation matrix:
[[-0.00998882 0.01194957 -0.99987871]
[ 0.49223613 -0.87032691 -0.01531875]
[-0.8704044 -0.49232944 0.00281153]]
python implemnation - qw, qx, qy, qz:
[-0.68145553 -0.18496647 0.68613542 0. ]
eigen:
-0.686135 -0.174997 -0.681456 0.184966
rotation matrix:
[[ 0.01541426 0.02293597 -0.9996181 ]
[ 0.49081359 -0.87117607 -0.01242048]
[-0.87112825 -0.49043469 -0.02468582]]
python implemnation - qw, qx, qy, qz:
[-0.17288173 0.18580601 -0.67658628 0. ]
eigen:
-0.686135 -0.174997 -0.681456 0.184966
rotation matrix:
[[ 0.03744363 -0.01068005 -0.99924167]
[ 0.48694091 -0.87299945 0.02757743]
[-0.87263195 -0.48760425 -0.02748771]]
python implemnation - qw, qx, qy, qz:
[-0.18503815 0.17105894 -0.67232212 0. ]
eigen:
-0.672322 -0.185038 -0.696048 0.171059
Help would be highly appreciated!
Thanks,
Johannes
It seems that the order used by Eigen is [x, y, z, w], see same source file that you base your implementation on.
So the indices that you use should be changed in the following way:
def rotationMatrixToQuaternion3(m):
#q0 = qw
t = np.matrix.trace(m)
q = np.asarray([0.0, 0.0, 0.0, 0.0], dtype=np.float64)
if(t > 0):
t = np.sqrt(t + 1)
q[3] = 0.5 * t
t = 0.5/t
q[0] = (m[2,1] - m[1,2]) * t
q[1] = (m[0,2] - m[2,0]) * t
q[2] = (m[1,0] - m[0,1]) * t
else:
i = 0
if (m[1,1] > m[0,0]):
i = 1
if (m[2,2] > m[i,i]):
i = 2
j = (i+1)%3
k = (j+1)%3
t = np.sqrt(m[i,i] - m[j,j] - m[k,k] + 1)
q[i] = 0.5 * t
t = 0.5 / t
q[3] = (m[k,j] - m[j,k]) * t
q[j] = (m[j,i] + m[i,j]) * t
q[k] = (m[k,i] + m[i,k]) * t
return q
And the returned quaternion is [x, y, z, w].
Running the modified code did not produce the same results that you report for Eigen. Unfortunately, I do not know how to reproduce the Eigen results.
However, there is a scipy implementation of quaternion-to-matrix, which gives the same results as the above implementation (up to multiplication by of the vector by -1 which is an inherent ambiguity of the quaternion and is thus implementation-dependent):
from scipy.spatial import transform
mat = np.array([[-0.00998882, 0.01194957, -0.99987871],
[ 0.49223613, -0.87032691, -0.01531875,],
[-0.8704044, -0.49232944, 0.00281153]])
quat_sp = transform.Rotation.from_matrix(mat).as_quat()
So I think the above implementation is correct, and the problem was in the invocation of Eigen.

Using maximum likelihood to derive regression coefficients in Python

As a learning exercise for myself, I am trying to estimate the regression parameters using the MLE method in Python.
From what I have gathered, the log-likelihood function boils down to maximizing the following:
So I need to take partial derivatives with respect to the intercept and the slope, setting each to zero, and this should give me the coefficients.
I have been trying to approach this using sympy as follows:
from sympy import *
b = Symbol('b')# beta1
a = Symbol('a')# intercept
x = Symbol('x', integer=True)
y = Symbol('y', integer=True)
i = symbols('i', cls=Idx)
x_values = [2,3,2]
y_values = [1,2,3]
n = len(x_values)-1
function = summation((Indexed('y',i) - a+b*Indexed('x',i))**2, (i, 0, n))
partial_intercept = function.diff(a)
print(partial_intercept)
# 6*a - 2*b*x[0] - 2*b*x[1] - 2*b*x[2] - 2*y[0] - 2*y[1] - 2*y[2]
intercept_f = lambdify([x, y], partial_intercept)
inter = solve(intercept_f(x_values, y_values), a)
print(inter)
# [7*b/3 + 2]
I would have expected a single value for the slope, such that the 'b' variable is gone. However, I see that this wouldn't be possible given that the variable b is still there in my derivative equation.
Does anyone have any advice on where I am going wrong?
Thanks!
Edit : Fixed a typo in the codeblock
The expression 7*b/2 + 2 at the end tell you that we have to satisfies a = 7*b/2 + 2, it depends on the quantity of b.
You should solve for both a and b as a system simultaneously.
In the following code, I find the relationship that a and b has to satisfies and solve them simultaneously.
from sympy import *
b = Symbol('b')# beta1
a = Symbol('a')# intercept
x = Symbol('x', integer=True)
y = Symbol('y', integer=True)
i = symbols('i', cls=Idx)
x_values = [2,3,2]
y_values = [1,2,3]
n = len(x_values)-1
function = summation((Indexed('y',i) - a+b*Indexed('x',i))**2, (i, 0, n))
partial_intercept = function.diff(a)
print(partial_intercept)
# 6*a - 2*b*x[0] - 2*b*x[1] - 2*b*x[2] - 2*y[0] - 2*y[1] - 2*y[2]
intercept_f = lambdify([x, y], partial_intercept)
inter = solve(intercept_f(x_values, y_values), a)
print(inter)
#[7*b/3 + 2]
partial_gradient = function.diff(b)
print(partial_gradient)
# 6*a - 2*b*x[0] - 2*b*x[1] - 2*b*x[2] - 2*y[0] - 2*y[1] - 2*y[2]
intercept_f = lambdify([x, y], partial_gradient)
inter2 = solve(intercept_f(x_values, y_values), b)
print(inter2)
ans = solve([a-inter[0], b-inter2[0]])
print(ans)
Here are the outputs:
6*a - 2*b*x[0] - 2*b*x[1] - 2*b*x[2] - 2*y[0] - 2*y[1] - 2*y[2]
[7*b/3 + 2]
2*(-a + b*x[0] + y[0])*x[0] + 2*(-a + b*x[1] + y[1])*x[1] + 2*(-a + b*x[2] + y[2])*x[2]
[7*a/17 - 14/17]
{a: 2, b: 0}
a should be set to be 2 and b should be set to be 0.

Why isn't my gradient descent algorithm working?

I made a gradient descent algorithm in Python and it doesn't work. My m and b values keep increasing and never stop until I get the -inf error or the overflow encountered in square error.
import numpy as np
x = np.array([2,3,4,5])
y = np.array([5,7,9,5])
m = np.random.randn()
b = np.random.randn()
error = 0
lr = 0.0001
for q in range(1000):
for i in range(len(x)):
ypred = m*x[i] + b
error += (ypred - y[i]) **2
m = m - (x * error) *lr
b = b - (lr * error)
print(b,m)
I expected my algorithm to return the best m and b values for my data (x and y) but it didn't work. What is going wrong?
import numpy as np
x = np.array([2,3,4,5])
y = 0.3*x+0.6
m = np.random.randn()
b = np.random.randn()
lr = 0.001
for q in range(100000):
ypred = m*x + b
error = (1./(2*len(x))) * np.sum(np.square(ypred - y)) #eq 1
m = m - lr * np.sum((ypred - y)*x)/len(x) # eq 2 and eq 4
b = b - lr * np.sum(ypred - y)/len(x) # eq 3 and eq 5
print (m , b)
Output:
0.30007724168011807 0.5997039817571881
Math behind it
Use numpy vectorized operations to avoid loops.
I think you implemented the formula incorrectly:
Use summation on x - error
divide by length of x
See below code:
import numpy as np
x = np.array([2,3,4,5])
y = np.array([5,7,9,11])
m = np.random.randn()
b = np.random.randn()
error = 0
lr = 0.1
print(b, m)
for q in range(1000):
ypred = []
for i in range(len(x)):
temp = m*x[i] + b
ypred.append(temp)
error += temp - y[i]
m = m - np.sum(x * (ypred-y)) *lr/len(x)
b = b - np.sum(lr * (ypred-y))/len(x)
print(b,m)
Output:
-1.198074371762264 0.058595039571115955 # initial weights
0.9997389097653074 2.0000681277214487 # Final weights

Pure-Python inverse error function

Are there any pure-python implementations of the inverse error function?
I know that SciPy has scipy.special.erfinv(), but that relies on some C extensions. I'd like a pure python implementation.
I've tried writing my own using the Wikipedia and Wolfram references, but it always seems to diverge from the true value when the arg is > 0.9.
I've also attempted to port the underlying C code that Scipy uses (ndtri.c and the cephes polevl.c functions) but that's also not passing my unit tests.
Edit: As requested, I've added the ported code.
Docstrings (and doctests) have been removed because they're longer than the functions. I haven't yet put much effort into making the port more pythonic - I'll worry about that once I get something that passes unit tests.
Supporting functions from cephes polevl.c
def polevl(x, coefs, N):
ans = 0
power = len(coefs) - 1
for coef in coefs[:N]:
ans += coef * x**power
power -= 1
return ans
def p1evl(x, coefs, N):
return polevl(x, [1] + coefs, N)
Main Inverse Error Function
def inv_erf(z):
if z < -1 or z > 1:
raise ValueError("`z` must be between -1 and 1 inclusive")
if z == 0:
return 0
if z == 1:
return math.inf
if z == -1:
return -math.inf
# From scipy special/cephes/ndrti.c
def ndtri(y):
# approximation for 0 <= abs(z - 0.5) <= 3/8
P0 = [
-5.99633501014107895267E1,
9.80010754185999661536E1,
-5.66762857469070293439E1,
1.39312609387279679503E1,
-1.23916583867381258016E0,
]
Q0 = [
1.95448858338141759834E0,
4.67627912898881538453E0,
8.63602421390890590575E1,
-2.25462687854119370527E2,
2.00260212380060660359E2,
-8.20372256168333339912E1,
1.59056225126211695515E1,
-1.18331621121330003142E0,
]
# Approximation for interval z = sqrt(-2 log y ) between 2 and 8
# i.e., y between exp(-2) = .135 and exp(-32) = 1.27e-14.
P1 = [
4.05544892305962419923E0,
3.15251094599893866154E1,
5.71628192246421288162E1,
4.40805073893200834700E1,
1.46849561928858024014E1,
2.18663306850790267539E0,
-1.40256079171354495875E-1,
-3.50424626827848203418E-2,
-8.57456785154685413611E-4,
]
Q1 = [
1.57799883256466749731E1,
4.53907635128879210584E1,
4.13172038254672030440E1,
1.50425385692907503408E1,
2.50464946208309415979E0,
-1.42182922854787788574E-1,
-3.80806407691578277194E-2,
-9.33259480895457427372E-4,
]
# Approximation for interval z = sqrt(-2 log y ) between 8 and 64
# i.e., y between exp(-32) = 1.27e-14 and exp(-2048) = 3.67e-890.
P2 = [
3.23774891776946035970E0,
6.91522889068984211695E0,
3.93881025292474443415E0,
1.33303460815807542389E0,
2.01485389549179081538E-1,
1.23716634817820021358E-2,
3.01581553508235416007E-4,
2.65806974686737550832E-6,
6.23974539184983293730E-9,
]
Q2 = [
6.02427039364742014255E0,
3.67983563856160859403E0,
1.37702099489081330271E0,
2.16236993594496635890E-1,
1.34204006088543189037E-2,
3.28014464682127739104E-4,
2.89247864745380683936E-6,
6.79019408009981274425E-9,
]
s2pi = 2.50662827463100050242
code = 1
if y > (1.0 - 0.13533528323661269189): # 0.135... = exp(-2)
y = 1.0 - y
code = 0
if y > 0.13533528323661269189:
y = y - 0.5
y2 = y * y
x = y + y * (y2 * polevl(y2, P0, 4) / p1evl(y2, Q0, 8))
x = x * s2pi
return x
x = math.sqrt(-2.0 * math.log(y))
x0 = x - math.log(x) / x
z = 1.0 / x
if x < 8.0: # y > exp(-32) = 1.2664165549e-14
x1 = z * polevl(z, P1, 8) / p1evl(z, Q1, 8)
else:
x1 = z * polevl(z, P2, 8) / p1evl(z, Q2, 8)
x = x0 - x1
if code != 0:
x = -x
return x
result = ndtri((z + 1) / 2.0) / math.sqrt(2)
return result
I think the error in your code is in the for loop over coefficients in the polevl function. If you replace what you have with the function below everything seems to work.
def polevl(x, coefs, N):
ans = 0
power = len(coefs) - 1
for coef in coefs:
ans += coef * x**power
power -= 1
return ans
I have tested it against scipy's implementation with the following code:
import numpy as np
from scipy.special import erfinv
N = 100000
x = np.random.rand(N) - 1.
# Calculate the inverse of the error function
y = np.zeros(N)
for i in range(N):
y[i] = inv_erf(x[i])
assert np.allclose(y, erfinv(x))
sympy? some digging may be needed to see how its implemented internally http://docs.sympy.org/latest/modules/functions/special.html#sympy.functions.special.error_functions.erfinv
from sympy import erfinv
erfinv(0.9).evalf(30)
1.16308715367667425688580351562

Scipy - Non-linear Equations System with linear constraints (beginner)

I have seen this amazing example.
But I need to solve system with boundaries on X and F, for example:
f1 = x+y^2 = 0
f2 = e^x+ xy = 0
-5.5< x <0.18
2.1< y < 10.6
# 0.15< f1 <20.5 - not useful for this example
# -10.5< f2 < -0.16 - not useful for this example
How could I set this boundary constrains to fsolve() of scipy? Or may be there is some other method?
Would You give me a Simple code example?
I hope this will serve you as a starter. It was all there.
import numpy as np
from scipy.optimize import minimize
def my_fun(z):
x = z[0]
y = z[1]
f = np.zeros(2)
f[0] = x + y ** 2
f[1] = np.exp(x) + x * y
return np.dot(f,f)
def my_cons(z):
x = z[0]
y = z[1]
f = np.zeros(4)
f[0] = x + 5.5
f[1] = 0.18 - x
f[2] = y - 2.1
f[3] = 10.6 - y
return f
cons = {'type' : 'ineq', 'fun': my_cons}
res = minimize(my_fun, (2, 0), method='SLSQP',\
constraints=cons)
res
status: 0
success: True
njev: 7
nfev: 29
fun: 14.514193585986144
x: array([-0.86901099, 2.1 ])
message: 'Optimization terminated successfully.'
jac: array([ -2.47001648e-04, 3.21871972e+01, 0.00000000e+00])
nit: 7
EDIT: As a response to the comments: If your function values f1 and f2 are not zero you just have to rewrite the equations e.g:
f1 = -6 and f2 = 3
Your function to minimize will be:
def my_fun(z):
x = z[0]
y = z[1]
f = np.zeros(2)
f[0] = x + y ** 2 + 6
f[1] = np.exp(x) + x * y -3
return np.dot(f,f)
It depends on the system, but here you can simply check the constraints afterwards.
First solve your nonlinear system to get one/none/several solutions of the form (x,y). Then check which, if any, of these solutions, satisfy the constraints.

Categories