numpy polynomial.Polynomial.fit() gives different coefficients than polynomial.polyfit()

numpy polynomial.Polynomial.fit() gives different coefficients than polynomial.polyfit() - python

I do not understand why polynomial.Polynomial.fit() gives coefficients very different from the expected coefficients :
import numpy as np
x = np.linspace(0, 10, 50)
y = x**2 + 5 * x + 10
print(np.polyfit(x, y, 2))
print(np.polynomial.polynomial.polyfit(x, y, 2))
print(np.polynomial.polynomial.Polynomial.fit(x, y, 2))
Gives :
[ 1. 5. 10.]
[10. 5. 1.]
poly([60. 75. 25.])
The two first results are OK, and thanks to this answer I understand why the two arrays are in reversed order.
However, I do not understand the signification of the third result. The coefficients looks wrong, though the polynomial that I got this way seems to give correct predicted values.

The answer is slightly hidden in the docs, of course. Looking at the class numpy.polynomial.polynomial.Polynomial(coef, domain=None, window=None)
It is clear that in general the coefficients [a, b, c, ...] are for the polynomial a + b * x + c * x**2 + .... However, there are the keyword parameters domain and window both with default [-1,1]. I am not into that class, so I am not sure about the purpose, but it is clear that a remapping takes place. Now in the case of polynomial.Polynomial.fit() one has a class method that automatically takes the x data as domain, but still makes the mapping to the window. Hence, in the OP [0-10] is mapped onto [-1,1]. This is done by x = x' / 5 - 1 or x' -> 5 * x + 5. Putting the latter in the OP polynomial we get
( 5 x' + 5 )**2 + 5 * ( 5 * x' + 5 ) + 10 = 25 * x'**2 + 75 * x' + 60
Voila.
To get the expected result one has to put
print(np.polynomial.polynomial.Polynomial.fit(x, y, 2, window=[0, 10] ) )
wich gives
poly([10. 5. 1.])

Buried in the docs:
Note that the coefficients are given in the scaled domain defined by the linear mapping between the window and domain. convert can be used to get the coefficients in the unscaled data domain.
So use:
poly.convert()
This will rescale your coefficients to what you are probably expecting.
Example for data generated from 1 + 2x + 3x^2:
from numpy.polynomial import Polynomial
test_poly = Polynomial.fit([0, 1, 2, 3, 4, 5],
[1, 6, 17, 34, 57, 86],
2)
print(test_poly)
print(test_poly.convert())
Output:
poly([24.75 42.5 18.75])
poly([1. 2. 3.])

Related

I dont receive the arrays I want, can someone help me?

I have to make a program with 4 functions. 3 of them have to be the 3 coordinates x, y, z and the other one have to be the function, which give the np.array with all the coordinates together, but I only receive one giant array with all the factors of each one like an only element of the array. I don't know if I'm explaining myself, but here is my code:
import numpy as np
def x(t):
x = (5.25 * t)
return x
def y(t):
y = (-0.365 * (t**2)) + (7.15 * t) + 34
return y
def z(t):
z = (-0.49 * (t**2)) + (9.9 * t)
return z
def f(t):
f = np.array([x(t), y(t), z(t)])
return f
f(t) [0:49]
t = np.arange(0, 21, 0.4275996114)
M = f(t)
print(M)
I have to print the first 50 coordinates of the ball until the time 20 seconds, but I receive the 50 numbers of x like only 1 element of an array.

The question was not very clear but I think I understood your problem and what you are asking. Basically you want to get a matrix M in which each row is a numpy array containing the three x,y,z coordinates. You want to have as many rows as there are measurements in 20 seconds.
M = np.empty((0,3), float)
t = np.arange(0, 21, 0.4275996114)
for time in t:
M = np.append(M, [f(time)], axis = 0)
print(M)
Explanation
First, we create what will become our matrix by specifying that each row consists of 3 columns. We want it to contain decimal numbers, so we specify float as type:
M = np.empty((0,3), float)
Then a problem you have in your code is that you only call the f function once, passing the entire t array of times as an argument. You actually have to call the f function once for each instant of time contained in t.
To solve it you have to make a loop on each element in t. The result of each call to f() must be added as a row in the M matrix.
t = np.arange(0, 21, 0.4275996114)
for time in t:
M = np.append(M, [f(time)], axis = 0)
Partial Output
This output shows only the result of the range from 0 to 3. is just an example to show you the format of the output obtained
[[ 0. 34. 0. ]
[ 2.24489796 36.9906001 4.14364385]
[ 4.48979592 39.84772596 8.10810311]
[ 6.73469388 42.57137757 11.89337776]
[ 8.97959184 45.16155495 15.49946782]
[11.2244898 47.61825808 18.92637328]
[13.46938776 49.94148697 22.17409413]
[15.71428572 52.13124162 25.24263039]]

Why does my optimization (scipy.optimize.minimize) not work and return the initial values instead?

I have a set of data; each column corresponds to a spectrum at a certain time. I want to fit the spectrum at a generic time (t_i) as a linear combination of the spectrum at time 0 (in the first column), at time 5 (in column 30) and time 35 (in column 210). So the equation I want to fit is:
S(t_i) = a * S(t_0) + b * S(t_5) + c * S(t_35)
where:
0 <= a, b, c <= 1
a + b + c = 1
I found the solution at this question (Minimizing Least Squares with Algebraic Constraints and Bounds) super useful. But when I try it with my set of data the results are obviously wrong. I tried modifying the method to 'Nelder-Mead' but it doesn't respect my bound so I get negative values.
This is my script:
t0= df.iloc[:,0] #Spectrum at time 0
t5 = df.iloc[:,30] # Spectrum at time 5
t35 = df.iloc[:,120] # Spectrum at time 35
ti= df.iloc[:,20]
# Bounds that make every coefficient be between 0 and 1
bnds = [(0, 1), (0, 1), (0, 1)]
# Constrain the sum of the coefficient to 1
cons = [{"type": "eq", "fun": lambda x: x[0] + x[1] + x[2] - 1}]
xinit = np.array([1, 0, 0])
fun = lambda x: np.sum((ti -(x[0] * t0 + x[1] * t5 + x[2] * t35))**2)
res = minimize(fun, xinit,method='Nelder-Mead', bounds=bnds, constraints=cons)
print(res.x)
If I use the Nelder-Mead method I get: Out: [ 0.02732053 1.01961422 -0.04504698] , if I don't specify the method I get: [1. 0. 0.] (I believe that in this case the SLSQP method is being used).
The data I'm referring to is similar to the following:
0 3.333 5 35.001
0.001045089 0.001109701 0.001169798 0.000725486
0.001083051 0.001138815 0.001176665 0.000713021
0.001090994 0.001142676 0.001186642 0.000716149
0.001096258 0.001156476 0.001190218 0.00071286
Can you identify the problem? Can you suggest other ways to solve this problem? I have also tried using least_squares, but it failed.

The result of a local optimization strongly depends on the initial values.
It might return [1, 0, 0] for the case you stated above because there simply was no possibility for the optimizer to find a "downhill-only" way to [0. 1. 0.].
In fact, you might have started in a local minima and all ways out of the dip went uphill. So the optimizer chose to stay. That's how these optimizers work.
Try
xinit = np.array([0.0, 1.0, 0.0])
for t_i = t5 and I am quite sure the optimizer will return the initial value.
For your case do what I stated here: Run the optimizer several times, each time pick random initial values inside your boundaries. You can pick the code posted there and just add your constraints, use SLSQP or trust-constr.

python: work out intersection of two functions

I am trying to use scipy.optimize.fsolve to work out the x-intercept(s):
from scipy.optimize import fsolve
from numpy import array, empty
counter = 0
def f(x_):
global counter
counter += 1
return pow(x_, 3) * 3 - 9.5 * pow(x_, 2) + 10 * x_
x0_ = empty(2)
x0_[0] = 1
x0_[1] = 6
res = fsolve(f, x0=x0_)
print(counter)
print(res)
the function f(x): https://www.desmos.com/calculator/8j8djr01da
the result of this code is:
74
[0. 0.]
I expect the result to be
[0, 1.575, 3.175]
Can someone please offer some help.
Plus:
I can't understand the documentation of fsolve(x0), is that just a guess? I will be so appreciated if you can explain.
Plus Plus:
I will be working with lots of linear equations with unknown expressions and exponential, I am really looking for a way to work out the x-intercepts, in other words, the roots by the expression of f(x).I would be so glad if you can help.

You get the set of all roots for a polynomial by
numpy.roots([3, -9.5, +10, 0])
array([1.58333333+0.90905934j, 1.58333333-0.90905934j,
0. +0.j ])
It is not clear what your other expected real roots are, fsolve will only find the real root 0.
Of course, if you take the coefficients that you used in the Desmos graphing tool
numpy.roots([2, -9.5, +10, 0])
you will actually get the expected
array([3.17539053, 1.57460947, 0. ])
For scalar non-polynomial functions the interface scipy.optimize.find_root is perhaps more suitable, especially if you can provide a bracketing interval.

I just want to say that at the first step you define your function wrong:
it should be
def f(x_):
# global counter
# counter += 1
return pow(x_, 3) * 2 - 9.5 * pow(x_, 2) + 10 * x_
but notpow(x_, 3) * 3 - 9.5 * pow(x_, 2) + 10 * x_
If you then set x0_ precisely:
x0_=[0,1,3] # according to intersection on graph
res=fsolve(f, x0=x0_)
Give you the anticipated output:
[0. 1.57460947 3.17539053]
Sometimes you just have to be more careful :)

How to resolve function approximation task in Python?

Consider the complex mathematical function on the line [1, 15]:
f(x) = sin(x / 5) * exp(x / 10) + 5 * exp(-x / 2)
polynomial of degree n (w_0 + w_1 x + w_2 x^2 + ... + w_n x^n) is uniquely defined by any n + 1 different points through which it passes.
This means that its coefficients w_0, ... w_n can be determined from the following system of linear equations:
Where x_1, ..., x_n, x_ {n + 1} are the points through which the polynomial passes, and by f (x_1), ..., f (x_n), f (x_ {n + 1}) - values that it must take at these points.
I'm trying to form a system of linear equations (that is, specify the coefficient matrix A and the free vector b) for the polynomial of the third degree, which must coincide with the function f at points 1, 4, 10, and 15. Solve this system using the scipy.linalg.solve function.
A = numpy.array([[1., 1., 1., 1.], [1., 4., 8., 64.], [1., 10., 100., 1000.], [1., 15., 225., 3375.]])
V = numpy.array([3.25, 1.74, 2.50, 0.63])
numpy.linalg.solve(A, V)
I got the wrong answer, which is
So the question is: is the matrix correct?

No, your matrix is not correct.
The biggest mistake is your second sub-matrix for A. The third entry should be 4**2 which is 16 but you have 8. Less important, you have only two decimal places for your constants array V but you really should have more precision than that. Systems of linear equations are sometimes very sensitive to the provided values, so make them as precise as possible. Also, the rounding in your final three entries is bad: you rounded down but you should have rounded up. If you really want two decimal places (which I do not recommend) the values should be
V = numpy.array([3.25, 1.75, 2.51, 0.64])
But better would be
V = numpy.array([3.252216865271419, 1.7468459495903677,
2.5054164070002463, 0.6352214195786656])
With those changes to A and V I get the result
array([ 4.36264154, -1.29552587, 0.19333685, -0.00823565])
I get these two sympy plots, the first showing your original function and the second using the approximated cubic polynomial.
They look close to me! When I calculate the function values at 1, 4, 10, and 15, the largest absolute error is for 15, namely -4.57042132584462e-6. That is somewhat larger than I would have expected but probably is good enough.

Is it from data science course? :)
Here is an almost generic solution I did:
%matplotlib inline
import numpy as np;
import math;
import matplotlib.pyplot as plt;
def f(x):
return np.sin(x / 5) * np.exp(x / 10) + 5 * np.exp(-x / 2)
# approximate at the given points (feel free to experiment: change/add/remove)
points = np.array([1, 4, 10, 15])
n = points.size
# fill A-matrix, each row is 1 or xi^0, xi^1, xi^2, xi^3 .. xi^n
A = np.zeros((n, n))
for index in range(0, n):
A[index] = np.power(np.full(n, points[index]), np.arange(0, n, 1))
# fill b-matrix, i.e. function value at the given points
b = f(points)
# solve to get approximation polynomial coefficents
solve = np.linalg.solve(A,b)
# define the polynome approximation of the function
def polinom(x):
# Yi = solve * Xi where Xi = x^i
tiles = np.tile(x, (n, 1))
tiles[0] = np.ones(x.size)
for index in range(1, n):
tiles[index] = tiles[index]**index
return solve.dot(tiles)
# plot the graphs of original function and its approximation
x = np.linspace(1, 15, 100)
plt.plot(x, f(x))
plt.plot(x, polinom(x))
# print out the coefficients of polynome approximating our function
print(solve)

How does this interpolating function work?

I am trying to write a function which interpolates some data and then you can chose any value on the x axis to find the corresponding point on the y axis.
For example:
f = f_from_data([3, 4, 6], [0, 1, 2])
print f(3.5)
produces the answer
0.5
I came across an answer which looks like this:
def f_from_data(xs,ys):
return scipy.interpolate.interp1d(xs, ys)
Can someone please explain how this works? I understand interp1d but I'm not sure how this simple line of code can get the answer when, for example
print f(5)
is input into it.

A simple example may help. interp1d is a class that acts like a function. It returns not a number, but another function-like object. Once you call it again, it returns the interpolated value of y at the input value of x. You can also feed this function single points, or whole arrays:
import numpy as np
from scipy.interpolate import interp1d
X=[3,4,6]
Y=[0,1,2]
f = interp1d(X,Y, bounds_error=False)
print f(3.5)
X2 = np.linspace(3, 6, 5)
print X2
print f(X2)
0.5
[ 3. 3.75 4.5 5.25 6. ]
[ 0. 0.75 1.25 1.625 2. ]

Your example uses linear interpolation - straight connecting lines between data points.
So, for your given data (xs = [3, 4, 6] and ys = [0, 1, 2]) the function looks like
where the blue points are the input data, the green line is the interpolated function, and the red dot is the test point f(3.5) == 0.5
To calculate f(5.0):
First, you have to find out which line segment you are on.
x == 5 is in the second segment, between 4 and 6, so we are looking for point C (5, y) between points A (4, 1) and B (6, 2).
C is on the line, so AC = k * AB where 0. <= k < 1.; this gives us two equations in two unknowns (k and y). Solving, we get
y = Ay + (By - Ay) * (Cx - Ax) / (Bx - Ax)
and subbing in,
y = 1. + (2. - 1.) * (5. - 4.) / (6. - 4.)
= 1.5
so the interpolated point is C (5, 1.5) and the function returns f(5.0) = 1.5
From the above, you should be able to write your own f() function given xs and ys; and this is exactly what scipy.interpolate.interp1d(xs, ys) does - takes xs and ys and returns an interpolative function, ie
f = scipy.interpolate.interp1d([3, 4, 6], [0, 1, 2])
# f is now a function that you can call, like
f(5.0) # => 1.5

To quote the documentation:
This class returns a function whose call method uses interpolation
to find the value of new points.
Thus, calling the returned function with an x value gives the corresponding interpolated y value.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

numpy polynomial.Polynomial.fit() gives different coefficients than polynomial.polyfit() - python

Related

I dont receive the arrays I want, can someone help me?

Why does my optimization (scipy.optimize.minimize) not work and return the initial values instead?

python: work out intersection of two functions

How to resolve function approximation task in Python?

How does this interpolating function work?

Categories

Resources