I have some points and I am trying to fit curve for this points. I know that there exist scipy.optimize.curve_fit function, but I do not understand documentation, i.e how to use this function.
My points: np.array([(1, 1), (2, 4), (3, 1), (9, 3)])
Can anybody explain how to do that?
I suggest you to start with simple polynomial fit, scipy.optimize.curve_fit tries to fit a function f that you must know to a set of points.
This is a simple 3 degree polynomial fit using numpy.polyfit and poly1d, the first performs a least squares polynomial fit and the second calculates the new points:
import numpy as np
import matplotlib.pyplot as plt
points = np.array([(1, 1), (2, 4), (3, 1), (9, 3)])
# get x and y vectors
x = points[:,0]
y = points[:,1]
# calculate polynomial
z = np.polyfit(x, y, 3)
f = np.poly1d(z)
# calculate new x's and y's
x_new = np.linspace(x[0], x[-1], 50)
y_new = f(x_new)
plt.plot(x,y,'o', x_new, y_new)
plt.xlim([x[0]-1, x[-1] + 1 ])
plt.show()
You'll first need to separate your numpy array into two separate arrays containing x and y values.
x = [1, 2, 3, 9]
y = [1, 4, 1, 3]
curve_fit also requires a function that provides the type of fit you would like. For instance, a linear fit would use a function like
def func(x, a, b):
return a*x + b
scipy.optimize.curve_fit(func, x, y) will return a numpy array containing two arrays: the first will contain values for a and b that best fit your data, and the second will be the covariance of the optimal fit parameters.
Here's an example for a linear fit with the data you provided.
import numpy as np
from scipy.optimize import curve_fit
x = np.array([1, 2, 3, 9])
y = np.array([1, 4, 1, 3])
def fit_func(x, a, b):
return a*x + b
params = curve_fit(fit_func, x, y)
[a, b] = params[0]
This code will return a = 0.135483870968 and b = 1.74193548387
Here's a plot with your points and the linear fit... which is clearly a bad one, but you can change the fitting function to obtain whatever type of fit you would like.
Related
I'm working through a quiz where you have to implement some machine learning concepts from scratch. One question asks to implement linear interpolation in python without any external libraries besides numpy. The question states:
Q: Given the input data of points [(1, 2), (3, 4), (5, 6), (8, 8), (7, -1)], fit a line of the form Y = A * X + B using gradient descent. Provide the implementation of your algorithm in the function provided using no external libraries, except for numpy. The evaluate linear regression fit on the provided point.
def linear_interpolate_point(data, x):
"""
Fit a line to the provided data and then evaluate on the provided test point.
:param data: Collection of points to fit provided as a list of tuples
:param x: Point to interpolate using your fit line
:return: The output of your point on the interpolated line
"""
# fill in function below
I've tried a few different concepts but seem to be stumped on what the question is asking.
Any help is much appreciated.
A linear regression can be formalized as follows: f(x) = x*m + b, where b is a bias weigth and m is a weigth that is multiplied with the input. What gradient descent does is to optimize the m and b based on your data. So let's set up an initial model based on your data.
import numpy as np
data = np.array([(1, 2), (3, 4), (5, 6), (8, 8), (7, -1)])
x = data[:, 0]
y = data[:, 0]
m = 0
b = 0
With the initial model we can run a several optimization steps with gradient descent to optimize the weights. Gradient descent basically just computes in which direction to change the weights b and m. We further specify how much iterations we allow for optimization (max_iterations) and how much we move in the direction of the gradient (eta).
def gradient_descent(x, y, m, b, eta, max_iterations):
n = x.shape[0]
for iteration in range(max_iterations):
y_hat = x * m + b
error = y - y_hat
# Compute gradient for m and b
gradient_m = (-2/n) * x.dot(error)
gradient_b = (-2/n) * sum(error)
# Update weights
m = m - eta*gradient_m
b = b - eta*gradient_b
return m, b
m, b = gradient_descent(x, y, m, b, 1e-2, 1000)
Now plotting the data and the model with optimized weights.
plt.scatter(x, y)
plt.plot(np.linspace(1, 8, 100), np.linspace(1, 8, 100) * m + b )
plt.show()
For a complete explanation how gradient descent works probably some blog posts are better than stackoverflow, e.g. Gradient Descent.
How do I evaluate a function in n variables in numpy? For simplicity, let n = 3. Consider the following example:
x, y, z = numpy.linspace(0, 1, 100), numpy.linspace(0, 1, 100), numpy.linspace(0, 1, 100)
def F(a, b, c): # Test function in 3 variables
return a + b + c
F_over_xyz = ... # How to get an array that contains F evaluated at all points in [0;1]³?
Somehow, I am also having a hard time wrapping my head around which shape the generated array would have?.
A general way to get Cartesian product of any wanted number of arrays is:
np.stack(np.meshgrid(*arrays), axis=-1).reshape(-1, len(arrays))
So you could list all the points in [0;1]³:
import numpy as np
arrays = np.linspace(0, 1, 100), np.linspace(0, 1, 100), np.linspace(0, 1, 100)
list_of_points = np.stack(np.meshgrid(*arrays), axis=-1).reshape(-1, len(arrays))
Shape of list_of_points is (1000000, 3): 1M points, 3 coordinates each.
Then you can calculate sum of coordinates like so:
np.sum(list_of_points, axis=1)
You could also try:
import numpy as np
nn = 4
x,y,z=np.linspace(0,1,nn),np.linspace(0,1,nn),np.linspace(0,1,nn)
def F(a, b, c): # Test function in 3 variables
return a + b + c
# this creates your grid
xgrid,ygrid,zgrid = np.meshgrid(x,y,z)
# output[i,j,k] will be F(xgrid[i,j,k],ygrid[i,j,k],zgrid[i,j,k])
output = F(xgrid,ygrid,zgrid)
This is a trivial problem but I run into it again and again and I am sure there is a elegant solution, which I would like to use.
I do math with numpy and would like to plot lines that are results of linear algebra calculations. These lines come in the form
So I would would like to "outsource" the job of finding the start end endpoint of my line to a clever sipplet of python code, so that my resulting line gets drawn into my 3D plot, honoring the existing dimensions of the plot. E.g. if I plotted a 3D parabel from x = -2 to 2 and z = -3 to 3, and I wanted to draw a line
,
it would figure out that it would need to start at (-2,1,-2) and end at (2,1,2).
How could that work?
At first, it's important to define projection parameter. At second, you need to work with different shapes of P, v and z in order to obtain X, Y, Z parameters that corresponds to coordinates of plot method:
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
P = np.array([1,1,1]).reshape(-1,1)
v = np.array([1,0,1]).reshape(-1,1)
z = np.linspace(-3,3,100)
X, Y, Z = P + v*z
ax.plot(X, Y, Z)
plt.show()
Per comments
reshape(-1, 1) adds an extra dimension which is required for broadcasting (you can also read a nice tutorial on this topic). It is also a substitute of reshape(3, 1). Simple case (arr1 = v; arr2 = np.linspace(-3,3,11)) can be visualized like so:
Ending points of a curve g = (1, 1, 1) + z * (1, 0, 1) are at the bounds of interval of z, namely:
g1 = (1, 1, 1) + (-3) * (1, 0, 1) = (-2, 1, -2)
g2 = (1, 1, 1) + 3 * (1, 0, 1) = (4, 1, 4)
Note that z = 1 is needed to get ending point = (2,1,2)
import numpy as np
import matplotlib.pyplot as plt
points = np.array([(333, 195.3267), (500, 223.0235), (1000, 264.5914), (2000, 294.8728
), (5000, 328.3523), (10000, 345.4688)])
# get x and y vectors
x = points[:,0]
y = points[:,1]
# calculate polynomial
z = np.polyfit(x, y, 3)
f = np.poly1d(z)
# calculate new x's and y's
x_new = np.linspace(x[0], x[-1], 50)
y_new = f(x_new)
plt.plot(x,y,'o', x_new, y_new)
plt.xlim([x[0]-1, x[-1] + 1 ])
plt.show()
So this script creates a polynomial fit for the inserted data. I want to use the poly text feature or some feature in order to print the formula for the curve fit. I am pretty new to Python.
from numpy.polynomial import polynomial as P
c, stats = P.polyfit(x,y,3,full=True)
Now you can get an array of coefficients by printing c and ssr stats!
The examples in the documentation are quite understandable! https://docs.scipy.org/doc/numpy/reference/generated/numpy.polynomial.polynomial.polyfit.html
What is a good way to produce a numpy array containing the values of a function evaluated on an n-dimensional grid of points?
For example, suppose I want to evaluate the function defined by
def func(x, y):
return <some function of x and y>
Suppose I want to evaluate it on a two dimensional array of points with the x values going from 0 to 4 in ten steps, and the y values going from -1 to 1 in twenty steps. What's a good way to do this in numpy?
P.S. This has been asked in various forms on StackOverflow many times, but I couldn't find a concisely stated question and answer. I posted this to provide a concise simple solution (below).
shorter, faster and clearer answer, avoiding meshgrid:
import numpy as np
def func(x, y):
return np.sin(y * x)
xaxis = np.linspace(0, 4, 10)
yaxis = np.linspace(-1, 1, 20)
result = func(xaxis[:,None], yaxis[None,:])
This will be faster in memory if you get something like x^2+y as function, since than x^2 is done on a 1D array (instead of a 2D one), and the increase in dimension only happens when you do the "+". For meshgrid, x^2 will be done on a 2D array, in which essentially every row is the same, causing massive time increases.
Edit: the "x[:,None]", makes x to a 2D array, but with an empty second dimension. This "None" is the same as using "x[:,numpy.newaxis]". The same thing is done with Y, but with making an empty first dimension.
Edit: in 3 dimensions:
def func2(x, y, z):
return np.sin(y * x)+z
xaxis = np.linspace(0, 4, 10)
yaxis = np.linspace(-1, 1, 20)
zaxis = np.linspace(0, 1, 20)
result2 = func2(xaxis[:,None,None], yaxis[None,:,None],zaxis[None,None,:])
This way you can easily extend to n dimensions if you wish, using as many None or : as you have dimensions. Each : makes a dimension, and each None makes an "empty" dimension. The next example shows a bit more how these empty dimensions work. As you can see, the shape changes if you use None, showing that it is a 3D object in the next example, but the empty dimensions only get filled up whenever you multiply with an object that actually has something in those dimensions (sounds complicated, but the next example shows what i mean)
In [1]: import numpy
In [2]: a = numpy.linspace(-1,1,20)
In [3]: a.shape
Out[3]: (20,)
In [4]: a[None,:,None].shape
Out[4]: (1, 20, 1)
In [5]: b = a[None,:,None] # this is a 3D array, but with the first and third dimension being "empty"
In [6]: c = a[:,None,None] # same, but last two dimensions are "empty" here
In [7]: d=b*c
In [8]: d.shape # only the last dimension is "empty" here
Out[8]: (20, 20, 1)
edit: without needing to type the None yourself
def ndm(*args):
return [x[(None,)*i+(slice(None),)+(None,)*(len(args)-i-1)] for i, x in enumerate(args)]
x2,y2,z2 = ndm(xaxis,yaxis,zaxis)
result3 = func2(x2,y2,z2)
This way, you make the None-slicing to create the extra empty dimensions, by making the first argument you give to ndm as the first full dimension, the second as second full dimension etc- it does the same as the 'hardcoded' None-typed syntax used before.
Short explanation: doing x2, y2, z2 = ndm(xaxis, yaxis, zaxis) is the same as doing
x2 = xaxis[:,None,None]
y2 = yaxis[None,:,None]
z2 = zaxis[None,None,:]
but the ndm method should also work for more dimensions, without needing to hardcode the None-slices in multiple lines like just shown. This will also work in numpy versions before 1.8, while numpy.meshgrid only works for higher than 2 dimensions if you have numpy 1.8 or higher.
import numpy as np
def func(x, y):
return np.sin(y * x)
xaxis = np.linspace(0, 4, 10)
yaxis = np.linspace(-1, 1, 20)
x, y = np.meshgrid(xaxis, yaxis)
result = func(x, y)
I use this function to get X, Y, Z values ready for plotting:
def npmap2d(fun, xs, ys, doPrint=False):
Z = np.empty(len(xs) * len(ys))
i = 0
for y in ys:
for x in xs:
Z[i] = fun(x, y)
if doPrint: print([i, x, y, Z[i]])
i += 1
X, Y = np.meshgrid(xs, ys)
Z.shape = X.shape
return X, Y, Z
Usage:
def f(x, y):
# ...some function that can't handle numpy arrays
X, Y, Z = npmap2d(f, np.linspace(0, 0.5, 21), np.linspace(0.6, 0.4, 41))
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_wireframe(X, Y, Z)
The same result can be achieved using map:
xs = np.linspace(0, 4, 10)
ys = np.linspace(-1, 1, 20)
X, Y = np.meshgrid(xs, ys)
Z = np.fromiter(map(f, X.ravel(), Y.ravel()), X.dtype).reshape(X.shape)
In the case your function actually takes a tuple of d elements, i.e. f((x1,x2,x3,...xd)) (for example the scipy.stats.multivariate_normal function), and you want to evaluate f on N^d combinations/grid of N variables, you could also do the following (2D case):
x=np.arange(-1,1,0.2) # each variable is instantiated N=10 times
y=np.arange(-1,1,0.2)
Z=f(np.dstack(np.meshgrid(x,y))) # result is an NxN (10x10) matrix, whose entries are f((xi,yj))
Here np.dstack(np.meshgrid(x,y)) creates an 10x10 "matrix" (technically a 10x10x2 numpy array) whose entries are the 2-dimensional tuples to be evaluated by f.
My two cents:
import numpy as np
x = np.linspace(0, 4, 10)
y = np.linspace(-1, 1, 20)
[X, Y] = np.meshgrid(x, y, indexing = 'ij', sparse = 'true')
def func(x, y):
return x*y/(x**2 + y**2 + 4)
# I have defined a function of x and y.
func(X, Y)