I have polynomials which coefficients are computed using numerical integration methods. Mathematically, I use the Gram-Schmidt algorithm to produce orthogonal polynomials from a given probability distribution function, which involves integrals in the associated Hilbert space. Hence, they are sometimes approximated as floating point numbers very close to zero, although I know that the mathematical value is zero. I would like to customize the printing so that these values are not printed at all.
For example, the script:
import numpy as np
p = np.polynomial.Polynomial([1.23456789e-15, 1.0, 1.23456789e-13, 2.0])
print("p = ", p)
produces:
p = 1.23456789e-15 + 1.0·x¹ + 1.23456789e-13·x² + 2.0·x³
but I would like to print:
p = 1.0·x¹ + 2.0·x³
How can I do this?
The best I managed to come up with is editing the funcion output, removing numbers that are to small and reformating:
coeffs = p.coef
formatted_coeffs = []
for i, coeff in enumerate(coeffs):
if abs(coeff) >= 1e-5:
formatted_coeffs.append("{:.1f}·x^{}".format(coeff, i))
print("p = ", " + ".join(formatted_coeffs))
This code prints what you wanted:
p = 1.0·x^1 + 2.0·x^3
Related
i have the following problem:
I want to integrate a 2D array, so basically reversing a gradient operator.
Assuming i have a very simple array as follows:
shape = (60, 60)
sampling = 1
k_mesh = np.meshgrid(np.fft.fftfreq(shape[0], sampling), np.fft.fftfreq(shape[1], sampling))
Then i construct my vectorfield as a complex-valued arreay (x-vector = real part, y-vector = imaginary part):
k = k_mesh[0] + 1j * k_mesh[1]
So the real part for example looks like this
Now i take the gradient:
k_grad = np.gradient(k, sampling)
I then use Fourier transforms to reverse it, using the following function:
def freq_array(shape, sampling):
f_freq_1d_y = np.fft.fftfreq(shape[0], sampling[0])
f_freq_1d_x = np.fft.fftfreq(shape[1], sampling[1])
f_freq_mesh = np.meshgrid(f_freq_1d_x, f_freq_1d_y)
f_freq = np.hypot(f_freq_mesh[0], f_freq_mesh[1])
return f_freq
def int_2d_fourier(arr, sampling):
freqs = freq_array(arr.shape, sampling)
k_sq = np.where(freqs != 0, freqs**2, 0.0001)
k = np.meshgrid(np.fft.fftfreq(arr.shape[0], sampling), np.fft.fftfreq(arr.shape[1], sampling))
v_int_x = np.real(np.fft.ifft2((np.fft.fft2(arr[1]) * k[0]) / (2*np.pi * 1j * k_sq)))
v_int_y = np.real(np.fft.ifft2((np.fft.fft2(arr[0]) * k[0]) / (2*np.pi * 1j * k_sq)))
v_int_fs = v_int_x + v_int_y
return v_int_fs
k_int = int_2d_fourier(k, sampling)
Unfortunately, the result is not very accurate at the position where k has an abrupt change, as can be seen in the plot below, which displayes a horizontal line profile of k and k_int.
Any ideas how to improve the accuracy? Is there a way to make it exactly the same?
I actually found a solution. The integration itself yields very accurate results.
However, the gradient function from numpy calculates second order accurate central differences, which means that the gradient itself already is an approximation.
When you replace the problem above with an analytical formula such as a 2D Gaussian, one can calculate the derivative analytically. When integrating this analytically derived function, the error is on the order of 10^-10 (depending on the width of the Gaussian, which can lead to aliasing effects).
So long story short: The integration function proposed above works as intended!
I have data that I want to fit with polynomials. I have 200,000 data points, so I want an efficient algorithm. I want to use the numpy.polynomial package so that I can try different families and degrees of polynomials. Is there some way I can formulate this as a system of equations like Ax=b? Is there a better way to solve this than with scipy.minimize?
import numpy as np
from scipy.optimize import minimize as mini
x1 = np.random.random(2000)
x2 = np.random.random(2000)
y = 20 * np.sin(x1) + x2 - np.sin (30 * x1 - x2 / 10)
def fitness(x, degree=5):
poly1 = np.polynomial.polynomial.polyval(x1, x[:degree])
poly2 = np.polynomial.polynomial.polyval(x2, x[degree:])
return np.sum((y - (poly1 + poly2)) ** 2 )
# It seems like I should be able to solve this as a system of equations
# x = np.linalg.solve(np.concatenate([x1, x2]), y)
# minimize the sum of the squared residuals to find the optimal polynomial coefficients
x = mini(fitness, np.ones(10))
print fitness(x.x)
Your intuition is right. You can solve this as a system of equations of the form Ax = b.
However:
The system is overdefined and you want to get the least-squares solution, so you need to use np.linalg.lstsq instead of np.linalg.solve.
You can't use polyval because you need to separate the coefficients and powers of the independent variable.
This is how to construct the system of equations and solve it:
A = np.stack([x1**0, x1**1, x1**2, x1**3, x1**4, x2**0, x2**1, x2**2, x2**3, x2**4]).T
xx = np.linalg.lstsq(A, y)[0]
print(fitness(xx)) # test the result with original fitness function
Of course you can generalize over the degree:
A = np.stack([x1**p for p in range(degree)] + [x2**p for p in range(degree)]).T
With the example data, the least squares solution runs much faster than the minimize solution (800µs vs 35ms on my laptop). However, A can become quite large, so if memory is an issue minimize might still be an option.
Update:
Without any knowledge about the internals of the polynomial function things become tricky, but it is possible to separate terms and coefficients. Here is a somewhat ugly way to construct the system matrix A from a function like polyval:
def construct_A(valfunc, degree):
columns1 = []
columns2 = []
for p in range(degree):
c = np.zeros(degree)
c[p] = 1
columns1.append(valfunc(x1, c))
columns2.append(valfunc(x2, c))
return np.stack(columns1 + columns2).T
A = construct_A(np.polynomial.polynomial.polyval, 5)
xx = np.linalg.lstsq(A, y)[0]
print(fitness(xx)) # test the result with original fitness function
I am trying to approximate a function using the Discrete Fourier Transform, being given 2M+1 values of the function.
I've seen a few different expression for the coefficients and the approximation, but the ones I was originally trying were (12) and (13) as in http://www.chebfun.org/docs/guide/guide11.html
(I apologize for the link, but apparently Stack Overflow does not support Latex.)
I have a function for computing the approximation given the coefficients and another to calculate the coefficients, but it also returns this previous function. I've tested with some values but the results weren't close at all. I compared both of them with the numpy.fft.fft: the coefficients didn't match and passing the FFT to the first function did not result in a good approximation as well, so the coefficients aren't the only problem.
Here is my code:
def model(cks, x):
n = len(cks)
assert(n%2 == 1)
M = (n-1)//2
def soma(s):
soma = 0
for i in range(n):
m = -M + i
soma += cks[i]*cmath.exp(1j*m*s)
return soma
soma = np.vectorize(soma)
return soma(x)
def fourier(y):
n = len(y)
assert(n%2 == 1)
M = (n-1)//2
def soma(k):
soma = 0
for i in range(n):
t = 2*math.pi*i/n
soma += y[i]*cmath.exp(-1j*k*t)
return (1/n)*soma
cks = np.zeros(n, dtype='complex')
for i in range(n):
j = -M + i
cks[i] = soma(j)
return cks, (lambda x: model(cks,x))
I'm not sure I understand your code, but it looks to me like you have a forward and an inverse DFT there. One of those doesn't use pi, but it should.
If you're interested in obtaining interpolating samples, you can apply the DFT, pad it with zeros, then compute the inverse DFT (I'm using MATLAB code, because that is what I know, but I think it's fairly easy to read):
f = randn(1,21); % an input signal to be interpolated
F = fft(f); % forward DFT
F = fftshift(F); % shift zero frequency to middle of array
F = [zeros(1,60),F,zeros(1,60)]; % pad with equal number of zeros on both sides
F = ifftshift(F); % shift zero frequency back to first array element
fi = ifft(F) * length(F)/length(f); % inverse DFT, normalize
% `fi` is the interpolated `f`
% plotting
x = linspace(1,length(fi)+1,length(f)+1);
x = x(1:end-1);
plot(x,f,'x');
xi = 1:length(fi);
hold on
plot(xi,fi);
If you feel like you need to implement the DFT and inverse DFT from scratch, know that you can implement the latter using the former.
If you want to create a continuous function as a summation of shifted sine functions, follow the equation for the Fourier series, with A_n and Φ_n given by the amplitude and phase of the elements of the DFT.
I have used numpy's polyfit and obtained a very good fit (using a 7th order polynomial) for two arrays, x and y. My relationship is thus;
y(x) = p[0]* x^7 + p[1]*x^6 + p[2]*x^5 + p[3]*x^4 + p[4]*x^3 + p[5]*x^2 + p[6]*x^1 + p[7]
where p is the polynomial array output by polyfit.
Is there a way to reverse this method easily, so I have a solution in the form of,
x(y) = p[0]*y^n + p[1]*y^n-1 + .... + p[n]*y^0
No there is no easy way in general. Closed form-solutions for arbitrary polynomials are not available for polynomials of the seventh order.
Doing the fit in the reverse direction is possible, but only on monotonically varying regions of the original polynomial. If the original polynomial has minima or maxima on the domain you are interested in, then even though y is a function of x, x cannot be a function of y because there is no 1-to-1 relation between them.
If you are (i) OK with redoing the fitting procedure, and (ii) OK with working piecewise on single monotonic regions of your fit at a time, then you could do something like this:
-
import numpy as np
# generate a random coefficient vector a
degree = 1
a = 2 * np.random.random(degree+1) - 1
# an assumed true polynomial y(x)
def y_of_x(x, coeff_vector):
"""
Evaluate a polynomial with coeff_vector and degree len(coeff_vector)-1 using Horner's method.
Coefficients are ordered by increasing degree, from the constant term at coeff_vector[0],
to the linear term at coeff_vector[1], to the n-th degree term at coeff_vector[n]
"""
coeff_rev = coeff_vector[::-1]
b = 0
for a in coeff_rev:
b = b * x + a
return b
# generate some data
my_x = np.arange(-1, 1, 0.01)
my_y = y_of_x(my_x, a)
# verify that polyfit in the "traditional" direction gives the correct result
# [::-1] b/c polyfit returns coeffs in backwards order rel. to y_of_x()
p_test = np.polyfit(my_x, my_y, deg=degree)[::-1]
print p_test, a
# fit the data using polyfit but with y as the independent var, x as the dependent var
p = np.polyfit(my_y, my_x, deg=degree)[::-1]
# define x as a function of y
def x_of_y(yy, a):
return y_of_x(yy, a)
# compare results
import matplotlib.pyplot as plt
%matplotlib inline
plt.plot(my_x, my_y, '-b', x_of_y(my_y, p), my_y, '-r')
Note: this code does not check for monotonicity but simply assumes it.
By playing around with the value of degree, you should see that see the code only works well for all random values of a when degree=1. It occasionally does OK for other degrees, but not when there are lots of minima / maxima. It never does perfectly for degree > 1 because approximating parabolas with square-root functions doesn't always work, etc.
Update: I have modified the Optimize and Eigen and Solve methods to reflect changes. All now return the "same" vector allowing for machine precision. I am still stumped on the Eigen method. Specifically How/Why I select slice of the eigenvector does not make sense. It was just trial and error till the normal matched the other solutions. If anyone can correct/explain what I really should do, or why what I have done works I would appreciate it..
Thanks Alexander Kramer, for explaining why I take a slice, only alowed to select one correct answer
I have a depth image. I want to calculate a crude surface normal for a pixel in the depth image. I consider the surrounding pixels, in the simplest case a 3x3 matrix, and fit a plane to these point, and calculate the normal unit vector to this plane.
Sounds easy, but thought best to verify the plane fitting algorithms first. Searching SO and various other sites I see methods using least squares, singlualar value decomposition, eigenvectors/values etc.
Although I don't fully understand the maths I have been able to get the various fragments/example to work. The problem I am having, is that I am getting different answers for each method. I was expecting the various answers would be similar (not exact), but they seem significantly different. Perhaps some methods are not suited to my data, but not sure why I am getting different results. Any ideas why?
Here is the Updated output of the code:
LTSQ: [ -8.10792259e-17 7.07106781e-01 -7.07106781e-01]
SVD: [ 0. 0.70710678 -0.70710678]
Eigen: [ 0. 0.70710678 -0.70710678]
Solve: [ 0. 0.70710678 0.70710678]
Optim: [ -1.56069661e-09 7.07106781e-01 7.07106782e-01]
The following code implements five different methods to calculate the surface normal of a plane. The algorithms/code were sourced from various forums on the internet.
import numpy as np
import scipy.optimize
def fitPLaneLTSQ(XYZ):
# Fits a plane to a point cloud,
# Where Z = aX + bY + c ----Eqn #1
# Rearanging Eqn1: aX + bY -Z +c =0
# Gives normal (a,b,-1)
# Normal = (a,b,-1)
[rows,cols] = XYZ.shape
G = np.ones((rows,3))
G[:,0] = XYZ[:,0] #X
G[:,1] = XYZ[:,1] #Y
Z = XYZ[:,2]
(a,b,c),resid,rank,s = np.linalg.lstsq(G,Z)
normal = (a,b,-1)
nn = np.linalg.norm(normal)
normal = normal / nn
return normal
def fitPlaneSVD(XYZ):
[rows,cols] = XYZ.shape
# Set up constraint equations of the form AB = 0,
# where B is a column vector of the plane coefficients
# in the form b(1)*X + b(2)*Y +b(3)*Z + b(4) = 0.
p = (np.ones((rows,1)))
AB = np.hstack([XYZ,p])
[u, d, v] = np.linalg.svd(AB,0)
B = v[3,:]; # Solution is last column of v.
nn = np.linalg.norm(B[0:3])
B = B / nn
return B[0:3]
def fitPlaneEigen(XYZ):
# Works, in this case but don't understand!
average=sum(XYZ)/XYZ.shape[0]
covariant=np.cov(XYZ - average)
eigenvalues,eigenvectors = np.linalg.eig(covariant)
want_max = eigenvectors[:,eigenvalues.argmax()]
(c,a,b) = want_max[3:6] # Do not understand! Why 3:6? Why (c,a,b)?
normal = np.array([a,b,c])
nn = np.linalg.norm(normal)
return normal / nn
def fitPlaneSolve(XYZ):
X = XYZ[:,0]
Y = XYZ[:,1]
Z = XYZ[:,2]
npts = len(X)
A = np.array([ [sum(X*X), sum(X*Y), sum(X)],
[sum(X*Y), sum(Y*Y), sum(Y)],
[sum(X), sum(Y), npts] ])
B = np.array([ [sum(X*Z), sum(Y*Z), sum(Z)] ])
normal = np.linalg.solve(A,B.T)
nn = np.linalg.norm(normal)
normal = normal / nn
return normal.ravel()
def fitPlaneOptimize(XYZ):
def residiuals(parameter,f,x,y):
return [(f[i] - model(parameter,x[i],y[i])) for i in range(len(f))]
def model(parameter, x, y):
a, b, c = parameter
return a*x + b*y + c
X = XYZ[:,0]
Y = XYZ[:,1]
Z = XYZ[:,2]
p0 = [1., 1.,1.] # initial guess
result = scipy.optimize.leastsq(residiuals, p0, args=(Z,X,Y))[0]
normal = result[0:3]
nn = np.linalg.norm(normal)
normal = normal / nn
return normal
if __name__=="__main__":
XYZ = np.array([
[0,0,1],
[0,1,2],
[0,2,3],
[1,0,1],
[1,1,2],
[1,2,3],
[2,0,1],
[2,1,2],
[2,2,3]
])
print "Solve: ", fitPlaneSolve(XYZ)
print "Optim: ",fitPlaneOptimize(XYZ)
print "SVD: ",fitPlaneSVD(XYZ)
print "LTSQ: ",fitPLaneLTSQ(XYZ)
print "Eigen: ",fitPlaneEigen(XYZ)
Optimize
The normal vector of a plane a*x + b*y +c*z = 0, equals (a,b,c)
The optimize method finds a values for a and b such that a*x+b*y~z (~ denotes approximates) It omits to use the value of c in the calculation at all. I don't have numpy installed on this machine but I expect that changing the model to (a*x+b*y)/c should fix this method. It will not give the same result for all data-sets. This method will always assume a plane that goes through the origin.
SVD and LTSQ
produce the same results. (The difference is about the size of machine precision).
Eigen
The wrong eigenvector is chosen. The eigenvector corresponding to the greatest eigenvalue (lambda = 1.50) is x=[0, sqrt(2)/2, sqrt(2)/2] just as in the SVD and LTSQ.
Solve
I have no clue how this is supposed to work.
The normal vector of the plane in Eigen solution is the eigenvector for smallest eigenvalue. Some Eigen implementations sort the eigenvalues and eigenvectors some others don't. So in some implementations it's sufficient to take first (or last) eigenvector for normal. In other implementations you have to sort them first. On the other hand the majority of SVD implementations provide sorted values so it's simple first (or last) vector.