How to best perform a surface integral over 2D point data? - python

I have a data set of 363 x- by 190 y-points with an associated functional value that I would to integrate over multiple different subregions.. I've tried to create a SciPy interp2d function to integrate; however, creating that function even with linear interpolation has taken over 2 hours (and is not yet done).
What is a better approach to perform this task?
Some snippets below...
In the convert_RT_to_XY function below, imb/jmb are the r,theta mesh boundaries that I convert to Cartesian boundaries.
Later, in my code, I convert the mesh boundaries (imb/jmb) to mesh-center values (imm,jmm), convert to vectors (iX, iY), convert my function a vector (iZ), and then attempt to make my interpolation function.
# Convert R, T mesh vectors to X, Y mesh arrays.
def convert_RT_to_XY(imb, jmb):
R, T = np.meshgrid(imb,jmb)
X = R * np.cos(np.radians(T*360))
Y = R * np.sin(np.radians(T*360))
return(X, Y)
...
imm = imb[:-1]+np.divide(np.diff(imb),2)
jmm = jmb[:-1]+np.divide(np.diff(jmb),2)
iX, iY = convert_RT_to_XY(imm, jmm)
iX = np.ndarray.flatten(iX)
iY = np.ndarray.flatten(iY)
iZ = np.ndarray.flatten(plot_function)
f = interpolate.interp2d(iX, iY, iZ, kind='linear')
Ultimately, I want to perform:
result = dblquad(f, 10, 30,
lambda x: 10,
lambda x: 30))

Look into SciPy's RectBivariateSpline. If you are placing your data on a Cartesian grid anyway, it performs much faster than interp2D

Related

Evaluating convolution of two continuous functions using fftconvolve

I am trying to evaluate the convolution of two continuous functions using scipy.signal.fftconvolve. The scenario of the code is as following:
I am trying to approximate the following double integral:
, i.e. in a region C_1(x',y'), representing a circle of radius 1 centered at (x', y'). This can be approximated by the following integral:
where the function K is chosen as a continuous integrable function, say, exp(-x^2-y^2), the shape of which is approximately that of a circle of radius 1. If I take a function K'(x,y)=K(-x,-y), then the integral is exactly a convolution of the two functions:
So I try to discretize these two functions into arrays and then carry out convolution.
The following code will be written in Julia and the fftconvolve function will be imported using PyCall.jl.
using PyCall
using Interpolations
r = 1
xc = -10:0.05:10
yc = -10:0.05:10
K(x, y) = exp(-(x^2+y^2)/r^2)
rho(x, y) = x^2+y^3 # Try some arbitrary function
ss = pyimport("scipy.signal") # Import scipy.signal module from Python
a = [rho(x,y) for x in xc, y in yc]
b = [K(-x,-y) for x in xc, y in yc]
c = ss.fftconvolve(a,b,mode="same") # zero-paddings beyond boundary, unimportant since rho is near zero beyond the boundary anyway
c_unscaled = interpolate(c', BSpline(Cubic(Line(OnCell()))))
# Adjoint because the array comprehension switched x and y, then interpolate the array
c_scaled = Interpolations.scale(c_unscaled, xc, yc) # Scale the interpolated function w.r.t. xc, yc
print(c_scaled(0.0,0.0)) # The result of the integral for (x', y') = (0, 0)
The result is 628.3185307178969, while the result from numerical integration is 0.785398. What is the problem here?
You could probably try to use scipy.signal.convolve which will convolve two N-dimensional arrays, but not by using Fast Fourier Transform.
It uses a direct method to calculate a convolution. Here, I mean that the convolution is determined directly from sums.
So you could maybe try to replace the line where you calculate c with this one:
c = ss.convolve(a,b,mode="same", method='direct')

Finding the correct x and y widths of 2D array for Gaussian fit

I'd like to fit my 2D numpy array (image) data to a Gaussian. I've read a lot of examples using scipy.optimize, and I've tried but the fit has never been good -- this is probably because my background is non-zero, and sometimes I have other peaks too. I think it might be easier for me to simply generate a Gaussian that has the parameters of the correct peak. I already have the subpixel centroid coordinates x and y of the peak I want, and can easily get the amplitude of the peak with data[y][x], although I guess I would have to round the coordinates. What I'm stuck on now is the x and y widths. My Gaussian function looks like this:
import numpy as np
def gaussian_func(xy, x0, y0, width_x, width_y, amp): #x0 and y0 are the centroid coordinates
x = xy[0]
y = xy[1]
offset = np.min(data) #should this be a median value of the background instead?
a = 1/(2*width_x**2)
c = 1/(2*width_y**2)
exp_term = a*(x-x0)**2 + c*(y-y0)**2
return (offset + amp * np.exp(-exp_term)).ravel()
x, y = np.arange(0, np.shape(data)[1], 1), np.arange(0, np.shape(data)[0], 1)
xx, yy = np.meshgrid(x, y)
gaussian = gaussian_func((xx, yy), x0, y0, width_x, width_y, amp)
gaussian = np.reshape(gaussian, np.shape(data))
So I'm basically just confused on what to insert for width_x and width_y. I know these terms are supposed to be interchangeable with the standard deviations in x and y, but when I tried simply using np.std(data), I got bad results. Do the widths correspond to the actual physical widths of the peak? If so, how do I find those? Thanks!

Joint CDF in numpy

Computing a CDF (Cummulative Distribution Function) in numpy is fairly straightforward, but now I want to move to multiple dimensions using the 3 dimensions of data and then compute be able to easily compute the corresponding X, Y, Z for say Nth percentile easily.
I'm finding the documentation out there is not the easiest to navigate and that would be useful. I'm trying to use what is already out there and not re-invent the wheel.
Here is how I do it in 1D:
h, x = np.histogramdd(np.array(full_data), bins = 10, normed = True)
dx = x[1] - x[0]
f1 = np.cumsum(h)*dx
Then plot:
plt.plot(x[1:], f1)
In 3D it will look like:
full_data = [[1,2,4], [2,3,4], ...]
Any suggestions for something more pythonic and elegant before I cludge something together.

Efficient way to map a function over numpy matrix?

I'm trying to reproduce an algorithm over image, but failed to achieve the performance of PIL on Python.
For simplity, we take interpolation as an example.Supposed we have a matrix Im of luminance. For any point (x,y), we can compute the interpolation value by g(x,y)=f(floor(x),floor(y))+f(floor(x)+1,floor(y))+f(floor(x),floor(y)+1)+f(floor(x)+1,floor(y)+1) /4
Here is part of code. It takes tens of seconds to resize an image, that's inefficient. Also, it's not a element-wise mapping function. It involves with the whole matrix, or more precisely, the neighbour points of each point.
im = np.matrix(...) #A 512*512 image
axis = [x/(2047/511.) for x in xrange(2048)]
axis = [(x,y) for x in axis for y in axis] #resize to 2048*2048
im_temp = []
for (x, y) in axis:
(l, k) = np.floor((x, y)).astype(int)
a, b = x-l, y-k
temp = (1-a)*(1-b)*im[l+1,k+1] + a*(1-b)*im[l+2,k+1] + (1-a)*b*im[l+1,k+2] + a*b*im[l+2,k+2]
im_temp.append(temp)
np.asmatrix(im_temp).reshape((2048,2048)).astype(int)
How can we implement this algorithm in a more efficient way instead of 2 for loop?

SciPy interp2D for pairs of coordinates

I'm using scipy.interpolate.interp2d to create an interpolation function for a surface. I then have two arrays of real data that I want to calculate interpolated points for. If I pass the two arrays to the interp2d function I get an array of all the points, not just the pairs of points.
My solution to this is to zip the two arrays into a list of coordinate pairs and pass this to the interpolation function in a loop:
f_interp = interpolate.interp2d(X_table, Y_table,Z_table, kind='cubic')
co_ords = zip(X,Y)
out = []
for i in range(len(co_ords)):
X = co_ords[i][0]
Y = co_ords[i][1]
value = f_interp(X,Y)
out.append(float(value))
My question is, is there a better (more elegant, Pythonic?) way of achieving the same result?
Passing all of your points at once will probably be quite a lot faster than looping over them in Python. You could use scipy.interpolate.griddata:
Z = interpolate.griddata((X_table, Y_table), Z_table, (X, Y), method='cubic')
or one of the scipy.interpolate.BivariateSpline classes, e.g. SmoothBivariateSpline:
itp = interpolate.SmoothBivariateSpline(X_table, Y_table, Z_table)
# NB: choose grid=False to get an (n,) rather than an (n, n) output
Z = itp(X, Y, grid=False)
CloughTocher2DInterpolator also works in a similar fashion, but without the grid=False parameter (it always returns a 1D output).
Try *args and tuple packing/unpacking
points = zip(X, Y)
out = []
for p in points:
value = f_interp(*p)
out.append(float(value))
or just
points = zip(X, Y)
out = [float(f_interp(*p)) for p in points]
or just
out = [float(f_interp(*p)) for p in zip(X, Y)]
as a side note, the "magic star" allows zip to be its own inverse!
points = zip(x, y)
x, y = zip(*points)
For one, you can do
for Xtmp,Ytmp in zip(X,Y):
...
in your loop. Or even better, just
out = [float(f_interp(XX,YY)) for XX,YY in zip(X,Y)]
replacing the loop.
On a different note, I suggest using interpolate.griddata instead. It tends to behave much better than interp2d, and it accepts arbitrary-shaped points as input. As you've seen, interp2d interpolators will only return you values on a mesh.
Inspired by this thread where someone recommends using the internal weights of the interp2d function, I've created the following wrapper which has exactly the same interface as interp2d but the interpolant evaluate pairs of inputs and return a numpy array of the same shape of its inputs. The performances should be better than for loops or list comprehension, but when evaluated on a grid it will be slightly outperformed by the scipy interp2d.
import scipy.interpolate as si
def interp2d_pairs(*args,**kwargs):
""" Same interface as interp2d but the returned interpolant will evaluate its inputs as pairs of values.
"""
# Internal function, that evaluates pairs of values, output has the same shape as input
def interpolant(x,y,f):
x,y = np.asarray(x), np.asarray(y)
return (si.dfitpack.bispeu(f.tck[0], f.tck[1], f.tck[2], f.tck[3], f.tck[4], x.ravel(), y.ravel())[0]).reshape(x.shape)
# Wrapping the scipy interp2 function to call out interpolant instead
return lambda x,y: interpolant(x,y,si.interp2d(*args,**kwargs))
# Create the interpolant (same interface as interp2d)
f = interp2d_pairs(X,Y,Z,kind='cubic')
# Evaluate the interpolant on each pairs of x and y values
z=f(x,y)

Categories