Related
Working in Python, I am doing some physics calculations over an NxM grid of values, where N goes from 1 to 3108 and M goes from 1 to 2304 (this corresponds to a large image). I need calculate a value at each and every point in this space, which totals ~ 7 million calculations. My current approach is painfully slow, and I am wondering if there is a way to complete this task and it not take hours...
My first approach was just to use nested for loops, but this seemed like the least efficient way to solve my problem. I have tried using NumPy's nditer and iterating over each axis individually, but I've read that it doesn't actually speed up my computations. Rather than looping through each axis individually, I also tried making a 3-D array and looping through the outer axis as shown in Brian's answer here How can I, in python, iterate over multiple 2d lists at once, cleanly? . Here is the current state of my code:
import numpy as np
x,y = np.linspace(1,3108,num=3108),np.linspace(1,2304,num=2304) # x&y dimensions of image
X,Y = np.meshgrid(x,y,indexing='ij')
all_coords = np.dstack((X,Y)) # moves to 3-D
all_coords = all_coords.astype(int) # sets coords to int
For reference, all_coords looks like this:
array([[[1.000e+00, 1.000e+00],
[1.000e+00, 2.000e+00],
[1.000e+00, 3.000e+00],
...,
[1.000e+00, 2.302e+03],
[1.000e+00, 2.303e+03],
[1.000e+00, 2.304e+03]],
[[2.000e+00, 1.000e+00],
[2.000e+00, 2.000e+00],
[2.000e+00, 3.000e+00],
...,
[2.000e+00, 2.302e+03],
[2.000e+00, 2.303e+03],
[2.000e+00, 2.304e+03]],
and so on. Back to my code...
'''
- below is a function that does a calculation on the full grid using the distance between x0,y0 and each point on the grid.
- the function takes x0,y0 and returns the calculated values across the grid
'''
def do_calc(x0,y0):
del_x, del_y = X-x0, Y-y0
np.seterr(divide='ignore', invalid='ignore')
dmx_ij = (del_x/((del_x**2)+(del_y**2))) # x component
dmy_ij = (del_y/((del_x**2)+(del_y**2))) # y component
return dmx_ij,dmy_ij
# now the actual loop
def do_loop():
dmx,dmy = 0,0
for pair in all_coords:
for xi,yi in pair:
DM = do_calc(xi,yi)
dmx,dmy = dmx+DM[0],dmy+DM[1]
return dmx,dmy
As you might see, this code takes an incredibly long time to run... If there is any way to modify my code such that it doesn't take hours to complete, I would be extremely interested in knowing how to do that. Thanks in advance for the help.
Here is a method that gives a 10,000x speedup at N=310, M=230. As the method scales better than the original code I'd expect a factor of more than a million at the full problem size.
The method exploits the shift invariance of the problem. For example, del_x**2 is essentially the same up to shift at each call of do_calc, so we compute it only once.
If the output of do_calc is weighted before summation the problem is no longer fully translation invariant, and this method doesn't work anymore. The result, however, can then be expressed in terms of linear convolution. At N=310, M=230 this still leaves us with a more than 1,000x speedup. And, again, this will be more at full problem size
Code for original problem
import numpy as np
#N, M = 3108, 2304
N, M = 310, 230
### OP's code
x,y = np.linspace(1,N,num=N),np.linspace(1,M,num=M) # x&y dimensions of image
X,Y = np.meshgrid(x,y,indexing='ij')
all_coords = np.dstack((X,Y)) # moves to 3-D
all_coords = all_coords.astype(int) # sets coords to int
'''
- below is a function that does a calculation on the full grid using the distance between x0,y0 and each point on the grid.
- the function takes x0,y0 and returns the calculated values across the grid
'''
def do_calc(x0,y0):
del_x, del_y = X-x0, Y-y0
np.seterr(divide='ignore', invalid='ignore')
dmx_ij = (del_x/((del_x**2)+(del_y**2))) # x component
dmy_ij = (del_y/((del_x**2)+(del_y**2))) # y component
return np.nan_to_num(dmx_ij), np.nan_to_num(dmy_ij)
# now the actual loop
def do_loop():
dmx,dmy = 0,0
for pair in all_coords:
for xi,yi in pair:
DM = do_calc(xi,yi)
dmx,dmy = dmx+DM[0],dmy+DM[1]
return dmx,dmy
from time import time
t = [time()]
### pp's code
x, y = np.ogrid[-N+1:N-1:2j*N - 1j, -M+1:M-1:2j*M - 1J]
den = x*x + y*y
den[N-1, M-1] = 1
xx = x / den
yy = y / den
for zz in xx, yy:
zz[N:] -= zz[:N-1]
zz[:, M:] -= zz[:, :M-1]
XX = xx.cumsum(0)[N-1:].cumsum(1)[:, M-1:]
YY = yy.cumsum(0)[N-1:].cumsum(1)[:, M-1:]
t.append(time())
### call OP's code for reference
X_OP, Y_OP = do_loop()
t.append(time())
# make sure results are equal
assert np.allclose(XX, X_OP)
assert np.allclose(YY, Y_OP)
print('pp {}\nOP {}'.format(*np.diff(t)))
Sample run:
pp 0.015251636505126953
OP 149.1642508506775
Code for weighted problem:
import numpy as np
#N, M = 3108, 2304
N, M = 310, 230
values = np.random.random((N, M))
x,y = np.linspace(1,N,num=N),np.linspace(1,M,num=M) # x&y dimensions of image
X,Y = np.meshgrid(x,y,indexing='ij')
all_coords = np.dstack((X,Y)) # moves to 3-D
all_coords = all_coords.astype(int) # sets coords to int
'''
- below is a function that does a calculation on the full grid using the distance between x0,y0 and each point on the grid.
- the function takes x0,y0 and returns the calculated values across the grid
'''
def do_calc(x0,y0, v):
del_x, del_y = X-x0, Y-y0
np.seterr(divide='ignore', invalid='ignore')
dmx_ij = (del_x/((del_x**2)+(del_y**2))) # x component
dmy_ij = (del_y/((del_x**2)+(del_y**2))) # y component
return v*np.nan_to_num(dmx_ij), v*np.nan_to_num(dmy_ij)
# now the actual loop
def do_loop():
dmx,dmy = 0,0
for pair, vv in zip(all_coords, values):
for (xi,yi), v in zip(pair, vv):
DM = do_calc(xi,yi, v)
dmx,dmy = dmx+DM[0],dmy+DM[1]
return dmx,dmy
from time import time
from scipy import signal
t = [time()]
x, y = np.ogrid[-N+1:N-1:2j*N - 1j, -M+1:M-1:2j*M - 1J]
den = x*x + y*y
den[N-1, M-1] = 1
xx = x / den
yy = y / den
XX, YY = (signal.fftconvolve(zz, values, 'valid') for zz in (xx, yy))
t.append(time())
X_OP, Y_OP = do_loop()
t.append(time())
assert np.allclose(XX, X_OP)
assert np.allclose(YY, Y_OP)
print('pp {}\nOP {}'.format(*np.diff(t)))
Sample run:
pp 0.12683939933776855
OP 158.35225439071655
I have a grid containing some data in polar coordinates, simulating data obtained from a LIDAR for the SLAM problem. Each row in the grid represents the angle, and each column represents a distance. The values contained in the grid store a weighted probability of the occupancy map for a Cartesian world.
After converting to Cartesian Coordinates, I obtain something like this:
This mapping is intended to work in a FastSLAM application, with at least 10 particles. The performance I am obtaining isn't good enough for a reliable application.
I have tried with nested loops, using the scipy.ndimage.geometric_transform library and accessing directly the grid with pre-computed coordinates.
In those examples, I am working with a 800x800 grid.
Nested loops: aprox 300ms
i = 0
for scan in scans:
hit = scan < laser.range_max
if hit:
d = np.linspace(scan + wall_size, 0, num=int((scan+ wall_size)/cell_size))
else:
d = np.linspace(scan, 0, num=int(scan/cell_size))
for distance in distances:
x = int(pos[0] + d * math.cos(angle[i]+pos[2]))
y = int(pos[1] + d * math.sin(angle[i]+pos[2]))
if distance > scan:
grid_cart[y][x] = grid_cart[y][x] + hit_weight
else:
grid_cart[y][x] = grid_cart[y][x] + miss_weight
i = i + 1
Scipy library (Described here): aprox 2500ms (Gives a smoother result since it interpolates the empty cells)
grid_cart = S.ndimage.geometric_transform(weight_mat, polar2cartesian,
order=0,
output_shape = (weight_mat.shape[0] * 2, weight_mat.shape[0] * 2),
extra_keywords = {'inputshape':weight_mat.shape,
'origin':(weight_mat.shape[0], weight_mat.shape[0])})
def polar2cartesian(outcoords, inputshape, origin):
"""Coordinate transform for converting a polar array to Cartesian coordinates.
inputshape is a tuple containing the shape of the polar array. origin is a
tuple containing the x and y indices of where the origin should be in the
output array."""
xindex, yindex = outcoords
x0, y0 = origin
x = xindex - x0
y = yindex - y0
r = np.sqrt(x**2 + y**2)
theta = np.arctan2(y, x)
theta_index = np.round((theta + np.pi) * inputshape[1] / (2 * np.pi))
return (r,theta_index)
Pre-computed indexes: 80ms
for i in range(0, 144000):
gird_cart[ys[i]][xs[i]] = grid_polar_1d[i]
I am not very used to python and Numpy, and I feel I am skipping an easy and fast way to solve this problem. Are there any other alternatives to solve that?
Many thanks to you all!
I came across a piece of code that seems to behave x10 times faster (8ms):
angle_resolution = 1
range_max = 400
a, r = np.mgrid[0:int(360/angle_resolution),0:range_max]
x = (range_max + r * np.cos(a*(2*math.pi)/360.0)).astype(int)
y = (range_max + r * np.sin(a*(2*math.pi)/360.0)).astype(int)
for i in range(0, int(360/angle_resolution)):
cart_grid[y[i,:],x[i,:]] = polar_grid[i,:]
I'm currently attempting to implement this algorithm for volume rendering in Python, and am conceptually confused about their method of generating the LH histogram (see section 3.1, page 4).
I have a 3D stack of DICOM images, and calculated its gradient magnitude and the 2 corresponding azimuth and elevation angles with it (which I found out about here), as well as finding the second derivative.
Now, the algorithm is asking me to iterate through a set of voxels, and "track a path by integrating the gradient field in both directions...using the second order Runge-Kutta method with an integration step of one voxel".
What I don't understand is how to use the 2 angles I calculated to integrate the gradient field in said direction. I understand that you can use trilinear interpolation to get intermediate voxel values, but I don't understand how to get the voxel coordinates I want using the angles I have.
In other words, I start at a given voxel position, and want to take a 1 voxel step in the direction of the 2 angles calculated for that voxel (one in the x-y direction, the other in the z-direction). How would I take this step at these 2 angles and retrieve the new (x, y, z) voxel coordinates?
Apologies in advance, as I have a very basic background in Calc II/III, so vector fields/visualization of 3D spaces is still a little rough for me.
Creating 3D stack of DICOM images:
def collect_data(data_path):
print "collecting data"
files = [] # create an empty list
for dirName, subdirList, fileList in os.walk(data_path):
for filename in fileList:
if ".dcm" in filename:
files.append(os.path.join(dirName,filename))
# Get reference file
ref = dicom.read_file(files[0])
# Load dimensions based on the number of rows, columns, and slices (along the Z axis)
pixel_dims = (int(ref.Rows), int(ref.Columns), len(files))
# Load spacing values (in mm)
pixel_spacings = (float(ref.PixelSpacing[0]), float(ref.PixelSpacing[1]), float(ref.SliceThickness))
x = np.arange(0.0, (pixel_dims[0]+1)*pixel_spacings[0], pixel_spacings[0])
y = np.arange(0.0, (pixel_dims[1]+1)*pixel_spacings[1], pixel_spacings[1])
z = np.arange(0.0, (pixel_dims[2]+1)*pixel_spacings[2], pixel_spacings[2])
# Row and column directional cosines
orientation = ref.ImageOrientationPatient
# This will become the intensity values
dcm = np.zeros(pixel_dims, dtype=ref.pixel_array.dtype)
origins = []
# loop through all the DICOM files
for filename in files:
# read the file
ds = dicom.read_file(filename)
#get pixel spacing and origin information
origins.append(ds.ImagePositionPatient) #[0,0,0] coordinates in real 3D space (in mm)
# store the raw image data
dcm[:, :, files.index(filename)] = ds.pixel_array
return dcm, origins, pixel_spacings, orientation
Calculating gradient magnitude:
def calculate_gradient_magnitude(dcm):
print "calculating gradient magnitude"
gradient_magnitude = []
gradient_direction = []
gradx = np.zeros(dcm.shape)
sobel(dcm,0,gradx)
grady = np.zeros(dcm.shape)
sobel(dcm,1,grady)
gradz = np.zeros(dcm.shape)
sobel(dcm,2,gradz)
gradient = np.sqrt(gradx**2 + grady**2 + gradz**2)
azimuthal = np.arctan2(grady, gradx)
elevation = np.arctan(gradz,gradient)
azimuthal = np.degrees(azimuthal)
elevation = np.degrees(elevation)
return gradient, azimuthal, elevation
Converting to patient coordinate system to get actual voxel position:
def get_patient_position(dcm, origins, pixel_spacing, orientation):
"""
Image Space --> Anatomical (Patient) Space is an affine transformation
using the Image Orientation (Patient), Image Position (Patient), and
Pixel Spacing properties from the DICOM header
"""
print "getting patient coordinates"
world_coordinates = np.empty((dcm.shape[0], dcm.shape[1],dcm.shape[2], 3))
affine_matrix = np.zeros((4,4), dtype=np.float32)
rows = dcm.shape[0]
cols = dcm.shape[1]
num_slices = dcm.shape[2]
image_orientation_x = np.array([ orientation[0], orientation[1], orientation[2] ]).reshape(3,1)
image_orientation_y = np.array([ orientation[3], orientation[4], orientation[5] ]).reshape(3,1)
pixel_spacing_x = pixel_spacing[0]
# Construct affine matrix
# Method from:
# http://nipy.org/nibabel/dicom/dicom_orientation.html
T_1 = origins[0]
T_n = origins[num_slices-1]
affine_matrix[0,0] = image_orientation_y[0] * pixel_spacing[0]
affine_matrix[0,1] = image_orientation_x[0] * pixel_spacing[1]
affine_matrix[0,3] = T_1[0]
affine_matrix[1,0] = image_orientation_y[1] * pixel_spacing[0]
affine_matrix[1,1] = image_orientation_x[1] * pixel_spacing[1]
affine_matrix[1,3] = T_1[1]
affine_matrix[2,0] = image_orientation_y[2] * pixel_spacing[0]
affine_matrix[2,1] = image_orientation_x[2] * pixel_spacing[1]
affine_matrix[2,3] = T_1[2]
affine_matrix[3,3] = 1
k1 = (T_1[0] - T_n[0])/ (1 - num_slices)
k2 = (T_1[1] - T_n[1])/ (1 - num_slices)
k3 = (T_1[2] - T_n[2])/ (1 - num_slices)
affine_matrix[:3, 2] = np.array([k1,k2,k3])
for z in range(num_slices):
for r in range(rows):
for c in range(cols):
vector = np.array([r, c, 0, 1]).reshape((4,1))
result = np.matmul(affine_matrix, vector)
result = np.delete(result, 3, axis=0)
result = np.transpose(result)
world_coordinates[r,c,z] = result
# print "Finished slice ", str(z)
# np.save('./data/saved/world_coordinates_3d.npy', str(world_coordinates))
return world_coordinates
Now I'm at the point where I want to write this function:
def create_lh_histogram(patient_positions, dcm, magnitude, azimuthal, elevation):
print "constructing LH histogram"
# Get 2nd derivative
second_derivative = gaussian_filter(magnitude, sigma=1, order=1)
# Determine if voxels lie on boundary or not (thresholding)
# Still have to code out: let's say the thresholded voxels are in
# a numpy array called voxels
#Iterate through all thresholded voxels and integrate gradient field in
# both directions using 2nd-order Runge-Kutta
vox_it = voxels.nditer(voxels, flags=['multi_index'])
while not vox_it.finished:
# ???
Say i have a given set of points that i import using a CSV. My ultimate goal is to grab this XY data, convert it to polar form, find a smoothing line and finally obtain a residual plot (between the polar plot and the smoothed line).
(in the code you will see that i centered the xy data before converting to polar)
So initially i used the data and found that the data seemed to be jumbled. hence when i convert to polar and find the residual it looked terrible.
i figured that i needed to sort the data beforehand.
Comparisons for what i need to see (roughly) when i plot xy and what i do see when i plot xy are shown in the link:
http://imgur.com/a/DYw85
My code for import and plotting the data is here:
def cart2pol(x, y):
rho = np.sqrt(x**2 + y**2)
phi = np.arctan2(y, x)
return(rho, phi)
def extract():
filename = "C:/Users/dsfsdf/Dropbox/rthrth/xy50.csv"
data = {}
data = np.genfromtxt(filename,delimiter=',',dtype= float)
block.append(data)
x.append(block[0][:,0])
y.append(block[0][:,1])
extract()
x = np.asarray(x)
y = np.asarray(y)
y = y-min(y[0,:]) #To centre it on the y-axis
x = x - ((max(x[0,:])+min(x[0,:]))/2) #To centre it on the x-axis
xf = x[0,:]
yf = y[0,:]
[rf,th] = cart2pol(xf,yf)
thf = th-np.pi / 2
plt.subplot(121)
plt.plot(yf,xf)
plt.xlabel('y')
plt.ylabel('x')
plt.subplot(122)
plt.plot(thf,rf)
plt.xlabel(r'$\theta$')
plt.ylabel('r')
I've tried this code for clockwise sorting that i found on another thread and it helps quite alot,but doesn't completely correct the issue:
def clockwise(x,y):
cx = np.mean(x)
cy = np.mean(y)
a = np.arctan2(y - cy, x - cx)
order = a.ravel().argsort()
x = x[order]
y = y[order]
return np.vstack([x,y])
heres the plot you get for that : http://imgur.com/pvxoZHo
What is wrong with this process? is all i need to fix the data sorting? should i be sorting xy data --> then going polar or go polar then sort?
How can i get sort the data so that the plot will connect correctly?
Thanks for the help
I have 3D measurement data on a sphere that is very coarse and that I want to interpolate. With the great help from #M4rtini and #HYRY here at stackoverflow I have now been able to generate working code (based on the original example from the RectSphereBivariateSpline example from SciPy).
The test data can be found here: testdata
""" read csv input file, post process and plot 3D data """
import csv
import numpy as np
from mayavi import mlab
from scipy.interpolate import RectSphereBivariateSpline
# user input
nElevationPoints = 17 # needs to correspond with csv file
nAzimuthPoints = 40 # needs to correspond with csv file
threshold = - 40 # needs to correspond with how measurement data was captured
turnTableStepSize = 72 # needs to correspond with measurement settings
resolution = 0.125 # needs to correspond with measurement settings
# read data from file
patternData = np.empty([nElevationPoints, nAzimuthPoints]) # empty buffer
ifile = open('ttest.csv') # need the 'b' suffix to prevent blank rows being inserted
reader = csv.reader(ifile,delimiter=',')
reader.next() # skip first line in csv file as this is only text
for nElevation in range (0,nElevationPoints):
# azimuth
for nAzimuth in range(0,nAzimuthPoints):
patternData[nElevation,nAzimuth] = reader.next()[2]
ifile.close()
# post process
def r(thetaIndex,phiIndex):
"""r(thetaIndex,phiIndex): function in 3D plotting to return positive vector length from patternData[theta,phi]"""
radius = -threshold + patternData[thetaIndex,phiIndex]
return radius
#phi,theta = np.mgrid[0:nAzimuthPoints,0:nElevationPoints]
theta = np.arange(0,nElevationPoints)
phi = np.arange(0,nAzimuthPoints)
thetaMesh, phiMesh = np.meshgrid(theta,phi)
stepSizeRad = turnTableStepSize * resolution * np.pi / 180
theta = theta * stepSizeRad
phi = phi * stepSizeRad
# create new grid to interpolate on
phiIndex = np.arange(1,361)
phiNew = phiIndex*np.pi/180
thetaIndex = np.arange(1,181)
thetaNew = thetaIndex*np.pi/180
thetaNew,phiNew = np.meshgrid(thetaNew,phiNew)
# create interpolator object and interpolate
data = r(thetaMesh,phiMesh)
theta[0] += 1e-6 # zero values for theta cause program to halt; phi makes no sense at theta=0
lut = RectSphereBivariateSpline(theta,phi,data.T)
data_interp = lut.ev(thetaNew.ravel(),phiNew.ravel()).reshape((360,180)).T
def rInterp(theta,phi):
"""rInterp(theta,phi): function in 3D plotting to return positive vector length from interpolated patternData[theta,phi]"""
thetaIndex = theta/(np.pi/180)
thetaIndex = thetaIndex.astype(int)
phiIndex = phi/(np.pi/180)
phiIndex = phiIndex.astype(int)
radius = data_interp[thetaIndex,phiIndex]
return radius
# recreate mesh minus one, needed otherwise the below gives index error, but why??
phiIndex = np.arange(0,360)
phiNew = phiIndex*np.pi/180
thetaIndex = np.arange(0,180)
thetaNew = thetaIndex*np.pi/180
thetaNew,phiNew = np.meshgrid(thetaNew,phiNew)
x = (rInterp(thetaNew,phiNew)*np.cos(phiNew)*np.sin(thetaNew))
y = (-rInterp(thetaNew,phiNew)*np.sin(phiNew)*np.sin(thetaNew))
z = (rInterp(thetaNew,phiNew)*np.cos(thetaNew))
# plot 3D data
obj = mlab.mesh(x, y, z, colormap='jet')
obj.enable_contours = True
obj.contour.filled_contours = True
obj.contour.number_of_contours = 20
mlab.show()
Although the code runs, the resulting plot is much different than the non-interpolated data, see picture
as a reference.
Also, when running the interactive session, data_interp is much larger in value (>3e5) than the original data (this is around 20 max).
Does anyone have any idea what I may be doing wrong?
I seem to have solved it!
For on thing, I tried to extrapolate whereas I could only interpolate this scattered data. SO the new interpolation grid should only go up to theta = 140 degrees or so.
But the most important change is the addition of the parameter s=900 in the RectSphereBivariateSpline call.
So I now have the following code:
""" read csv input file, post process and plot 3D data """
import csv
import numpy as np
from mayavi import mlab
from scipy.interpolate import RectSphereBivariateSpline
# user input
nElevationPoints = 17 # needs to correspond with csv file
nAzimuthPoints = 40 # needs to correspond with csv file
threshold = - 40 # needs to correspond with how measurement data was captured
turnTableStepSize = 72 # needs to correspond with measurement settings
resolution = 0.125 # needs to correspond with measurement settings
# read data from file
patternData = np.empty([nElevationPoints, nAzimuthPoints]) # empty buffer
ifile = open('ttest.csv') # need the 'b' suffix to prevent blank rows being inserted
reader = csv.reader(ifile,delimiter=',')
reader.next() # skip first line in csv file as this is only text
for nElevation in range (0,nElevationPoints):
# azimuth
for nAzimuth in range(0,nAzimuthPoints):
patternData[nElevation,nAzimuth] = reader.next()[2]
ifile.close()
# post process
def r(thetaIndex,phiIndex):
"""r(thetaIndex,phiIndex): function in 3D plotting to return positive vector length from patternData[theta,phi]"""
radius = -threshold + patternData[thetaIndex,phiIndex]
return radius
#phi,theta = np.mgrid[0:nAzimuthPoints,0:nElevationPoints]
theta = np.arange(0,nElevationPoints)
phi = np.arange(0,nAzimuthPoints)
thetaMesh, phiMesh = np.meshgrid(theta,phi)
stepSizeRad = turnTableStepSize * resolution * np.pi / 180
theta = theta * stepSizeRad
phi = phi * stepSizeRad
# create new grid to interpolate on
phiIndex = np.arange(1,361)
phiNew = phiIndex*np.pi/180
thetaIndex = np.arange(1,141)
thetaNew = thetaIndex*np.pi/180
thetaNew,phiNew = np.meshgrid(thetaNew,phiNew)
# create interpolator object and interpolate
data = r(thetaMesh,phiMesh)
theta[0] += 1e-6 # zero values for theta cause program to halt; phi makes no sense at theta=0
lut = RectSphereBivariateSpline(theta,phi,data.T,s=900)
data_interp = lut.ev(thetaNew.ravel(),phiNew.ravel()).reshape((360,140)).T
def rInterp(theta,phi):
"""rInterp(theta,phi): function in 3D plotting to return positive vector length from interpolated patternData[theta,phi]"""
thetaIndex = theta/(np.pi/180)
thetaIndex = thetaIndex.astype(int)
phiIndex = phi/(np.pi/180)
phiIndex = phiIndex.astype(int)
radius = data_interp[thetaIndex,phiIndex]
return radius
# recreate mesh minus one, needed otherwise the below gives index error, but why??
phiIndex = np.arange(0,360)
phiNew = phiIndex*np.pi/180
thetaIndex = np.arange(0,140)
thetaNew = thetaIndex*np.pi/180
thetaNew,phiNew = np.meshgrid(thetaNew,phiNew)
x = (rInterp(thetaNew,phiNew)*np.cos(phiNew)*np.sin(thetaNew))
y = (-rInterp(thetaNew,phiNew)*np.sin(phiNew)*np.sin(thetaNew))
z = (rInterp(thetaNew,phiNew)*np.cos(thetaNew))
# plot 3D data
intensity = rInterp(thetaNew,phiNew)
obj = mlab.mesh(x, y, z, scalars = intensity, colormap='jet')
obj.enable_contours = True
obj.contour.filled_contours = True
obj.contour.number_of_contours = 20
mlab.show()
The resulting plot compares nicely to the original non-interpolated data:
I don't fully understand why s should be set at 900, since the RectSphereBivariateSpline documentation says that s=0 for interpolation. However, when reading the documentation a little further I gain some insight:
Chosing the optimal value of s can be a delicate task. Recommended values for s depend on the accuracy of the data values. If the user has an idea of the statistical errors on the data, she can also find a proper estimate for s. By assuming that, if she specifies the right s, the interpolator will use a spline f(u,v) which exactly reproduces the function underlying the data, she can evaluate sum((r(i,j)-s(u(i),v(j)))**2) to find a good estimate for this s. For example, if she knows that the statistical errors on her r(i,j)-values are not greater than 0.1, she may expect that a good s should have a value not larger than u.size * v.size * (0.1)**2.
If nothing is known about the statistical error in r(i,j), s must be determined by trial and error. The best is then to start with a very large value of s (to determine the least-squares polynomial and the corresponding upper bound fp0 for s) and then to progressively decrease the value of s (say by a factor 10 in the beginning, i.e. s = fp0 / 10, fp0 / 100, ... and more carefully as the approximation shows more detail) to obtain closer fits.