I used fsolve to find the zeros of an example sinus function, and worked great. However, I wanted to do the same with a dataset. Two lists of floats, later converted to arrays with numpy.asarray(), containing the (x,y) values, namely 't' and 'ys'.
Although I found some related questions, I failed to implement the code provided in them, as I try to show here. Our arrays of interest are stored in a 2D list (data[i][j], where 'i' corresponds to a variable (e.g. data[0]==t==time==x values) and 'j' are the values of said variable along the x axis (e.g. data[1]==Force). Keep in mind that each data[i] is an array of floats.
Could you offer an example code that takes two inputs (the two mentioned arrays) and returns its intersecting points with a defined function (e.g. 'y=0').
I include some testing I made regarding the other related question. ( #HYRY 's answer)
I do not think it is relevant, but I'm using Spyder through Anaconda.
Thanks in advance!
"""
Following the answer provided by #HYRY in the 'related questions' (see link above).
At this point of the code, the variable 'data' has already been defined as stated before.
"""
from scipy.optimize import fsolve
def tfun(x):
return data[0][x]
def yfun(x):
return data[14][x]
def findIntersection(fun1, fun2, x0):
return [fsolve(lambda x:fun1(x)-fun2(x, y), x0) for y in range(1, 10)]
print findIntersection(tfun, yfun, 0)
Which returns the next error
File "E:/Data/Anaconda/[...]/00-Latest/fsolvestacktest001.py", line 36, in tfun
return data[0][x]
IndexError: arrays used as indices must be of integer (or boolean) type
The full output is as it follows:
Traceback (most recent call last):
File "<ipython-input-16-105803b235a9>", line 1, in <module>
runfile('E:/Data/Anaconda/[...]/00-Latest/fsolvestacktest001.py', wdir='E:/Data/Anaconda/[...]/00-Latest')
File "C:\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 580, in runfile
execfile(filename, namespace)
File "E:/Data/Anaconda/[...]/00-Latest/fsolvestacktest001.py", line 44, in <module>
print findIntersection(tfun, yfun, 0)
File "E:/Data/Anaconda/[...]/00-Latest/fsolvestacktest001.py", line 42, in findIntersection
return [fsolve(lambda x:fun1(x)-fun2(x, y), x0) for y in range(1, 10)]
File "C:\Anaconda\lib\site-packages\scipy\optimize\minpack.py", line 140, in fsolve
res = _root_hybr(func, x0, args, jac=fprime, **options)
File "C:\Anaconda\lib\site-packages\scipy\optimize\minpack.py", line 209, in _root_hybr
ml, mu, epsfcn, factor, diag)
File "E:/Data/Anaconda/[...]/00-Latest/fsolvestacktest001.py", line 42, in <lambda>
return [fsolve(lambda x:fun1(x)-fun2(x, y), x0) for y in range(1, 10)]
File "E:/Data/Anaconda/[...]/00-Latest/fsolvestacktest001.py", line 36, in tfun
return data[0][x]
IndexError: arrays used as indices must be of integer (or boolean) type
You can 'convert' a datasets (arrays) to continuous functions by means of interpolation. scipy.interpolate.interp1d is a factory that provides you with the resulting function, which you could then use with your root finding algorithm.
--edit-- an example for computing an intersection of sin and cos from 20 samples (I've used cubic spline interpolation, as piecewise linear gives warnings about the smoothness):
>>> import numpy, scipy.optimize, scipy.interpolate
>>> x = numpy.linspace(0,2*numpy.pi, 20)
>>> x
array([ 0. , 0.33069396, 0.66138793, 0.99208189, 1.32277585,
1.65346982, 1.98416378, 2.31485774, 2.64555171, 2.97624567,
3.30693964, 3.6376336 , 3.96832756, 4.29902153, 4.62971549,
4.96040945, 5.29110342, 5.62179738, 5.95249134, 6.28318531])
>>> y1sampled = numpy.sin(x)
>>> y2sampled = numpy.cos(x)
>>> y1int = scipy.interpolate.interp1d(x,y1sampled,kind='cubic')
>>> y2int = scipy.interpolate.interp1d(x,y2sampled,kind='cubic')
>>> scipy.optimize.fsolve(lambda x: y1int(x) - y2int(x), numpy.pi)
array([ 3.9269884])
>>> scipy.optimize.fsolve(lambda x: numpy.sin(x) - numpy.cos(x), numpy.pi)
array([ 3.92699082])
Note that interpolation will give you 'guesses' about what data should be between the sampling points. No way to tell how good these guesses are. (but for my example, you can see it's a pretty good estimation)
Related
I'm trying to code my own logistic regression, and compare different methods of maximizing the log-likelihood. Using the Newton-CG method, I get the error message "ValueError: setting an array element with a sequence". Reading around, it seems this error rises if the function sought to be minimized returns a non-skalar, but that is not the case here. I need the three methods given below to give the same result (approximately), but when running on my real data, one does not converge, and the other one gives a worse LL than the initial guess, and the third does not run at all.
Why do I get the ValueError message and how can I fix it?
My code (with dummy data, the real data is ~100 measurements) is as follows:
import numpy as np
from numpy import linalg
import scipy
from scipy.optimize import minimize
def CalcLL(beta,xinlist,yinlist):
LL=0.0
ncol=len(beta)
pi=FindPi(xinlist,beta.reshape(ncol,1))
for i in range(len(yinlist)):
LL=LL+np.where(yinlist[i]==1,np.log(pi[i]),np.log(1-pi[i]))
return -LL
def Jacobian(beta,xinlist,yinlist):
ncol=len(beta)
nrow=np.shape(xinlist)[0]
pi=FindPi(xinlist,beta.reshape(ncol,1))
Jac=np.transpose(np.matrix(yinlist-pi))*np.matrix(xinlist)
return Jac
def Hessian(beta,xinlist,yinlist):
ncol=len(beta)
nrow=np.shape(xinlist)[0]
pi=FindPi(xinlist,beta.reshape(ncol,1))
W=FindW(pi)
Hes=np.matrix(np.transpose(xinlist))*(np.matrix(W)*np.matrix(xinlist))
return Hes
def FindPi(xinlist,beta):
rows=np.shape(xinlist)[0]# Number of rows in x_new
cols=np.shape(xinlist)[1]# Number of columns in x_new
expon=np.dot(xinlist,beta)
expon=np.array(expon).reshape(rows,1)
pi=np.exp(expon)/(1+np.exp(expon))
return pi
def FindW(pi):
W=np.zeros(len(pi)*len(pi)).reshape(len(pi),len(pi))
for i in range(len(pi)):
W[i,i]=float(pi[i]*(1-pi[i]))
return W
xinlist=np.matrix([[1,1],[0,1],[1,1],[1,1],[1,1],[0,1],[0,1],[1,1],[1,1],[0,1]])
yinlist=np.transpose(np.matrix([0,0,0,0,0,1,1,1,1,1]))
ncol=np.shape(xinlist)[1]
beta1=np.zeros(ncol).reshape(ncol,1) # Initial guess for parameter values
limit=0.000001 # selfwritten Newton-Raphson method
iter_i=limit+1
while iter_i>limit:
Hes=Hessian(beta1,xinlist,yinlist)
Jac=np.transpose(Jacobian(beta1,xinlist,yinlist))
root_diff=np.array(linalg.inv(Hes)*Jac)
beta1=beta1+root_diff
iter_i=np.sum(root_diff*root_diff)
print "When running self-written algorithm, the log-likelihood is",-CalcLL(beta1,xinlist,yinlist)
beta2=np.zeros(ncol).reshape(ncol,1)
res=minimize(CalcLL,beta2,args=(xinlist,yinlist),method='Nelder-Mead',options={'xtol':1e-8,'disp':True,'maxiter':10000})
beta2=res.x
print "The log-likelihood using Nelder-Mead is", -CalcLL(beta2,xinlist,yinlist)
beta3=np.zeros(ncol).reshape(ncol,1)
res=minimize(CalcLL,beta3,args=(xinlist,yinlist),method='Newton-CG',jac=Jacobian,hess=Hes,options={'xtol':1e-8,'disp':True})
beta3=res.x
print "The log-likelihood using Newton-CG is", -CalcLL(beta3,xinlist,yinlist)
EDIT:
The errorstack is as follows:
Traceback (most recent call last):
File "MyLogisticRegression2.py", line 62, in
res=minimize(CalcLL,beta3,args=(xinlist,yinlist),method='Newton-CG',jac=Jacobian,hess=Hes,options={'xtol':1e-8,'disp':True})
File C:\Python27\lib\site-packages\scipy\optimize_minimize.py, line 447, in minimize **options)
File C:\Python27\lib\site-packages\scipy\optimize\optimize.py, line 2393, in _minimize_newtoncg eta=numpy.min([0.5, numpy.sqrt(maggrad)])
File C:\Python27\lib\site-packages\numpy\core\fromnumeric.py, line 2393, in amin out=out, **kwargs)
File C:\Python27\lib\site-packages\numpy\core_methods.py, line 29, in _amin return umr_minimum(a,axis,None,out,keepdims)
ValueError: setting an array element with a sequence
I found out the problem rose from beta arrays having shape (2,1) instead of (2,), and likewise for the Jacobian. Reshaping these two solved the problem.
The Newton-CG solver needs only 1d arrays for the Jacobian apparently.
I keep getting errors when I tried to solve a system of three equations using the following code in python3:
import sympy
from sympy import Symbol, solve, nsolve
x = Symbol('x')
y = Symbol('y')
z = Symbol('z')
eq1 = x - y + 3
eq2 = x + y
eq3 = z - y
print(nsolve( (eq1, eq2, eq3), (x,y,z), (-50,50)))
Here is the error message:
Traceback (most recent call last):
File
"/usr/lib/python3/dist-packages/mpmath/calculus/optimization.py", line
928, in findroot
fx = f(*x0)
TypeError: () missing 1 required positional argument:
'_Dummy_15'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "", line 1, in
File "", line 12, in File
"/usr/lib/python3/dist-packages/sympy/solvers/solvers.py", line 2498,
in nsolve
x = findroot(f, x0, J=J, **kwargs)
File
"/usr/lib/python3/dist-packages/mpmath/calculus/optimization.py", line
931, in findroot
fx = f(x0[0])
TypeError: () missing 2 required positional arguments:
'_Dummy_14' and '_Dummy_15'
The strange thing is, the error message goes away if I only solve the first two equation --- by changing the last line of the code to
print(nsolve( (eq1, eq2), (x,y), (-50,50)))
output:
exec(open('bug444.py').read())
[-1.5]
[ 1.5]
I'm baffled; your help is most appreciated!
A few pieces of additional info:
I'm using python3.4.0 + sympy 0.7.6-3 on ubuntu 14.04. I got the same error in python2
I could solve this system using
solve( [eq1,eq2,eq3], [x,y,z] )
but this system is just a toy example; in the actual applications the system is non-linear and I need higher precision, and I don't see how to adjust the precision for solve, whereas for nsolve I could use nsolve(... , prec=100)
THANKS!
In your print statement, you are missing your guess for z
print(nsolve((eq1, eq2, eq3), (x, y, z), (-50, 50)))
try this (in most cases, using 1 for all the guesses is fine):
print(nsolve((eq1, eq2, eq3), (x, y, z), (1, 1, 1)))
Output:
[-1.5]
[ 1.5]
[ 1.5]
You can discard the initial guesses/dummies if you use linsolve:
>>> from sympy import linsolve
>>> print(linsolve((eq1, eq2, eq3), x,y,z))
{(-3/2, 3/2, 3/2)}
And then you can use nonlinsolve for your non linear problem set.
The Problem is number of variables should be equal to the number of guess vectors,
print(nsolve((eq1, eq2, eq3), (x,y,z), (-50,50,50)))
If you're using a numerical solver on a multidimensional problem, it wants to start from somewhere and follow a gradient to the solution.
the guess vector is where you start.
if there are multiple local minima / maxima in the space, different guess vectors can lead to diffierent outputs.
Or an unfortunate guess vector may not converge at all.
For a one-dimensional problem the guess vector is just x0.
For most functions you can write down easily, almost any vector will converge to the one global solutions.
so (1,1,1) guess vectors here is as good as (-50,50,50)
Just don't leave a null space for the sake of program
your code should be:
nsolve([eq1, eq2, eq3], [x,y,z], [1,1,1])
your code was:
nsolve([eq1, eq2, eq3], [x,y,z], [1,1])
you were mising one guess value in the last argument.
point is: if you are solving for n unknown terms you provide a guess for each unknown term (n guesses in the last argument)
I'm trying to fit a two global parameters of a galactic model using Scipy curve_fit in python. I have an array of independent variables and an array of dependent variables. The first 1/4 of the data set needs to be fit to a function depending on the two global parameters and two local parameters, the next quarter to another function depending on the two global parameters and two local variables, etc.
Is there anyway that I can write a function that will call the appropriate function with the right index and the global parameters through the entire array.
What I have so far is:
def galaxy_func_inner(time,a,b,c,d):
telescope_inner = lt.station(rot_angle=c,pol_angle=d)
power = telescope_inner.calculate_gpowervslstarray(time)[0]
return a*np.array(power)+b
def galaxy_func_outer(time,a,b,c,d):
telescope_outer = lt.station(rot_angle=c,pol_angle=d)
power = telescope_outer.calculate_gpowervslstarray(time)[0]
return a*np.array(power)+b
def galaxy_func_global(time,R,P,a,b,c,d,e,f,g,h):
for t_index in range(len(time)):
if t_index in range(0,50):
return galaxy_func_outer(t_index,a,b,R,P)
elif t_index in range(50,100):
return galaxy_func_outer(t_index,c,d,R,P)
elif t_index in range(100,150):
return galaxy_func_inner(t_index,e,f,R,P)
elif t_index in range(150,200):
return galaxy_func_inner(t_index,g,h,R,P)
The problem is that this only fits the first time but the whole time array, and the single point is only fitted to the corresponding model point and not the whole array. Any help as to how to reformulate this? I've tried to reformulate it as:
def galaxy_func_global(xdata,R,P,a,b,c,d,e,f,g,h):
return galaxy_func_outer(xdata[0:50],a,b,R,P),galaxy_func_outer(xdata[50:100],c,d,R,P),galaxy_func_inner(xdata[100:150],e,f,R,P),galaxy_func_inner(xdata[150:200],g,h,R,P)
but I get the error:
File "galaxy_calibration.py", line 117, in <module>
popt,pcov = curve_fit(galaxy_func_global,xdata,ydata)
File "/Library/Python/2.7/site-packages/scipy-0.14.0.dev_7cefb25-py2.7-macosx-10.9-intel.egg/scipy/optimize/minpack.py", line 555, in curve_fit
res = leastsq(func, p0, args=args, full_output=1, **kw)
File "/Library/Python/2.7/site-packages/scipy-0.14.0.dev_7cefb25-py2.7-macosx-10.9-intel.egg/scipy/optimize/minpack.py", line 369, in leastsq
shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
File "/Library/Python/2.7/site-packages/scipy-0.14.0.dev_7cefb25-py2.7-macosx-10.9-intel.egg/scipy/optimize/minpack.py", line 20, in _check_func
res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
File "/Library/Python/2.7/site-packages/scipy-0.14.0.dev_7cefb25-py2.7-macosx-10.9-intel.egg/scipy/optimize/minpack.py", line 445, in _general_function
return function(xdata, *params) - ydata
ValueError: operands could not be broadcast together with shapes (4,) (191,)
Any help would be much appreciated.
If you want to cut your input data into 4 batches (based on the index of the time points) and process the data depending on the batches, then return the results in a single array, then you can do this:
def galaxy_func_global(time,R,P,a,b,c,d,e,f,g,h):
return np.concatenate([galaxy_func_outer(time[0:50],a,b,R,P),
galaxy_func_outer(time[50:100],c,d,R,P),
galaxy_func_inner(time[100:150],e,f,R,P),
galaxy_func_inner(time[150:200],g,h,R,P)])
This will slice into your time array to pick out each slice of interest, then call the appropriate function for each piece. It seems to me that these functions return simple np.arrays, which can be concatenated to get a single array as result.
(I just realized that I could've just said "what you tried was almost perfect, but you need to concatenate the resulting arrays into a single array":)
Note that there are at least two ways in which you can have dimensioning problems.
Firstly, you should make sure that the return value of both of your functions (galaxy...inner/outer()) is a 1d numpy array. Otherwise you'll run into problems with your global return value.
Secondly, every fitting method expects a function the return value of which has the same size (shape) as the input variable, for obvious reasons. So you can also run into problems with your current code if time is not exactly 200 elements long, since your output will be truncated to 200 elements even if time is longer. At least you should put
galaxy_func_inner(time[150:],g,h,R,P)
into your last function call to catch all the remaining points of time, but if you want to do it properly, call
def galaxy_func_global(time,R,P,a,b,c,d,e,f,g,h):
inds=np.floor(np.linspace(0,len(time)-1,5))
return np.concatenate([galaxy_func_outer(time[0:inds[1]],a,b,R,P),
galaxy_func_outer(time[inds[1]:inds[2]],c,d,R,P),
galaxy_func_inner(time[inds[2]:inds[3]],e,f,R,P),
galaxy_func_inner(time[inds[3]:],g,h,R,P)])
Also note that your original error is formally of this kind:
File "/Library/Python/2.7/site-packages/scipy-0.14.0.dev_7cefb25-py2.7-macosx-10.9-intel.egg/scipy/optimize/minpack.py", line 445, in _general_function
return function(xdata, *params) - ydata
ValueError: operands could not be broadcast together with shapes (4,) (191,)
This tells you that python couldn't subtract ydata from function(xdata,*params) (i.e. your fitting model) because one is of length 4 while the other is of length 191. This is because if your function calls return a,b,c,d, then it will return a tuple (a,b,c,d), so the return value will have a length of 4. It's more interesting that your ydata has length 191, this might mean that you'll still run into an error.
I use the function leastsq from scipy.optimize to fit sphere coordinates and radius from 3D coordinates.
So my code looks like this :
def distance(pc,point):
xc,yc,zc,rd = pc
x ,y ,z = point
return np.sqrt((xc-x)**2+(yc-y)**2+(zc-z)**2)
def sphere_params(coords):
from scipy import optimize
err = lambda pc,point : distance(pc,point) - pc[3]
pc = [0, 0, 0, 1]
pc, success = optimize.leastsq(err, pc[:], args=(coords,))
return pc
(Built thanks to : how do I fit 3D data.)
I started working with the variable coords as a list of tuples (each tuple being an x,y,z coordinate):
>> coords
>> [(0,0,0),(0,0,1),(-1,0,0),(0.57,0.57,0.57),...,(1,0,0),(0,1,0)]
Which lead me to an error :
>> pc = sphere_params(coords)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/michel/anaconda/lib/python2.7/site-packages/scipy/optimize/minpack.py", line 374, in leastsq
raise TypeError('Improper input: N=%s must not exceed M=%s' % (n, m))
TypeError: Improper input: N=4 must not exceed M=3
Where N is the number of parameters stored in pc, and M the number of data points. Which makes it look like I haven't given enough data points while my list coords actually regroups 351 tuples versus 4 parameters in pc !
From what I read in minipack the actual culprit seems to be this line (from _check_func()) :
res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
Unless i'm mistaken, in my case it translates into
res = atleast_1d(distance(*(pc[:len(pc)],) + args)
But I'm having a terrible time trying to understand what this mean alongs with the rest of the _check_func() function.
I ended up changing coords into an array before giving it as an argument to sphere_param() : coords = np.asarray(coords).T and it started working just fine. I would really like to understand why the data format was giving me trouble though !
In advance, many thanks for your answers!
EDIT : I notice my use of coords for the "distance" and "err" functions was really unwise and misleading, it wasn't so in my original code so it was not the core of the problem. Now make more sense.
Your err function must take the full list of coords and return a full list of distances. leastsq will then take the list of errors, square and sum them, and minimize that squared sum.
There are also distance functions in scipy.spatial.distance, so I would recommend that:
from scipy.spatial.distance import cdist
from scipy.optimize import leastsq
def distance_cdist(pc, coords):
return cdist([pc], coords).squeeze()
def distance_norm(pc, points):
""" pc must be shape (D+1,) array
points can be (N, D) or (D,) array """
c = np.asarray(pc[:3])
points = np.atleast_2d(points)
return np.linalg.norm(points-c, axis=1)
def sphere_params(coords):
err = lambda pc, coords: distance(pc[:3], coords) - pc[3]
pc = [0, 0, 0, 1]
pc, success = leastsq(err, pc, args=(coords,))
return pc
coords = [(0,0,0),(0,0,1),(-1,0,0),(0.57,0.57,0.57),(1,0,0),(0,1,0)]
sphere_params(coords)
While I haven't used this function much, as best I can tell, coords is passed as is to your distance function. At least it would if the error checking allowed. In fact it is likely that the error checking tries to do that, and raises an error if distance raises an error. So lets try that.
In [91]: coords=[(0,0,0),(0,0,1),(-1,0,0),(0.57,0.57,0.57),(1,0,0),(0,1,0)]
In [92]: distance([0,0,0,0],coords)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-92-113da104affb> in <module>()
----> 1 distance([0,0,0,0],coords)
<ipython-input-89-64c557cd95e0> in distance(pc, coords)
2
3 xc,yx,zx,rd = pc
----> 4 x ,y ,z = coords
5 return np.sqrt((xc-x)**2+(yc-y)**2+(zc-z)**2)
6
ValueError: too many values to unpack (expected 3)
So that's where the 3 comes from - your x, y, z = coords.
distance([0,0,0,0],np.array(coords))
same error.
distance([0,0,0,0],np.array(coords).T)
gets past that issue (3 rows that can be split into 3 variables), raises another error: NameError: name 'yc' is not defined
That looks like a typo in the code you gave us, !naughty, naughty!.
Correcting that:
In [97]: def distance(pc,coords):
xc,yc,zc,rd = pc
x ,y ,z = coords
return np.sqrt((xc-x)**2+(yc-y)**2+(zc-z)**2)
....:
In [98]: distance([0,0,0,0],np.array(coords).T)
Out[98]: array([ 0. , 1. , 1. , 0.98726896, 1. , 1. ])
# and wrapping the array in a tuple, as `leastsq` does
In [102]: distance([0,0,0,0],*(np.array(coords).T,))
Out[102]: array([ 0. , 1. , 1. , 0.98726896, 1. , 1. ])
I get a 5 element array, one value for each 'point' in coords. Is that what you want?
Where did you get the idea that leastsq feeds your coords one tuple at a time to your lambda?
args : tuple
Any extra arguments to func are placed in this tuple.
In general with these optimize functions, it you want to perform the operation on a set of conditions, then you need to iterate over those conditions, calling the optimize on each one. Or if you want to optimize over the whole set at once, then you need to write your function (err,etc) to work with the whole set at once.
So here is what I came up with from previous help :
import numpy as np
from scipy.optimize import leastsq
def a_dist(a,B):
# works with - a : reference point - B : coordinates matrix
return np.linalg.norm(a-B, axis=1)
def parametric(coords):
err = lambda pc,point : a_dist(pc,point) - 18
pc = [0, 0, 0] # Initial guess for the parameters
pc, success = leastsq(err, pc[:], args=(coords,))
return pc
It definitely works with both a list of tuples and an array of shape (N,3)
>> cluster #it's more than 6000 point you won't have the same result
>> [(4, 30, 19), (3, 30, 19), (5, 30, 19), ..., (4, 30, 3), (4, 30, 35)]
>> sphere_params(cluster)
>> array([ -5.25734467, 20.73419249, 9.73428766])
>> np.asarray(cluster).shape
>> (6017,3)
>> sphere_params(np.asarray(cluster))
>> array([ -5.25734467, 20.73419249, 9.73428766])
Combining this version with Askewchan's, ie having :
def sphere_params(coords):
err = lambda pc, coords: distance(pc[:3], coords) - pc[3]
pc = [0, 0, 0, 1]
pc, success = leastsq(err, pc, args=(coords,))
return pc
Also works fine, to be honest I didn't take the time to try your solution. I definitely stopped taking the radius as a fit parameter however. I found it not robust at all (even 6000 -noisy- data points were not enough to get the right curvature !).
When comparing to my first code I'm still not quite sure what was wrong though, I probably messed up with global/local variables although I really don't recall using any "global" statement in any of my functions.
I'm getting a ZeroDivisionError from the following code:
#stacking the array into a complex array allows np.unique to choose
#truely unique points. We also keep a handle on the unique indices
#to allow us to index `self` in the same order.
unique_points,index = np.unique(xdata[mask]+1j*ydata[mask],
return_index=True)
#Now we break it into the data structure we need.
points = np.column_stack((unique_points.real,unique_points.imag))
xx1,xx2 = self.meta['rcm_xx1'],self.meta['rcm_xx2']
yy1 = self.meta['rcm_yy2']
gx = np.arange(xx1,xx2+dx,dx)
gy = np.arange(-yy1,yy1+dy,dy)
GX,GY = np.meshgrid(gx,gy)
xi = np.column_stack((GX.ravel(),GY.ravel()))
gdata = griddata(points,self[mask][index],xi,method='linear',
fill_value=np.nan)
Here, xdata,ydata and self are all 2D numpy.ndarrays (or subclasses thereof) with the same shape and dtype=np.float32. mask is a 2d ndarray with the same shape and dtype=bool. Here's a link for those wanting to peruse the scipy.interpolate.griddata documentation.
Originally, xdata and ydata are derived from a non-uniform cylindrical grid that has a 4 point stencil -- I thought that the error might be coming from the fact that the same point was defined multiple times, so I made the set of input points unique as suggested in this question. Unfortunately, that hasn't seemed to help. The full traceback is:
Traceback (most recent call last):
File "/xxxxxxx/rcm.py", line 428, in <module>
x[...,1].to_pz0()
File "/xxxxxxx/rcm.py", line 285, in to_pz0
fill_value=fill_value)
File "/usr/local/lib/python2.7/site-packages/scipy/interpolate/ndgriddata.py", line 183, in griddata
ip = LinearNDInterpolator(points, values, fill_value=fill_value)
File "interpnd.pyx", line 192, in scipy.interpolate.interpnd.LinearNDInterpolator.__init__ (scipy/interpolate/interpnd.c:2935)
File "qhull.pyx", line 996, in scipy.spatial.qhull.Delaunay.__init__ (scipy/spatial/qhull.c:6607)
File "qhull.pyx", line 183, in scipy.spatial.qhull._construct_delaunay (scipy/spatial/qhull.c:1919)
ZeroDivisionError: float division
For what it's worth, the code "works" (No exception) if I use the "nearest" method.