I am using the LinearNDInterpolator on some (x, y, z) data, using the following script. However, I cannot figure out how to go from the interpolated data to plotting/showing the interpolation in heatmap form? Am I missing something like setting up a meshgrid based on the min and max of x and y? Any help or an example would be great!
import numpy as np
import scipy.interpolate
x = np.array([-4386795.73911443, -1239996.25110694, -3974316.43669208,
1560260.49911342, 4977361.53694849, -1996458.01768192,
5888021.46423068, 2969439.36068243, 562498.56468588,
4940040.00457585])
y = np.array([ -572081.11495993, -5663387.07621326, 3841976.34982795,
3761230.61316845, -942281.80271223, 5414546.28275767,
1320445.40098735, -4234503.89305636, 4621185.12249923,
1172328.8107458 ])
z = np.array([ 4579159.6898615 , 2649940.2481702 , 3171358.81564312,
4892740.54647532, 3862475.79651847, 2707177.605241 ,
2059175.83411223, 3720138.47529587, 4345385.04025412,
3847493.83999694])
# Create coordinate pairs
cartcoord = zip(x, y)
# Interpolate
interp = scipy.interpolate.LinearNDInterpolator(cartcoord, z)
Edit:
Based on #Spinor's solution, and using Python 2.7, the following code gives me what I'm looking for (approach 1). Is there a way to increase my density of the interpolated points?
The dataset yields the following plot:
Needless to say, I did not expect the results to be circular, since the (lat,lon) coordinates are taken from an equirectrangular projection map. On further investigation, I think this is simply mapped on a different projection.
I will assume that you are trying to interpolate values of z.
Now, what happens when you do call interpolation function? It creates the entire landscape of the inputs (x and y) and the outputs (z). In the code above, you didn't really ask for its value at any point. To use this function, you need to specify the inputs and it will give you the interpolated output.
You used the function scipy.interpolate.LinearNDInterpolator which is constructed by triangulating the input data and on each triangle performing linear barycentric interpolation. Depending on your inputs, there are likely to be regions where this breaks down and you get Nan. For instance, try this in your code
print interp(-4386790, 3720137)
This is within the limits of the min-max of your x and y. We could set the Nan to zero via the fill_value argument if that is acceptable to you.
Read up on the docs. Often people might find the following function acceptable as well, scipy.interpolate.interp2d. It uses spline interpolation instead. In the code below, I've implemented both functions (the former with nan values set to 0) and plotted them on a heatmap.
As for the heatmap, it is as you've suspected. You have to create a grid of values. Below is my the output graphs for LinearNDInterpolator with nan set to zero and interp2d as well as the codes.
Using LinearNDInterpolator(cartcoord, z, fill_value=0)
Using interp2d(x, y, z)
P.S. I am using Python3. If you run into issues in Python2, remove the list from cartcoord = list(zip(x, y)).
import matplotlib.pyplot as plt
import numpy as np
import scipy.interpolate
x = np.array([-4386795.73911443, -1239996.25110694, -3974316.43669208,
1560260.49911342, 4977361.53694849, -1996458.01768192,
5888021.46423068, 2969439.36068243, 562498.56468588,
4940040.00457585])
y = np.array([ -572081.11495993, -5663387.07621326, 3841976.34982795,
3761230.61316845, -942281.80271223, 5414546.28275767,
1320445.40098735, -4234503.89305636, 4621185.12249923,
1172328.8107458 ])
z = np.array([ 4579159.6898615 , 2649940.2481702 , 3171358.81564312,
4892740.54647532, 3862475.79651847, 2707177.605241 ,
2059175.83411223, 3720138.47529587, 4345385.04025412,
3847493.83999694])
# Create coordinate pairs
cartcoord = list(zip(x, y))
X = np.linspace(min(x), max(x))
Y = np.linspace(min(y), max(y))
X, Y = np.meshgrid(X, Y)
# Approach 1
interp = scipy.interpolate.LinearNDInterpolator(cartcoord, z, fill_value=0)
Z0 = interp(X, Y)
plt.figure()
plt.pcolormesh(X, Y, Z0)
plt.colorbar() # Color Bar
plt.show()
# Approach 2
func = scipy.interpolate.interp2d(x, y, z)
Z = func(X[0, :], Y[:, 0])
plt.figure()
plt.pcolormesh(X, Y, Z)
plt.colorbar() # Color Bar
plt.show()
Related
I am plotting a 1d array (x-axis) against a 2d array (y-axis)in matplotlib so there are multiple y values for each x value. I want to plot a straigt line of best fit (linear regression), not just a line joining the points. How can I do this???
All the otehr examples seem to only have one y value per x value. When I use 'from sklearn.linear_model import LinearRegression' I get as many best fit lines as there are y values per x value.
EDIT: here is the code I have tried:
model = LinearRegression()
x_axis2 = np.arange(0,len(av_rsq3))
x_axis2 = x_axis2.reshape(-1,1)
model.fit(x_axis2, av_rsq3)
pt.figure()
pt.plot(x_axis2,av_rsq3, 'rx')
pt.plot(x_axis2, model.predict(x_axis2))
note: x_axis2 is a 1d array and av_rsq3 is a 2d array.
You just need to add these points with matching x-values as normal points, then you can add a line of best fit as follows:
import numpy as np
from numpy.polynomial.polynomial import polyfit
import matplotlib.pyplot as plt
x = np.array([1,2,3,4,5,6,6,6,7,7,8])
y = np.array([1,2,4,8,16,32,34,30,61,65,120])
# Fit with polyfit
b, m = polyfit(x, y, 1)
plt.plot(x, y, '.')
plt.plot(x, b + m * x, '-')
plt.show()
which produces .
Note, a straight line doesn't fit my example data, but I didn't think about that when writing it :) With polyfit you are also able to change the degree of the fit, as well as obtain error margins in gradients* and offsets.
* (or other polynomial coefficients)
What you need to do is provide a one to one mapping. The order the points appear in does not matter. So if you have something like this
X: [1,2,3,4]
Y1: [4,6,2,7]
Y2: [2,3,6,8]
you would get this
X: [1,2,3,4,1,2,3,4]
Y: [4,6,2,7,2,3,6,8]
If you just want to plot the y values and a line averaging between them, this is possible. Borrowing the dummy data from another answer:
x = [1,2,3,4]
y = [4,6,2,7]
y1 = [2,3,6,8]
plt.scatter(x,y)
plt.scatter(x,y1)
plt.plot(x,[((y[i]+y1[i])/2) for i in range(len(y))])
I am trying to invert an interpolated function using scipy's interpolate function. Let's say I create an interpolated function,
import scipy.interpolate as interpolate
interpolatedfunction = interpolated.interp1d(xvariable,data,kind='cubic')
Is there some function that can find x when I specify a:
interpolatedfunction(x) == a
In other words, "I want my interpolated function to equal a; what is the value of xvariable such that my function is equal to a?"
I appreciate I can do this with some numerical scheme, but is there a more straightforward method? What if the interpolated function is multivalued in xvariable?
There are dedicated methods for finding roots of cubic splines. The simplest to use is the .roots() method of InterpolatedUnivariateSpline object:
spl = InterpolatedUnivariateSpline(x, y)
roots = spl.roots()
This finds all of the roots instead of just one, as generic solvers (fsolve, brentq, newton, bisect, etc) do.
x = np.arange(20)
y = np.cos(np.arange(20))
spl = InterpolatedUnivariateSpline(x, y)
print(spl.roots())
outputs array([ 1.56669456, 4.71145244, 7.85321627, 10.99554642, 14.13792756, 17.28271674])
However, you want to equate the spline to some arbitrary number a, rather than 0. One option is to rebuild the spline (you can't just subtract a from it):
solutions = InterpolatedUnivariateSpline(x, y - a).roots()
Note that none of this will work with the function returned by interp1d; it does not have roots method. For that function, using generic methods like fsolve is an option, but you will only get one root at a time from it. In any case, why use interp1d for cubic splines when there are more powerful ways to do the same kind of interpolation?
Non-object-oriented way
Instead of rebuilding the spline after subtracting a from data, one can directly subtract a from spline coefficients. This requires us to drop down to non-object-oriented interpolation methods. Specifically, sproot takes in a tck tuple prepared by splrep, as follows:
tck = splrep(x, y, k=3, s=0)
tck_mod = (tck[0], tck[1] - a, tck[2])
solutions = sproot(tck_mod)
I'm not sure if messing with tck is worth the gain here, as it's possible that the bulk of computation time will be in root-finding anyway. But it's good to have alternatives.
After creating an interpolated function interp_fn, you can find the value of x where interp_fn(x) == a by the roots of the function
interp_fn2 = lambda x: interp_fn(x) - a
There are number of options to find the roots in scipy.optimize. For instance, to use Newton's method with the initial value starting at 10:
from scipy import optimize
optimize.newton(interp_fn2, 10)
Actual example
Create an interpolated function and then find the roots where fn(x) == 5
import numpy as np
from scipy import interpolate, optimize
x = np.arange(10)
y = 1 + 6*np.arange(10) - np.arange(10)**2
y2 = 5*np.ones_like(x)
plt.scatter(x,y)
plt.plot(x,y)
plt.plot(x,y2,'k-')
plt.show()
# create the interpolated function, and then the offset
# function used to find the roots
interp_fn = interpolate.interp1d(x, y, 'quadratic')
interp_fn2 = lambda x: interp_fn(x)-5
# to find the roots, we need to supply a starting value
# because there are more than 1 root in our range, we need
# to supply multiple starting values. They should be
# fairly close to the actual root
root1, root2 = optimize.newton(interp_fn2, 1), optimize.newton(interp_fn2, 5)
root1, root2
# returns:
(0.76393202250021064, 5.2360679774997898)
If your data are monotonic you might also try the following:
inversefunction = interpolated.interp1d(data, xvariable, kind='cubic')
Mentioning another option because I found this page in a google search and the other option works for my simple use case. Hopefully it'll be of use to someone.
If the function you're interpolating is very simple and always has a 1:1 relationship between y and x, then you can simply take your data, swap x and y when you pass it into interp1d, and then call the interpolation function in that direction.
Adapting code from https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.interp1d.html
import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate
x = np.arange(0, 10)
y = np.exp(-x/3.0)
f = interpolate.interp1d(x, y)
xnew = np.arange(0, 9, 0.1)
ynew = f(xnew)
plt.plot(x, y, 'o', xnew, ynew, '-')
plt.show()
When x and y have been swapped you can call swappedInterpolationFunction(a) to get the x value where that would occur.
f = interpolate.interp1d(y, x)
xnew = np.arange(np.exp(-9/3), np.exp(0), 0.01)
ynew = f(xnew)
plt.plot(y, x, 'o', xnew, ynew, '-')
plt.title("Inverted")
plt.show()
Of course, if the function ever has multiple x values for a given y value (like sine or a parabola) then this will not work because it will no longer be a 1:1 function from x to y, and the above answers are necessary. This is just a simplification in a limited use case.
I have a function which is an interpolation of a relative large set of data. I use linear interpolation interp1d so there are a lot of non-smooth sharp point like this. The quad function from scipy will give warning because of the sharp points. I wonder how to do the integration without the warning?
Thank you!
Thanks for all the answers. Here I summarize the solutions in case some others run into the same problem:
Just like what #Stelios did, use points to avoid warnings and to get a more accurate result.
In practice the number of points are usually larger than the default limit(limit=50) of quad, so I choose quad(f_interp, a, b, limit=2*p.shape[0], points=p) to avoid all those warnings.
If a and b are not the same start or the end point of the data set x, the points p can be chosen by p = x[where(x>=a and x<=b)]
quad accepts an optional argument, called points. According to the documentation:
points : (sequence of floats,ints), optional
A sequence of break points in the bounded integration interval where
local difficulties of the integrand may occur (e.g., singularities,
discontinuities). The sequence does not have to be sorted.
In your case, the "difficult" points are exactly the x-coordinates of the data points. Here is an example:
import numpy as np
from scipy.integrate import quad
np.random.seed(123)
# generate random data set
x = np.arange(0,10)
y = np.random.rand(10)
# construct a linear interpolation function of the data set
f_interp = lambda xx: np.interp(xx, x, y)
Here is a plot of the data points and f_interp:
Now calling quad as
quad(f_interp,0,9)
return a series of warnings along with
(4.89770017785734, 1.3762838395159349e-05)
If you provide the points argument, i.e.,
quad(f_interp,0,9, points = x)
it issues no warnings and the result is
(4.8977001778573435, 5.437539505167948e-14)
which also implies a much greater accuracy of the result compared to the previous call.
Instead of interp1d, you could use scipy.interpolate.InterpolatedUnivariateSpline. That interpolator has the method integral(a, b) that computes the definite integral.
Here's an example:
import numpy as np
from scipy.interpolate import InterpolatedUnivariateSpline
import matplotlib.pyplot as plt
# Create some test data.
x = np.linspace(0, np.pi, 21)
np.random.seed(12345)
y = np.sin(1.5*x) + np.random.laplace(scale=0.35, size=len(x))**3
# Create the interpolator. Use k=1 for linear interpolation.
finterp = InterpolatedUnivariateSpline(x, y, k=1)
# Create a finer mesh of points on which to compute the integral.
xx = np.linspace(x[0], x[-1], 5*len(x))
# Use the interpolator to compute the integral from 0 to t for each
# t in xx.
qq = [finterp.integral(0, t) for t in xx]
# Plot stuff
p = plt.plot(x, y, '.', label='data')
plt.plot(x, y, '-', color=p[0].get_color(), label='linear interpolation')
plt.plot(xx, qq, label='integral of linear interpolation')
plt.grid()
plt.legend(framealpha=1, shadow=True)
plt.show()
The plot:
I want to make a streamplot in Basemap module, but I get a blank sphere. Please help me resolve this problem. I use matplotlib 1.3 and ordinary streamplot is working fine.
import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits.basemap import Basemap
map = Basemap(projection='ortho',lat_0=45,lon_0=-100,resolution='l')
# draw lat/lon grid lines every 30 degrees.
map.drawmeridians(np.arange(0,360,30))
map.drawparallels(np.arange(-90,90,30))
# prepare grids
lons = np.linspace(0, 2*np.pi, 100)
lats = np.linspace(-np.pi/2, np.pi/2, 100)
lons, lats = np.meshgrid(lons, lats)
# parameters for vector field
beta = 0.0
alpha = 1.0
u = -np.cos(lats)*(beta - alpha*np.cos(2.0*lons))
v = alpha*(1.0 - np.cos(lats)**2)*np.sin(2.0*lons)
speed = np.sqrt(u*u + v*v)
# compute native map projection coordinates of lat/lon grid.
x, y = map(lons*180./np.pi, lats*180./np.pi)
# contour data over the map.
cs = map.streamplot(x, y, u, v, latlon = True, color = speed, cmap=plt.cm.autumn, linewidth=0.5)
plt.show()
I can't exactly tell you what's wrong, but from the matplotlib.streamplot manual:
matplotlib.pyplot.streamplot(x, y, u, v, density=1, linewidth=None,
color=None, cmap=None, norm=None, arrowsize=1, arrowstyle=u'-|>',
minlength=0.1, transform=None, zorder=1, hold=None)ΒΆ
Draws streamlines of a vector flow.
x, y : 1d arrays
an evenly spaced grid.
u, v : 2d arrays
x and y-velocities. Number of rows should match length of y, and the number of columns should match x.
Additionally from matplotlib.basemap.streamplot you can read that
If latlon keyword is set to True, x,y are intrepreted as longitude and latitude in degrees.
Which corresponds to the fact that x and y should be 1D arrays (lat, lon). However in your example x and y are
>>> np.shape(x)
(100, 100)
>>> np.shape(y)
(100, 100)
Then again you call the method map() "to compute native map projection coordinates of lat/lon grid" which is coincidentally the same as the name of your basemap.map. So it depends on which one do you want? Because both will return a value! (or better to say, both will return an error)
Aditionally check out the values you have in your u array. They are of range e-17. While other values are easily in the range e+30. IIRC the way you get streamlines is by solving a differential equation in which points you sent it as values are used as parameters at coordinates you sent. It's not hard to imagine a that while calculating something with these numbers a floating point round-off occurs and you suddenly start getting NaN or 0 values.
Try to scale your example better or if you want to pursue the solution to the end you can try and use np.seterr to get a more detailed idea where it fails.
Sorry I couldn't have been of a bigger help.
For instance, for the unit sphere, one has
x = cos(phi)sin(theta)
y = sin(phi)sin(theta)
z = cos(theta)
I would like to simply plot the set of points where phi and theta are in the intervals [0, 2*pi] and [0, pi], respectively.
Is there a way to do this in the general case, meaning specifying
x,y,z as functions of some parameters, and
The ranges of those parameters
and then getting a 3D plot of that?
I think as far as Mayavi goes, you will always be stuck with creating some grids yourself, and plotting the resulting datapoints...
This does, however, not have to be too cumbersome when using numpy :
from numpy import pi, sin, cos, mgrid
[phi,theta] = mgrid[0:2*pi:100j,0:pi:100j] # 100 is the amount of steps in the respective dimension
x = cos(phi)*sin(theta)
y = sin(phi)*sin(theta)
z = cos(theta)
from mayavi import mlab
s = mlab.mesh(x, y, z)
mlab.show()