Get best linear function which approximate some dots in 3D - python

I have 4 dots which are represented with these coordinates:
X = [0.1, 0.5, 0.9, 0.18]
Y = [0.7, 0.5, 0.7, 0.3]
Z = [4.2, 3.3, 4.2, 2.5]
and I have to get the best linear function (plane) which approximate these 4 dots.
I'm aware of numpy.polyfit, but polyfitworks only with x and y (2D),
What can I do?

while not completely general, if the the data points can be reasonably represented as a surface relative to a coordinate plane, say z = ax + by + c then np.linalg.lstsq can be used
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import numpy as np
X = np.array([0.1, 0.5, 0.9, 0.18])
Y = np.array([0.7, 0.5, 0.7, 0.3])
Z = np.array([4.2, 3.3, 4.2, 2.5])
# least squares fit
A = np.vstack([X, Y, np.ones(len(X))]).T
a,b,c= np.linalg.lstsq(A, Z)[0]
# plots
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# plot data as big red crosses
ax.scatter(X, Y, Z, color='r', marker='+', linewidth=10)
# plot plane fit as grid of green dots
xs = np.linspace(min(X), max(X), 10)
ys = np.linspace(min(Y), max(Y), 10)
xv, yv = np.meshgrid(xs, ys)
zv = a*xv + b*yv + c
ax.scatter(xv, yv, zv, color = 'g')
# ax.plot_wireframe(xv, yv, zv, color = 'g') # alternative fit plane plot
plt.show()
plotting the data 1st, you could select a different coordinate pair for the "independent variable" plane to avoid ill conditioned result if necessary, if the data points appeared to lie in a plane containing the z axis, then use xz or yz
and of course you could have degenerate points on a line or the vertices of a regular tetrahedron
for a better "geometric fit" the 1st fitted plane could be used as the base for a 2nd least square fit of the data rotated into that coordinate system (if the data is "reasonably" plane like)

Related

How to find out the value y for a specific x after fitting a polynomial line using polyfit?

I have fitted a polynomial line on a graph using poly1D. How can I determine the value of y of this polynomial line for a specific value of x?
draw_polynomial = np.poly1d(np.polyfit(x, y, 8))
polyline = np.linspace(min_x, max_x, 300)
plt.plot(polyline, draw_polynomial(polyline), color='purple')
plt.show()
Here, I want to find out the y if x = 6.
You can directly call the fitted result p (polyline in your case) to get the y value. For example, x_val = 3.5, y_val_interp = round(p(x_val), 2) will give a y value of -0.36 in the code example below. I also added some annotations to visualize the result better.
import numpy as np
import numpy.polynomial.polynomial as npp
import matplotlib.pyplot as plt
# Since numpy version 1.4, the new polynomial API
# defined in numpy.polynomial is preferred.
x = np.array([0.0, 1.0, 2.0, 3.0, 4.0, 5.0])
y = np.array([0.0, 0.8, 0.9, 0.1, -0.8, -1.0])
z = npp.polyfit(x, y, 4)
p = np.poly1d(np.flip(z))
xp = np.linspace(-2, 6, 100)
plt.plot(x, y, '.', markersize=12, zorder=2.01)
plt.plot(xp, p(xp), '-')
plt.xlim(-1, 6)
plt.ylim(-1.5, 1)
# interrupting y value based on x value
x_val = 3.5
y_val_interp = round(p(x_val), 2)
# add dashed lines
plt.plot([x_val, xp[0]], [y_val_interp, y_val_interp], '--', color='k')
plt.plot([x_val, x_val], [p(xp[0]), y_val_interp], '--', color='k')
# add annotation and marker
plt.annotate(f'(x={x_val}, y={y_val_interp})', (x_val, y_val_interp), size=12, xytext=(x_val * 1.05, y_val_interp))
plt.plot(x_val, y_val_interp, 'o', color='r', zorder=2.01)
print(f'x = {x_val}, y = {y_val_interp}')
plt.tight_layout()
plt.show()
References:
https://numpy.org/doc/stable/reference/generated/numpy.polyfit.html
https://numpy.org/doc/stable/reference/generated/numpy.polynomial.polynomial.Polynomial.fit.html#numpy.polynomial.polynomial.Polynomial.fit
https://numpy.org/doc/stable/reference/generated/numpy.poly1d.html

How do I smooth out the edges of a closed line similar to d3's curveCardinal method implementation?

I have a few data points that I am connecting using a closed line plot, and I want the line to have smooth edges similar to how the curveCardinal methods in d3 do it. Link Here
Here's a minimal example of what I want to do:
import numpy as np
from matplotlib import pyplot as plt
x = np.array([0.5, 0.13, 0.4, 0.5, 0.6, 0.7, 0.5])
y = np.array([1.0, 0.7, 0.5, 0.2, 0.4, 0.6, 1.0])
fig, ax = plt.subplots()
ax.plot(x, y)
ax.scatter(x, y)
Now, I'd like to smooth out/interpolate the line similar to d3's curveCardinal methods. Here are a few things that I've tried.
from scipy import interpolate
tck, u = interpolate.splprep([x, y], s=0, per=True)
xi, yi = interpolate.splev(np.linspace(0, 1, 100), tck)
fig, ax = plt.subplots(1, 1)
ax.plot(xi, yi, '-b')
ax.plot(x, y, 'k')
ax.scatter(x[:2], y[:2], s=200)
ax.scatter(x, y)
The result of the above code is not bad, but I was hoping that the curve would stay closer to the line when the data points are far apart (I increased the size of two such data points above to highlight this). Essentially, have the curve stay close to the line.
Using interp1d (has the same problem as the code above):
from scipy.interpolate import interp1d
x = [0.5, 0.13, 0.4, 0.5, 0.6, 0.7, 0.5]
y = [1.0, 0.7, 0.5, 0.2, 0.4, 0.6, 1.0]
orig_len = len(x)
x = x[-3:-1] + x + x[1:3]
y = y[-3:-1] + y + y[1:3]
t = np.arange(len(x))
ti = np.linspace(2, orig_len + 1, 10 * orig_len)
kind='cubic'
xi = interp1d(t, x, kind=kind)(ti)
yi = interp1d(t, y, kind=kind)(ti)
fig, ax = plt.subplots()
ax.plot(xi, yi, 'g')
ax.plot(x, y, 'k')
ax.scatter(x, y)
I also looked at the Chaikins Corner Cutting algorithm, but I don't like the result.
def chaikins_corner_cutting(coords, refinements=5):
coords = np.array(coords)
for _ in range(refinements):
L = coords.repeat(2, axis=0)
R = np.empty_like(L)
R[0] = L[0]
R[2::2] = L[1:-1:2]
R[1:-1:2] = L[2::2]
R[-1] = L[-1]
coords = L * 0.75 + R * 0.25
return coords
fig, ax = plt.subplots()
ax.plot(x, y, 'k', linewidth=1)
ax.plot(chaikins_corner_cutting(x, 4), chaikins_corner_cutting(y, 4))
I also, superficially, looked at Bezier curves, matplotlibs PathPatch, and Fancy box implementations, but I couldn't get any satisfactory results.
Suggestions are greatly appreciated.
So, here's how I ended up doing it. I decided to introduce new points between every two existing data points. The following image shows how I am adding these new points. Red are data that I have. Using a convex hull I calculate the geometric center of the data points and draw lines to it from each point (shown with blue lines). Divide these lines twice in half and connect the resulting points (green line). The center of the green line is the new point added.
Here are the functions that accomplish this:
def midpoint(p1, p2, sf=1):
"""Calculate the midpoint, with an optional
scaling-factor (sf)"""
xm = ((p1[0]+p2[0])/2) * sf
ym = ((p1[1]+p2[1])/2) * sf
return (xm, ym)
def star_curv(old_x, old_y):
""" Interpolates every point by a star-shaped curve. It does so by adding
"fake" data points in-between every two data points, and pushes these "fake"
points towards the center of the graph (roughly 1/4 of the way).
"""
try:
points = np.array([old_x, old_y]).reshape(7, 2)
hull = ConvexHull(points)
x_mid = np.mean(hull.points[hull.vertices,0])
y_mid = np.mean(hull.points[hull.vertices,1])
except:
x_mid = 0.5
y_mid = 0.5
c=1
x, y = [], []
for i, j in zip(old_x, old_y):
x.append(i)
y.append(j)
try:
xm_i, ym_i = midpoint((i, j),
midpoint((i, j), (x_mid, y_mid)))
xm_j, ym_j = midpoint((old_x[c], old_y[c]),
midpoint((old_x[c], old_y[c]), (x_mid, y_mid)))
xm, ym = midpoint((xm_i, ym_i), (xm_j, ym_j))
x.append(xm)
y.append(ym)
c += 1
except IndexError:
break
orig_len = len(x)
x = x[-3:-1] + x + x[1:3]
y = y[-3:-1] + y + y[1:3]
t = np.arange(len(x))
ti = np.linspace(2, orig_len + 1, 10 * orig_len)
kind='quadratic'
xi = interp1d(t, x, kind=kind)(ti)
yi = interp1d(t, y, kind=kind)(ti)
return xi, yi
Here's how it looks:
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
from scipy.spatial import ConvexHull
x = [0.5, 0.13, 0.4, 0.5, 0.6, 0.7, 0.5]
y = [1.0, 0.7, 0.5, 0.2, 0.4, 0.6, 1.0]
xi, yi = star_curv(x, y)
fig, ax = plt.subplots()
ax.plot(xi, yi, 'g')
ax.plot(x, y, 'k', alpha=0.5)
ax.scatter(x, y, color='r')
The result is especially noticeable when the data points are more symmetric, for example the following x, y values give the results in the image below:
x = [0.5, 0.32, 0.34, 0.5, 0.66, 0.65, 0.5]
y = [0.71, 0.6, 0.41, 0.3, 0.41, 0.59, 0.71]
Comparison between the interpolation presented here, with the default interp1d interpolation.
I would create another array with the vertices extended in/out or up/down by about 5%. So if a point is lower than the average of the neighbouring points, make it a bit lower still.
Then do a linear interpolation between the new points, say 10 points/edge. Finally do a spline between the second last point per edge and the actual vertex. If you use Bezier curves, you can make the spline come in at the same angle on each side.
It's a bit of work, but of course you can use this anywhere.

Drawing heat map in python

I'm having two lists x, y representing coordinates in 2D. For example x = [1,4,0.5,2,5,10,33,0.04] and y = [2,5,44,0.33,2,14,20,0.03]. x[i] and y[i] represent one point in 2D. Now I also have a list representing "heat" values for each (x,y) point, for example z = [0.77, 0.88, 0.65, 0.55, 0.89, 0.9, 0.8,0.95]. Of course x,y and z are much higher dimensional than the example.
Now I would like to plot a heat map in 2D where x and y represents the axis coordinates and z represents the color. How can this be done in python?
This code produces a heat map. With a few more data points, the plot starts looking pretty nice and I've found it to be very quick in general even for >100k points.
import matplotlib.pyplot as plt
import matplotlib.tri as tri
import numpy as np
import math
x = [1,4,0.5,2,5,10,33,0.04]
y = [2,5,44,0.33,2,14,20,0.03]
z = [0.77, 0.88, 0.65, 0.55, 0.89, 0.9, 0.8, 0.95]
levels = [0.7, 0.75, 0.8, 0.85, 0.9]
plt.figure()
ax = plt.gca()
ax.set_aspect('equal')
CS = ax.tricontourf(x, y, z, levels, cmap=plt.get_cmap('jet'))
cbar = plt.colorbar(CS, ticks=np.sort(np.array(levels)),ax=ax, orientation='horizontal', shrink=.75, pad=.09, aspect=40,fraction=0.05)
cbar.ax.set_xticklabels(list(map(str,np.sort(np.array(levels))))) # horizontal colorbar
cbar.ax.tick_params(labelsize=8)
plt.title('Heat Map')
plt.xlabel('X Label')
plt.ylabel('Y Label')
plt.show()
Produces this image:
or if you're looking for a more gradual color change, change the tricontourf line to this:
CS = ax.tricontourf(x, y, z, np.linspace(min(levels),max(levels),256), cmap=cmap)
and then the plot will change to:
Based on this answer, you might want to do something like:
import numpy as np
from matplotlib.mlab import griddata
import matplotlib.pyplot as plt
xs0 = [1,4,0.5,2,5,10,33,0.04]
ys0 = [2,5,44,0.33,2,14,20,0.03]
zs0 = [0.77, 0.88, 0.65, 0.55, 0.89, 0.9, 0.8,0.95]
N = 30j
extent = (np.min(xs0),np.max(xs0),np.min(ys0),np.max(ys0))
xs,ys = np.mgrid[extent[0]:extent[1]:N, extent[2]:extent[3]:N]
resampled = griddata(xs0, ys0, zs0, xs, ys, interp='linear')
plt.imshow(np.fliplr(resampled).T, extent=extent,interpolation='none')
plt.colorbar()
The example here might also help: http://matplotlib.org/examples/pylab_examples/griddata_demo.html

Smoothing a 2-D figure

I have a number of vaguely rectangular 2D figures that need to be smoothed. A simplified example:
fig, ax1 = plt.subplots(1,1, figsize=(3,3))
xs1 = [-0.25, -0.625, -0.125, -1.25, -1.125, -1.25, 0.875, 1.0, 1.0, 0.5, 1.0, 0.625, -0.25]
ys1 = [1.25, 1.375, 1.5, 1.625, 1.75, 1.875, 1.875, 1.75, 1.625, 1.5, 1.375, 1.25, 1.25]
ax1.plot(xs1, ys1)
ax1.set_ylim(0.5,2.5)
ax1.set_xlim(-2,2) ;
I have tried scipy.interpolate.RectBivariateSpline but that apparently wants data at all the points (e.g. for a heat map), and scipy.interpolate.interp1d but that, reasonably enough, wants to generate a 1d smoothed version.
What is an appropriate method to smooth this?
Edit to revise/explain my goal a little better. I don't need the lines to go through all the points; in fact I'd prefer that they not go through all the points, because there are clear outlier points that "should" be averaged with neighbors, or some similar approach. I've included a crude manual sketch of the start of what I have in mind above.
Chaikin's corner cutting algorithm might be the ideal approach for you. For a given polygon with vertices as P0, P1, ...P(N-1), the corner cutting algorithm will generate 2 new vertices for each line segment defined by P(i) and P(i+1) as
Q(i) = (3/4)P(i) + (1/4)P(i+1)
R(i) = (1/4)P(i) + (3/4)P(i+1)
So, your new polygon will have 2N vertices. You can then apply the corner cutting on the new polygon again and repeatedly until the desired resolution is reached. The result will be a polygon with many vertices but they will look smooth when displayed. It can be proved that the resulting curve produced from this corner cutting approach will converge into a quadratic B-spline curve. The advantage of this approach is that the resulting curve will never over-shoot. The following pictures will give you a better idea about this algorithm (pictures taken from this link)
Original Polygon
Apply corner cutting once
Apply corner cutting one more time
See this link for more details for Chaikin's corner cutting algorithm.
Actually, you can use scipy.interpolate.inter1d for this. You need to treat both the x and y components of your polygon as separate series.
As a quick example with a square:
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
x = [0, 1, 1, 0, 0]
y = [0, 0, 1, 1, 0]
t = np.arange(len(x))
ti = np.linspace(0, t.max(), 10 * t.size)
xi = interp1d(t, x, kind='cubic')(ti)
yi = interp1d(t, y, kind='cubic')(ti)
fig, ax = plt.subplots()
ax.plot(xi, yi)
ax.plot(x, y)
ax.margins(0.05)
plt.show()
However, as you can see, this results in some issues at 0,0.
This is happening because a spline segment depends on more than just two points. The first and last points aren't "connected" in the way we interpolated. We can fix this by "padding" the x and y sequences with the second-to-last and second points so that there are "wrap-around" boundary conditions for the spline at the endpoints.
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
x = [0, 1, 1, 0, 0]
y = [0, 0, 1, 1, 0]
# Pad the x and y series so it "wraps around".
# Note that if x and y are numpy arrays, you'll need to
# use np.r_ or np.concatenate instead of addition!
orig_len = len(x)
x = x[-3:-1] + x + x[1:3]
y = y[-3:-1] + y + y[1:3]
t = np.arange(len(x))
ti = np.linspace(2, orig_len + 1, 10 * orig_len)
xi = interp1d(t, x, kind='cubic')(ti)
yi = interp1d(t, y, kind='cubic')(ti)
fig, ax = plt.subplots()
ax.plot(xi, yi)
ax.plot(x, y)
ax.margins(0.05)
plt.show()
And just to show what it looks like with your data:
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
x = [-0.25, -0.625, -0.125, -1.25, -1.125, -1.25,
0.875, 1.0, 1.0, 0.5, 1.0, 0.625, -0.25]
y = [1.25, 1.375, 1.5, 1.625, 1.75, 1.875, 1.875,
1.75, 1.625, 1.5, 1.375, 1.25, 1.25]
# Pad the x and y series so it "wraps around".
# Note that if x and y are numpy arrays, you'll need to
# use np.r_ or np.concatenate instead of addition!
orig_len = len(x)
x = x[-3:-1] + x + x[1:3]
y = y[-3:-1] + y + y[1:3]
t = np.arange(len(x))
ti = np.linspace(2, orig_len + 1, 10 * orig_len)
xi = interp1d(t, x, kind='cubic')(ti)
yi = interp1d(t, y, kind='cubic')(ti)
fig, ax = plt.subplots()
ax.plot(xi, yi)
ax.plot(x, y)
ax.margins(0.05)
plt.show()
Note that you get quite a bit of "overshoot" with this method. That's due to the cubic spline interpolation. #pythonstarter's suggestion is another good way to handle it, but bezier curves will suffer from the same problem (they're basically equivalent mathematically, it's just a matter of how the control points are defined). There are a number of other ways to handle the smoothing, including methods that are specialized for smoothing a polygon (e.g. Polynomial Approximation with Exponential Kernel (PAEK)). I've never tried to implement PAEK, so I'm not sure how involved it is. If you need to do this more robustly, you might try looking into PAEK or another similar method.
This is more of a comment then the answer, but maybe you can try to define this polygon as a Bezier curve. Code is rather simple, and I’m sure that you are familiar with how this curves work. In that case this curve would be a control polygon. But it's not all perfect: firstly it is not really "smoothed version" of this polygon, but a curve, and another thing; the higher degree of the curve it is the less it looks like control polygon. What I the want to say is that maybe you should try solving this problem with math tools instead trying to smooth instead of smoothening the polygon with programing skills

axis limits for scatter plot not holding in matplotlib

I am trying to overlay a scatter plot onto a contour plot using matplotlib, which contains
plt.contourf(X, Y, XYprof.T, self.nLevels, extent=extentYPY, \
origin = 'lower')
if self.doScatter == True and len(xyScatter['y']) != 0:
plt.scatter(xyScatter['x'], xyScatter['y'], \
s=dSize, c=myColor, marker='.', edgecolor='none')
plt.xlim(-xLimHist, xLimHist)
plt.ylim(-yLimHist, yLimHist)
plt.xlabel(r'$x$')
plt.ylabel(r'$y$')
What ends up happening is the resulting plots extend to include all of the scatter points, which can exceed the limits for the contour plot. Is there any way to get around this?
I used the following example to try and replicate your problem. If left to default, the range for x and y was -3 to 3. I input the xlim and ylim so the range for both was -2 to 2. It worked.
import numpy as np
import matplotlib.pyplot as plt
from pylab import *
# the random data
x = np.random.randn(1000)
y = np.random.randn(1000)
fig = plt.figure(1, figsize=(5.5,5.5))
X, Y = meshgrid(x, y)
Z1 = bivariate_normal(X, Y, 1.0, 1.0, 0.0, 0.0)
Z2 = bivariate_normal(X, Y, 1.5, 0.5, 1, 1)
Z = 10 * (Z1 - Z2)
origin = 'lower'
CS = contourf(x, y, Z, 10, # [-1, -0.1, 0, 0.1],
cmap=cm.bone,
origin=origin)
title('Nonsense')
xlabel('x-stuff')
ylabel('y-stuff')
# the scatter plot:
axScatter = plt.subplot(111)
axScatter.scatter(x, y)
# set axes range
plt.xlim(-2, 2)
plt.ylim(-2, 2)
show()

Categories