Related
I am trying to estimate/interpolate a curve from noisy data like the circle in the example. My data consists of more than circles but this should be a good starting point for solving the other structures as well.
I have a noisy binary image and I am trying to fit a continuous curve/skeleton to it (each pixel has 2 neighbours, except maybe start and end pixel, if the shape is not circular).
I had some success fitting the x,y coordinates separately, using the distance to a starting point as x values and the coordinates as y value and then interpolating distances in small steps. Then I checked if the coordinates were all connected. In some extreme cases the new interpolated points are not connected and I have to use smaller steps for the interpolation. This often also leads to pixels with more than 2 neighbours and other weird artifacts.
Is there an easier way to fit these values to a curve and to get a continuous curve as a result?
import numpy as np
from skimage import draw
from matplotlib import pyplot as plt
image = np.zeros((200,200), dtype=np.uint8)
coords = np.array(draw.circle_perimeter(100,100,50))
noise = np.random.normal(0,2,coords.shape).astype(np.int64)
coords += noise
image[coords[0], coords[1]] = 1
plt.imshow(image, cmap="gray")
plt.show()
To fit data, you need a model. There are any number of ways of fitting a circle. The one I've had the most success with is Ian Coope's linearized solution. The paper is available here: https://ir.canterbury.ac.nz/handle/10092/11104
I've made a python implementation of it in a linearized fitting library called scikit-guess. The function is skg.nsphere_fit. Given your (2, n) array coords, you would use it like this:
from skg import nsphere_fit
radius, center = nsphere_fit(coords, axis=0)
To plot over your image, you can use matplotlib.patches.Circle:
from matplotlib.patches import Circle
fig, ax = plt.subplots()
ax.imshow(image, cmap='gray')
ax.add_patch(Circle(center[::-1], radius, edgecolor='red', facecolor='none'))
You need to reverse center because your input coordinates are (row, col), while Circle expects (x, y), which is (col, row).
To fit a different model, you would need a different method. For arbitrary models, you might want to look into scipy.optimize and lmfit.
Fitting a circle to noisy data is very simple :
This method comes from https://fr.scribd.com/doc/14819165/Regressions-coniques-quadriques-circulaire-spherique
Using matplotlib(or if there exists anything else), i want to populate a scatterplot image by using a grey scale image as its distribution. I have found many resource to create heat maps from images but not the other way around.
The input image will be like this one.
I think I understand what you're going for, but I'm not certain. I also don't really understand what this would be used for so I'm extra uncertain about this answer, but here goes:
So by loading the image we can evaluate each pixel position and its intensity. We can use that intensity as a "fitness" value and probabilistically add it to our plot so that we can get some of that "density" of points that you want to see. I picked a really simple equation as a decider (I just cubed the value), but feel free to replace that with whatever you want.
import cv2
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import random
# select func
def selection(value):
return value**3 >= random.randint(0, 255**3);
# populate the sample
def populate(img):
# get res
h, w = img.shape;
# go through and populate
sx = [];
sy = [];
for y in range(0, h):
for x in range(0, w):
val = img[y, x];
# use intensity to decide if it gets in
# replace with what you want this function to look like
if selection(val):
sx.append(x);
sy.append(h - y); # opencv is top-left origin
return sx, sy;
# I'm using opencv to pull the image into code, use whatever you like
# matplotlib can also do something similar, but I'm not familiar with its format
img = cv2.imread("circ.png");
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY);
# lets take a sample
sx, sy = populate(img);
# find the bigger square size
h, w = img.shape;
side = None;
if h > w:
side = h;
else:
side = w;
# make a square graph
fig, ax = plt.subplots();
ax.scatter(sx, sy, s = 4);
ax.set_xlim((0, side));
ax.set_ylim((0, side));
x0,x1 = ax.get_xlim();
y0,y1 = ax.get_ylim();
ax.set_aspect(abs(x1-x0)/abs(y1-y0));
fig.savefig("out.png", dpi=600);
plt.show();
Feel free to replace opencv with whatever image library you're comfortable with. I'm pretty sure matplotlib can open images as well, but openCV is what I'm most familiar with so I used that.
As far as I can tell, you're trying to generate random coordinates that follow a distribution described by a grayscale image: the brighter each point, the more likely that point's coordinates will be generated. Your problem can thus be solved by a rejection sampler, as follows.
Assume you know the width and height of the image in pixels, call them w and h.
Generate two random numbers: one in the interval [0, w), and [0, h). These are the x and y coordinates, respectively.
Get the pixel at the given coordinates x and y in the image. This can be done using interpolation, but describing interpolation techniques is beyond the scope of this answer. For this reason, we will use only the nearest pixel ("nearest neighbor") in the image: take the pixel at coordinate floor(x) and floor(y) (and step 1 devolves to generating random integers). Convert the pixel somehow to a number p in the interval [0, 1]; in this answer we will assume black is 0 and white is 1, to simplify matters.
With probability p, return the point (x, y). Otherwise, go to step 1.
Roughly speaking, the time complexity of this algorithm depends on the numbers of "bright points" the input image has, compared to the number of "dark points". In general, the "brighter" the image, the higher the acceptance rate (and the faster the algorithm runs).
I'm trying to visualise a 2D plane cutting through a 3D graph with Numpy and Matplotlib to explain the intuition of partial derivatives.
Specifically, the function I'm using is J(θ1,θ2) = θ1^2 + θ2^2, and I want to plot a θ1-J(θ1,θ2) plane at θ2=0.
I have managed to plot a 2D plane with the below code but the superposition of the 2D plane and the 3D graph isn't quite right and the 2D plane is slightly off, as I want the plane to look like it's cutting the 3D at θ2=0.
It would be great if I can borrow your expertise on this, thanks.
def f(theta1, theta2):
return theta1**2 + theta2**2
fig, ax = plt.subplots(figsize=(6, 6),
subplot_kw={'projection': '3d'})
x,z = np.meshgrid(np.linspace(-1,1,100), np.linspace(0,2,100))
X = x.T
Z = z.T
Y = 0 * np.ones((100, 100))
ax.plot_surface(X, Y, Z)
r = np.linspace(-1,1,100)
theta1_grid, theta2_grid = np.meshgrid(r,r)
J_grid = f(theta1_grid, theta2_grid)
ax.contour3D(theta1_grid,theta2_grid,J_grid,500,cmap='binary')
ax.set_xlabel(r'$\theta_1$',fontsize='large')
ax.set_ylabel(r'$\theta_2$',fontsize='large')
ax.set_zlabel(r'$J(\theta_1,\theta_2)$',fontsize='large')
ax.set_title(r'Fig.2 $J(\theta_1,\theta_2)=(\theta_1^2+\theta_2^2)$',fontsize='x-large')
plt.tight_layout()
plt.show()
This is the image output by the code:
As #ImportanceOfBeingErnest noted in a comment, your code is fine but matplotlib has a 2d engine, so 3d plots easily show weird artifacts. In particular, objects are rendered one at a time, so two 3d objects are typically either fully in front of or fully behind one another, which makes the visualization of interlocking 3d objects near impossible using matplotlib.
My personal alternative suggestion would be mayavi (incredible flexibility and visualizations, pretty steep learning curve), however I would like to show a trick with which the problem can often be removed altogether. The idea is to turn your two independent objects into a single one using an invisible bridge between your surfaces. Possible downsides of the approach are that
you need to plot both surfaces as surfaces rather than a contour3D, and
the output relies heavily on transparency, so you need a backend that can handle that.
Disclaimer: I learned this trick from a contributor to the matplotlib topic of the now-defunct Stack Overflow Documentation project, but unfortunately I don't remember who that user was.
In order to use this trick for your use case, we essentially have to turn that contour3D call to another plot_surface one. I don't think this is overall that bad; you perhaps need to reconsider the density of your cutting plane if you see that the resulting figure has too many faces for interactive use. We also have to explicitly define a point-by-point colormap, the alpha channel of which contributes the transparent bridge between your two surfaces. Since we need to stitch the two surfaces together, at least one "in-plane" dimension of the surfaces have to match; in this case I made sure that the points along "y" are the same in the two cases.
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
def f(theta1, theta2):
return theta1**2 + theta2**2
fig, ax = plt.subplots(figsize=(6, 6),
subplot_kw={'projection': '3d'})
# plane data: X, Y, Z, C (first three shaped (nx,ny), last one shaped (nx,ny,4))
x,z = np.meshgrid(np.linspace(-1,1,100), np.linspace(0,2,100)) # <-- you can probably reduce these sizes
X = x.T
Z = z.T
Y = 0 * np.ones((100, 100))
# colormap for the plane: need shape (nx,ny,4) for RGBA values
C = np.full(X.shape + (4,), [0,0,0.5,1]) # dark blue plane, fully opaque
# surface data: theta1_grid, theta2_grid, J_grid, CJ (shaped (nx',ny) or (nx',ny,4))
r = np.linspace(-1,1,X.shape[1]) # <-- we are going to stitch the surface along the y dimension, sizes have to match
theta1_grid, theta2_grid = np.meshgrid(r,r)
J_grid = f(theta1_grid, theta2_grid)
# colormap for the surface; scale data to between 0 and 1 for scaling
CJ = plt.get_cmap('binary')((J_grid - J_grid.min())/J_grid.ptp())
# construct a common dataset with an invisible bridge, shape (2,ny) or (2,ny,4)
X_bridge = np.vstack([X[-1,:],theta1_grid[0,:]])
Y_bridge = np.vstack([Y[-1,:],theta2_grid[0,:]])
Z_bridge = np.vstack([Z[-1,:],J_grid[0,:]])
C_bridge = np.full(Z_bridge.shape + (4,), [1,1,1,0]) # 0 opacity == transparent; probably needs a backend that supports transparency!
# join the datasets
X_surf = np.vstack([X,X_bridge,theta1_grid])
Y_surf = np.vstack([Y,Y_bridge,theta2_grid])
Z_surf = np.vstack([Z,Z_bridge,J_grid])
C_surf = np.vstack([C,C_bridge,CJ])
# plot the joint datasets as a single surface, pass colors explicitly, set strides to 1
ax.plot_surface(X_surf, Y_surf, Z_surf, facecolors=C_surf, rstride=1, cstride=1)
ax.set_xlabel(r'$\theta_1$',fontsize='large')
ax.set_ylabel(r'$\theta_2$',fontsize='large')
ax.set_zlabel(r'$J(\theta_1,\theta_2)$',fontsize='large')
ax.set_title(r'Fig.2 $J(\theta_1,\theta_2)=(\theta_1^2+\theta_2^2)$',fontsize='x-large')
plt.tight_layout()
plt.show()
The result from two angles:
As you can see, the result is pretty decent. You can start playing around with the individual transparencies of your surfaces to see if you can make that cross-section more visible. You can also switch the opacity of the bridge to 1 to see how your surfaces are actually stitched together. All in all what we had to do was take your existing data, make sure their sizes match, and define explicit colormaps and the auxiliary bridge between the surfaces.
I have several 2d sets of scattered data that I would like to find the edges of. Some edges may be open lines, others may be polygons.
For example, here is one plot that has an open edge that I would like to be able to keep. I would actually like to create a polygon from the open edges so I can use point_in_poly to check if another point lies inside. The points that would close the polygon are the boundaries of my plot area, btw.
Any ideas on where to get started?
EDIT:
Here is what I have already tried:
KernelDensity from sklearn. The edges point density varies significantly enough to not be entirely distinguishable from the bulk of the points.
kde = KernelDensity()
kde.fit(my_data)
dens = np.exp(kde.score_samples(ds))
dmax = dens.max()
dens_mask = (0.4 * dmax < dens) & (dens < 0.8 * dmax)
ax.scatter(ds[dens_mask, 0], ds[dens_mask, 1], ds[dens_mask, 2],
c=dens[dens_mask], depthshade=False, marker='o', edgecolors='none')
Incidentally, the 'gap' in the left side of the color plot is the same one that is in the black and white plot above. I also am pretty sure that I could be using KDE better. For example, I would like to get the density for a much smaller volume, more like using radius_neighbors from sklearn's NearestNeighbors()
ConvexHull from scipy. I tried removing points from semi-random data (for practice) while still keeping a point of interest (here, 0,0) inside the convex set. This wasn't terribly effective. I had no sophisticated way of exlcuding points from an iteration and only removed the ones that were used in the last convex hull. This code and accompanying image shows the first and last hull made while keeping the point of interest in the set.
hull = ConvexHull(pts)
contains = True
while contains:
temp_pts = np.delete(pts, hull.vertices, 0)
temp_hull = ConvexHull(temp_pts)
tp = path.Path(np.hstack((temp_pts[temp_hull.vertices, 0][np.newaxis].T,
temp_pts[temp_hull.vertices, 1][np.newaxis].T)))
if not tp.contains_point([0, 0]):
contains = False
hull = ConvexHull(pts)
plt.plot(pts[hull.vertices, 0], pts[hull.vertices, 1])
else:
pts = temp_pts
plt.plot(pts[hull.vertices, 0], pts[hull.vertices, 1], 'r-')
plt.show()
Ideally the goal for convex hull would be to maximize the area inside the hull while keeping only the point of interest inside the set but I haven't been able to code this.
KMeans() from sklearn.cluster. Using n=3 clusters I tried just run the class with default settings and got three horizontal groups of points. I haven't learned how to train the data to recognize points that form edges.
Here is a piece of the model where the data points are coming from. The solid areas contain points while the voids do not.
Here, and here are some other questions I have asked that show some more of what I have been looking at.
So I was able to do this in a roundabout way.
I used images of slices of the model in the xy plane generated from SolidWorks to distinguish the areas of interest.
If you see them, there are points in the corners of the picture that I placed in the model for reference at known distances. These points allowed me to determine the number of pixels per millimeter. From there, I mapped the points in my analysis set to pixels and checked the color of the pixel. If the pixel is white it is masked.
def mask_z_level(xi, yi, msk_img, x0=-14.3887, y0=5.564):
im = plt.imread(msk_img)
msk = np.zeros(xi.shape, dtype='bool')
pxmm = np.zeros((3, 2))
p = 0
for row in range(im.shape[0]):
for col in range(im.shape[1]):
if tuple(im[row, col]) == (1., 0., 0.):
pxmm[p] = (row, col)
p += 1
pxx = pxmm[1, 1] / 5.5
pxy = pxmm[2, 0] / 6.5
print(pxx, pxy)
for j in range(xi.shape[1]):
for i in range(xi.shape[0]):
x, y = xi[i, j], yi[i, j]
dx, dy = x - x0, y - y0
dpx = np.round(dx * pxx).astype('int')
dpy = -np.round(dy * pxy).astype('int')
if tuple(im[dpy, dpx]) == (1., 1., 1.):
msk[i, j] = True
return msk
Here is a plot showing the effects of the masking:
I am still fine tuning the borders but I have a very manageable task now that the mask is in largely complete. The reason being is that some mask points are incorrect resulting in banding.
I have a bunch of images like this one:
The corresponding data is not available. I need to automatically retrieve about 100 points (regularly x-spaced) on the blue curve. All curves are very similar, so I need at least 1 pixel precision, but sub-pixel would be preferred. The good news is all curves start from 0,0 and end at 1,1, so we may forget about the grid.
Any hint on Python libs that could help or any other approach ? Thanks !
I saved your image to a file 14154233_input.png. Then this program
import pylab as plt
import numpy as np
# Read image from disk and filter all grayscale
im = plt.imread("14154233_input.png")[:,:,:3]
im -= im.mean(axis=2).reshape(im.shape[0], im.shape[1], 1).repeat(3,axis=2)
im_maxnorm = im.max(axis=2)
# Find y-position of remaining line
ypos = np.ones((im.shape[1])) * np.nan
for i in range(im_maxnorm.shape[1]):
if im_maxnorm[:,i].max()<0.01:
continue
ypos[i] = np.argmax(im_maxnorm[:,i])
# Pick only values that are set
ys = 1-ypos[np.isfinite(ypos)]
# Normalize to 0,1
ys -= ys.min()
ys /= ys.max()
# Create x values
xs = np.linspace(0,1,ys.shape[0])
# Create plot of both
# read and filtered image and
# data extracted
plt.figure(figsize=(4,8))
plt.subplot(211)
plt.imshow(im_maxnorm)
plt.subplot(212, aspect="equal")
plt.plot(xs,ys)
plt.show()
Produces this plot:
You can then do with xs and ys whatever you want. Maybe you should put this code in a function that returns xs and ys or so.
One could improve the precision by fitting gaussians on each column or so. If you really need it, tell me.
First, read the image via
from scipy.misc import imread
im = imread("thefile.png")
This gives a 3D numpy array with the third dimension being the color channels (RGB+alpha). The curve is in the blue channel, but the grid is there also. But in the red channel, you have the grid and not the curve. So we use
a = im[:,:,2] - im[:,:,0]
Now, we want the position of the maximum along each column. With one pixel precision, it is given by
y0 = np.argmax(a, axis=0)
The result of this is zero when there is no blue curve in the column , i.e. outside the frame. On can get the limits of the frame by
xmin, xmax = np.where(y0>0)[0][[0,-1]
With this, you may be able to rescale x axis.
Then, you want subpixel resolution. Let us focus on a single column
f=a[:,x]
We use a single iteration of the Newton method to refine the position of an extrema
y1 = y0 - f'[y]/f''[y]
Note that we cannot iterate further because of the discreet sampling. Nontheless, we want a good approximation of the derivatives, so we will use a 5-point scheme for both.
coefprime = np.array([1,-8, 0, 8, -1], float)
coefsec = np.array([-1, 16, -30, 16, -1], float)
y1 = y0 - np.dot(f[y0-2:y0+3], coefprime)/np.dot(f[y0-2:y0+3], coefsec)
P.S. : Thorsten Kranz was faster than me (at least here), but my answer has the subpixel precision and my way of extracting the blue curve is probably more understandable.