I have several 2d sets of scattered data that I would like to find the edges of. Some edges may be open lines, others may be polygons.
For example, here is one plot that has an open edge that I would like to be able to keep. I would actually like to create a polygon from the open edges so I can use point_in_poly to check if another point lies inside. The points that would close the polygon are the boundaries of my plot area, btw.
Any ideas on where to get started?
EDIT:
Here is what I have already tried:
KernelDensity from sklearn. The edges point density varies significantly enough to not be entirely distinguishable from the bulk of the points.
kde = KernelDensity()
kde.fit(my_data)
dens = np.exp(kde.score_samples(ds))
dmax = dens.max()
dens_mask = (0.4 * dmax < dens) & (dens < 0.8 * dmax)
ax.scatter(ds[dens_mask, 0], ds[dens_mask, 1], ds[dens_mask, 2],
c=dens[dens_mask], depthshade=False, marker='o', edgecolors='none')
Incidentally, the 'gap' in the left side of the color plot is the same one that is in the black and white plot above. I also am pretty sure that I could be using KDE better. For example, I would like to get the density for a much smaller volume, more like using radius_neighbors from sklearn's NearestNeighbors()
ConvexHull from scipy. I tried removing points from semi-random data (for practice) while still keeping a point of interest (here, 0,0) inside the convex set. This wasn't terribly effective. I had no sophisticated way of exlcuding points from an iteration and only removed the ones that were used in the last convex hull. This code and accompanying image shows the first and last hull made while keeping the point of interest in the set.
hull = ConvexHull(pts)
contains = True
while contains:
temp_pts = np.delete(pts, hull.vertices, 0)
temp_hull = ConvexHull(temp_pts)
tp = path.Path(np.hstack((temp_pts[temp_hull.vertices, 0][np.newaxis].T,
temp_pts[temp_hull.vertices, 1][np.newaxis].T)))
if not tp.contains_point([0, 0]):
contains = False
hull = ConvexHull(pts)
plt.plot(pts[hull.vertices, 0], pts[hull.vertices, 1])
else:
pts = temp_pts
plt.plot(pts[hull.vertices, 0], pts[hull.vertices, 1], 'r-')
plt.show()
Ideally the goal for convex hull would be to maximize the area inside the hull while keeping only the point of interest inside the set but I haven't been able to code this.
KMeans() from sklearn.cluster. Using n=3 clusters I tried just run the class with default settings and got three horizontal groups of points. I haven't learned how to train the data to recognize points that form edges.
Here is a piece of the model where the data points are coming from. The solid areas contain points while the voids do not.
Here, and here are some other questions I have asked that show some more of what I have been looking at.
So I was able to do this in a roundabout way.
I used images of slices of the model in the xy plane generated from SolidWorks to distinguish the areas of interest.
If you see them, there are points in the corners of the picture that I placed in the model for reference at known distances. These points allowed me to determine the number of pixels per millimeter. From there, I mapped the points in my analysis set to pixels and checked the color of the pixel. If the pixel is white it is masked.
def mask_z_level(xi, yi, msk_img, x0=-14.3887, y0=5.564):
im = plt.imread(msk_img)
msk = np.zeros(xi.shape, dtype='bool')
pxmm = np.zeros((3, 2))
p = 0
for row in range(im.shape[0]):
for col in range(im.shape[1]):
if tuple(im[row, col]) == (1., 0., 0.):
pxmm[p] = (row, col)
p += 1
pxx = pxmm[1, 1] / 5.5
pxy = pxmm[2, 0] / 6.5
print(pxx, pxy)
for j in range(xi.shape[1]):
for i in range(xi.shape[0]):
x, y = xi[i, j], yi[i, j]
dx, dy = x - x0, y - y0
dpx = np.round(dx * pxx).astype('int')
dpy = -np.round(dy * pxy).astype('int')
if tuple(im[dpy, dpx]) == (1., 1., 1.):
msk[i, j] = True
return msk
Here is a plot showing the effects of the masking:
I am still fine tuning the borders but I have a very manageable task now that the mask is in largely complete. The reason being is that some mask points are incorrect resulting in banding.
Related
I am trying to triangulate a number of polygons such that the triangles do not add extra points. For the sake of keeping the question short I will be using 2 circles in each other, in reality these will be opencv contours, however the translation between the two is quite complex and the circles also show the problem.
So I have the following code (based on the example) in order to first get the circles and then triangulate them with the triangle project
import matplotlib.pyplot as plt
import numpy as np
import triangle as tr
def circle(N, R):
i = np.arange(N)
theta = i * 2 * np.pi / N
pts = np.stack([np.cos(theta), np.sin(theta)], axis=1) * R
seg = np.stack([i, i + 1], axis=1) % N
return pts, seg
pts0, seg0 = circle(30, 1.4)
pts1, seg1 = circle(16, 0.6)
pts = np.vstack([pts0, pts1])
seg = np.vstack([seg0, seg1 + seg0.shape[0]])
print(pts)
print(seg)
A = dict(vertices=pts, segments=seg, holes=[[0, 0]])
print(seg)
B = tr.triangulate(A) #note that the origin uses 'qpa0.05' here
tr.compare(plt, A, B)
plt.show()
Now this causes both the outer and inner circles to get triangulated like show here , clearly ignoring the hole. However by setting the 'qpa0.05' flag we can cause the circle to use the hole as seen here . However doing this causes the triangles to be split, adding many different triangles, increasing the qpa to a higher value does cause the number of triangles to be somewhat decreased, however they remain there.
Note that I want to be able to handle multiple holes in the same shape, and that shapes might end up being concave.
Anybody know how to get the triangulation to use the holes without adding extra triangles?
You can connect the hole (or holes) to the exterior perimeter so that you get a single "degenerate polygon" defined by a single point sequence that connects all points without self-intersection.
You go in and out through the same segment. If you follow the outer perimeter clockwise, you need to follow the hole perimeter counterclockwise or vice-versa. Otherwise, it would self-intersect.
I figured it out
the "qpa0.05" should have been a 'p', the p makes the code factor in holes and the a sets the maximum area for the triangles, this causes the extra points to be added.
B = tr.triangulate(A,'p')
I am trying to draw a polygon (concave) edge on a K-Means cluster shown below (fig_1).
With #ypnos's help, This piece of code plot everything except the edge.
df = pd.read_csv('https://raw.githubusercontent.com/MachineIntellect/dataset.ml/master/watermelon/watermelon_4_0.csv')
X = df.iloc[:,1:].to_numpy()
m0 = X[5]
m1 = X[11]
m2 = X[23]
centroids = np.array([m0, m1, m2])
labels = pairwise_distances_argmin(X, centroids)
m0 = X[labels == 0].mean(0)
m1 = X[labels == 1].mean(0)
m2 = X[labels == 2].mean(0)
new_centroids = np.array([m0, m1, m2])
plt.xlim(0.1,0.9)
plt.ylim(0, 0.8)
plt.scatter(X[:,0], X[:,1])
plt.scatter(new_centroids[:,0], new_centroids[:,1], c='r', marker = '+')
for i in range(3):
points = X[labels == i]
hull = ConvexHull(points)
for simplex in hull.simplices:
plt.plot(points[simplex, 0], points[simplex, 1], 'r-')
(fig_2)
The scikit-learn doc seems to be inspiring
The question is that the edges pointed by the arrow in fig_1 are different from the correspondence in fig_2.
the edge of the polygon that was being pointed to by the arrow was bent inward (thanks to #dwilli).
Thanks to #ImportanceOfBeingErnest's reminder, scipy.spatial.ConvexHull may not be able to produce concave.
Is there any other module/package to do this (concave)?
any hint would be appreciated.
What your inspiration shows is a Voronoi diagram. The coloring shows for any coordinate in the graph, which cluster it would be associated to.
The polygons you show in your first figure are a rough approximation of the convex hull of your cluster members. You could use scipy.spatial.ConvexHull or cv2.convexHull() (from OpenCV) to compute it. The documentation of the former also gives an example on how to plot it.
To generate the polygon you can try the below steps
Generate polygons around each cluster treating each cluster as in individual part of the plot.
You can create a rough polygon using the convex hull method mentioned by #ypnos, but to get a better result, have a look at the Delaunay triangulation method.
You will generate triangular regions between the points based on a set threshold value. The threshold will ensure the best possible fit.
Using this data, you can plot a concave hull using the extreme points. As you don't want the extreme points to be included as the vertices of the polygon, you should add a buffer to go around the points by a set value.
Expected result on some sample data
There's quite a bit of code required to achieve the result, here is a link to a comprehensive guide to generate the sample plot.
I have a set 100k of of geo locations (lat/lon) and a hexogonal grid (4k polygons). My goal is to calculate the total number of points which are located within each polygon.
My current algorithm uses 2 for loops to loop over all geo points and all polygons, which is really slow if I increase the number of polygons... How would you speedup the algorithm? I have uploaded a minimal example which creates 100k random geo points and uses 561 cells in the grid...
I also saw that reading the geo json file (with 4k polygons) takes some time, maybe i should export the polygons into a csv?
hexagon_grid.geojson file:
https://gist.github.com/Arnold1/9e41454e6eea910a4f6cd68ff1901db1
minimal python example:
https://gist.github.com/Arnold1/ee37a2e4b2dfbfdca9bfae7c7c3a3755
You don't need to explicitly test each hexagon to see whether a given point is located inside it.
Let's assume, for the moment, that all of your points fall somewhere within the bounds of your hexagonal grid. Because your hexagons form a regular lattice, you only really need to know which of the hexagon centers is closest to each point.
This can be computed very efficiently using a scipy.spatial.cKDTree:
import numpy as np
from scipy.spatial import cKDTree
import json
with open('/tmp/grid.geojson', 'r') as f:
data = json.load(f)
verts = []
centroids = []
for hexagon in data['features']:
# a (7, 2) array of xy coordinates specifying the vertices of the hexagon.
# we ignore the last vertex since it's equal to the first
xy = np.array(hexagon['geometry']['coordinates'][0][:6])
verts.append(xy)
# compute the centroid by taking the average of the vertex coordinates
centroids.append(xy.mean(0))
verts = np.array(verts)
centroids = np.array(centroids)
# construct a k-D tree from the centroid coordinates of the hexagons
tree = cKDTree(centroids)
# generate 10000 normally distributed xy coordinates
sigma = 0.5 * centroids.std(0, keepdims=True)
mu = centroids.mean(0, keepdims=True)
gen = np.random.RandomState(0)
xy = (gen.randn(10000, 2) * sigma) + mu
# query the k-D tree to find which hexagon centroid is nearest to each point
distance, idx = tree.query(xy, 1)
# count the number of points that are closest to each hexagon centroid
counts = np.bincount(idx, minlength=centroids.shape[0])
Plotting the output:
from matplotlib import pyplot as plt
fig, ax = plt.subplots(1, 1, subplot_kw={'aspect': 'equal'})
ax.hold(True)
ax.scatter(xy[:, 0], xy[:, 1], 10, c='b', alpha=0.25, edgecolors='none')
ax.scatter(centroids[:, 0], centroids[:, 1], marker='h', s=(counts + 5),
c=counts, cmap='Reds')
ax.margins(0.01)
I can think of several different ways you could handle points that fall outside your grid depending on how much accuracy you need:
You could exclude points that fall outside the outer bounding rectangle of your hexagon vertices (i.e. x < xmin, x > xmax etc.). However, this will fail to exclude points that fall within the 'gaps' along the edges of your grid.
Another straightforward option would be to set a cut-off on distance according to the spacing of your hexagon centers, which is equivalent to using a circular approximation for your outer hexagons.
If accuracy is crucial then you could define a matplotlib.path.Path corresponding to the outer vertices of your hexagonal grid, then use its .contains_points() method to test whether your points are contained within it. Compared to the other two methods, this would probably be slower and more fiddly to code.
I have a historical time sequence of seafloor images scanned from film that need registration.
from pylab import *
import cv2
import urllib
urllib.urlretrieve('http://geoport.whoi.edu/images/frame014.png','frame014.png');
urllib.urlretrieve('http://geoport.whoi.edu/images/frame015.png','frame015.png');
gray1=cv2.imread('frame014.png',0)
gray2=cv2.imread('frame015.png',0)
figure(figsize=(14,6))
subplot(121);imshow(gray1,cmap=cm.gray);
subplot(122);imshow(gray2,cmap=cm.gray);
I want to use the black region on the left of each image to do the registration, since that region was inside the camera and should be fixed in time. So I just need to compute the affine transformation between the black regions.
I determined these regions by thresholding and finding the largest contour:
def find_biggest_contour(gray,threshold=40):
# threshold a grayscale image
ret,thresh = cv2.threshold(gray,threshold,255,1)
# find the contours
contours,h = cv2.findContours(thresh,mode=cv2.RETR_LIST,method=cv2.CHAIN_APPROX_NONE)
# measure the perimeter
perim = [cv2.arcLength(cnt,True) for cnt in contours]
# find contour with largest perimeter
i=perim.index(max(perim))
return contours[i]
c1=find_biggest_contour(gray1)
c2=find_biggest_contour(gray2)
x1=c1[:,0,0];y1=c1[:,0,1]
x2=c2[:,0,0];y2=c2[:,0,1]
figure(figsize=(8,8))
imshow(gray1,cmap=cm.gray, alpha=0.5);plot(x1,y1,'b-')
imshow(gray2,cmap=cm.gray, alpha=0.5);plot(x2,y2,'g-')
axis([0,1500,1000,0]);
The blue is the longest contour from the 1st frame, the green is the longest contour from the 2nd frame.
What is the best way to determine the rotation and offset between the blue and green contours?
I only want to use the right side of the contours in some region surrounding the step, something like the region between the arrows.
Of course, if there is a better way to register these images, I'd love to hear it. I already tried a standard feature matching approach on the raw images, and it didn't work well enough.
Following Shambool's suggested approach, here's what I've come up with. I used a Ramer-Douglas-Peucker algorithm to simplify the contour in the region of interest and identified the two turning points. I was going to use the two turning points to get my three unknowns (xoffset, yoffset and angle of rotation), but the 2nd turning point is a bit too far toward the right because RDP simplified away the smoother curve in this region. So instead I used the angle of the line segment leading up to the 1st turning point. Differencing this angle between image1 and image2 gives me the rotation angle. I'm still not completely happy with this solution. It worked well enough for these two images, but I'm not sure it will work well on the entire image sequence. We'll see.
It would really be better to fit the contour to the known shape of the black border.
# select region of interest from largest contour
ind1=where((x1>190.) & (y1>200.) & (y1<900.))[0]
ind2=where((x2>190.) & (y2>200.) & (y2<900.))[0]
figure(figsize=(10,10))
imshow(gray1,cmap=cm.gray, alpha=0.5);plot(x1[ind1],y1[ind1],'b-')
imshow(gray2,cmap=cm.gray, alpha=0.5);plot(x2[ind2],y2[ind2],'g-')
axis([0,1500,1000,0])
def angle(x1,y1):
# Returns angle of each segment along an (x,y) track
return array([math.atan2(y,x) for (y,x) in zip(diff(y1),diff(x1))])
def simplify(x,y, tolerance=40, min_angle = 60.*pi/180.):
"""
Use the Ramer-Douglas-Peucker algorithm to simplify the path
http://en.wikipedia.org/wiki/Ramer-Douglas-Peucker_algorithm
Python implementation: https://github.com/sebleier/RDP/
"""
from RDP import rdp
points=vstack((x,y)).T
simplified = array(rdp(points.tolist(), tolerance))
sx, sy = simplified.T
theta=abs(diff(angle(sx,sy)))
# Select the index of the points with the greatest theta
# Large theta is associated with greatest change in direction.
idx = where(theta>min_angle)[0]+1
return sx,sy,idx
sx1,sy1,i1 = simplify(x1[ind1],y1[ind1])
sx2,sy2,i2 = simplify(x2[ind2],y2[ind2])
fig = plt.figure(figsize=(10,6))
ax =fig.add_subplot(111)
ax.plot(x1, y1, 'b-', x2, y2, 'g-',label='original path')
ax.plot(sx1, sy1, 'ko-', sx2, sy2, 'ko-',lw=2, label='simplified path')
ax.plot(sx1[i1], sy1[i1], 'ro', sx2[i2], sy2[i2], 'ro',
markersize = 10, label='turning points')
ax.invert_yaxis()
plt.legend(loc='best')
# determine x,y offset between 1st turning points, and
# angle from difference in slopes of line segments approaching 1st turning point
xoff = sx2[i2[0]] - sx1[i1[0]]
yoff = sy2[i2[0]] - sy1[i1[0]]
iseg1 = [i1[0]-1, i1[0]]
iseg2 = [i2[0]-1, i2[0]]
ang1 = angle(sx1[iseg1], sy1[iseg1])
ang2 = angle(sx2[iseg2], sy2[iseg2])
ang = -(ang2[0] - ang1[0])
print xoff, yoff, ang*180.*pi
-28 14 5.07775871644
# 2x3 affine matrix M
M=array([cos(ang),sin(ang),xoff,-sin(ang),cos(ang),yoff]).reshape(2,3)
print M
[[ 9.99959685e-01 8.97932821e-03 -2.80000000e+01]
[ -8.97932821e-03 9.99959685e-01 1.40000000e+01]]
# warp 2nd image into coordinate frame of 1st
Minv = cv2.invertAffineTransform(M)
gray2b = cv2.warpAffine(gray2,Minv,shape(gray2.T))
figure(figsize=(10,10))
imshow(gray1,cmap=cm.gray, alpha=0.5);plot(x1[ind1],y1[ind1],'b-')
imshow(gray2b,cmap=cm.gray, alpha=0.5);
axis([0,1500,1000,0]);
title('image1 and transformed image2 overlain with 50% transparency');
Good question.
One approach is to represent contours as 2d point clouds and then do registration.
More simple and clear code in Matlab that can give you affine transform.
And more complex C++ code(using VXL lib) with python and matlab wrapper included.
Or you can use some modificated ICP(iterative closest point) algorithm that is robust to noise and can handle affine transform.
Also your contours seems to be not very accurate so it can be a problem.
Another approach is to use some kind of registration that use pixel values.
Matlab code (I think it's using some kind of minimizer+ crosscorrelation metric)
Also maybe there is some kind of optical flow registration(or some other kind) that is used in medical imaging.
Also you can use point features as SIFT(SURF).
You can try it quick in FIJI(ImageJ)
also this link.
Open 2 images
Plugins->feature extraction-> sift (or other)
Set expected transformation to affine
Look at estimated transformation model [3,3] homography matrix in ImageJ log.
If it works good then you can implement it in python using OpenCV or maybe using Jython with ImageJ.
And it will be better if you post original images and describe all conditions (it seems that image is changing between frames)
You can represent these contours with their respective ellipses. These ellipses are centered on the centroid of the contour and they are oriented towards the main density axis. You can compare the centroids and the orientation angle.
1) Fill the contours => drawContours with thickness=CV_FILLED
2) Find moments => cvMoments()
3) And use them.
Centroid: { x, y } = {M10/M00, M01/M00 }
Orientation (theta):
EDIT: I customized the sample code from legacy (enteringblobdetection.cpp) for your case.
/* Image moments */
double M00,X,Y,XX,YY,XY;
CvMoments m;
CvRect r = ((CvContour*)cnt)->rect;
CvMat mat;
cvMoments( cvGetSubRect(pImgFG,&mat,r), &m, 0 );
M00 = cvGetSpatialMoment( &m, 0, 0 );
X = cvGetSpatialMoment( &m, 1, 0 )/M00;
Y = cvGetSpatialMoment( &m, 0, 1 )/M00;
XX = (cvGetSpatialMoment( &m, 2, 0 )/M00) - X*X;
YY = (cvGetSpatialMoment( &m, 0, 2 )/M00) - Y*Y;
XY = (cvGetSpatialMoment( &m, 1, 1 )/M00) - X*Y;
/* Contour description */
CvPoint myCentroid(r.x+(float)X,r.y+(float)Y);
double myTheta = atan( 2*XY/(XX-YY) );
Also, check this with OpenCV 2.0 examples.
If you don't want to find the homography between the two images and want to find the affine transformation you have three unknowns, rotation angle (R), and the displacement in x and y (X,Y). Therefore minimum of two points (with two known values for each) are needed to find the unknowns. Two points should be matched between the two images or two lines, each has two known values, the intercept and slope. If you go with the point matching approach, the further the points are from each other the more robust is the found transform to noise (this is very simple if you remember error propagation rules).
In the two point matching method:
find two points (A and B) in the first image I1 and their corresponding points (A',B') in the second image I2
find the middle point between A and B: C, and the middle point between A' and B': C'
the difference C and C' (C-C') gives the translation between the images (X and Y)
using the dot product of C-A and C'-A' you can find the rotation angle (R)
To detect robust points, I would find the the points along the side of counter you have found with highest absolute value of the second derivative (Hessian) and then try to match them. Since you mentioned this is a video footage you can easily make the assumption the transformation between each two frames is small to reject the outliers.
I'm trying to come up with an algorithm that will determine turning points in a trajectory of x/y coordinates. The following figures illustrates what I mean: green indicates the starting point and red the final point of the trajectory (the entire trajectory consists of ~ 1500 points):
In the following figure, I added by hand the possible (global) turning points that an algorithm could return:
Obviously, the true turning point is always debatable and will depend on the angle that one specifies that has to lie between points. Furthermore a turning point can be defined on a global scale (what I tried to do with the black circles), but could also be defined on a high-resolution local scale. I'm interested in the global (overall) direction changes, but I'd love to see a discussion on the different approaches that one would use to tease apart global vs local solutions.
What I've tried so far:
calculate distance between subsequent points
calculate angle between subsequent points
look how distance / angle changes between subsequent points
Unfortunately this doesn't give me any robust results. I probably have too calculate the curvature along multiple points, but that's just an idea.
I'd really appreciate any algorithms / ideas that might help me here. The code can be in any programming language, matlab or python are preferred.
EDIT here's the raw data (in case somebody want's to play with it):
mat file
text file (x coordinate first, y coordinate in second line)
You could use the Ramer-Douglas-Peucker (RDP) algorithm to simplify the path. Then you could compute the change in directions along each segment of the simplified path. The points corresponding to the greatest change in direction could be called the turning points:
A Python implementation of the RDP algorithm can be found on github.
import matplotlib.pyplot as plt
import numpy as np
import os
import rdp
def angle(dir):
"""
Returns the angles between vectors.
Parameters:
dir is a 2D-array of shape (N,M) representing N vectors in M-dimensional space.
The return value is a 1D-array of values of shape (N-1,), with each value
between 0 and pi.
0 implies the vectors point in the same direction
pi/2 implies the vectors are orthogonal
pi implies the vectors point in opposite directions
"""
dir2 = dir[1:]
dir1 = dir[:-1]
return np.arccos((dir1*dir2).sum(axis=1)/(
np.sqrt((dir1**2).sum(axis=1)*(dir2**2).sum(axis=1))))
tolerance = 70
min_angle = np.pi*0.22
filename = os.path.expanduser('~/tmp/bla.data')
points = np.genfromtxt(filename).T
print(len(points))
x, y = points.T
# Use the Ramer-Douglas-Peucker algorithm to simplify the path
# http://en.wikipedia.org/wiki/Ramer-Douglas-Peucker_algorithm
# Python implementation: https://github.com/sebleier/RDP/
simplified = np.array(rdp.rdp(points.tolist(), tolerance))
print(len(simplified))
sx, sy = simplified.T
# compute the direction vectors on the simplified curve
directions = np.diff(simplified, axis=0)
theta = angle(directions)
# Select the index of the points with the greatest theta
# Large theta is associated with greatest change in direction.
idx = np.where(theta>min_angle)[0]+1
fig = plt.figure()
ax =fig.add_subplot(111)
ax.plot(x, y, 'b-', label='original path')
ax.plot(sx, sy, 'g--', label='simplified path')
ax.plot(sx[idx], sy[idx], 'ro', markersize = 10, label='turning points')
ax.invert_yaxis()
plt.legend(loc='best')
plt.show()
Two parameters were used above:
The RDP algorithm takes one parameter, the tolerance, which
represents the maximum distance the simplified path
can stray from the original path. The larger the tolerance, the cruder the simplified path.
The other parameter is the min_angle which defines what is considered a turning point. (I'm taking a turning point to be any point on the original path, whose angle between the entering and exiting vectors on the simplified path is greater than min_angle).
I will be giving numpy/scipy code below, as I have almost no Matlab experience.
If your curve is smooth enough, you could identify your turning points as those of highest curvature. Taking the point index number as the curve parameter, and a central differences scheme, you can compute the curvature with the following code
import numpy as np
import matplotlib.pyplot as plt
import scipy.ndimage
def first_derivative(x) :
return x[2:] - x[0:-2]
def second_derivative(x) :
return x[2:] - 2 * x[1:-1] + x[:-2]
def curvature(x, y) :
x_1 = first_derivative(x)
x_2 = second_derivative(x)
y_1 = first_derivative(y)
y_2 = second_derivative(y)
return np.abs(x_1 * y_2 - y_1 * x_2) / np.sqrt((x_1**2 + y_1**2)**3)
You will probably want to smooth your curve out first, then calculate the curvature, then identify the highest curvature points. The following function does just that:
def plot_turning_points(x, y, turning_points=10, smoothing_radius=3,
cluster_radius=10) :
if smoothing_radius :
weights = np.ones(2 * smoothing_radius + 1)
new_x = scipy.ndimage.convolve1d(x, weights, mode='constant', cval=0.0)
new_x = new_x[smoothing_radius:-smoothing_radius] / np.sum(weights)
new_y = scipy.ndimage.convolve1d(y, weights, mode='constant', cval=0.0)
new_y = new_y[smoothing_radius:-smoothing_radius] / np.sum(weights)
else :
new_x, new_y = x, y
k = curvature(new_x, new_y)
turn_point_idx = np.argsort(k)[::-1]
t_points = []
while len(t_points) < turning_points and len(turn_point_idx) > 0:
t_points += [turn_point_idx[0]]
idx = np.abs(turn_point_idx - turn_point_idx[0]) > cluster_radius
turn_point_idx = turn_point_idx[idx]
t_points = np.array(t_points)
t_points += smoothing_radius + 1
plt.plot(x,y, 'k-')
plt.plot(new_x, new_y, 'r-')
plt.plot(x[t_points], y[t_points], 'o')
plt.show()
Some explaining is in order:
turning_points is the number of points you want to identify
smoothing_radius is the radius of a smoothing convolution to be applied to your data before computing the curvature
cluster_radius is the distance from a point of high curvature selected as a turning point where no other point should be considered as a candidate.
You may have to play around with the parameters a little, but I got something like this:
>>> x, y = np.genfromtxt('bla.data')
>>> plot_turning_points(x, y, turning_points=20, smoothing_radius=15,
... cluster_radius=75)
Probably not good enough for a fully automated detection, but it's pretty close to what you wanted.
A very interesting question. Here is my solution, that allows for variable resolution. Although, fine-tuning it may not be simple, as it's mostly intended to narrow down
Every k points, calculate the convex hull and store it as a set. Go through the at most k points and remove any points that are not in the convex hull, in such a way that the points don't lose their original order.
The purpose here is that the convex hull will act as a filter, removing all of "unimportant points" leaving only the extreme points. Of course, if the k-value is too high, you'll end up with something too close to the actual convex hull, instead of what you actually want.
This should start with a small k, at least 4, then increase it until you get what you seek. You should also probably only include the middle point for every 3 points where the angle is below a certain amount, d. This would ensure that all of the turns are at least d degrees (not implemented in code below). However, this should probably be done incrementally to avoid loss of information, same as increasing the k-value. Another possible improvement would be to actually re-run with points that were removed, and and only remove points that were not in both convex hulls, though this requires a higher minimum k-value of at least 8.
The following code seems to work fairly well, but could still use improvements for efficiency and noise removal. It's also rather inelegant in determining when it should stop, thus the code really only works (as it stands) from around k=4 to k=14.
def convex_filter(points,k):
new_points = []
for pts in (points[i:i + k] for i in xrange(0, len(points), k)):
hull = set(convex_hull(pts))
for point in pts:
if point in hull:
new_points.append(point)
return new_points
# How the points are obtained is a minor point, but they need to be in the right order.
x_coords = [float(x) for x in x.split()]
y_coords = [float(y) for y in y.split()]
points = zip(x_coords,y_coords)
k = 10
prev_length = 0
new_points = points
# Filter using the convex hull until no more points are removed
while len(new_points) != prev_length:
prev_length = len(new_points)
new_points = convex_filter(new_points,k)
Here is a screen shot of the above code with k=14. The 61 red dots are the ones that remain after the filter.
The approach you took sounds promising but your data is heavily oversampled. You could filter the x and y coordinates first, for example with a wide Gaussian and then downsample.
In MATLAB, you could use x = conv(x, normpdf(-10 : 10, 0, 5)) and then x = x(1 : 5 : end). You will have to tweak those numbers depending on the intrinsic persistence of the objects you are tracking and the average distance between points.
Then, you will be able to detect changes in direction very reliably, using the same approach you tried before, based on the scalar product, I imagine.
Another idea is to examine the left and the right surroundings at every point. This may be done by creating a linear regression of N points before and after each point. If the intersecting angle between the points is below some threshold, then you have an corner.
This may be done efficiently by keeping a queue of the points currently in the linear regression and replacing old points with new points, similar to a running average.
You finally have to merge adjacent corners to a single corner. E.g. choosing the point with the strongest corner property.