I'm working with python on a 3d point cloud files that are in format XYZ, I need to calculate the distance from each one of them to the center, and then label (and colour for better visualization) them according to it. So far I got this cloud classification using this code:
xyz_coordinates = points[:, 0:3]
xyz_min = np.amin(xyz_coordinates, axis=0) # 3, gets minimum of each axis
xyz_max = np.amax(xyz_coordinates, axis=0) # 3, gets maximum of each axis
xyz_center = (xyz_min + xyz_max) / 2
xyz_max_euclidean=distance.euclidean(xyz_center,xyz_max) # gets euclidean distance, it gives me circles
xyz_cut=xyz_max_euclidean/N_CLASSES
# gets the euclidean distance for all points and assign normalized tagged classes
for i in xrange(xyz_coordinates.shape[0]):
label=int(math.floor(distance.euclidean(xyz_center,xyz_coordinates[i])/xyz_cut))
point_bbox_list.append(np.concatenate([xyz_coordinates[i],g_distance2color[str(label)],np.array([label])],0))
but as you can see I am calculating the euclidean distance from the center to each one of the points, and this is not correct, in this case, for example the limit of the walls or the table are not correct. I imagined this kind of graph but with a squared coloured shape. Right now I been successful in calculating the bounding boxes for each object, shown here but I can not get the results I expect. I've tried also with mahalanobis distance, but the classification turns out as an ellipsoid, is there any other calculating distance metric that I can use?
Related
I have a 50 by 50 grid of evenly spaced (x,y) points. Each of these points has a third scalar value. This can be visualized using a contourplot which I have added. I am interested in the regions indicated in by the red circles. These regions of low "Z-values" are what I want to extract from this data.
2D contour plot of 50 x 50 evenly spaced grid points:
I want to do this by using clustering (machine learning), which can be lightning quick when applied correctly. The problem is, however, that the points are evenly spaced together and therefore the density of the entire dataset is equal everywhere.
I have tried using a DBSCAN algorithm with a custom distance metric which takes into account the Z values of each point. I have defined the distance between two points as follows:\
def custom_distance(point1,point2):
average_Z = (point1[2]+point2[2])/2
distance = np.sqrt(np.square((point1[0]-point2[0])) + np.square((point1[1]-point2[1])))
distance = distance * average_Z
return distance
This essentially determines the Euclidean distance between two points and adds to it the average of the two Z values of both points. In the picture below I have tested this distance determination function applied in a DBSCAN algorithm. Each point in this 50 by 50 grid each has a Z value of 1, except for four clusters that I have randomly placed. These points each have a z value of 10. The algorithm is able to find the clusters in the data based on their z value as can be seen below.
DBSCAN clustering result using scalar value distance determination:
Positive about the results I tried to apply it to my actual data, only to be disappointed by the results. Since the x and y values of my data are very large, I have simply scaled them to be 0 to 49. The z values I have left untouched. The results of the clustering can be seen in the image below:
Clustering result on original data:
This does not come close to what I want and what I was expecting. For some reason the clusters that are found are of rectangular shape and the light regions of low Z values that I am interested in are not extracted with this approach.
Is there any way I can make the DBSCAN algorithm work in this way? I suspect the reason that it is currently not working has something to do with the differences in scale of the x,y and z values. I am also open for tips or recommendations on other approaches on how to define and find the lighter regions in the data.
I have a 2x2 matrix of distances from a depth sensor.
The matrix is cropped so only the points we are interested in is in the frame(All the points in the cropped image contains the object).
My question is how can we determine if this object is flat or not?
The depth image is acquired from Realsense d435. I read the depth image and then multiply it by depth_scale.
The object is recognized using AI for the rgb image that is aligned with the depth image.
And I have 4 points on the object. So, all the distances in that rectangle contains the distance of the object from the sensor.
My first idea was standard deviation of all the points. But then this falls apart if the image is taken from an angle. (since the standard deviation won't be 0)
From an angle the distance of a flat object is changing uniformly on the y axis. Maybe somehow, we can use this information?
The 2x2 matrix is a numpy array in python. Maybe there are some libraries which do this already.
After reprojecting your four depth measurements to the 3D space, it becomes a problem of deciding if your set of points is coplanar. There are several ways you can go about it.
One way to do it is to reproject the points to 3D and fit a plane to all four of them there. Since you're fitting a plane to four points in three dimensions, you get an over-determined system, and it's very unlikely that all points would lie exactly on the estimated plane. At this stage, you could prescribe some tolerance to determine "goodness of fit". For instance, you could look at the R^2 coefficient.
To fit the plane you can use scipy.linalg.lstsq. Here's a good description of how it can be done: Fit plane to a set of points in 3D.
Another way to approach the problem is by calculating the volume of a tetrahedron spanned by the four points in 3D. If they are coplanar (or close to coplanar), the volume of such a tatrahedron should be equal to (or close to) 0. Assuming your pointa reprojected to 3D can be described by (x_0, y_0, z_0), ..., (x_3, y_3, z_3), the volume of the tetrahedron is equal to:
volume = abs(numpy.linalg.det(tetrahedron)) / 6, where
tetrahedron = np.array([[x_0, y_0, z_0, 1], [x_1, y_1, z_1, 1], [x_2, y_2, z_2, 1], [x_3, y_3, z_3, 1]])
To check if your points are on the same plane, (equivalently - if the tetrahedron has a small enough volume), it is now sufficient to check if
volume < TOL
for some defined small tolerance value, which must be determined experimentally.
You can define a surface by choosing three of the four 3D points.
Evaluate the distance from the remaining point to the surface.
How to choose the three points is... it may be good to choose the pattern that maximizes the area of the triangle.
Problem
Need to identify a way to find 2 mile clusters of points where each point has a value. Identify 2 mile areas which have a sum(value) > 50.
Data
I have data that looks like the following
ID COUNT LATITUDE LONGITUDE
187601546 20 025.56394 -080.03206
187601547 25 025.56394 -080.03206
187601548 4 025.56394 -080.03206
187601550 0 025.56298 -080.03285
Roughly 200K records. What I need to determine is if there are any areas where more than sum of the count exceeds 65 in a one mile radius (2 mile diameter) area.
Using each point as a center for an area
Now, I have python code from another project that will draw a shapefile around a point of x diameter as follows:
def poly_based_on_distance(center_lat,center_long, distance, bearing):
# bearing is in degrees
# distance in miles
# print ('center', center_lat, center_long)
destination = (vincenty(miles=distance).destination(Point(center_lat,
center_long), bearing).format_decimal())
And a routine to return destination and then see which points are inside the radius.
## This is the evaluation for overlap between points and
## area polyshapes
area_list = []
store_geo_dict = {}
for stores in locationdict:
location = Polygon(locationdict[stores])
for areas in AREAdictionary:
area = Polygon(AREAdictionary[areass])
if store.intersects(area):
area_list.append(areas)
store_geo_dict[stores] = area_list
area_list = []
At this point, I am simply drawing a circular shapefile around each of the 200K points, see which others were inside and doing the count.
Need Clustering Algorithm?
However, there might be an area with the required count density where one of the points is not in the center.
I'm familiar with clustering algos such as DBSCAN that use attributes for classification but this is a matter of finding a density clusters using a value for each point. Is there any clustering algorithm to find any cluster of a 2 mile diameter circle where the inside count is >= 50?
Any suggestions, python or R are preferred tools but this is wide-open and probably a one-off so computation efficiency is not a priority.
Not a complete solution, but maybe it will help simplify the problem depending on the distribution of your data. I will use planar coordinates and cKDTree in my example, this might work with geographic data if you can ignore curvature in a projection.
The main observation is the following: a point (x,y) does not contribute to a dense cluster if a ball of radius 2*r (e.g. 2 miles) around (x,y) contributes less than the cutoff value (e.g. 50 in your title). In fact, any point within r of (x,y) does not contribute to ant dense cluster.
This allows you to repeatedly discard points from consideration. If you are left with no points, there are no dense clusters; if you are left with some points, clusters may exist.
import numpy as np
from scipy.spatial import cKDTree
# test data
N = 1000
data = np.random.rand(N, 2)
x, y = data.T
# test weights of each point
weights = np.random.rand(N)
def filter_noncontrib(pts, weights, radius=0.1, cutoff=60):
tree = cKDTree(pts)
contribs = np.array(
[weights[tree.query_ball_point(pt, 2 * radius)].sum() for pt in pts]
)
return contribs >= cutoff
def possible_contributors(pts, weights, radius=0.1, cutoff=60):
n_pts = len(pts)
while len(pts):
mask = filter_noncontrib(pts, weights, radius, cutoff)
pts = pts[mask]
weights = weights[mask]
if len(pts) == n_pts:
break
n_pts = len(pts)
return pts
Example with dummy data:
DBSCAN can be adapted (see Generalized DBSCAN; define core points as weight sum >= 50), but it will not ensure the maximum cluster size (it computes transitive closures).
You could also try complete linkage. Use it to find clusters with the desired maximum diameter, then check if these satisfy the desired density. But that does not guarantee to find all.
It's probably faster to (a) build an index for fast radius search. (b) for every point, find neighbors in radius r; keep if they have the desired minimum sum. But that does not guarantee to find everything because the center is not necessarily a data point. Consider a max radius of 1, minimum weight 100. Two points with weight 50 each, at (0,0) and (1,1). Neither a query at (0,0) nor one at (1,1) will discover the solution, but a cluster at (.5,.5) satisfies the conditions.
Unfortunately, I believe your problem is at least NP-hard, so you won't be able to afford the ultimate solution.
I have a list of X and Y coordinates from geodata of a specific part of the world. I want to assign each coordinate, a weight, based upon where it lies in the graph.
For Example: If a point lies in a place where there are a lot of other nodes around it, it lies in a high density area, and therefore has a higher weight.
The most immediate method I can think of is drawing circles of unit radius around each point and then calculating if the other points lie within in and then using a function, assign a weight to that point. But this seems primitive.
I've looked at pySAL and NetworkX but it looks like they work with graphs. I don't have any edges in the graph, just nodes.
A standard solution would be using KDE (Kernel Density Estimation).
Search on web: "KDE Estimation" you will find enormous links.
in Google type: KDE Estimation ext:pdf
Also, Scipy has KDE, follow this http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html. There is working example codes there ;)
If you have a lot of points, you may compute nearest neighbors more efficiently using a KDTree:
import numpy as np
import scipy.spatial as spatial
points = np.array([(1, 2), (3, 4), (4, 5), (100,100)])
tree = spatial.KDTree(np.array(points))
radius = 3.0
neighbors = tree.query_ball_tree(tree, radius)
print(neighbors)
# [[0, 1], [0, 1, 2], [1, 2], [3]]
tree.query_ball_tree returns indices (of points) of the nearest neighbors. For example, [0,1] (at index 0) means points[0] and points[1] are within radius distance from points[0]. [0,1,2] (at index 1) means points[0], points[1] and points[2] are within radius distance from points[1].
frequency = np.array(map(len, neighbors))
print(frequency)
# [2 3 2 1]
density = frequency/radius**2
print(density)
# [ 0.22222222 0.33333333 0.22222222 0.11111111]
Yes, you do have edges, and they are the distances between the nodes. In your case, you have a complete graph with weighted edges.
Simply derive the distance from each node to each other node -- which gives you O(N^2) in time complexity --, and use both nodes and edges as input to one of these approaches you found.
Happens though your problem seems rather an analysis problem other than anything else; you should try to run some clustering algorithm on your data, like K-means, that clusters nodes based on a distance function, in which you can simply use the euclidean distance.
The result of this algorithm is exactly what you'll need, as you'll have clusters of close elements, you'll know what and how many elements are assigned to each group, and you'll be able to, according to these values, generate the coefficient you want to assign to each node.
The only concern worth pointing out here is that you'll have to determine how many clusters -- k-means, k-clusters -- you want to create.
You initial inclination to draw a circle around each point and count the number of other points in that circle is a good one and as mentioned by unutbu, a KDTree will be a fast way to solve this problem.
This can be done very easily with PySAL, which using scipy's kdtree under the hood.
import pysal
import numpy
pts = numpy.random.random((100,2)) #generate some random points
radius = 0.2 #pick an arbitrary radius
#Build a Spatial Weights Matrix
W = pysal.threshold_continuousW_from_array(pts, threshold=radius)
# Note: if your points are in Latitude and Longitude you can increase the accuracy by
# passing the radius of earth to this function and it will use arc distances.
# W = pysal.threshold_continuousW_from_array(pts, threshold=radius, radius=pysal.cg.RADIUS_EARTH_KM)
print W.cardinalities
#{0: 10, 1: 15, ..... }
If your data is in a Shapefile, simply replace threshold_continuousW_from_array with threshold_continuousW_from_shapefile, see the docs for details.
Is there even such a thing as a 3D centroid? Let me be perfectly clear—I've been reading and reading about centroids for the last 2 days both on this site and across the web, so I'm perfectly aware at the existing posts on the topic, including Wikipedia.
That said, let me explain what I'm trying to do. Basically, I want to take a selection of edges and/or vertices, but NOT faces. Then, I want to place an object at the 3D centroid position.
I'll tell you what I don't want:
The vertices average, which would pull too far in any direction that has a more high-detailed mesh.
The bounding box center, because I already have something working for this scenario.
I'm open to suggestions about center of mass, but I don't see how this would work, because vertices or edges alone don't define any sort of mass, especially when I just have an edge loop selected.
For kicks, I'll show you some PyMEL that I worked up, using #Emile's code as reference, but I don't think it's working the way it should:
from pymel.core import ls, spaceLocator
from pymel.core.datatypes import Vector
from pymel.core.nodetypes import NurbsCurve
def get_centroid(node):
if not isinstance(node, NurbsCurve):
raise TypeError("Requires NurbsCurve.")
centroid = Vector(0, 0, 0)
signed_area = 0.0
cvs = node.getCVs(space='world')
v0 = cvs[len(cvs) - 1]
for i, cv in enumerate(cvs[:-1]):
v1 = cv
a = v0.x * v1.y - v1.x * v0.y
signed_area += a
centroid += sum([v0, v1]) * a
v0 = v1
signed_area *= 0.5
centroid /= 6 * signed_area
return centroid
texas = ls(selection=True)[0]
centroid = get_centroid(texas)
print(centroid)
spaceLocator(position=centroid)
In theory centroid = SUM(pos*volume)/SUM(volume) when you split the part into finite volumes each with a location pos and volume value volume.
This is precisely the calculation done for finding the center of gravity of a composite part.
There is not just a 3D centroid, there is an n-dimensional centroid, and the formula for it is given in the "By integral formula" section of the Wikipedia article you cite.
Perhaps you are having trouble setting up this integral? You have not defined your shape.
[Edit] I'll beef up this answer in response to your comment. Since you have described your shape in terms of edges and vertices, then I'll assume it is a polyhedron. You can partition a polyedron into pyramids, find the centroids of the pyramids, and then the centroid of your shape is the centroid of the centroids (this last calculation is done using ja72's formula).
I'll assume your shape is convex (no hollow parts---if this is not the case then break it into convex chunks). You can partition it into pyramids (triangulate it) by picking a point in the interior and drawing edges to the vertices. Then each face of your shape is the base of a pyramid. There are formulas for the centroid of a pyramid (you can look this up, it's 1/4 the way from the centroid of the face to your interior point). Then as was said, the centroid of your shape is the centroid of the centroids---ja72's finite calculation, not an integral---as given in the other answer.
This is the same algorithm as in Hugh Bothwell's answer, however I believe that 1/4 is correct instead of 1/3. Perhaps you can find some code for it lurking around somewhere using the search terms in this description.
I like the question. Centre of mass sounds right, but the question then becomes, what mass for each vertex?
Why not use the average length of each edge that includes the vertex? This should compensate nicely areas with a dense mesh.
You will have to recreate face information from the vertices (essentially a Delauney triangulation).
If your vertices define a convex hull, you can pick any arbitrary point A inside the object. Treat your object as a collection of pyramidal prisms having apex A and each face as a base.
For each face, find the area Fa and the 2d centroid Fc; then the prism's mass is proportional to the volume (== 1/3 base * height (component of Fc-A perpendicular to the face)) and you can disregard the constant of proportionality so long as you do the same for all prisms; the center of mass is (2/3 A + 1/3 Fc), or a third of the way from the apex to the 2d centroid of the base.
You can then do a mass-weighted average of the center-of-mass points to find the 3d centroid of the object as a whole.
The same process should work for non-convex hulls - or even for A outside the hull - but the face-calculation may be a problem; you will need to be careful about the handedness of your faces.