I am solving a problem in which I need to find the maximum distance between two points on a plane (2D) .So there is an O(n^2) approach in which I calculate distance between every point in the graph . I also implemented a convex hull algorithm now my approach is I compute convex hull in O(nlogn) and then use the O(n^2) algorithm to compute maximum distance between points in the convex hull. Is there a better approach than this to compute the max distance in convex hull
Here are my algorithm :
O(n^2)
def d(l1,l2):
return ((l2[0]-l1[0])**2+(l2[1]-l1[1])**2)
def find_max_dist(L):
max_dist = d(L[0], L[1])
for i in range(0, len(L)-1):
for j in range(i+1, len(L)):
max_dist = max(d(L[i], L[j]), max_dist)
return max_dist
convex hull
def convex_hull(points):
"""Computes the convex hull of a set of 2D points.
Input: an iterable sequence of (x, y) pairs representing the points.
Output: a list of vertices of the convex hull in counter-clockwise order,
starting from the vertex with the lexicographically smallest coordinates.
Implements Andrew's monotone chain algorithm. O(n log n) complexity.
"""
# Sort the points lexicographically (tuples are compared lexicographically).
# Remove duplicates to detect the case we have just one unique point.
points = sorted(set(points))
# Boring case: no points or a single point, possibly repeated multiple times.
if len(points) <= 1:
return points
# 2D cross product of OA and OB vectors, i.e. z-component of their 3D cross product.
# Returns a positive value, if OAB makes a counter-clockwise turn,
# negative for clockwise turn, and zero if the points are collinear.
def cross(o, a, b):
return (a[0] - o[0]) * (b[1] - o[1]) - (a[1] - o[1]) * (b[0] - o[0])
# Build lower hull
lower = []
for p in points:
while len(lower) >= 2 and cross(lower[-2], lower[-1], p) <= 0:
lower.pop()
lower.append(p)
# Build upper hull
upper = []
for p in reversed(points):
while len(upper) >= 2 and cross(upper[-2], upper[-1], p) <= 0:
upper.pop()
upper.append(p)
# Concatenation of the lower and upper hulls gives the convex hull.
# Last point of each list is omitted because it is repeated at the beginning of the other list.
return lower[:-1] + upper[:-1]
overall algorithm
l=[]
for i in xrange(int(raw_input())): # takes input denoting number of points in the plane
n=tuple(int(i) for i in raw_input().split()) #takes each point and makes a tuple
l.append(n) # appends to n
if len(l)>=10:
print find_max_dist(convex_hull(l))
else:
print find_max_dist(l)
Now how do I improve the running time of my approach and is there a better way to compute this ?
Once you have a convex hull, you can find two furthest points in linear time.
The idea is to keep two pointers: one of them points to the current edge (and is always incremented by one) and the other one points to a vertex.
The answer is the maximum distance between end points of an edge and the vertex for all edges.
It is possible to show (the proof is neither short nor trivial, so I will not post it here) that if we keep incrementing the second pointer every time after moving the first one as long as it increases the distance between the line that goes through the edge and a vertex, we will find the optimal answer.
Related
I have given a rectangle whose lower left and upper right vertex coordinates are given as (x1,y1) and (x2,y2). A general point (x, y) is given outside the rectangle. and an integer value R is also given.
Now my objective is to find total number of integral points on and inside the rectangle whose distance from (x,y) is less than or equal to R.
What I did is:
n=0
for i in range(x1, x2+1, 1):
for j in range(y1, y2+1, 1):
if (i-x)**2+(j-y)**2<=R2:
n+=1
print(n)
My code is very inefficient and its time complexity is high as nested for loops have been used. Can you please provide an efficient method to solve the same problem in python?
Example: (x1,y1)=(0,0) and (x2,y2)=(1,1)
let (x,y)=(-8,0) and R=9
then output must be 3 as only (0,0),(1,0) and (0,1) satisfies the conditions.
You can greatly reduce the complexity by approaching the circle's discrete points as a series of lines. With those lines, counting the number of intersecting points with the rectangle can be done mathematically without a nested loop. This will reduce the complexity from O(n^2) down to O(n).
For example:
x1,y1 = (0,0)
x2,y2 = (1,1)
x,y = (-8,0)
R = 9
n = 0
for cy in range(y-R,y+R+1): # cy: vertical coordinate of circle's lines
if cy not in range(y1,y2+1): # no vertical intersection
continue
dx = int((R**2-(cy-y)**2)**0.5) # width of half circle at cy
cx1,cx2 = x-dx,x+dx # edges of circle at line cy
if cx2<x1 or cx1>x2: continue # no horizontal intersection
n += min(x2,cx2)-max(x1,cx1)+1 # intersection with cy line
print(n) # 3
Visually:
# (x1,y1) ----- y+3 ... no vertical intersection
# XXXXXXXXXXX--------- y+2 ... intersect line at y+2 with x1..x2
# XXXXXXXXXX----------- y+1 ... intersect line at y+1 with x1..x2
# XXXXXXXXXX-----o----- y ... intersect line at y+0 with x1..x2
# XXXXXXXXXX----------- y-1 ... intersect line at y-1 with x1..x2
# XXXXXXXXXXX--------- y-2 ... intersect line at y-2 with x1..x2
# XXXXXXXXXXXX ----- y-3 ... no horizontal intersection
# XXXXXXXXXXXX
# XXXXXXXXXXXX
# (x2,y2)
Find points of intersection of the circle with rectangle.
Classify intersection type as cap (circle segment), as sector-like, as segment without sector, perhaps other types are possible.
Scan by Y-coordinate and for every Y get corresponding last X inside rectangle and circle.
Add X - edgeX value to result for sector-like intersection. edgeX might be x1 or x2 depending on which edge is intesected by the circle.
For circle cap take two X-points and add their difference.
With this approach complexity is linear relative to rectangle size, but some efforts needed to classify intersection.
You can use numpy for a vectorized bruteforce version:
import numpy as np
P_x,P_y=-8,0
R=9
x_min,x_max=0,1
y_min,y_max=0,1
xx,yy=np.meshgrid(np.arange(x_min,x_max+1),np.arange(x_min,x_max+1))
distances=((xx-P_x)**2+(yy-P_y)**2)
print(distances)
print(np.sum(distances<=R**2))
However you will probably even get faster by thinking out a semianalytic approach. The computational complexity of this formulation remains of the order of the area of the rectangle with a lower prefactor due to the higher execution speed of numpy routines. But the problem can be reduced to only looking at the boundaries of the circle. And deciding which subset of points lies within the boundaries.
This is a question that I was asked on a job interview some time ago. And I still can't figure out sensible answer.
Question is:
you are given set of points (x,y). Find 2 most distant points. Distant from each other.
For example, for points: (0,0), (1,1), (-8, 5) - the most distant are: (1,1) and (-8,5) because the distance between them is larger from both (0,0)-(1,1) and (0,0)-(-8,5).
The obvious approach is to calculate all distances between all points, and find maximum. The problem is that it is O(n^2), which makes it prohibitively expensive for large datasets.
There is approach with first tracking points that are on the boundary, and then calculating distances for them, on the premise that there will be less points on boundary than "inside", but it's still expensive, and will fail in worst case scenario.
Tried to search the web, but didn't find any sensible answer - although this might be simply my lack of search skills.
For this specific problem, with just a list of Euclidean points, one way is to find the convex hull of the set of points. The two distant points can then be found by traversing the hull once with the rotating calipers method.
Here is an O(N log N) implementation:
http://mukeshiiitm.wordpress.com/2008/05/27/find-the-farthest-pair-of-points/
If the list of points is already sorted, you can remove the sort to get the optimal O(N) complexity.
For a more general problem of finding most distant points in a graph:
Algorithm to find two points furthest away from each other
The accepted answer works in O(N^2).
Boundary point algorithms abound (look for convex hull algorithms). From there, it should take O(N) time to find the most-distant opposite points.
From the author's comment: first find any pair of opposite points on the hull, and then walk around it in semi-lock-step fashion. Depending on the angles between edges, you will have to advance either one walker or the other, but it will always take O(N) to circumnavigate the hull.
You are looking for an algorithm to compute the diameter of a set of points, Diam(S). It can be shown that this is the same as the diameter of the convex hull of S, Diam(S) = Diam(CH(S)). So first compute the convex hull of the set.
Now you have to find all the antipodal points on the convex hull and pick the pair with maximum distance. There are O(n) antipodal points on a convex polygon. So this gives a O(n lg n) algorithm for finding the farthest points.
This technique is known as Rotating Calipers. This is what Marcelo Cantos describes in his answer.
If you write the algorithm carefully, you can do without computing angles. For details, check this URL.
A stochastic algorithm to find the most distant pair would be
Choose a random point
Get the point most distant to it
Repeat a few times
Remove all visited points
Choose another random point and repeat a few times.
You are in O(n) as long as you predetermine "a few times", but are not guaranteed to actually find the most distant pair. But depending on your set of points the result should be pretty good. =)
This question is introduced at Introduction to Algorithm. It mentioned 1) Calculate Convex Hull O(NlgN). 2) If there is M vectex on Convex Hull. Then we need O(M) to find the farthest pair.
I find this helpful links. It includes analysis of algorithm details and program.
http://www.seas.gwu.edu/~simhaweb/alg/lectures/module1/module1.html
Wish this will be helpful.
Find the mean of all the points, measure the difference between all points and the mean, take the point the largest distance from the mean and find the point farthest from it. Those points will be the absolute corners of the convex hull and the two most distant points.
I recently did this for a project that needed convex hulls confined to randomly directed infinite planes. It worked great.
See the comments: this solution isn't guaranteed to produce the correct answer.
Just a few thoughts:
You might look at only the points that define the convex hull of your set of points to reduce the number,... but it still looks a bit "not optimal".
Otherwise there might be a recursive quad/oct-tree approach to rapidly bound some distances between sets of points and eliminate large parts of your data.
This seems easy if the points are given in Cartesian coordinates. So easy that I'm pretty sure that I'm overlooking something. Feel free to point out what I'm missing!
Find the points with the max and min values of their x, y, and z coordinates (6 points total). These should be the most "remote" of all the boundary points.
Compute all the distances (30 unique distances)
Find the max distance
The two points that correspond to this max distance are the ones you're looking for.
Here's a good solution, which works in O(n log n). It's called Rotating Caliper’s Method.
https://www.geeksforgeeks.org/maximum-distance-between-two-points-in-coordinate-plane-using-rotating-calipers-method/
Firstly you find the convex hull, which you can make in O(n log n) with the Graham's scan. Only the point from the convex hull can provide you the maximal distance. This algorithm arranges points of the convex hull in the clockwise traversal. This property will be used later.
Secondly, for all the points on the convex hull, you'll need to find the most distant point on this hull (it's called the antipodal point here). You don't have to find all the antipodal points separately (which would give quadratic time). Let's say the points of the convex hall are called p_1, ..., p_n, and their order corresponds to the clockwise traversal. There is a property of convex polygons that when you iterate through points p_j on the hull in the clockwise order and calculate the distances d(p_i, p_j), these distances firstly don't decrease (and maybe increase) and then don't increase (and maybe decrease). So you can find the maximum distance easily in this case. But when you've found the correct antipodal point p_j* for the p_i, you can start this search for p_{i+1} with the candidates points starting from that p_j*. You don't need to check all previously seen points. in total p_i iterates through points p_1, ..., p_n once, and p_j iterates through these points at most twice, because p_j can never catch up p_i as it would give zero distance, and we stop when the distance starts decreasing.
A solution that has runtime complexity O(N) is a combination of the above
answers. In detail:
(1) One can compute the convex hull with runtime complexity O(N) if you
use counting sort as an internal polar angle sort and are willing to
use angles rounded to the nearest integer [0, 359], inclusive.
(2) Note that the number of points on the convex hull is then N_H which is usually less than N.
We can speculate about the size of the hull from information in Cormen et al. Introduction to Algorithms, Exercise 33-5.
For sparse-hulled distributions of a unit-radius disk, a convex polygon with k sides, and a 2-D normal distribution respectively as n^(1/3), log_2(n), sqrt(log_2(n)).
The furthest pair problem is then between comparison of points on the hull.
This is N_H^2, but each leading point's search for distance point can be
truncated when the distances start to decrease if the points are traversed
in the order of the convex hull (those points are ordered CCW from first point).
The runtime complexity for this part is then O(N_H^2).
Because N_H^2 is usually less than N, the total runtime complexity
for furthest pair is O(N) with a caveat of using integer degree angles to reduce the sort in the convex hull to linear.
Given a set of points {(x1,y1), (x2,y2) ... (xn,yn)} find 2 most distant points.
My approach:
1). You need a reference point (xa,ya), and it will be:
xa = ( x1 + x2 +...+ xn )/n
ya = ( y1 + y2 +...+ yn )/n
2). Calculate all distance from point (xa,ya) to (x1,y1), (x2,y2),...(xn,yn)
The first "most distant point" (xb,yb) is the one with the maximum distance.
3). Calculate all distance from point (xb,yb) to (x1,y1), (x2,y2),...(xn,yn)
The other "most distant point" (xc,yc) is the one with the maximum distance.
So you got your most distant points (xb,yb) (xc,yc) in O(n)
For example, for points: (0,0), (1,1), (-8, 5)
1). Reference point (xa,ya) = (-2.333, 2)
2). Calculate distances:
from (-2.333, 2) to (0,0) : 3.073
from (-2.333, 2) to (1,1) : 3.480
from (-2.333, 2) to (-8, 5) : 6.411
So the first most distant point is (-8, 5)
3). Calculate distances:
from (-8, 5) to (0,0) : 9.434
from (-8, 5) to (1,1) : 9.849
from (-8, 5) to (-8, 5) : 0
So the other most distant point is (1, 1)
Let's look at m points in n-d space- (A solution for 4 points in 3-d space is here: minimize distance from sets of points)
a= (x1, y1, z1, ..)
b= (x2, y2 ,z2, ..)
c= (x3, y3, z3, ..)
.
.
p= (x , y , z, ..)
Find point q = c1* a + c2* b + c3* c + ..
where c1 + c2 + c3 + .. = 1
and c1, c2, c3, .. >= 0
s.t.
euclidean distance pq is minimized.
What algorithms can be used ? Idea or pseudocode is enough.
(Optimizing performance is a big issue here. Monte Carlo method with all vertices and changing coefficients would also give a solution.)
We can assume p = 0 by subtracting p from all the other points. Then the question is one of minimizing the norm over a convex hull of a finite set of points, i.e., a polytope.
There are a few papers on this problem. It looks like "A recursive algorithm for finding the minimum norm point in a polytope and a pair of closest points in two polytopes" by Kazuyuki Sekitani and Yoshitsugu Yamamoto is a good one, with a short survey of prior solutions to the problem. It is behind a paywall but if you have access to a university library you may be able to download a copy.
The algorithm they give is fairly simple, once you get past the notation. P is the finite set of points. C(P) is its convex hull. Nr(C(P)) is the unique point of minimum norm, which is what you want to find.
Step 0: Choose a point x_0 from the convex hull C(P) of your finite set of points P. They recommend choosing x_0 to be the point in P with minimum norm. Let k=1.
Now loop:
Step 1: Let a_k = min {x^t_{k-1} p | p is in P}. Here x^t_{k-1} is the transpose of x_{k-1} (so the function being minimized is just a dot product as p ranges over your finite set P). If |x_{k-1}|^2 <= a_k, then the answer is x_{k-1}, stop.
Step 2: P_k = {p | p in P and x^t_{k-1} = a_k}. P_k is the subset of P that minimizes the expression in Step 1. Call the algorithm recursively on this set P_k, and let the result be y_k = Nr(C(P_k)).
Step 3: b_k = min{y^t_k p | p in P\P_k}, the minimum of the dot product of y_k with points in the complement set P\P_k. If |y_k|^2 <= b_k then y_k is the answer, stop.
Step 4: s_k = max{s| [(1-s)x_{k-1} + sy_k]^t y_k <= [(1-s)x_{k-1} + sy_k]^t p for every p in P\P_k}. Let x_k = (1-s_k) x_{k-1} + s_k y_k, let k=k+1, and go back to Step 1.
There is an explicit formula for s_k in Step 4:
s_k = min{ [x^t_{k-1} (p-y_k)]/[(y_k-x_{k-1})^t (y_k-p)] | p in P\P_k and (y_k - x_{k-1})^t (y_k-p) > 0 }
There is a proof in the paper that s_k has the necessary properties, that the algorithm terminates after a finite number of operations, and that the result is indeed optimal.
Note that you should add some tolerance into your comparisons, otherwise rounding errors may cause the algorithm to fail. There is a lot of discussion about numerical stability, see the paper for details.
They do not give a complete analysis of the computational complexity of the algorithm, but they do prove it is at most O(m^2) in the two-dimensional case (m is the number of points in P), and they have done numerical experiments which give the impression that it is sublinear in time as a function of m, with dimension fixed. I'm skeptical of that claim. In the absence of a detailed analysis, I suggest you try some experiments with typical data to see how well the algorithm performs for you.
Stated a simpler way, you have a set of points {a}i, and you are considering all points which are some weighted average thereof. This set of points is exactly the convex hull of those points; it's a polytope (polygon, polyhedron, etc.) that just happens to be convex, where the corners are a subset of the {a}i points.
You are just asking which point on a polytope(~hedron) is closest to a point. (your query point p)
The closest point must be on the exterior of the polytope. One algorithm would be to brute-force searching all N-1 dimensional surfaces. Do this in the usual way you would find the closest point on a line or surface or N-dimensional surface to a query point.
(If the points are not all linearly independent, you will have multiple ways (multiple weight vectors) which can give you the same weighted-average point q. You can worry about reconstructing the answer q from the basis vectors after you find it geometrically.)
I have created a convex hull using scipy.spatial.ConvexHull. I need to compute the intersection point between the convex hull and a ray, starting at 0 and in the direction of some other defined point. The convex hull is known to contain 0 so the intersection should be guaranteed. The dimension of the problem can vary between 2 and 5. I have tried some google searching but haven't found an answer. I am hoping this is a common problem with known solutions in computational geometry. Thank you.
According to qhull.org, the points x of a facet of the convex hull verify V.x+b=0, where V and b are given by hull.equations. (. stands for the dot product here. V is a normal vector of length one.)
If V is a normal, b is an offset, and x is a point inside the convex
hull, then Vx+b <0.
If U is a vector of the ray starting in O, the equation of the ray is x=αU, α>0. so the intersection of ray an facet is x = αU = -b/(V.U) U. The unique intersection point with the hull corresponds to the min of the positive values of α:
The next code give it :
import numpy as np
from scipy.spatial import ConvexHull
def hit(U,hull):
eq=hull.equations.T
V,b=eq[:-1],eq[-1]
alpha=-b/np.dot(V,U)
return np.min(alpha[alpha>0])*U
It is a pure numpy solution so it is fast. An example for 1 million points in the [-1,1]^3 cube :
In [13]: points=2*np.random.rand(1e6,3)-1;hull=ConvexHull(points)
In [14]: %timeit x=hit(np.ones(3),hull)
#array([ 0.98388702, 0.98388702, 0.98388702])
10000 loops, best of 3: 30 µs per loop
As mentioned by Ante in the comments, you need to find the closest intersection of all the lines/planes/hyper-planes in the hull.
To find the intersection of the ray with the hyperplane, do a dot product of the normalized ray with the hyperplane normal, which will tell you how far in the direction of the hyperplane normal you move for each unit distance along the ray.
If the dot product is negative it means that the hyperplane is in the opposite direction of the ray, if zero it means the ray is parallel to it and won't intersect.
Once you have a positive dot product, you can work out how far away the hyperplane is in the direction of the ray, by dividing the distance of the plane in the direction of the plane normal by the dot product. For example if the plane is 3 units away, and the dot product is 0.5, then you only get 0.5 units closer for every unit you move along the ray, so the hyperplane is 3 / 0.5 = 6 units away in the direction of the ray.
Once you have calculated this distance for all the hyperplanes and found the closest one, the intersection point is just the ray multiplied by the closest distance.
Here is a solution in Python (normalize function is from here):
def normalize(v):
norm = np.linalg.norm(v)
if norm == 0:
return v
return v / norm
def find_hull_intersection(hull, ray_point):
# normalise ray_point
unit_ray = normalize(ray_point)
# find the closest line/plane/hyperplane in the hull:
closest_plane = None
closest_plane_distance = 0
for plane in hull.equations:
normal = plane[:-1]
distance = plane[-1]
# if plane passes through the origin then return the origin
if distance == 0:
return np.multiply(ray_point, 0) # return n-dimensional zero vector
# if distance is negative then flip the sign of both the
# normal and the distance:
if distance < 0:
np.multiply(normal, -1);
distance = distance * -1
# find out how much we move along the plane normal for
# every unit distance along the ray normal:
dot_product = np.dot(normal, unit_ray)
# check the dot product is positive, if not then the
# plane is in the opposite direction to the rayL
if dot_product > 0:
# calculate the distance of the plane
# along the ray normal:
ray_distance = distance / dot_product
# is this the closest so far:
if closest_plane is None or ray_distance < closest_plane_distance:
closest_plane = plane
closest_plane_distance = ray_distance
# was there no valid plane? (should never happen):
if closest_plane is None:
return None
# return the point along the unit_ray of the closest plane,
# which will be the intersection point
return np.multiply(unit_ray, closest_plane_distance)
Test code in 2D (the solution generalizes to higher dimensions):
from scipy.spatial import ConvexHull
import numpy as np
points = np.array([[-2, -2], [2, 0], [-1, 2]])
h = ConvexHull(points)
closest_point = find_hull_intersection(h, [1, -1])
print closest_point
output:
[ 0.66666667 -0.66666667]
[Question has been rewritten for clarification]
I'm trying to come up with a sorting function. What is being sorted is a list of points.
The sorting function takes in 3 points. One from the list of points to be sorted, and two others that are used for comparison. The goal is to determine the relative euclidean distance the point to be sorted is from the other two points. The lowest value of the function should be given when the point lies directly between the two points. The function should make use of the euclidean distance between both points.
So far seems like the formula should either be the some of the squares of the distance, or to create a point in between the two given points, and use the euclidean distance to that point. below I've include the two possible function so far.
p is the point to be sorted
p1,p2 are the given points
def f(p,p1,p2): #Midpoint distance
midPoint = midpoint(p1,p2)
return distance(p,midPoint)
def f(p,p1,p2): #Sum of squares
return distance(p,p1) ** 2 + distance(p,p2) ** 2
def distance(pointA,pointB): #Psudocode
dx = pointA.x - pointB.x
dy = pointA.y - pointB.y
return sqrt(dx ** 2 + dy ** 2)
Below is an example:
The two points being considered here are the ones with the line drawn between them. The circled points should be the three lowest points in the sorting algorithm. The close point to the left is penalized for being close to one of the two points, but far from the other.
Maybe the Least Squares method would help? So you sum the square of the distances. This way the left node would be penalized for being too far from the right node in the line.
Another option is to take the distance to the halfway point on the line made by the two base nodes. This would also prefer the three nodes to the one on the left.
Well using the average seems like the intuitive way to do it (by the way this will be the same as using the sum). One other thing you could do would be to use the 'weighted' average. For instance, if a is the shorter distance, you could give it a higher priority by using (2*a + b) / 3, for instance (or in general (m*a + b*n) / (m + n) where m > n).