Snap a point to the closest part of a line? - python

I have a work order management system that has Python 2.7.
In the system, it is only possible to use the libraries that are included in the standard Python 2.7 library (it's not possible to import other libraries).
The system has lines and points:
Line Vertex 1: 676561.00, 4860927.00
Line Vertex 2: 676557.00, 4860939.00
Point 100: 676551.00, 4860935.00
Point 200: 676558.00, 4860922.00
I want to snap the points to the nearest part of the line.
Snapping can be defined as:
Moving a point so that it coincides exactly with the closest part of a line.
There appear to be two different mathematical scenarios at play:
A. The closest part of a line is the position along the line where the point is perpendicular to the line.
B. Or, the closest part of the line is simply the closest vertex.
Is it possible to snap a point to the closest part of a line using Python 2.7 (and the standard 2.7 library)?

This is an extremely helpful resource.
import collections
import math
Line = collections.namedtuple('Line', 'x1 y1 x2 y2')
Point = collections.namedtuple('Point', 'x y')
def lineLength(line):
dist = math.sqrt((line.x2 - line.x1)**2 + (line.y2 - line.y1)**2)
return dist
## See http://paulbourke.net/geometry/pointlineplane/
## for helpful formulas
## Inputs
line = Line(0.0, 0.0, 100.0, 0.0)
point = Point(50.0, 1500)
## Calculations
print('Inputs:')
print('Line defined by: ({}, {}) and ({}, {})'.format(line.x1, line.y1, line.x2, line.y2))
print('Point "P": ({}, {})'.format(point.x, point.y))
len = lineLength(line)
if (len == 0):
raise Exception('The points on input line must not be identical')
print('\nResults:')
print('Length of line (calculated): {}'.format(len))
u = ((point.x - line.x1) * (line.x2 - line.x1) + (point.y - line.y1) * (line.y2 - line.y1)) / (
len**2)
# restrict to line boundary
if u > 1:
u = 1
elif u < 0:
u = 0
nearestPointOnLine = Point(line.x1 + u * (line.x2 - line.x1), line.y1 + u * (line.y2 - line.y1))
shortestLine = Line(nearestPointOnLine.x, nearestPointOnLine.y, point.x, point.y)
print('Nearest point "N" on line: ({}, {})'.format(nearestPointOnLine.x, nearestPointOnLine.y))
print('Length from "P" to "N": {}'.format(lineLength(shortestLine)))

You can calculate the line equation and then the distance between each point and the line.
For instance:
import collections
import math
Point = collections.namedtuple('Point', "x, y")
def distance(pt, a, b, c):
# line eq: ax + by + c = 0
return math.fabs(a * pt.x + b * pt.y + c) / math.sqrt(a**2 + b**2)
l1 = Point(676561.00, 4860927.00)
l2 = Point(676557.00, 4860939.00)
# line equation
a = l2.y - l1.y
b = l1.x - l2.x
c = l2.x * l1.y - l2.y * l1.x
assert a * l1.x + b * l1.y + c == 0
assert a * l2.x + b * l2.y + c == 0
p100 = Point(676551.00, 4860935.00)
p200 = Point(676558.00, 4860922.00)
print(distance(p100, a, b, c))
print(distance(p200, a, b, c))
You get:
6.957010852370434
4.427188724235731
Edit1: calculating the orthographic projection
What you want is the coordinates of the orthographic projection of p100 and p200 on the line (l1, l2).
You can calculate that as follow:
import collections
import math
Point = collections.namedtuple('Point', "x, y")
def snap(pt, pt1, pt2):
# type: (Point, Point, Point) -> Point
v = Point(pt2.x - pt1.x, pt2.y - pt1.y)
dv = math.sqrt(v.x ** 2 + v.y ** 2)
bh = ((pt.x - pt1.x) * v.x + (pt.y - pt1.y) * pt2.y) / dv
h = Point(
pt1.x + bh * v.x / dv,
pt1.y + bh * v.y / dv
)
return h
l1 = Point(676561.00, 4860927.00)
l2 = Point(676557.00, 4860939.00)
p100 = Point(676551.00, 4860935.00)
p200 = Point(676558.00, 4860922.00)
s100 = snap(p100, l1, l2)
s200 = snap(p200, l1, l2)
print(s100)
print(p100)
You get:
Point(x=-295627.7999999998, y=7777493.4)
Point(x=676551.0, y=4860935.0)
You can check that the snapped points are on the line:
# line equation
a = l2.y - l1.y
b = l1.x - l2.x
c = l2.x * l1.y - l2.y * l1.x
assert math.fabs(a * s100.x + b * s100.y + c) < 1e-6
assert math.fabs(a * s200.x + b * s200.y + c) < 1e-6
Edit2: snap to the line segment
If you want to snap to a line segment, you need to check if the orthographic projection is inside the line segment or not.
If the orthographic projection is inside the line segment: it is the solution,
If it is near an extremity of the segment, this extremity is the solution.
You can do that as bellow:
def distance_pts(pt1, pt2):
v = Point(pt2.x - pt1.x, pt2.y - pt1.y)
dv = math.sqrt(v.x ** 2 + v.y ** 2)
return dv
def snap(pt, pt1, pt2):
# type: (Point, Point, Point) -> Point
v = Point(pt2.x - pt1.x, pt2.y - pt1.y)
dv = distance_pts(pt1, pt2)
bh = ((pt.x - pt1.x) * v.x + (pt.y - pt1.y) * pt2.y) / dv
h = Point(pt1.x + bh * v.x / dv, pt1.y + bh * v.y / dv)
if 0 <= (pt1.x - h.x) / (pt2.x - h.y) < 1:
# in the line segment
return h
elif distance_pts(h, pt1) < distance_pts(h, pt2):
# near pt1
return pt1
else:
# near pt2
return pt2
The solutions for p100 and p200 are:
Point(x=676557.0, y=4860939.0)
Point(x=676551.0, y=4860935.0)

Another option:
# snap a point to a 2d line
# parameters:
# A,B: the endpoints of the line
# C: the point we want to snap to the line AB
# all parameters must be a tuple/list of float numbers
def snap_to_line(A,B,C):
Ax,Ay = A
Bx,By = B
Cx,Cy = C
# special case: A,B are the same point: just return A
eps = 0.0000001
if abs(Ax-Bx) < eps and abs(Ay-By) < eps: return [Ax,Ay]
# any point X on the line can be represented by the equation
# X = A + t * (B-A)
# so for point C we compute its perpendicular D on the line
# and the parameter t for D
# if t is between 0..1 then D is on the line so the snap point is D
# if t < 0 then the snap point is A
# if t > 1 then the snap point is B
#
# explanation of the formula for distance from point to line:
# http://paulbourke.net/geometry/pointlineplane/
#
dx = Bx-Ax
dy = By-Ay
d2 = dx*dx + dy*dy
t = ((Cx-Ax)*dx + (Cy-Ay)*dy) / d2
if t <= 0: return A
if t >= 1: return B
return [dx*t + Ax, dy*t + Ay]
if __name__=="__main__":
A,B = [676561.00, 4860927.00],[676557.00, 4860939.00]
C = [676551.00, 4860935.00]
print(snap_to_line(A,B,C))
C = [676558.00, 4860922.00]
print(snap_to_line(A,B,C))

Related

Angles calculated by function are inconsistent

I have a function that is intended to rotate polygons by 5 degrees left or right and return their new points. This function is as follows, along with the function player_center that it requires.
# finds center of player shape
# finds slope and midpoint of each vertice-midpoint line on the longer sides,
# then the intercept of them all
def player_center(player):
left_mid = line_midpoint(player[0], player[1])
right_mid = line_midpoint(player[0], player[2])
left_mid_slope = line_slope(left_mid, player[2])
right_mid_slope = line_slope(right_mid, player[1])
left_mid_line = find_equation(player[2], left_mid_slope, True)
right_mid_line = find_equation(player[1], right_mid_slope, True)
standard_left_mid_line = slope_intercept_to_standard(left_mid_line[0], left_mid_line[1], left_mid_line[2])
standard_right_mid_line = slope_intercept_to_standard(right_mid_line[0], right_mid_line[1], right_mid_line[2])
lines = sym.Matrix([standard_left_mid_line, standard_right_mid_line])
return (float(lines.rref()[0].row(0).col(2)[0]), float(lines.rref()[0].row(1).col(2)[0]))
# rotates the player using SOHCAHTOA
# divides x coordinate by radius to find angle, then adds or subtracts increment of 5 to it depending on direction
# calculates the position of point at incremented angle, then appends to new set of points
# finally, new set is returned
# direction; 1 is left, 0 is right
def rotate_player(player, direction):
increment = math.pi/36 # radian equivalent of 5 degrees
full_circle = 2 * math.pi # radian equivalent of 360 degrees
center = player_center(player)
new_player = []
for point in player:
radius = line_distance(point, center)
point_sin = (center[1] - point[1])/radius
while (point_sin > 1):
point_sin -= 1
point_angle = math.asin(point_sin)
if (direction == 1):
if ((point_angle+increment) > math.pi * 2):
new_angle = (point_angle+increment) - math.pi * 2
else:
new_angle = point_angle + increment
else:
if ((point_angle-increment) < 0):
new_angle = 2 * math.pi + (point_angle-increment)
else:
new_angle = point_angle-increment
print("The angle was {}".format(math.degrees(point_angle)))
print("The angle is now {}".format(math.degrees(new_angle))) # print lines are for debug purposes
new_point = ((radius * math.cos(new_angle)) + center[0], -(radius * math.sin(new_angle)) + center[1])
new_player.append(new_point)
print(new_player)
return new_player
The geometric functions that it relies on are all defined in this file here:
import math
import sympy as sym
# shape is in form of list of tuples e.g [(1,1), (2,1), (1,0), (2,0)]
# angle is in degrees
# def rotate_shape(shape, angle):
def line_distance(first_point, second_point):
return math.sqrt( (second_point[0] - first_point[0]) ** 2 + (second_point[1] - first_point[1]) ** 2)
# undefined is represented by None in this program
def line_slope(first_point, second_point):
if (second_point[0] - first_point[0] == 0):
return None
elif (second_point[1] - first_point[1] == 0):
return 0
else:
return (second_point[1] - first_point[1])/(second_point[0] - first_point[0])
def line_midpoint(first_point, second_point):
return ( (first_point[0] + second_point[0])/2, (first_point[1] + second_point[1])/2 )
def find_equation(coord_pair, slope, array_form):
# Array form useful for conversion into standard form
if (array_form == True):
if (slope == 0):
intercept = coord_pair[1]
return [0, 1, intercept]
elif (slope == None):
intercept = coord_pair[0]
return [1, 0, intercept]
else:
intercept = coord_pair[1] - (coord_pair[0] * slope)
return [slope, 1, intercept]
else:
if (slope == 0):
intercept = coord_pair[1]
print("y = {0}".format(intercept))
return
elif (slope == None):
intercept = coord_pair[0]
print("x = {0}".format(intercept))
return
else:
intercept = coord_pair[1] - (coord_pair[0] * slope)
if (intercept >= 0):
print("y = {0}x + {1}".format(slope, intercept))
return
else:
print("y = {0}x - {1}".format(slope, intercept))
def find_perpendicular(slope):
if (slope == 0):
return None
elif (slope == None):
return 0
else:
return -(1/slope)
def find_perp_bisector(first_point, second_point):
# This function finds the perpendicular bisector between two points, using funcs made previously
midpoint = line_midpoint(first_point, second_point)
slope = line_slope(first_point, second_point)
return find_equation(midpoint, -(1/slope))
def find_perp_equation(x, y, m, array_form):
# returns the perpendicular equation of a given line
if (array_form == True):
return [find_perpendicular(x), y, m]
else:
if (m >= 0):
print("{0}y = {1}x + {2}".format(y, find_perpendicular(x), m))
else:
print("{0}y = {1}x - {2}".format(y, find_perpendicular(x), m))
def find_hyp(a, b):
return math.sqrt((a**2) + (b**2))
def find_tri_area(a, b, c):
# finds area of triangle using Heron's formula
semi = (a+b+c)/2
return math.sqrt(semi * (semi - a) * (semi - b) * (semi - c) )
def r_tri_check(a, b, c):
if (a**2) + (b**2) != (c**2):
print("This thing fake, bro.")
def find_point_section(first_point, second_point, ratio):
# I coded this half a year ago and can't remember what for, but kept it here anyway.
# separtions aren't necessary, but good for code readability
first_numerator = (ratio[0] * second_point[0]) + (ratio[1] * first_point[0])
second_numerator = (ratio[0] * second_point[1]) + (ratio[1] * first_point[1])
return ( first_numerator/(ratio[0]+ratio[1]), second_numerator/(ratio[0] + ratio[1]))
def slope_intercept_to_standard(x, y, b):
# x and y are the coeffients of the said variables, for example 5y = 3x + 8 would be inputted as (3, 5, 8)
if (x == 0):
return [0, 1, b]
elif (y == 0):
return [x, 0, b]
else:
return [x, -y, -b]
It mathematically seems sound, but when I try to apply it, all hell breaks loose.
For example when trying to apply it with the set polygon_points equal to [(400, 300), (385, 340), (415, 340)], All hell breaks loose.
An example of the output among repeated calls to the function upon polygon_points(outputs manually spaced for clarity):
The angle was 90.0
The angle is now 95.0
The angle was -41.633539336570394
The angle is now -36.633539336570394
The angle was -41.63353933657017
The angle is now -36.63353933657017
The angle was 64.4439547804165
The angle is now 69.4439547804165
The angle was -64.44395478041695
The angle is now -59.44395478041695
The angle was -64.44395478041623
The angle is now -59.44395478041624
The angle was 80.94458887142648
The angle is now 85.94458887142648
The angle was -80.9445888714264
The angle is now -75.9445888714264
The angle was -80.94458887142665
The angle is now -75.94458887142665
Can anyone explain this?
Too much irrelevant code, a lot of magic like this while (point_sin > 1): point_sin -= 1 - so hard to reproduce.
To rotate point around some center, you need just this (where cos(fi), sin(fi) are precalculated value in your case):
new_x = center_x + (old_x - center_x) * cos(fi) - (old_y - center_y) * sin(fi)
new_y = center_y + (old_x - center_x) * sin(fi) + (old_y - center_y) * cos(fi)
This is a built-in capability of RegularPolygon in SymPy:
>>> from sympy import RegularPolygon, rad
>>> p = RegularPolygon((0,0), 1, 5)
>>> p.vertices[0]
Point2D(1, 0)
>>> p.rotate(rad(30)) # or rad(-30)
>>> p.vertices[0]
Point2D(sqrt(3)/2, 1/2)

Location and time of impact for a moving point and a moving line segment (Continuous Collision Detection)

I am working on a 2D collider system that breaks up shapes into one possible primitive: impenetrable segments that are defined by two points. To provide collision detection for this system, I am using a static collision detection approach that calculates the distance between the edge of one segment and the currently handled segment (point/line distance) once every frame. If the distance is too small, a collision is triggered during that frame. This works fine but has the known problem of tunneling if one or more bodies exhibit high speeds. So I am tinkering with alternatives.
Now I want to introduce continuous collision detection (CCD) that operates on dynamic points / dynamic segments. My problem is: I don't exactly know how. I do know how to do continuous collision between two moving points, a moving point and a static segment but not how to do CCD between a moving point (defined by point P) and a moving segment (defined by points U and V, both can move completely freely).
illustration of problem
I have seen similar questions beeing asked on SO and other platforms, but not with these exact requirements:
both point and segment are moving
segment can be rotating and stretching (because U and V are moving freely)
collision time and collision point need to be found accurately between two frames (CCD, no static collision test)
I prefer a mathematically perfect solutiuon, if possible (no iterative approximation algorithms, swept volumes)
note: the swept line shape will not always be a convex polygon, because of the freedom of the U,V points (see image)
note: testing for a collision with the swept volume test is inaccurate because a collision point with the polygon does not mean a collision point in the actual movement (see image, the point will have left the polygon once the actual segment has crossed the trajectory of the point)
So far I came up with the following approach, given:
sP (P at start of frame),
eP (P at end of frame),
sU (U at start of frame),
eU (U at end of frame),
sV (V at start of frame),
eV (V at end of frame)
Question: Will they collide? If yes, when and where?
To answer the question of "if", I found this paper to be useful: https://www.cs.ubc.ca/~rbridson/docs/brochu-siggraph2012-ccd.pdf (section 3.1) but I could not derive the answers to "when" and "where". I also found an alternative explanation of the problem here: http://15462.courses.cs.cmu.edu/fall2018/article/13 (3rd Question)
Solution:
Model temporal trajectory of each point during a frame as linear movement (line trajectory for 0 <= t <= 1)
P(t) = sP * (1 - t) + eP * t
U(t) = sU * (1 - t) + eU * t
V(t) = sV * (1 - t) + eV * t
(0 <= a <= 1 represents a location on the segment defined by U and V):
UV(a, t) = U(t) * (1 - a) + V(t) * a
Model collision by equating point and segment equations:
P(t) = UV(a, t)
P(t) = U(t) * (1 - a) + V(t) * a
Derive a function for the vector from point P to a point on the segment (see picture of F):
F(a, t) = P(t) - (1 - a) * U(t) - a * V(t)
To now find a collision, one needs to find a and t, so that F(a, t) = (0, 0) and a,t in [0, 1]. This can be modeled as a root finding problem with 2 variables.
Insert the temporal trajectory equations into F(a, t):
F(a, t) = (sP * (1 - t) + eP * t) - (1 - a) * (sU * (1 - t) + eU * t) - a * (sV * (1 - t) + eV * t)
Separate the temporal trajectory equations by dimension (x and y):
Fx(a, t) = (sP.x * (1 - t) + eP.x * t) - (1 - a) * (sU.x * (1 - t) + eU.x * t) - a * (sV.x * (1 - t) + eV.x * t)
Fy(a, t) = (sP.y * (1 - t) + eP.y * t) - (1 - a) * (sU.y * (1 - t) + eU.y * t) - a * (sV.y * (1 - t) + eV.y * t)
Now we have two equations and two variables that we want to solve for (Fx, Fy and a, t respectively), so we should be able to use a solver to get a and t to only then check if they lie within [0, 1].. right?
When I plug this into Python sympy to solve:
from sympy import symbols, Eq, solve, nsolve
def main():
sxP = symbols("sxP")
syP = symbols("syP")
exP = symbols("exP")
eyP = symbols("eyP")
sxU = symbols("sxU")
syU = symbols("syU")
exU = symbols("exU")
eyU = symbols("eyU")
sxV = symbols("sxV")
syV = symbols("syV")
exV = symbols("exV")
eyV = symbols("eyV")
a = symbols("a")
t = symbols("t")
eq1 = Eq((sxP * (1 - t) + exP * t) - (1 - a) * (sxU * (1 - t) + exU * t) - a * (sxV * (1 - t) + exV * t))
eq2 = Eq((syP * (1 - t) + eyP * t) - (1 - a) * (syU * (1 - t) + eyU * t) - a * (syV * (1 - t) + eyV * t))
sol = solve((eq1, eq2), (a, t), dict=True)
print(sol)
if __name__ == "__main__":
main()
I get a solution that is HUGE in size and it takes sympy like 5 minutes to evaluate.
I cannot be using such a big expression in my actual engine code and this solutions just does not seem right to me.
What I want to know is:
Am I missing something here? I think this problem seems rather easy to understand but I cannot figure out a mathematically accurate way to find a time (t) and point (a) of impact solution for dynamic points / dynamic segments. Any help is greatly appreciated, even if someone tells me that
this is not possible to do like that.
TLDR
I did read "...like 5 minutes to evaluate..."
No way too long, this is a real-time solution for many lines and points.
Sorry this is not a complete answer (I did not rationalize and simplify the equation) that will find the point of intercept, that I leave to you.
Also I can see several approaches to the solution as it revolves around a triangle (see image) that when flat is the solution. The approach bellow finds the point in time when the long side of the triangle is equal to the sum of the shorter two.
Solving for u (time)
This can be done as a simple quadratic with the coefficients derived from the 3 starting points, the vector over unit time of each point. Solving for u
The image below give more details.
The point P is the start pos of point
The points L1, L2 are the start points of line ends.
The vector V1 is for the point, over unit time (along green line).
The vectors V2,V3 are for the line ends over unit time.
u is the unit time
A is the point (blue), and B and C are the line end points (red)
There is (may) a point in time u where A is on the line B,C. At this point in time the length of the lines AB (as a) and AC (as c) sum to equal the length of line BC (as b) (orange line).
That means that when b - (a + c) == 0 the point is on the line. In the image the points are squared as this simplifies it a little. b2 - (a2 + c2) == 0
At the bottom of image is the equation (quadratic) in terms of u, P, L1, L2, V1, V2, V3.
That equation needs to be rearranged such that you get (???)u2 + (???)u + (???) = 0
Sorry doing that manually is very tedious and very prone to mistakes. I don`t have the tools at hand to do that nor do I use python so the math lib you are using is unknown to me. However it should be able to help you find how to calculate the coefficients for (???)u2 + (???)u + (???) = 0
Update
Ignore most of the above as I made a mistake. b - (a + c) == 0 is not the same as b2 - (a2 + c2) == 0. The first one is the one needed and that is a problem when dealing with radicals (Note that there could still be a solution using a + bi == sqrt(a^2 + b^2) where i is the imaginary number).
Another solution
So I explored the other options.
The simplest has a slight flaw. It will return the time of intercept. However that must be validated as it will also return the time for intercepts when it intercepts the line, rather than the line segment BC
Thus when a result is found you then test it by dividing the dot product of the found point and line segment with the square of the line segments length. See function isPointOnLine in test snippet.
To solve I use the fact that the cross product of the line BC and the vector from B to A will be 0 when the point is on the line.
Some renaming
Using the image above I renamed the variables so that it is easier for me to do all the fiddly bits.
/*
point P is {a,b}
point L1 is {c,d}
point L2 is {e,f}
vector V1 is {g,h}
vector V2 is {i,j}
vector V3 is {k,l}
Thus for points A,B,C over time u */
Ax = (a+g*u)
Ay = (b+h*u)
Bx = (c+i*u)
By = (d+j*u)
Cx = (e+k*u)
Cy = (f+l*u)
/* Vectors BA and BC at u */
Vbax = ((a+g*u)-(c+i*u))
Vbay = ((b+h*u)-(d+j*u))
Vbcx = ((e+k*u)-(c+i*u))
Vbcy = ((f+l*u)-(d+j*u))
/*
thus Vbax * Vbcy - Vbay * Vbcx == 0 at intercept
*/
This gives the quadratic
0 = ((a+g*u)-(c+i*u)) * ((f+l*u)-(d+j*u)) - ((b+h*u)-(d+j*u)) * ((e+k*u)-(c+i*u))
Rearranging we get
0 = -((i*l)-(h*k)+g*l+i*h+(i+k)*j-(g+i)*j)*u* u -(d*g-c*l-k*b-h*e+l*a+g*f+i*b+c*h+(i+k)*d+(c+e)*j-((f+d)*i)-((a+c)*j))*u +(c+e)*d-((a+c)*d)+a*f-(c*f)-(b*e)+c*b
The coefficients are thus
A = -((i*l)-(h*k)+g*l+i*h+(i+k)*j-(g+i)*j)
B = -(d*g-c*l-k*b-h*e+l*a+g*f+i*b+c*h+(i+k)*d+(c+e)*j-((f+d)*i)-((a+c)*j))
C = (c+e)*d-((a+c)*d)+a*f-(c*f)-(b*e)+c*b
We can solve using the quadratic formula (see image top right).
Note that there could be two solutions. In the example I ignored the second solution. However as the first may not be on the line segment you need to keep the second solution if within the range 0 <= u <= 1 just in case the first fails. You also need to validate that result.
Testing
To avoid errors I had to test the solution
Below is a snippet that generates a random random pair of lines and then generate random lines until an intercept is found.
The functions of interest are
movingLineVPoint which return the unit time of first intercept if any.
isPointOnLine to validate the result.
const ctx = canvas.getContext("2d");
canvas.addEventListener("click",test);
const W = 256, H = W, D = (W ** 2 * 2) ** 0.5;
canvas.width = W; canvas.height = H;
const rand = (m, M) => Math.random() * (M - m) + m;
const Tests = 300;
var line1, line2, path, count = 0;
setTimeout(test, 0);
// creating P point L line
const P = (x,y) => ({x,y,get arr() {return [this.x, this.y]}});
const L = (l1, l2) => ({l1,l2,vec: P(l2.x - l1.x, l2.y - l1.y), get arr() {return [this.l1, this.l2]}});
const randLine = () => L(P(rand(0, W), rand(0, H)), P(rand(0, W), rand(0, H)));
const isPointOnLine = (p, l) => {
const x = p.x - l.l1.x;
const y = p.y - l.l1.y;
const u = (l.vec.x * x + l.vec.y * y) / (l.vec.x * l.vec.x + l.vec.y * l.vec.y);
return u >= 0 && u <= 1;
}
// See answer illustration for names
// arguments in order Px,Py,L1x,l1y,l2x,l2y,V1x,V1y,V2x,V2y,V3x,V3y
function movingLineVPoint(a,b, c,d, e,f, g,h, i,j, k,l) {
var A = -(i*l)-(h*k)+g*l+i*h+(i+k)*j-(g+i)*j;
var B = -d*g-c*l-k*b-h*e+l*a+g*f+i*b+c*h+(i+k)*d+(c+e)*j-((f+d)*i)-((a+c)*j)
var C = +(c+e)*d-((a+c)*d)+a*f-(c*f)-(b*e)+c*b
// Find roots if any. Could be up to 2
// Using the smallest root >= 0 and <= 1
var u, D, u1, u2;
// if A is tiny we can ignore
if (Math.abs(A) < 1e-6) {
if (B !== 0) {
u = -C / B;
if (u < 0 || u > 1) { return } // !!!! no solution !!!!
} else { return } // !!!! no solution !!!!
} else {
B /= A;
D = B * B - 4 * (C / A);
if (D > 0) {
D **= 0.5;
u1 = 0.5 * (-B + D);
u2 = 0.5 * (-B - D);
if ((u1 < 0 || u1 > 1) && (u2 < 0 || u2 > 1)) { return } // !!!! no solution !!!!
if (u1 < 0 || u1 > 1) { u = u2 } // is first out of range
else if (u2 < 0 || u2 > 1) { u = u1 } // is second out of range
else if (u1 < u2) { u = u1 } // first is smallest
else { u = u2 }
} else if (D === 0) {
u = 0.5 * -B;
if (u < 0 || u > 1) { return } // !!!! no solution !!!!
} else { return } // !!!! no solution !!!!
}
return u;
}
function test() {
if (count> 0) { return }
line1 = randLine();
line2 = randLine();
count = Tests
subTest();
}
function subTest() {
path = randLine()
ctx.clearRect(0,0,W,H);
drawLines();
const u = movingLineVPoint(
path.l1.x, path.l1.y,
line1.l1.x, line1.l1.y,
line2.l1.x, line2.l1.y,
path.vec.x, path.vec.y,
line1.vec.x, line1.vec.y,
line2.vec.x, line2.vec.y
);
if (u !== undefined) { // intercept found maybe
pointAt = P(path.l1.x + path.vec.x * u, path.l1.y + path.vec.y * u);
lineAt = L(
P(line1.l1.x + line1.vec.x * u, line1.l1.y + line1.vec.y * u),
P(line2.l1.x + line2.vec.x * u, line2.l1.y + line2.vec.y * u)
);
const isOn = isPointOnLine(pointAt, lineAt);
if (isOn) {
drawResult(pointAt, lineAt);
count = 0;
info.textContent = "Found at: u= " + u.toFixed(4) + ". Click for another";
return;
}
}
setTimeout((--count < 0 ? test : subTest), 18);
}
function drawLine(line, col = "#000", lw = 1) {
ctx.lineWidth = lw;
ctx.strokeStyle = col;
ctx.beginPath();
ctx.lineTo(...line.l1.arr);
ctx.lineTo(...line.l2.arr);
ctx.stroke();
}
function markPoint(p, size = 3, col = "#000", lw = 1) {
ctx.lineWidth = lw;
ctx.strokeStyle = col;
ctx.beginPath();
ctx.arc(...p.arr, size, 0, Math.PI * 2);
ctx.stroke();
}
function drawLines() {
drawLine(line1);
drawLine(line2);
markPoint(line1.l1);
markPoint(line2.l1);
drawLine(path, "#0B0", 1);
markPoint(path.l1, 2, "#0B0", 2);
}
function drawResult(pointAt, lineAt) {
ctx.clearRect(0,0,W,H);
drawLines();
markPoint(lineAt.l1, 2, "red", 1.5);
markPoint(lineAt.l2, 2, "red", 1.5);
markPoint(pointAt, 2, "blue", 3);
drawLine(lineAt, "#BA0", 2);
}
div {position: absolute; top: 10px; left: 12px}
canvas {border: 2px solid black}
<canvas id="canvas" width="1024" height="1024"></canvas>
<div><span id="info">Click to start</span></div>
There are two parts of #Blindman67's solution I don't understand:
Solving for b^2 - (a^2 + c^2) = 0 instead of sqrt(b^2)-(sqrt(a^2)+sqrt(b^2)) = 0
The returned timestamp being clamped in the range [0,1]
Maybe I'm missing something obvious, but in any case, I designed a solution that addresses these concerns:
All quadratic terms are solved for, not just one
The returned time stamp has no limits
sqrt(b^2)-(sqrt(a^2)+sqrt(b^2)) = 0 is solved for, instead of b^2 - (a^2 + c^2) = 0
Feel free to recommend ways this could be optimized:
# pnt, crt_1, and crt_2 are points, each with x,y and dx,dy attributes
# returns a list of timestamps for which pnt is on the segment
# whose endpoints are crt_1 and crt_2
def colinear_points_collision(pnt, crt_1, crt_2):
a, b, c, d = pnt.x, pnt.y, pnt.dx, pnt.dy
e, f, g, h = crt_1.x, crt_1.y, crt_1.dx, crt_1.dy
i, j, k, l = crt_2.x, crt_2.y, crt_2.dx, crt_2.dy
m = a - e
n = c - g
o = b - f
p = d - h
q = a - i
r = c - k
s = b - j
u = d - l
v = e - i
w = g - k
x = f - j
y = h - l
# Left-hand expansion
r1 = n * n + p * p
r2 = 2 * o * p + 2 * m * n
r3 = m * m + o * o
r4 = r * r + u * u
r5 = 2 * q * r + 2 * s * u
r6 = q * q + s * s
coef_a = 4 * r1 * r4 # t^4 coefficient
coef_b = 4 * (r1 * r5 + r2 * r4) # t^3 coefficient
coef_c = 4 * (r1 * r6 + r2 * r5 + r3 * r4) # t^2 coefficient
coef_d = 4 * (r2 * r6 + r3 * r5) # t coefficient
coef_e = 4 * r3 * r6 # constant
# Right-hand expansion
q1 = (w * w + y * y - n * n - p * p - r * r - u * u)
q2 = 2 * (v * w + x * y - m * n - o * p - q * r - s * u)
q3 = v * v + x * x - m * m - o * o - q * q - s * s
coef1 = q1 * q1 # t^4 coefficient
coef2 = 2 * q1 * q2 # t^3 coefficient
coef3 = 2 * q1 * q3 + q2 * q2 # t^2 coefficient
coef4 = 2 * q2 * q3 # t coefficient
coef5 = q3 * q3 # constant
# Moves all the coefficients onto one side of the equation to get
# at^4 + bt^3 + ct^2 + dt + e
# solve for possible values of t
p = np.array([coef1 - coef_a, coef2 - coef_b, coef3 - coef_c, coef4 - coef_d, coef5 - coef_e])
def fun(x):
return p[0] * x**4 + p[1] * x**3 + p[2] * x**2 + p[3] * x + p[4]
# could use np.root, but I found this to be more numerically stable
sol = optimize.root(fun, [0, 0], tol=0.002)
r = sol.x
uniques = np.unique(np.round(np.real(r[np.isreal(r)]), 4))
final = []
for r in uniques[uniques > 0]:
if point_between(e + g * r, f + h * r, i + k * r, j + l * r, a + c * r, b + d * r):
final.append(r)
return np.array(final)
# Returns true if the point (px,py) is between the endpoints
# of the line segment whose endpoints lay at (ax,ay) and (bx,by)
def point_between(ax, ay, bx, by, px, py):
# colinear already checked above, this checks between the other two.
return (min(ax, bx) <= px <= max(ax, bx) or abs(ax - bx) < 0.001) and (min(ay, by) <= py <= max(ay, by) or abs(ay - by) < 0.001)
An example (L1 and L2 are endpoints of line):
P = (0,0) with velocity (0, +1)
L1 = (-1,2) with velocity (0, -1)
L2 = (1,2) with velocity (0, -1)
The returned result would be t=1, because after 1 time step, P will be one unit higher, and both endpoints of the line will each be one unit lower, therefore, the point intersects the segment at t=1.

How to generate a set of co-ordinates for a circle in python using the circle formula?

Looking to generate a set of whole integer co-ordinates for a circle using a user specified point, working with the formula for a circle: (x-a)^2 + (y-b)^2 = r^2
How can i do this in a 3d space, finding the co-ordinates of x,y and z.
Parametrics
Don't use the Cartesian format of the equations, use the parametric one
Instead of having (x-a)^2 + (y-b)^2 = r^2, you have
x = r * cos(t) + a
y = r * sin(t) + b
t, or more commonly for trigonometric functions, θ, is an angle between 0 and 2π
Example code
import math
a = 2
b = 3
r = 3
#The lower this value the higher quality the circle is with more points generated
stepSize = 0.1
#Generated vertices
positions = []
t = 0
while t < 2 * math.pi:
positions.append((r * math.cos(t) + a, r * math.sin(t) + b))
t += stepSize
print(positions)
Spheres
As this is a 2-dimensional surface, a second parameter will be required as one is insufficient
u = [0, 2π]
v = [-π/2, π/2]
x = r * sin(u) * cos(v) + a
y = r * cos(u) * cos(v) + b
z = r * sin(v) + c
import math
a = 2
b = 3
c = 7
r = 3
#The lower this value the higher quality the circle is with more points generated
stepSize = 0.1
#Generated vertices
positions = []
u = 0
v = -math.pi/2
while u < 2 * math.pi:
while v < math.pi/2:
positions.append((r * math.sin(u) * math.cos(v) + a, r * math.cos(u) * math.cos(v) + b, r * math.sin(v) + c))
v += stepSize
u += stepSize
print(positions)

How do I compute the intersection point of two lines?

I have two lines that intersect at a point. I know the endpoints of the two lines. How do I compute the intersection point in Python?
# Given these endpoints
#line 1
A = [X, Y]
B = [X, Y]
#line 2
C = [X, Y]
D = [X, Y]
# Compute this:
point_of_intersection = [X, Y]
Unlike other suggestions, this is short and doesn't use external libraries like numpy. (Not that using other libraries is bad...it's nice not need to, especially for such a simple problem.)
def line_intersection(line1, line2):
xdiff = (line1[0][0] - line1[1][0], line2[0][0] - line2[1][0])
ydiff = (line1[0][1] - line1[1][1], line2[0][1] - line2[1][1])
def det(a, b):
return a[0] * b[1] - a[1] * b[0]
div = det(xdiff, ydiff)
if div == 0:
raise Exception('lines do not intersect')
d = (det(*line1), det(*line2))
x = det(d, xdiff) / div
y = det(d, ydiff) / div
return x, y
print line_intersection((A, B), (C, D))
And FYI, I would use tuples instead of lists for your points. E.g.
A = (X, Y)
EDIT: Initially there was a typo. That was fixed Sept 2014 thanks to #zidik.
This is simply the Python transliteration of the following formula, where the lines are (a1, a2) and (b1, b2) and the intersection is p. (If the denominator is zero, the lines have no unique intersection.)
Can't stand aside,
So we have linear system:
A1 * x + B1 * y = C1
A2 * x + B2 * y = C2
let's do it with Cramer's rule, so solution can be found in determinants:
x = Dx/D
y = Dy/D
where D is main determinant of the system:
A1 B1
A2 B2
and Dx and Dy can be found from matricies:
C1 B1
C2 B2
and
A1 C1
A2 C2
(notice, as C column consequently substitues the coef. columns of x and y)
So now the python, for clarity for us, to not mess things up let's do mapping between math and python. We will use array L for storing our coefs A, B, C of the line equations and intestead of pretty x, y we'll have [0], [1], but anyway. Thus, what I wrote above will have the following form further in the code:
for D
L1[0] L1[1]
L2[0] L2[1]
for Dx
L1[2] L1[1]
L2[2] L2[1]
for Dy
L1[0] L1[2]
L2[0] L2[2]
Now go for coding:
line - produces coefs A, B, C of line equation by two points provided,
intersection - finds intersection point (if any) of two lines provided by coefs.
from __future__ import division
def line(p1, p2):
A = (p1[1] - p2[1])
B = (p2[0] - p1[0])
C = (p1[0]*p2[1] - p2[0]*p1[1])
return A, B, -C
def intersection(L1, L2):
D = L1[0] * L2[1] - L1[1] * L2[0]
Dx = L1[2] * L2[1] - L1[1] * L2[2]
Dy = L1[0] * L2[2] - L1[2] * L2[0]
if D != 0:
x = Dx / D
y = Dy / D
return x,y
else:
return False
Usage example:
L1 = line([0,1], [2,3])
L2 = line([2,3], [0,4])
R = intersection(L1, L2)
if R:
print "Intersection detected:", R
else:
print "No single intersection point detected"
Here is a solution using the Shapely library. Shapely is often used for GIS work, but is built to be useful for computational geometry. I changed your inputs from lists to tuples.
Problem
# Given these endpoints
#line 1
A = (X, Y)
B = (X, Y)
#line 2
C = (X, Y)
D = (X, Y)
# Compute this:
point_of_intersection = (X, Y)
Solution
import shapely
from shapely.geometry import LineString, Point
line1 = LineString([A, B])
line2 = LineString([C, D])
int_pt = line1.intersection(line2)
point_of_intersection = int_pt.x, int_pt.y
print(point_of_intersection)
Using formula from:
https://en.wikipedia.org/wiki/Line%E2%80%93line_intersection
def findIntersection(x1,y1,x2,y2,x3,y3,x4,y4):
px= ( (x1*y2-y1*x2)*(x3-x4)-(x1-x2)*(x3*y4-y3*x4) ) / ( (x1-x2)*(y3-y4)-(y1-y2)*(x3-x4) )
py= ( (x1*y2-y1*x2)*(y3-y4)-(y1-y2)*(x3*y4-y3*x4) ) / ( (x1-x2)*(y3-y4)-(y1-y2)*(x3-x4) )
return [px, py]
If your lines are multiple points instead, you can use this version.
import numpy as np
import matplotlib.pyplot as plt
"""
Sukhbinder
5 April 2017
Based on:
"""
def _rect_inter_inner(x1,x2):
n1=x1.shape[0]-1
n2=x2.shape[0]-1
X1=np.c_[x1[:-1],x1[1:]]
X2=np.c_[x2[:-1],x2[1:]]
S1=np.tile(X1.min(axis=1),(n2,1)).T
S2=np.tile(X2.max(axis=1),(n1,1))
S3=np.tile(X1.max(axis=1),(n2,1)).T
S4=np.tile(X2.min(axis=1),(n1,1))
return S1,S2,S3,S4
def _rectangle_intersection_(x1,y1,x2,y2):
S1,S2,S3,S4=_rect_inter_inner(x1,x2)
S5,S6,S7,S8=_rect_inter_inner(y1,y2)
C1=np.less_equal(S1,S2)
C2=np.greater_equal(S3,S4)
C3=np.less_equal(S5,S6)
C4=np.greater_equal(S7,S8)
ii,jj=np.nonzero(C1 & C2 & C3 & C4)
return ii,jj
def intersection(x1,y1,x2,y2):
"""
INTERSECTIONS Intersections of curves.
Computes the (x,y) locations where two curves intersect. The curves
can be broken with NaNs or have vertical segments.
usage:
x,y=intersection(x1,y1,x2,y2)
Example:
a, b = 1, 2
phi = np.linspace(3, 10, 100)
x1 = a*phi - b*np.sin(phi)
y1 = a - b*np.cos(phi)
x2=phi
y2=np.sin(phi)+2
x,y=intersection(x1,y1,x2,y2)
plt.plot(x1,y1,c='r')
plt.plot(x2,y2,c='g')
plt.plot(x,y,'*k')
plt.show()
"""
ii,jj=_rectangle_intersection_(x1,y1,x2,y2)
n=len(ii)
dxy1=np.diff(np.c_[x1,y1],axis=0)
dxy2=np.diff(np.c_[x2,y2],axis=0)
T=np.zeros((4,n))
AA=np.zeros((4,4,n))
AA[0:2,2,:]=-1
AA[2:4,3,:]=-1
AA[0::2,0,:]=dxy1[ii,:].T
AA[1::2,1,:]=dxy2[jj,:].T
BB=np.zeros((4,n))
BB[0,:]=-x1[ii].ravel()
BB[1,:]=-x2[jj].ravel()
BB[2,:]=-y1[ii].ravel()
BB[3,:]=-y2[jj].ravel()
for i in range(n):
try:
T[:,i]=np.linalg.solve(AA[:,:,i],BB[:,i])
except:
T[:,i]=np.NaN
in_range= (T[0,:] >=0) & (T[1,:] >=0) & (T[0,:] <=1) & (T[1,:] <=1)
xy0=T[2:,in_range]
xy0=xy0.T
return xy0[:,0],xy0[:,1]
if __name__ == '__main__':
# a piece of a prolate cycloid, and am going to find
a, b = 1, 2
phi = np.linspace(3, 10, 100)
x1 = a*phi - b*np.sin(phi)
y1 = a - b*np.cos(phi)
x2=phi
y2=np.sin(phi)+2
x,y=intersection(x1,y1,x2,y2)
plt.plot(x1,y1,c='r')
plt.plot(x2,y2,c='g')
plt.plot(x,y,'*k')
plt.show()
I didn't find an intuitive explanation on the web, so now that I worked it out, here's my solution. This is for infinite lines (what I needed), not segments.
Some terms you might remember:
A line is defined as y = mx + b OR y = slope * x + y-intercept
Slope = rise over run = dy / dx = height / distance
Y-intercept is where the line crosses the Y axis, where X = 0
Given those definitions, here are some functions:
def slope(P1, P2):
# dy/dx
# (y2 - y1) / (x2 - x1)
return(P2[1] - P1[1]) / (P2[0] - P1[0])
def y_intercept(P1, slope):
# y = mx + b
# b = y - mx
# b = P1[1] - slope * P1[0]
return P1[1] - slope * P1[0]
def line_intersect(m1, b1, m2, b2):
if m1 == m2:
print ("These lines are parallel!!!")
return None
# y = mx + b
# Set both lines equal to find the intersection point in the x direction
# m1 * x + b1 = m2 * x + b2
# m1 * x - m2 * x = b2 - b1
# x * (m1 - m2) = b2 - b1
# x = (b2 - b1) / (m1 - m2)
x = (b2 - b1) / (m1 - m2)
# Now solve for y -- use either line, because they are equal here
# y = mx + b
y = m1 * x + b1
return x,y
Here's a simple test between two (infinite) lines:
A1 = [1,1]
A2 = [3,3]
B1 = [1,3]
B2 = [3,1]
slope_A = slope(A1, A2)
slope_B = slope(B1, B2)
y_int_A = y_intercept(A1, slope_A)
y_int_B = y_intercept(B1, slope_B)
print(line_intersect(slope_A, y_int_A, slope_B, y_int_B))
Output:
(2.0, 2.0)
The most concise solution I have found uses Sympy: https://www.geeksforgeeks.org/python-sympy-line-intersection-method/
# import sympy and Point, Line
from sympy import Point, Line
p1, p2, p3 = Point(0, 0), Point(1, 1), Point(7, 7)
l1 = Line(p1, p2)
# using intersection() method
showIntersection = l1.intersection(p3)
print(showIntersection)
With the scikit-spatial library you can easily do it in the following way:
import matplotlib.pyplot as plt
from skspatial.objects import Line
# Define the two lines.
line_1 = Line.from_points([3, -2], [5, 4])
line_2 = Line.from_points([-1, 0], [3, 2])
# Compute the intersection point
intersection_point = line_1.intersect_line(line_2)
# Plot
_, ax = plt.subplots()
line_1.plot_2d(ax, t_1=-2, t_2=3, c="k")
line_2.plot_2d(ax, t_1=-2, t_2=3, c="k")
intersection_point.plot_2d(ax, c="r", s=100)
grid = ax.grid()
there is already an answer that uses formula from Wikipedia but that doesn't have any check point to check if line segments actually intersect so here you go
def line_intersection(a, b, c, d):
t = ((a[0] - c[0]) * (c[1] - d[1]) - (a[1] - c[1]) * (c[0] - d[0])) / ((a[0] - b[0]) * (c[1] - d[1]) - (a[1] - b[1]) * (c[0] - d[0]))
u = ((a[0] - c[0]) * (a[1] - b[1]) - (a[1] - c[1]) * (a[0] - b[0])) / ((a[0] - b[0]) * (c[1] - d[1]) - (a[1] - b[1]) * (c[0] - d[0]))
# check if line actually intersect
if (0 <= t and t <= 1 and 0 <= u and u <= 1):
return [a[0] + t * (b[0] - a[0]), a[1] + t * (b[1] - a[1])]
else:
return False
#usage
print(line_intersection([0,0], [10, 10], [0, 10], [10,0]))
#result [5.0, 5.0]
img And You can use this kode
class Nokta:
def __init__(self,x,y):
self.x=x
self.y=y
class Dogru:
def __init__(self,a,b):
self.a=a
self.b=b
def Kesisim(self,Dogru_b):
x1= self.a.x
x2=self.b.x
x3=Dogru_b.a.x
x4=Dogru_b.b.x
y1= self.a.y
y2=self.b.y
y3=Dogru_b.a.y
y4=Dogru_b.b.y
#Notlardaki denklemleri kullandım
pay1=((x4 - x3) * (y1 - y3) - (y4 - y3) * (x1 - x3))
pay2=((x2-x1) * (y1 - y3) - (y2 - y1) * (x1 - x3))
payda=((y4 - y3) *(x2-x1)-(x4 - x3)*(y2 - y1))
if pay1==0 and pay2==0 and payda==0:
print("DOĞRULAR BİRBİRİNE ÇAKIŞIKTIR")
elif payda==0:
print("DOĞRULAR BİRBİRNE PARALELDİR")
else:
ua=pay1/payda if payda else 0
ub=pay2/payda if payda else 0
#x ve y buldum
x=x1+ua*(x2-x1)
y=y1+ua*(y2-y1)
print("DOĞRULAR {},{} NOKTASINDA KESİŞTİ".format(x,y))

Python implementation of the Wilson Score Interval?

After reading How Not to Sort by Average Rating, I was curious if anyone has a Python implementation of a Lower bound of Wilson score confidence interval for a Bernoulli parameter?
Reddit uses the Wilson score interval for comment ranking, an explanation and python implementation can be found here
#Rewritten code from /r2/r2/lib/db/_sorts.pyx
from math import sqrt
def confidence(ups, downs):
n = ups + downs
if n == 0:
return 0
z = 1.0 #1.44 = 85%, 1.96 = 95%
phat = float(ups) / n
return ((phat + z*z/(2*n) - z * sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n))
I think this one has a wrong wilson call, because if you have 1 up 0 down you get NaN because you can't do a sqrt on the negative value.
The correct one can be found when looking at the ruby example from the article How not to sort by average page:
return ((phat + z*z/(2*n) - z * sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n))
To get the Wilson CI without continuity correction, you can use proportion_confint in statsmodels.stats.proportion. To get the Wilson CI with continuity correction, you can use the code below.
# cf.
# [1] R. G. Newcombe. Two-sided confidence intervals for the single proportion, 1998
# [2] R. G. Newcombe. Interval Estimation for the difference between independent proportions: comparison of eleven methods, 1998
import numpy as np
from statsmodels.stats.proportion import proportion_confint
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
def propci_wilson_cc(count, nobs, alpha=0.05):
# get confidence limits for proportion
# using wilson score method w/ cont correction
# i.e. Method 4 in Newcombe [1];
# verified via Table 1
from scipy import stats
n = nobs
p = count/n
q = 1.-p
z = stats.norm.isf(alpha / 2.)
z2 = z**2
denom = 2*(n+z2)
num = 2.*n*p+z2-1.-z*np.sqrt(z2-2-1./n+4*p*(n*q+1))
ci_l = num/denom
num = 2.*n*p+z2+1.+z*np.sqrt(z2+2-1./n+4*p*(n*q-1))
ci_u = num/denom
if p == 0:
ci_l = 0.
elif p == 1:
ci_u = 1.
return ci_l, ci_u
def dpropci_wilson_nocc(a,m,b,n,alpha=0.05):
# get confidence limits for difference in proportions
# a/m - b/n
# using wilson score method WITHOUT cont correction
# i.e. Method 10 in Newcombe [2]
# verified via Table II
theta = a/m - b/n
l1, u1 = proportion_confint(count=a, nobs=m, alpha=0.05, method='wilson')
l2, u2 = proportion_confint(count=b, nobs=n, alpha=0.05, method='wilson')
ci_u = theta + np.sqrt((a/m-u1)**2+(b/n-l2)**2)
ci_l = theta - np.sqrt((a/m-l1)**2+(b/n-u2)**2)
return ci_l, ci_u
def dpropci_wilson_cc(a,m,b,n,alpha=0.05):
# get confidence limits for difference in proportions
# a/m - b/n
# using wilson score method w/ cont correction
# i.e. Method 11 in Newcombe [2]
# verified via Table II
theta = a/m - b/n
l1, u1 = propci_wilson_cc(count=a, nobs=m, alpha=alpha)
l2, u2 = propci_wilson_cc(count=b, nobs=n, alpha=alpha)
ci_u = theta + np.sqrt((a/m-u1)**2+(b/n-l2)**2)
ci_l = theta - np.sqrt((a/m-l1)**2+(b/n-u2)**2)
return ci_l, ci_u
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# single proportion testing
# these come from Newcombe [1] (Table 1)
a_vec = np.array([81, 15, 0, 1])
m_vec = np.array([263, 148, 20, 29])
for (a,m) in zip(a_vec,m_vec):
l1, u1 = proportion_confint(count=a, nobs=m, alpha=0.05, method='wilson')
l2, u2 = propci_wilson_cc(count=a, nobs=m, alpha=0.05)
print(a,m,l1,u1,' ',l2,u2)
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# difference in proportions testing
# these come from Newcombe [2] (Table II)
a_vec = np.array([56,9,6,5,0,0,10,10],dtype=float)
m_vec = np.array([70,10,7,56,10,10,10,10],dtype=float)
b_vec = np.array([48,3,2,0,0,0,0,0],dtype=float)
n_vec = np.array([80,10,7,29,20,10,20,10],dtype=float)
print('\nWilson without CC')
for (a,m,b,n) in zip(a_vec,m_vec,b_vec,n_vec):
l, u = dpropci_wilson_nocc(a,m,b,n,alpha=0.05)
print('{:2.0f}/{:2.0f}-{:2.0f}/{:2.0f} ; {:6.4f} ; {:8.4f}, {:8.4f}'.format(a,m,b,n,a/m-b/n,l,u))
print('\nWilson with CC')
for (a,m,b,n) in zip(a_vec,m_vec,b_vec,n_vec):
l, u = dpropci_wilson_cc(a,m,b,n,alpha=0.05)
print('{:2.0f}/{:2.0f}-{:2.0f}/{:2.0f} ; {:6.4f} ; {:8.4f}, {:8.4f}'.format(a,m,b,n,a/m-b/n,l,u))
HTH
The accepted solution seems to use a hard-coded z-value (best for performance).
In the event that you wanted a direct python equivalent of the ruby formula from the blogpost with a dynamic z-value (based on the confidence interval):
import math
import scipy.stats as st
def ci_lower_bound(pos, n, confidence):
if n == 0:
return 0
z = st.norm.ppf(1 - (1 - confidence) / 2)
phat = 1.0 * pos / n
return (phat + z * z / (2 * n) - z * math.sqrt((phat * (1 - phat) + z * z / (4 * n)) / n)) / (1 + z * z / n)
If you'd like to actually calculate z directly from a confidence bound and want to avoid installing numpy/scipy, you can use the following snippet of code,
import math
def binconf(p, n, c=0.95):
'''
Calculate binomial confidence interval based on the number of positive and
negative events observed. Uses Wilson score and approximations to inverse
of normal cumulative density function.
Parameters
----------
p: int
number of positive events observed
n: int
number of negative events observed
c : optional, [0,1]
confidence percentage. e.g. 0.95 means 95% confident the probability of
success lies between the 2 returned values
Returns
-------
theta_low : float
lower bound on confidence interval
theta_high : float
upper bound on confidence interval
'''
p, n = float(p), float(n)
N = p + n
if N == 0.0: return (0.0, 1.0)
p = p / N
z = normcdfi(1 - 0.5 * (1-c))
a1 = 1.0 / (1.0 + z * z / N)
a2 = p + z * z / (2 * N)
a3 = z * math.sqrt(p * (1-p) / N + z * z / (4 * N * N))
return (a1 * (a2 - a3), a1 * (a2 + a3))
def erfi(x):
"""Approximation to inverse error function"""
a = 0.147 # MAGIC!!!
a1 = math.log(1 - x * x)
a2 = (
2.0 / (math.pi * a)
+ a1 / 2.0
)
return (
sign(x) *
math.sqrt( math.sqrt(a2 * a2 - a1 / a) - a2 )
)
def sign(x):
if x < 0: return -1
if x == 0: return 0
if x > 0: return 1
def normcdfi(p, mu=0.0, sigma2=1.0):
"""Inverse CDF of normal distribution"""
if mu == 0.0 and sigma2 == 1.0:
return math.sqrt(2) * erfi(2 * p - 1)
else:
return mu + math.sqrt(sigma2) * normcdfi(p)
Here is a simplified (no need for numpy) and slightly improved (0 and n values for k do not cause a math domain error) version of the Wilson score confidence interval with continuity correction, from the original sourcecode written by batesbatesbates in another answer, and also a pure python no-numpy non-continuity correction version, with 2 equivalent ways to calculate (can be switched with eqmode argument, but both ways give the exact same non-continuity correction results):
import math
def propci_wilson_nocc(k, n, z=1.96, eqmode=0):
# Calculates the Binomial Proportion Confidence Interval using the Wilson Score method without continuation correction
# Equations eqmode == 1 from: https://en.wikipedia.org/w/index.php?title=Binomial_proportion_confidence_interval&oldid=1101942017#Wilson_score_interval
# Equations eqmode == 0 from: https://www.evanmiller.org/how-not-to-sort-by-average-rating.html
# The results should be close to:
# from statsmodels.stats.proportion import proportion_confint
# proportion_confint(k, n, alpha=0.05, method='wilson')
#z=1.44 = 85%, 1.96 = 95%
if n == 0:
return 0
p_hat = float(k) / n
z2 = z**2
if eqmode == 0:
ci_l = (p_hat + z2/(2*n) - z*math.sqrt(max(0.0, (p_hat*(1 - p_hat) + z2/(4*n))/n))) / (1 + z2 / n)
else:
ci_l = (1.0 / (1.0 + z2/n)) * (p_hat + z2/(2*n)) - (z / (1 + z2/n)) * math.sqrt(max(0.0, (p_hat*(1 - p_hat)/n + z2/(4*(n**2)))))
if eqmode == 0:
ci_u = (p_hat + z2/(2*n) + z*math.sqrt(max(0.0, (p_hat*(1 - p_hat) + z2/(4*n))/n))) / (1 + z2 / n)
else:
ci_u = (1.0 / (1.0 + z2/n)) * (p_hat + z2/(2*n)) + (z / (1 + z2/n)) * math.sqrt(max(0.0, (p_hat*(1 - p_hat)/n + z2/(4*(n**2)))))
return [ci_l, ci_u]
def propci_wilson_cc(n, k, z=1.96):
# Calculates the Binomial Proportion Confidence Interval using the Wilson Score method with continuation correction
# i.e. Method 4 in Newcombe [1]: R. G. Newcombe. Two-sided confidence intervals for the single proportion, 1998;
# verified via Table 1
# originally written by batesbatesbates https://stackoverflow.com/questions/10029588/python-implementation-of-the-wilson-score-interval/74021634#74021634
p_hat = k/n
q = 1.0-p
z2 = z**2
denom = 2*(n+z2)
num = 2.0*n*p_hat + z2 - 1.0 - z*math.sqrt(max(0.0, z2 - 2 - 1.0/n + 4*p_hat*(n*q + 1)))
ci_l = num/denom
num2 = 2.0*n*p_hat + z2 + 1.0 + z*math.sqrt(max(0.0, z2 + 2 - 1.0/n + 4*p_hat*(n*q - 1)))
ci_u = num2/denom
if p_hat == 0:
ci_l = 0.0
elif p_hat == 1:
ci_u = 1.0
return [ci_l, ci_u]
Note that the returned value will always be bounded between [0.0, 1.0] (due to how p_hat is a ratio of k/n), this is why it's a score and not really a confidence interval, but it's easy to project back to a confidence interval by multiplying ci_l * n and ci_u * n, these values will be in the same domain as k and can be plotted alongside.
Here is a much more readable version for how to compute the Wilson Score interval without continuity correction, by Bartosz Mikulski:
from math import sqrt
def wilson(p, n, z = 1.96):
denominator = 1 + z**2/n
centre_adjusted_probability = p + z*z / (2*n)
adjusted_standard_deviation = sqrt((p*(1 - p) + z*z / (4*n)) / n)
lower_bound = (centre_adjusted_probability - z*adjusted_standard_deviation) / denominator
upper_bound = (centre_adjusted_probability + z*adjusted_standard_deviation) / denominator
return (lower_bound, upper_bound)

Categories