I'm trying to create a Voronoi diagram given a set of scatterplot points. However, several "extra unintended lines" appear to get calculated in the process. Some of these "extra" lines appear to be the infinite edges getting incorrectly calculated. But others are appearing randomly in the middle of the plot as well. How can I only create an extra edge when it's needed/required to connect a polygon to the edge of the plot (e.g. plot boundaries)?
My graph outer boundaries are:
boundaries = np.array([[0, -2], [0, 69], [105, 69], [105, -2], [0, -2]])
Here's the section dealing with the voronoi diagram creation:
def voronoi_polygons(voronoi, diameter):
centroid = voronoi.points.mean(axis=0)
ridge_direction = defaultdict(list)
for (p, q), rv in zip(voronoi.ridge_points, voronoi.ridge_vertices):
u, v = sorted(rv)
if u == -1:
t = voronoi.points[q] - voronoi.points[p] # tangent
n = np.array([-t[1], t[0]]) / np.linalg.norm(t) # normal
midpoint = voronoi.points[[p, q]].mean(axis=0)
direction = np.sign(np.dot(midpoint - centroid, n)) * n
ridge_direction[p, v].append(direction)
ridge_direction[q, v].append(direction)
for i, r in enumerate(voronoi.point_region):
region = voronoi.regions[r]
if -1 not in region:
# Finite region.
yield Polygon(voronoi.vertices[region])
continue
# Infinite region.
inf = region.index(-1) # Index of vertex at infinity.
j = region[(inf - 1) % len(region)] # Index of previous vertex.
k = region[(inf + 1) % len(region)] # Index of next vertex.
if j == k:
# Region has one Voronoi vertex with two ridges.
dir_j, dir_k = ridge_direction[i, j]
else:
# Region has two Voronoi vertices, each with one ridge.
dir_j, = ridge_direction[i, j]
dir_k, = ridge_direction[i, k]
# Length of ridges needed for the extra edge to lie at least
# 'diameter' away from all Voronoi vertices.
length = 2 * diameter / np.linalg.norm(dir_j + dir_k)
# Polygon consists of finite part plus an extra edge.
finite_part = voronoi.vertices[region[inf + 1:] + region[:inf]]
extra_edge = [voronoi.vertices[j] + dir_j * length,
voronoi.vertices[k] + dir_k * length]
combined_finite_edge = np.concatenate((finite_part, extra_edge))
poly = Polygon(combined_finite_edge)
yield poly
Here are the points being used:
['52.629' '24.28099822998047']
['68.425' '46.077999114990234']
['60.409' '36.7140007019043']
['72.442' '28.762001037597656']
['52.993' '43.51799964904785']
['59.924' '16.972000122070312']
['61.101' '55.74899959564209']
['68.9' '13.248001098632812']
['61.323' '29.0260009765625']
['45.283' '36.97500038146973']
['52.425' '19.132999420166016']
['37.739' '28.042999267578125']
['48.972' '2.3539962768554688']
['33.865' '30.240001678466797']
['52.34' '64.94799995422363']
['52.394' '45.391000747680664']
['52.458' '34.79800033569336']
['31.353' '43.14500045776367']
['38.194' '39.24399948120117']
['98.745' '32.15999984741211']
['6.197' '32.606998443603516']
Most likely this is due to the errors associated with floating point arithmetic while computing the voronoi traingulation from your data (esp. the second column).
Assuming that, there is no single solution for such kinds of problems. I urge you to go through this page* of the Qhull manual and try iterating through those parameters in qhull_options before generating the voronoi object that you are inputting in the function. An example would be qhull_options='Qbb Qc Qz QJ'.
Other than that I doubt there is anything that could be modified in the function to avoid such a problem.
*This will take some time though. Just be patient.
Figured out what was wrong: after each polygon I needed to add a null x and y value or else it would attempt to 'stitch' one polygon to another, drawing an additional unintended line in order to do so. So the data should really look more like this:
GameTime,Half,ObjectType,JerseyNumber,X,Y,PlayerIDEvent,PlayerIDTracking,MatchIDEvent,Position,teamId,i_order,v_vor_x,v_vor_y
0.0,1,1,22,None,None,578478,794888,2257663,3,35179.0,0,22.79645297,6.20866756
0.0,1,1,22,None,None,578478,794888,2257663,3,35179.0,1,17.63464264,3.41230187
0.0,1,1,22,None,None,578478,794888,2257663,3,35179.0,2,20.27639318,34.29191902
0.0,1,1,22,None,None,578478,794888,2257663,3,35179.0,3,32.15600546,36.60432421
0.0,1,1,22,None,None,578478,794888,2257663,3,35179.0,4,38.34639812,33.62806739
0.0,1,1,22,None,None,578478,794888,2257663,3,35179.0,5,22.79645297,6.20866756
0.0,1,1,22,None,None,578478,794888,2257663,3,35179.0,5,nan,nan
0.0,1,1,22,33.865,30.240001678466797,578478,794888,2257663,3,35179.0,,,
0.0,1,0,92,None,None,369351,561593,2257663,1,32446.0,0,46.91696938,29.44801535
0.0,1,0,92,None,None,369351,561593,2257663,1,32446.0,1,55.37574848,29.5855499
0.0,1,0,92,None,None,369351,561593,2257663,1,32446.0,2,58.85876401,23.20381766
0.0,1,0,92,None,None,369351,561593,2257663,1,32446.0,3,57.17455086,21.5228301
0.0,1,0,92,None,None,369351,561593,2257663,1,32446.0,4,44.14237744,22.03925667
0.0,1,0,92,None,None,369351,561593,2257663,1,32446.0,5,45.85962774,28.83613332
0.0,1,0,92,None,None,369351,561593,2257663,1,32446.0,5,nan,nan
0.0,1,0,92,52.629,24.28099822998047,369351,561593,2257663,1,32446.0,,,
0.0,1,0,27,None,None,704169,704169,2257663,2,32446.0,0,65.56965667,33.4292025
0.0,1,0,27,None,None,704169,704169,2257663,2,32446.0,1,57.23303682,32.43809027
0.0,1,0,27,None,None,704169,704169,2257663,2,32446.0,2,55.65704152,38.97814049
0.0,1,0,27,None,None,704169,704169,2257663,2,32446.0,3,60.75304149,44.53251169
0.0,1,0,27,None,None,704169,704169,2257663,2,32446.0,4,65.14170295,40.77562188
0.0,1,0,27,None,None,704169,704169,2257663,2,32446.0,5,65.56965667,33.4292025
0.0,1,0,27,None,None,704169,704169,2257663,2,32446.0,5,nan,nan
Is it possible to calculate the area of the overlap of two curves?
I found two answers here but they are written in R which I am not familiar with. Or struggling to convert them to python.
Area between the two curves and Find area of overlap between two curves
For example, for a given dataset with defined x, y points. (x1,y1,x2,y2)
I am able to get the area of each curve using :
np.trapz
However, to get the overlap only is challenging and I haven't found a solution to show. Any guidance or maths formulas will be appreciated.
So this can be done using the shapely module within Python.
Firstly, Join the two curves together to create one self-intersecting polygon (shown in code below).
Then using the unary_union() function from shapely, you will:
Split the complex polygon into seperate simple polygons.
Find the area of each simple polygon.
Sum it to find the overall area of the two curves.
Full code shown below:
import numpy as np
from shapely.geometry import LineString
from shapely.ops import unary_union, polygonize
avg_coords = [(0.0, 0.0), (4.872117, 2.29658), (5.268545, 2.4639225), (5.664686, 2.6485724), (6.059776, 2.8966842), (6.695151, 3.0986626), (7.728006, 3.4045217), (8.522297, 3.652668), (9.157002, 3.895031), (10.191483, 4.1028132), (10.827622, 4.258638), (11.38593, 4.2933016), (11.86478, 4.3048816), (12.344586, 4.258769), (12.984073, 4.2126703), (13.942729, 4.1781383), (14.58212, 4.137809), (15.542498, 3.99943), (16.502588, 3.878359), (17.182951, 3.7745714), (18.262657, 3.6621647), (19.102558, 3.567045), (20.061789, 3.497897), (21.139917, 3.4806826), (22.097425, 3.5153809), (23.65388, 3.5414772), (24.851482, 3.541581), (26.04966, 3.507069), (27.72702, 3.463945), (28.925198, 3.429433), (29.883854, 3.3949006), (31.08246, 3.3344274), (31.92107, 3.317192), (33.716183, 3.3952322), (35.63192, 3.4213595), (37.427895, 3.4474766), (39.343628, 3.473604), (41.49874, 3.508406), (43.773468, 3.5518723), (46.287716, 3.595359), (49.28115, 3.6302335), (52.633293, 3.6997545), (54.30922, 3.7431688), (55.8651, 3.8038807), (58.738773, 3.8387446), (60.893887, 3.8735466), (63.647655, 3.9170544), (66.760704, 3.960593), (68.79663, 3.9607692), (70.23332, 3.986855), (72.867905, 3.995737), (75.38245, 4.0219164), (77.778656, 3.9615464), (79.337975, 3.8145657), (80.41826, 3.6675436), (80.899734, 3.5204697), (81.62059, 3.38207), (82.34045, 3.3042476), (83.30039, 3.1918304), (84.38039, 3.062116), (84.50359, 2.854434), (83.906364, 2.7591898), (83.669716, 2.586092), (83.43435, 2.3351095), (83.19727, 2.1879735), (82.84229, 1.9283267), (82.48516, 1.7984879), (81.65014, 1.5993768), (80.454544, 1.4781193), (79.13962, 1.3308897), (77.944595, 1.1750168), (76.39001, 1.0364205), (74.59633, 0.87184185), (71.60447, 0.741775), (70.04903, 0.6551017), (58.3, 0.0)]
model_coords = [(0.0, 0.0), (0.6699889, 0.18807), (1.339894, 0.37499), (2.009583, 0.55966), (2.67915, 0.74106), (3.348189, 0.91826), (4.016881, 1.0904), (4.685107, 1.2567), (5.359344, 1.418), (6.026172, 1.5706), (6.685472, 1.714), (7.350604, 1.8508), (8.021434, 1.9803), (8.684451, 2.0996), (9.346408, 2.2099), (10.0066, 2.311), (10.66665, 2.4028), (11.32436, 2.4853), (11.98068, 2.5585), (12.6356, 2.6225), (13.29005, 2.6775), (13.93507, 2.7232), (14.58554, 2.7609), (15.23346, 2.7903), (15.87982, 2.8116), (16.52556, 2.8254), (17.16867, 2.832), (17.80914, 2.8317), (18.44891, 2.825), (19.08598, 2.8124), (19.72132, 2.7944), (20.35491, 2.7713), (20.98673, 2.7438), (21.61675, 2.7121), (22.24398, 2.677), (22.86939, 2.6387), (23.49297, 2.5978), (24.1147, 2.5548), (24.73458, 2.51), (25.3526, 2.464), (25.96874, 2.4171), (26.58301, 2.3697), (27.1954, 2.3223), (27.80491, 2.2751), (28.41354, 2.2285), (29.02028, 2.1829), (29.62512, 2.1384), (30.22809, 2.0954), (30.82917, 2.0541), (31.42837, 2.0147), (32.02669, 1.9775), (32.62215, 1.9425), (33.21674, 1.9099), (33.80945, 1.8799), (34.40032, 1.8525), (34.98933, 1.8277), (35.5765, 1.8058), (36.16283, 1.7865), (36.74733, 1.7701), (37.33002, 1.7564), (37.91187, 1.7455), (38.49092, 1.7372), (39.06917, 1.7316), (39.64661, 1.7285), (40.22127, 1.7279), (40.79514, 1.7297), (41.36723, 1.7337), (41.93759, 1.7399), (42.50707, 1.748), (43.07386, 1.7581), (43.63995, 1.7699), (44.20512, 1.7832), (44.76772, 1.7981), (45.3295, 1.8143), (45.88948, 1.8318), (46.44767, 1.8504), (47.00525, 1.8703), (47.55994, 1.8911), (48.11392, 1.9129), (48.6661, 1.9356), (49.21658, 1.959), (49.76518, 1.9832), (50.31305, 2.0079), (50.85824, 2.033), (51.40252, 2.0586), (51.94501, 2.0845), (52.48579, 2.1107), (53.02467, 2.1369), (53.56185, 2.1632), (54.09715, 2.1895), (54.63171, 2.2156), (55.1634, 2.2416), (55.69329, 2.2674), (56.22236, 2.2928), (56.74855, 2.3179), (57.27392, 2.3426), (57.7964, 2.3668), (58.31709, 2.3905), (58.83687, 2.4136), (59.35905, 2.4365), (59.87414, 2.4585), (60.38831, 2.4798), (60.8996, 2.5006), (61.40888, 2.5207), (61.91636, 2.5401), (62.42194, 2.5589), (62.92551, 2.577), (63.42729, 2.5945), (63.92607, 2.6113), (64.42384, 2.6275), (64.91873, 2.643), (65.4127, 2.658), (65.90369, 2.6724), (66.39266, 2.6862), (66.87964, 2.6995), (67.36373, 2.7123), (67.84679, 2.7246), (68.32689, 2.7364), (68.80595, 2.7478), (69.28194, 2.7588), (69.756, 2.7695), (70.22709, 2.7798), (70.69707, 2.7898), (71.16405, 2.7995), (71.62902, 2.809), (72.0919, 2.8183), (72.55277, 2.8273), (73.01067, 2.8362), (73.46734, 2.845), (73.92112, 2.8536), (74.37269, 2.8622), (74.82127, 2.8706), (75.26884, 2.8791), (75.71322, 2.8875), (76.15559, 2.8958), (76.59488, 2.9042), (77.03304, 2.9126), (77.46812, 2.921), (77.90111, 2.9294), (78.33199, 2.9379), (78.75986, 2.9464), (79.18652, 2.955), (79.60912, 2.9637), (80.03049, 2.9724), (80.44985, 2.9811), (80.86613, 2.99), (81.2802, 2.9989), (81.69118, 3.0078), (82.10006, 3.0168), (82.50674, 3.0259), (82.91132, 3.035), (83.31379, 3.0441), (83.71307, 3.0533), (84.10925, 3.0625), (84.50421, 3.0717), (84.8961, 3.0809), (85.28577, 3.0901), (85.67334, 3.0993), (86.05771, 3.1085), (86.43989, 3.1176), (86.81896, 3.1267), (87.19585, 3.1358), (87.57063, 3.1448), (87.94319, 3.1537), (88.31257, 3.1626), (88.67973, 3.1713), (89.04372, 3.18), (89.40659, 3.1886), (89.7652, 3.197), (90.12457, 3.2053), (90.47256, 3.2135), (90.82946, 3.2216), (91.17545, 3.2295), (91.52045, 3.2373), (91.86441, 3.2449), (92.20641, 3.2524), (92.54739, 3.2597), (92.88728, 3.2669), (93.21538, 3.2739), (93.55325, 3.2807), (93.87924, 3.2874), (94.20424, 3.2939), (94.52822, 3.3002), (94.85012, 3.3064), (95.16219, 3.3123), (95.48208, 3.3182), (95.79107, 3.3238), (96.09807, 3.3293), (96.40505, 3.3346), (96.71003, 3.3397), (97.01401, 3.3447), (97.31592, 3.3496), (97.60799, 3.3542), (97.90789, 3.3587), (98.19686, 3.3631), (98.48386, 3.3673), (98.77085, 3.3714), (99.05574, 3.3753), (99.32983, 3.3791), (99.6127, 3.3828), (99.8837, 3.3863), (100.1538, 3.3897), (100.4326, 3.393), (100.6897, 3.3961), (100.9566, 3.3991), (101.2215, 3.402), (101.4756, 3.4048), (101.7375, 3.4075), (101.9885, 3.4101), (102.2385, 3.4126), (102.4875, 3.4149), (102.7354, 3.4172), (102.9714, 3.4194), (103.2163, 3.4214), (103.4493, 3.4234), (103.6823, 3.4253), (103.9133, 3.4271), (104.1433, 3.4288), (104.3712, 3.4304), (104.5882, 3.4319), (104.8141, 3.4333), (105.0291, 3.4346), (105.2421, 3.4358), (105.4541, 3.437), (105.6651, 3.438), (105.8751, 3.439), (106.083, 3.4399), (106.28, 3.4407), (106.4759, 3.4414), (106.6699, 3.442), (106.8629, 3.4425), (107.0549, 3.443), (107.2458, 3.4433), (107.4249, 3.4435), (107.6128, 3.4437), (107.7897, 3.4438), (107.9647, 3.4437), (108.1387, 3.4436), (108.3116, 3.4433), (108.4737, 3.443), (108.6436, 3.4426), (108.8027, 3.4421), (108.9706, 3.4414), (109.1265, 3.4407), (109.2814, 3.4399), (109.4255, 3.439), (109.5784, 3.4379), (109.7195, 3.4368), (109.8694, 3.4356), (110.0084, 3.4342), (110.1454, 3.4328), (110.2813, 3.4313), (110.4162, 3.4296), (110.5403, 3.4279), (110.6722, 3.426), (110.7932, 3.424), (110.9132, 3.422), (111.0322, 3.4198), (111.1492, 3.4175), (111.2651, 3.4151), (111.3701, 3.4127), (111.483, 3.4101), (111.585, 3.4074), (111.686, 3.4046), (111.786, 3.4017), (111.884, 3.3987), (111.9809, 3.3956), (112.0669, 3.3924), (112.1608, 3.3891), (112.2448, 3.3857), (112.3268, 3.3822), (112.4078, 3.3786), (112.4867, 3.3749), (112.5548, 3.3711), (112.6317, 3.3672), (112.6978, 3.3632), (112.7726, 3.3591), (112.8356, 3.3549), (112.8975, 3.3506), (112.9476, 3.3462), (113.0076, 3.3417), (113.0655, 3.3372), (113.1125, 3.3325), (113.1584, 3.3278), (113.2024, 3.3229), (113.2464, 3.318), (113.2884, 3.313), (113.3283, 3.3079), (113.3584, 3.3027), (113.3963, 3.2974), (113.4233, 3.292), (113.4492, 3.2865), (113.4742, 3.281), (113.4972, 3.2753), (113.5201, 3.2696), (113.5312, 3.2638), (113.5501, 3.2579), (113.5591, 3.2519), (113.5661, 3.2459), (113.5721, 3.2397), (113.577, 3.2335), (113.5809, 3.2272), (113.573, 3.2208), (113.5749, 3.2143), (113.5649, 3.2077), (113.5539, 3.2011), (113.5409, 3.1944), (113.5278, 3.1876), (113.5128, 3.1807), (113.4967, 3.1737), (113.4697, 3.1667), (113.4418, 3.1596), (113.4227, 3.1524), (113.3917, 3.145), (113.3597, 3.1375), (113.3266, 3.1298), (113.2827, 3.1218), (113.2475, 3.1136), (113.2016, 3.1051), (113.1635, 3.0964), (113.1155, 3.0873), (113.0655, 3.0779), (113.0144, 3.0683), (112.9525, 3.0583), (112.8994, 3.048), (112.8345, 3.0373), (112.7793, 3.0264), (112.7123, 3.0152), (112.6453, 3.0037), (112.5763, 2.9919), (112.5063, 2.9798), (112.4352, 2.9674), (112.3533, 2.9548), (112.2801, 2.9419), (112.1952, 2.9287), (112.1102, 2.9153), (112.034, 2.9017), (111.9361, 2.8879), (111.8481, 2.8739), (111.7581, 2.8597), (111.667, 2.8453), (111.5661, 2.8307), (111.473, 2.816), (111.3689, 2.801), (111.2639, 2.786), (111.1579, 2.7708), (111.0509, 2.7555), (110.9428, 2.74), (110.8239, 2.7245), (110.7138, 2.7088), (110.5928, 2.6931), (110.4709, 2.6772), (110.3578, 2.6613), (110.2338, 2.6453), (110.1087, 2.6292), (109.9826, 2.613), (109.8457, 2.5968), (109.7176, 2.5805), (109.5787, 2.5642), (109.4496, 2.5478), (109.3086, 2.5314), (109.1666, 2.5149), (109.0236, 2.4984), (108.8806, 2.4819), (108.7355, 2.4653), (108.5905, 2.4488), (108.4434, 2.4322), (108.2865, 2.4155), (108.1384, 2.3989), (107.9794, 2.3822), (107.8195, 2.3655), (107.6684, 2.3488), (107.5063, 2.3321), (107.3374, 2.3156), (107.1744, 2.2989), (107.0104, 2.2822), (106.8442, 2.2654), (106.6683, 2.2487), (106.5012, 2.232), (106.3242, 2.2152), (106.1452, 2.1985), (105.9662, 2.1818), (105.7862, 2.165), (105.6052, 2.1483), (105.4232, 2.1316), (105.2402, 2.1149), (105.0572, 2.0981), (104.8721, 2.0814), (104.6772, 2.0647), (104.492, 2.048), (104.295, 2.0313), (104.098, 2.0147), (103.9, 1.998), (103.701, 1.9813), (103.502, 1.9647), (103.301, 1.948), (103.1, 1.9314), (102.899, 1.9148), (102.6959, 1.8982), (102.483, 1.8816), (102.2789, 1.865), (102.0649, 1.8484), (101.8588, 1.8318), (101.6428, 1.8153), (101.4268, 1.7988), (101.2098, 1.7822), (100.9918, 1.7657), (100.7728, 1.7492), (100.5538, 1.7328), (100.3338, 1.7163), (100.1128, 1.6999), (99.89169, 1.6834), (99.65978, 1.667), (99.43769, 1.6506), (99.20477, 1.6343), (98.98066, 1.6179), (98.74665, 1.6016), (98.51164, 1.5852), (98.27574, 1.5689), (98.04964, 1.5527), (97.81264, 1.5364), (97.57562, 1.5202), (97.33752, 1.5039), (97.08962, 1.4877), (96.8506, 1.4716), (96.61061, 1.4554), (96.37051, 1.4393), (96.12058, 1.4232), (95.87949, 1.4071), (95.62759, 1.391), (95.38547, 1.375), (95.13258, 1.359), (94.88946, 1.343), (94.63548, 1.3271), (94.38145, 1.3111), (94.12645, 1.2952), (93.87144, 1.2793), (93.61545, 1.2635), (93.35946, 1.2477), (93.10343, 1.2319), (92.84642, 1.2161), (92.58843, 1.2004), (92.33042, 1.1846), (92.07232, 1.169), (91.8034, 1.1533), (91.54331, 1.1377), (91.2744, 1.1221), (91.0133, 1.1065), (90.7434, 1.091), (90.48229, 1.0755), (90.21139, 1.0601), (89.9493, 1.0446), (89.67728, 1.0292), (89.40428, 1.0139), (89.13137, 0.99855), (88.86826, 0.98325), (88.59427, 0.96799), (88.32026, 0.95277), (88.04527, 0.93758), (87.77126, 0.92242), (87.4972, 0.90731), (87.21732, 0.89222), (86.94719, 0.87718), (86.66711, 0.86217), (86.3773, 0.8472), (86.10719, 0.83227), (85.82721, 0.81738), (85.5472, 0.80252), (85.26721, 0.7877), (84.9872, 0.77292), (84.7071, 0.75819), (84.41721, 0.74349), (84.1371, 0.72883), (83.84721, 0.71421), (83.5671, 0.69963), (83.27721, 0.68509), (82.99711, 0.6706), (82.70711, 0.65615), (82.41721, 0.64173), (82.1371, 0.62736), (81.8471, 0.61304), (81.55722, 0.59875), (81.27709, 0.58451), (80.98712, 0.57031), (80.697, 0.55616), (80.39711, 0.54205), (80.10722, 0.52798), (79.8271, 0.51396), (79.53701, 0.49999), (79.23711, 0.48605), (78.9471, 0.47217), (78.65701, 0.45833), (78.3571, 0.44453), (78.06712, 0.43078), (77.77701, 0.41708), (77.4771, 0.40343), (77.18701, 0.38982), (76.8871, 0.37626), (76.59711, 0.36274), (76.30701, 0.34928), (76.0071, 0.33586), (75.7169, 0.32249), (75.4071, 0.30917), (75.11701, 0.29589), (74.8171, 0.28267), (74.52701, 0.26949), (74.22711, 0.25636), (73.937, 0.24329), (73.63691, 0.23026), (73.3271, 0.21728), (73.03699, 0.20436), (72.73712, 0.19148), (72.4469, 0.17865), (72.13712, 0.16588), (71.84701, 0.15315), (71.547, 0.14048), (71.24701, 0.12786), (70.947, 0.11528), (70.64701, 0.10277), (70.3471, 0.090298), (70.05691, 0.077883), (69.74712, 0.06552), (69.457, 0.05321), (69.1569, 0.040952), (68.84709, 0.028747), (68.557, 0.016595), (68.25701, 0.0)]
polygon_points = [] #creates a empty list where we will append the points to create the polygon
for xyvalue in avg_coords:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 1
for xyvalue in model_coords[::-1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 2 in the reverse order (from last point to first point)
for xyvalue in avg_coords[0:1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append the first point in curve 1 again, to it "closes" the polygon
avg_poly = []
model_poly = []
for xyvalue in avg_coords:
avg_poly.append([xyvalue[0],xyvalue[1]])
for xyvalue in model_coords:
model_poly.append([xyvalue[0],xyvalue[1]])
line_non_simple = LineString(polygon_points)
mls = unary_union(line_non_simple)
Area_cal =[]
for polygon in polygonize(mls):
Area_cal.append(polygon.area)
print(polygon.area)# print area of each section
Area_poly = (np.asarray(Area_cal).sum())
print(Area_poly)#print combined area
If possible, represent your overlap regions as polygons. From there the polygon area is computable by a remarkably concise formula as explained on Paul Bourke's site.
Suppose (x[i], y[i]), i = 0, ..., N, are the polygon vertices, with (x[0], y[0]) = (x[N], y[N]) so that the polygon is closed, and consistently all in clockwise order or all in counter-clockwise order. Then the area is
area = |0.5 * sum_i (x[i] * y[i+1] - x[i+1] * y[i])|
where the sum goes over i = 0, ..., N-1. This is valid even for nonconvex polygons. This formula is essentially the same principle behind how a planimeter works to measure area of an arbitrary two-dimensional shape, a special case of Green's theorem.
If your functions are actually "function" meaning that no vertical lines intersect the functions more than once, then finding the overlaps is the matter of finding zeros.
import numpy as np
import matplotlib.pyplot as plt
dx = 0.01
x = np.arange(-2, 2, dx)
f1 = np.sin(4*x)
f2 = np.cos(4*x)
plt.plot(x, f1)
plt.plot(x, f2)
eps = 1e-1; # threshold of intersection points.
df = f1 - f2
idx_zeros = np.where(abs(df) <= eps)[0]
area = 0
for i in range(len(idx_zeros) - 1):
idx_left = idx_zeros[i]
idx_rite = idx_zeros[i+1]
area += abs(np.trapz(df[idx_left:idx_rite])) * dx
I have assumed areas to be considered positive.
The analytical value for the example I used is
sufficiently close to the computed value (area=2.819). Of course, you can improve this if your grids are finer, and threshold eps smaller.
I have data points of time and voltage that create the curve shown below.
The time data is
array([ 0.10810811, 0.75675676, 1.62162162, 2.59459459,
3.56756757, 4.21621622, 4.97297297, 4.97297297,
4.97297297, 4.97297297, 4.97297297, 4.97297297,
4.97297297, 4.97297297, 5.08108108, 5.18918919,
5.2972973 , 5.51351351, 5.72972973, 5.94594595,
6.27027027, 6.59459459, 7.13513514, 7.67567568,
8.32432432, 9.18918919, 10.05405405, 10.91891892,
11.78378378, 12.64864865, 13.51351351, 14.37837838,
15.35135135, 16.32432432, 17.08108108, 18.16216216,
19.02702703, 20. , 20. , 20. ,
20. , 20. , 20. , 20. ,
20.10810811, 20.21621622, 20.43243243, 20.64864865,
20.97297297, 21.40540541, 22.05405405, 22.91891892,
23.78378378, 24.86486486, 25.83783784, 26.7027027 ,
27.56756757, 28.54054054, 29.51351351, 30.48648649,
31.56756757, 32.64864865, 33.62162162, 34.59459459,
35.67567568, 36.64864865, 37.62162162, 38.59459459,
39.67567568, 40.75675676, 41.83783784, 42.81081081,
43.89189189, 44.97297297, 46.05405405, 47.02702703,
48.10810811, 49.18918919, 50.27027027, 51.35135135,
52.43243243, 53.51351351, 54.48648649, 55.56756757,
56.75675676, 57.72972973, 58.81081081, 59.89189189])
and the volts data is
array([ 4.11041056, 4.11041056, 4.11041056, 4.11041056, 4.11041056,
4.11041056, 4.11041056, 4.10454545, 4.09794721, 4.09208211,
4.08621701, 4.07961877, 4.07228739, 4.06568915, 4.05909091,
4.05175953, 4.04516129, 4.03782991, 4.03123167, 4.02463343,
4.01803519, 4.01217009, 4.00557185, 3.99970674, 3.99384164,
3.98797654, 3.98284457, 3.97771261, 3.97331378, 3.96891496,
3.96451613, 3.96085044, 3.95645161, 3.95205279, 3.9483871 ,
3.94398827, 3.94032258, 3.93665689, 3.94325513, 3.94985337,
3.95645161, 3.96378299, 3.97038123, 3.97624633, 3.98284457,
3.98944282, 3.99604106, 4.0026393 , 4.00923754, 4.01510264,
4.02096774, 4.02609971, 4.02903226, 4.03196481, 4.03416422,
4.0356305 , 4.03709677, 4.03856305, 4.03929619, 4.04002933,
4.04076246, 4.04222874, 4.04296188, 4.04296188, 4.04369501,
4.04442815, 4.04516129, 4.04516129, 4.04589443, 4.04589443,
4.04662757, 4.04662757, 4.0473607 , 4.0473607 , 4.04809384,
4.04809384, 4.04809384, 4.04882698, 4.04882698, 4.04882698,
4.04956012, 4.04956012, 4.04956012, 4.04956012, 4.05029326,
4.05029326, 4.05029326, 4.05029326])
I would like to determine the location of the points labeled A, B, C, D, and E. Point A is the first location where the slope goes from zero to undefined. Point B is the location where the line is no longer vertical. Point C is the minimum of the curve. Point D is where the curve is no longer vertical. Point E is where the slope is close to zero again. The Python code below determines the locations for points A and C.
tdiff = np.diff(time)
vdiff = np.diff(volts)
# point A
idxA = np.where(vdiff < 0)[0][0]
timeA = time[idxA]
voltA = volts[idxA]
# point C
idxC = volts.idxmin()
timeC = time[idxC]
voltC = volts[idxC]
How can I determine the other locations on the curve represented by points B, D, and E?
You are looking for the points that mark any location where the slope changes to or from zero or infinity. We do not not actually need to compute slopes anywhere: either yn - yn-1 == 0 and yn+1 - yn != 0, or vice versa, or the same for x.
We can take the diff of x. If one of two successive elements is zero, then the diff of the diff will be the diff or the negative diff at that point. So we just want to find and label all points where diff(x) == diff(diff(x)) and diff(x) != 0, properly adjusted for differences in size between the arrays of course. We also want all the points where the same is true for y.
In numpy terms, this is can be written as follows
def masks(vec):
d = np.diff(vec)
dd = np.diff(d)
# Mask of locations where graph goes to vertical or horizontal, depending on vec
to_mask = ((d[:-1] != 0) & (d[:-1] == -dd))
# Mask of locations where graph comes from vertical or horizontal, depending on vec
from_mask = ((d[1:] != 0) & (d[1:] == dd))
return to_mask, from_mask
to_vert_mask, from_vert_mask = masks(time)
to_horiz_mask, from_horiz_mask = masks(volts)
Keep in mind that the masks are computed on second order differences, so they are two elements shorter than the inputs. Elements in the masks correspond to elements in the input arrays with a one-element border on the leading and trailing edge (hence the index [1:-1] below). You can convert the mask to indices using np.nonzero or you can get the x- and y-values directly using the masks as indices:
def apply_mask(mask, x, y):
return x[1:-1][mask], y[1:-1][mask]
to_vert_t, to_vert_v = apply_mask(to_vert_mask, time, volts)
from_vert_t, from_vert_v = apply_mask(from_vert_mask, time, volts)
to_horiz_t, to_horiz_v = apply_mask(to_horiz_mask, time, volts)
from_horiz_t, from_horiz_v = apply_mask(from_horiz_mask, time, volts)
plt.plot(time, volts, 'b-')
plt.plot(to_vert_t, to_vert_v, 'r^', label='Plot goes vertical')
plt.plot(from_vert_t, from_vert_v, 'kv', label='Plot stops being vertical')
plt.plot(to_horiz_t, to_horiz_v, 'r>', label='Plot goes horizontal')
plt.plot(from_horiz_t, from_horiz_v, 'k<', label='Plot stops being horizontal')
plt.legend()
plt.show()
Here is the resulting plot:
Notice that because the classification is done separately, "Point A" is correctly identified as being both a spot where verticalness starts and horizontalness ends. The problem is that "Point E" does not appear to be resolvable as such according to these criteria. Zooming in shows that all of the proliferated points correctly identify horizontal line segments:
You could choose a "correct" version of "Point E" by discarding from_horiz completely, and only the last value from to_horiz:
to_horiz_t, to_horiz_v = apply_mask(to_horiz_mask, time, volts)
to_horiz_t, to_horiz_v = to_horiz_t[-1], to_horiz_v[-1]
plt.plot(time, volts, 'b-')
plt.plot(*apply_mask(to_vert_mask, time, volts), 'r^', label='Plot goes vertical')
plt.plot(*apply_mask(from_vert_mask, time, volts), 'kv', label='Plot stops being vertical')
plt.plot(to_horiz_t, to_horiz_v, 'r>', label='Plot goes horizontal')
plt.legend()
plt.show()
I am using this as a showcase for the star expansion of the results of apply_mask. The resulting plot is:
This is pretty much exactly the plot you were looking for. Discarding from_horiz also makes "Point A" be identified only as a drop to vertical, which is nice.
As multiple values in to_horiz show, this method is very sensitive to noise within the data. Your data is quite smooth, but this approach is unlikely to ever work with raw unfiltered measurements.