Add square to polygon composed of squares

Add square to polygon composed of squares - python

I have a collection of 1*1 polygons, each of which is defined by its boundary (a set of four points) and I use the function area() below in my code example to create these. I wish to combine such adjacent squares into a single polygon, with this polygon also defined in terms of its boundary points.
I wish to do this in a brute force fashion where I begin by adding two adjacent 1*1 squares to create a larger polygon using the function join() below and continue in this fashion in order to grow the polygon. So the first argument of join is the polygon so far and the second argument is an adjecent 1*1 square which I wish to add to the polygon. The return value is the boundary of the new polygon, current joined with the new 1*1.
Here's what I've come with so far:
def join(current, new):
""" current is the polygon, new the 1*1 square being added to it"""
return get_non_touching_part_of_boundary(current, new) + get_non_touching_part_of_boundary(new, current)
def get_non_touching_part_of_boundary(this, other):
for i,point in enumerate(this):
if point not in other and this[i-1] in other:
break # start of non touching boundary from a clockwise perspective
non_touching_part_of_boundary = []
for point in this[i:] + this[:i]:
if not point in other:
non_touching_part_of_boundary.append(point)
return non_touching_part_of_boundary
def area(point):
""" boundary defined in a clockwise fashion """
return [point,(point[0],point[1]+1),(point[0]+1,point[1]+1),(point[0]+1,point[1])]
a = area((0,1)) # a assigned a 1*1 polygon
a = join(a, area((0,2))) # a assigned a 2*1 polygon
a = join(a, area((1,2)))
a = join(a, area((2,2)))
a = join(a, area((2,1)))
a = join(a, area((2,0)))
print(a)
This gives me the following polygon shape (with the numbers representing the order in which which its composing squares are added):
234
1 5
6
The printed output of the code above gives:
[(2, 2), (1, 2), (1, 1), (0, 1), (0, 3), (3, 3), (3, 0), (2, 0)]
this is the minimum number of points required to defined the boundary of the polygon.
But if I add one more square to this shape via a = join(a, area((1,0))) and thereby creating a hole, my algorithm falls to pieces:
234
1 5
76
Here's another polygon that my algorithm can't handle:
123
64
5
Can anyone help me? I'd like the holes in the polygon to be listed in a separate list.
Thank!

I think that your algorithm is hard to fix. Consider that for example adding a single square to a polygon could create several holes:
xxx
x x
xxx xxx
x y x
xxx xxx
x x
xxx
imagine for example that all the x are the "current polygon" and you then andd y...
In general a closed area is defined by a collection of closed loops and you can use just a single list only with a much more complex approach of creating zero-area bridges between the loops. A simple approach to what seems to me you are looking for is radically different:
loop on every square and for each one:
if the square is in and the square to the left is out (or vice versa) then you know you need a wall on the left
if the square is in and the square above is out (or vice versa) then you know that you need a wall above
collect all the edges in a dictionary where for every coordinate pair you keep a list of all walls starting or arriving at that edge
once the scan is finished you can rebuild the resulting loops by starting with any wall and keeping following the chain until you get back to the initial point... in case that when you arrive at a point there are multiple choiches about which wall to use for continue the walk then pick any of them.
if you collected the data properly and assuming you can add a border of one "out" cell around your data then it's guaranteed that you will get as result a list of zero or more closed loops because each point will list an even number of walls.
Those loops (when considering an odd-even filling rule) will be defining your initial area. Note that you may get self-intersecting loops... if you want to avoid that the algorithm is slightly more complex.
This approach is also much faster than processing boundaries one at a time and doing all those merge operations and the result will be general (including non-connected areas and holes).
EDIT
The following image is the result of a complete implementation of this algorithm including a right-turn logic during cycle collection to avoid self-intersecting cycles. Different colors have been assigned to output polygons and corners have been cut to make the turns evident.

Related

"numpy.linspace" for second time after excluding some point by first "linspace"

I am building a model and I need to get the positions of some points inside a box (known volume). I am thinking on using
a) numpy.linspace(start,stop,30)
b) numpy.linspace(start,stop,3000)
from the same box, I think I need a tool to exclude the points from a) process.
Example as [2D]
say that we have a line of length 20, and we need to distribute two types of lines:
1)10 pieces of 1 length, 2) 4 pieces of 2 length.
-The space between piece(small line)from type 1 and any neighbors is equal whatever the neighbor is type 1 or 2.
-The number of small pieces are equally distributed around type 2 piece.

This solution is the only one that worked for me:
get the xyz file, by any other software like jmole.
you have the orientations of the model.
I wrote the orientations into my program to avoid overlapping.

Does
filtered_b = np.setdiff1d(np.linspace(start, stop, 3000), np.linspace(start, stop, 30))
This chooces points that are b that are not in a.

Find third coordinate of (right) triangle given 2 coordinates and ray to third

I start explaining my problem from very far, so you could suggest completely different approaches and understand custom objects and functions.
Over years I have recorded many bicycle GPS tracks (.gpx). I decided to merge these (mostly overlapping) tracks into a large graph and merge/remove most of track points. So far, I have managed to simplify tracks (feature in gpxpy module, that removes about 90% of track-points, while preserving positions of corners) and load them into my current program.
Current Python 3 program consists of loading gpx tracks and optimising graph with four scans. Here's planned steps in my program:
Import points from gpx (working)
Join points located close to each other (working)
Merge edges under small angles (Problem is with this step)
Remove points on straights (angle between both edges is over 170 degrees). Looks like it is working.
Clean-up by resetting unique indexing of points (working)
Final checking of all edges in graph.
In my program I started counting steps with 0, because first one is simply opening and parsing file. Stackoverflow doesn't let me to start ordering from 0.
To store graph, I have a dictionary punktid (points in estonian), where punkt (point) object is stored at key uid/ui (unique ID). Unique ID is also stored in point itself too. Weight attribute is used in 2-nd and 3-rd step to find average of points while taking into account earlier merges.
class punkt:
def __init__(self,lo,la,idd,edge=set(),ele=0, wei=1):
self.lng=lo #Longtitude
self.lat=la #Latitude
self.uid=idd #Unique ID
self.edges=edge #Set of neighbour nodes
self.att=ele #Elevation
self.weight=wei #Used to get weighted average
>>> punktid
{1: <__main__.punkt object at 0x0000006E9A9F7FD0>,
2: <__main__.punkt object at 0x0000006E9AADC470>, 3: ...}
>>> punktid[1].__dict__
{'weight': 90, 'uid': 9000, 'att': 21.09333333333333, 'lat': 59.41757, 'lng': 24.73907, 'edges': {1613, 1218, 1530}}
As you can see, there is a minor bug in clean-up, where uid was not updated. I have fixed it by now, but I left it in, so you could see scale of graph. Largest index in punktid was 1699/11787.
Getting to core problem
Let's say I have 3 points: A, B and C (i, lyhem(2) and lyhem(0) respectively in following code slice). A has common edge with B and C, but B and C might not have common edge. C is closer to A than B. To reduce size of graph, I want to move C closer to edge AB (while respecting weights of B and C) and redirect AB through C.
Solution I came up with is to find temporary point D on AB, which is closest to C. Then find weighted average between D and C, save it as E and redirect all C edges and AB to that. Simplified figure - note, that E=(C+D)/2 is not completely accurate. I cannot add more than two links, but I have additional 2 images illustrating my problem.
Biggest problem was finding coordinates of D. I found possible solution on Mathematica site, but it contains ± sign, because when finding coordinate there are two possible coordinates. But I have line, where point is located on. Anyway, I don't know how to implement it correctly and my code has become quite messy:
# 2-nd run: Merge edges under small angles
for i in set(punktid.keys()):
try:
naabrid1=frozenset(punktid[i].edges) # naabrid / neighbours
for e in naabrid1:
t=set(naabrid1)
t.remove(e)
for u in t:
try:
a=nurk_3(punktid[i], punktid[e], punktid[u]) #Returns angle EIU in degrees. 0<=a<=180
if a<10:
de=((punktid[i].lat-punktid[e].lat)**2+
((punktid[i].lng-punktid[u].lng))*2 **2) #distance i-e
du=((punktid[i].lat-punktid[u].lat)**2+
((punktid[i].lng-punktid[u].lng)*2) **2) #distance i-u
b=radians(a)
if du<de:
lyhem=[u,du,e] # lühem in English is shorter
else: # but currently it should be lähem/closer
lyhem=[e,de,u]
if sin(b)*lyhem[1]<r:
lr=abs(sin(b)*lyhem[1])
ml=tan(nurk_coor(punktid[i],punktid[lyhem[0]])) #Lühema tõus / Slope of closer (C)
mp=tan(nurk_coor(punktid[i],punktid[lyhem[2]])) #Pikema / ...farer / B
mr=-1/ml #Ristsirge / ...BD
p1=(punktid[i].lng+lyhem[1]*(1/(1+ml**2)**0.5), punktid[i].lat+lyhem[1]*(ml/(1+ml**2)**0.5))
p2=(punktid[i].lng-lyhem[1]*(1/(1+ml**2)**0.5), punktid[i].lat-lyhem[1]*(ml/(1+ml**2)**0.5))
d1=((punktid[lyhem[0]].lat-p1[1])**2+
((punktid[lyhem[0]].lng-p1[0])*2)**2)**0.5 #distance i-e
d2=((punktid[lyhem[0]].lat-p2[1])**2+
((punktid[lyhem[0]].lng-p2[0])*2)**2)**0.5 #distance i-u
if d1<d2: # I experimented with one idea,
x=p1[0]#but it made things worse.
y=p1[1]#Originally I simply used p1 coordinates
else:
x=p2[0]
y=p2[1]
lo=punktid[lyhem[2]].weight*p2[0] # Finding weighted average
la=punktid[lyhem[2]].weight*p2[1]
la+=punktid[lyhem[0]].weight*punktid[lyhem[0]].lat
lo+=punktid[lyhem[0]].weight*punktid[lyhem[0]].lng
kaal=punktid[lyhem[2]].weight+punktid[lyhem[0]].weight #kaal = weight
c=(la/kaal,lo/kaal)
punktid[ui]=punkt(c[1],c[0], ui,punktid[lyhem[0]].edges, punktid[lyhem[0]].att,kaal)
punktid[i].edges.remove(lyhem[2])
punktid[lyhem[2]].edges.remove(i)
try:
for n in punktid[ui].edges: #In all neighbours
try: #Remove link to old point
punktid[n].edges.remove(lyhem[0])
except KeyError:
pass #If it doesn't link to current
punktid[n].edges.add(ui) #And add new point
if log:
printf(punktid[n].edges,'naabri '+str(n)+' edges')
except KeyError: #If neighbour itself has been removed
pass #(in same merge), Ignore
punktid[ui].edges.add(lyhem[2])
punktid[lyhem[2]].edges.add(ui)
punktid.pop(lyhem[0])
ui+=1
except KeyError: # u has been removed
pass
except KeyError: # i has been removed
pass
This is a code segment and it is likely to not run after copy-pasting because of missing variables/functions. New point is being calculated on lines 22 to 43, in 3rd if-statement from beginning if sin(b)*lyhem[1]<r to punktid[ui]=... After that is redirecting old edges to new node.
Stating question clearly: How to find point on ray (AB), if two coordinates of line segment (AC) and angles at these points are known (angle ACB should be 90 degrees)? How to implement it in Python 3.5?
PS. (Meta) If somebody needs full source, how could I provide it (uploading single text file without registration)? Pastebin or pasting (spamming) it here? If I upload it to other site, how to provide link, if newbie users are limited to two?

Using flood fill algorithm to determine map area with equal height

I have a list of lists where each element represents the average height in integers of a all square metres contained in the map (one number= one square metre). For example:
map=[
[1,1,1,1],
[1,1,2,2],
[1,2,2,2]
] # where 1 and 2 are the average heights of those coordenates.
I'm trying to implement a method that, given a position looks for the area around him that has the same height. let's call them 'Flat areas'.
I found a solution in the flood-fill algorithm. However, i'm having some problems when it comes to writing the code. I get a
RuntimeError: maximum recursion depth exceeded
I've no idea of where my problem is. Here it is the code of the function:
def zona_igual_alcada(self,pos,zones=[],h=None):
x,y=pos
if h==None:
h=base_terreny.base_terreny.__getitem__(self,(x,y))
if base_terreny.base_terreny.__getitem__(self,(x,y))!=h:
return
if x in range(0,self.files) and y in range(0,self.columnes):
if base_terreny.base_terreny.__getitem__(self,(x,y))==h:
zones.append((x,y))
terreny.zona_igual_alcada(self,(x-1,y),zones,h)
terreny.zona_igual_alcada(self,(x+1,y),zones,h)
terreny.zona_igual_alcada(self,(x,y-1),zones,h)
terreny.zona_igual_alcada(self,(x,y+1),zones,h)
return set(zones)

You're not doing anything to "mark" the zones you have already visited, so you are doing the same zones over and over until the stack fills up.
This isn't a particularly efficient way to do a flood fill, so if you have a large number of zones you will be better off looking for a more efficient algorithm to do the flood fill (eg. scanline fill).

Modeling a graph in Python

I'm trying to solve a problem related to graphs in Python. Since its a comeptitive programming problem, I'm not using any other 3rd party packages.
The problem presents a graph in the form of a 5 X 5 square grid.
A bot is assumed to be at a user supplied position on the grid. The grid is indexed at (0,0) on the top left and (4,4) on the bottom right. Each cell in the grid is represented by any of the following 3 characters. ‘b’ (ascii value 98) indicates the bot’s current position, ‘d’ (ascii value 100) indicates a dirty cell and ‘-‘ (ascii value 45) indicates a clean cell in the grid.
For example below is a sample grid where the bot is at 0 0:
b---d
-d--d
--dd-
--d--
----d
The goal is to clean all the cells in the grid, in minimum number of steps.
A step is defined as a task, where either
i) The bot changes it position
ii) The bot changes the state of the cell (from d to -)
Assume that initially the position marked as b need not be cleaned. The bot is allowed to move UP, DOWN, LEFT and RIGHT.
My approach
I've read a couple of tutorials on graphs,and decided to model the graph as an adjacency matrix of 25 X 25 with 0 representing no paths, and 1 representing paths in the matrix (since we can move only in 4 directions). Next, I decided to apply Floyd Warshell's all pairs shortest path algorithm to it, and then sum up the values of the paths.
But I have a feeling that it won't work.
I'm in a delimma that the problem is either one of the following:
i) A Minimal Spanning Tree (which I'm unable to do, as I'm not able to model and store the grid as a graph).
ii) A* Search (Again a wild guess, but the same problem here, I'm not able to model the grid as a graph properly).
I'd be thankful if you could suggest a good approach at problems like these. Also, some hint and psuedocode about various forms of graph based problems (or links to those) would be helpful. Thank

I think you're asking two questions here.
1. How do I represent this problem as a graph in Python?
As the robot moves around, he'll be moving from one dirty square to another, sometimes passing through some clean spaces along the way. Your job is to figure out the order in which to visit the dirty squares.
# Code is untested and may contain typos. :-)
# A list of the (x, y) coordinates of all of the dirty squares.
dirty_squares = [(0, 4), (1, 1), etc.]
n = len(dirty_squares)
# Everywhere after here, refer to dirty squares by their index
# into dirty_squares.
def compute_distance(i, j):
return (abs(dirty_squares[i][0] - dirty_squares[j][0])
+ abs(dirty_squares[i][1] - dirty_squares[j][1]))
# distances[i][j] is the cost to move from dirty square i to
# dirty square j.
distances = []
for i in range(n):
distances.append([compute_distance(i, j) for j in range(n)])
# The x, y coordinates of where the robot starts.
start_node = (0, 0)
# first_move_distances[i] is the cost to move from the robot's
# start location to dirty square i.
first_move_distances = [
abs(start_node[0] - dirty_squares[i][0])
+ abs(start_node[1] - dirty_squares[i][1]))
for i in range(n)]
# order is a list of the dirty squares.
def cost(order):
if not order:
return 0 # Cleaning 0 dirty squares is free.
return (first_move_distances[order[0]]
+ sum(distances[order[i]][order[i+1]]
for i in range(len(order)-1)))
Your goal is to find a way to reorder list(range(n)) that minimizes the cost.
2. How do I find the minimum number of moves to solve this problem?
As others have pointed out, the generalized form of this problem is intractable (NP-Hard). You have two pieces of information that help constrain the problem to make it tractable:
The graph is a grid.
There are at most 24 dirty squares.
I like your instinct to use A* here. It's often good for solving find-the-minimum-number-of-moves problems. However, A* requires a fair amount of code. I think you'd be better of going with a Branch-and-Bound approach (sometimes called Branch-and-Prune), which should be almost as efficient but is much easier to implement.
The idea is to start enumerating all possible solutions using a depth-first-search, like so:
# Each list represents a sequence of dirty nodes.
[]
[1]
[1, 2]
[1, 2, 3]
[1, 3]
[1, 3, 2]
[2]
[2, 1]
[2, 1, 3]
Every time you're about to recurse into a branch, check to see if that branch is more expensive than the cheapest solution found so far. If so, you can skip the whole branch.
If that's not efficient enough, add a function to calculate a lower bound on the remaining cost. Then if cost([2]) + lower_bound(set([1, 3])) is more expensive than the cheapest solution found so far, you can skip the whole branch. The tighter lower_bound() is, the more branches you can skip.

Let's say V={v|v=b or v=d}, and get a full connected graph G(V,E). You could calculate the cost of each edge in E with a time complexity of O(n^2). Afterwards the problem becomes exactly the same as: Start at a specified vertex, and find a shortest path of G which covers V.
We call this Traveling Salesman Problem(TSP) since 1832.

The problem can certainly be stored as a graph. The cost between nodes (dirty cells) is their Manhattan distance. Ignore the cost of cleaning cells, because that total cost will be the same no matter what path taken.

This problem looks to me like the Minimum Rectilinear Steiner Tree problem. Unfortunately, the problem is NP hard, so you'll need to come up with an approximation (a Minimum Spanning Tree based on Manhattan distance), if I am correct.

How do I check if cartesian coordinates make up a rectangle efficiently?

The situation is as follows:
There are N arrays.
In each array (0..N-1) there are (x,y) tuples (cartesian coordinates) stored
The length of each array can be different
I want to extract the subset of coordinate combinations which make up a complete
retangle of size N. In other words; all the cartesian coordinates are adjacent to each other.
Example:
findRectangles({
{*(1,1), (3,5), (6,9)},
{(9,4), *(2,2), (5,5)},
{(5,1)},
{*(1,2), (3,6)},
{*(2,1), (3,3)}
})
yields the following:
[(1,1),(1,2),(2,1),(2,2)],
...,
...(other solutions)...
No two points can come from the same set.
I first just calculated the cartesian product, but this quickly becomes infeasible (my use-case at the moment has 18 arrays of points with each array roughly containing 10 different coordinates).

You can use hashing to great effect:
hash each point (keeping track of which list it is in)
for each pair of points (a,b) and (c,d):
if (a,d) exists in another list, and (c,b) exists in yet another list:
yield rectangle(...)
When I say exists, I mean do something like:
hashesToPoints = {}
for p in points:
hashesToPoints.setdefault(hash(p),set()).add(p)
for p1 in points:
for p2 in points:
p3,p4 = mixCoordinates(p1,p2)
if p3 in hashesToPoints[hash(p3)] and {{p3 doesn't share a bin with p1,p2}}:
if p4 in hashesToPoints[hash(p4)] and {{p4 doesn't share a bin with p1,p2,p3}}:
yield Rectangle(p1,p2)
This is O(#bins^2 * items_per_bin^2)~30000, which is downright speedy in your case of 18 arrays and 10 items_per_bin -- much better than the outer product approach which is... much worse with O(items_per_bin^#bins)~3trillion. =)
minor sidenote:
You can reduce both the base and exponent in your computation by making multiple passes of "pruning". e.g.
remove each point that is not corectilinear with another point in the X or Y direction
then maybe remove each point that is not corectilinear with 2 other points, in both X and Y direction
You can do this by sorting according to the X-coordinate, repeat for the Y-coordinate, in O(P log(P)) time in terms of number of points. You may be able to do this at the same time as the hashing too. If a bad guy is arranging your input, he can make this optimization not work at all. But depending on your distribution you may see significant speedup.

Let XY be your set of arrays. Construct two new sets X and Y, where X equals XY with all arrays sorted to x-coordinate and Y equals XY with all arrays sorted to y-coordinate.
For each point (x0,y0) in any of the arrays in X: find every point (x0,y1) with the same x-coordinate and a different y-coordinate in the remaining arrays from X
For each such pair of points (if it exists): search Y for points (x1,y0) and (x1,y1)
Let C be the size of the largest array. Then sorting all sets takes time O(N*C*log(C)). In step 1, finding a single matching point takes time O(N*log(C)) since all arrays in X are sorted. Finding all such points is in O(C*N), since there are at most C*N points overall. Step 2 takes time O(N*log(C)) since Y is sorted.
Hence, the overall asymptotic runtime is in O(C * N^2 * log(C)^2).
For C==10 and N==18, you'll get roughly 10.000 operations. Multiply that by 2, since I dropped that factor due to Big-O-notation.
The solution has the further benefit of being extremely simple to implement. All you need is arrays, sorting and binary search, the first two of which very likely being built into the language already, and binary search being extremely simple.
Also note that this is the runtime in the worst case where all rectangles start at the same x-coordinate. In the average case, you'll probably do much better than this.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.