Finding intersections

Finding intersections - python

Given a scenario where there are millions of potentially overlapping bounding boxes of variable sizes less the 5km in width.
Create a fast function with the arguments findIntersections(Longitude,Latitude,Radius) and the output is a list of those bounding boxes ids where each bounding box origin is inside the perimeter of the function argument dimensions.
How do I solve this problem elegantly?

This is normally done using an R-tree data structure
dbs like mysql or postgresql have GIS modules that use an r-tree under the hood to quickly retrieve locations within a certain proximity to a point on a map.
From http://en.wikipedia.org/wiki/R-tree:
R-trees are tree data structures that
are similar to B-trees, but are used
for spatial access methods, i.e., for
indexing multi-dimensional
information; for example, the (X, Y)
coordinates of geographical data. A
common real-world usage for an R-tree
might be: "Find all museums within 2
kilometres (1.2 mi) of my current
location".
The data structure splits space with
hierarchically nested, and possibly
overlapping, minimum bounding
rectangles (MBRs, otherwise known as
bounding boxes, i.e. "rectangle", what
the "R" in R-tree stands for).
The Priority R-Tree (PR-tree) is a variant that has a maximum running time of:
"O((N/B)^(1-1/d)+T/B) I/Os, where N is the number of d-dimensional (hyper-)
rectangles stored in the R-tree, B is the disk block size, and T is the output
size."
In practice most real-world queries will have a much quicker average case run time.
fyi, in addition to the other great code posted, there's some cool stuff like SpatiaLite and SQLite R-tree module

PostGIS is an open-source GIS extention for postgresql.
They have ST_Intersects and ST_Intersection functions available.
If your interested you can dig around and see how it's implemented there:
http://svn.osgeo.org/postgis/trunk/postgis/

This seems like a better more general approach GiST
http://en.wikipedia.org/wiki/GiST

Related

Irregular Polygon Sub-division?

I have a problem in which I have an irregular polygon shaped container which needs to be sub-divided such that it contains certain smaller rectangles of specific dimensions. I have previously tried addressing this problem using a bin packing approach which seems to be too time consuming. As a result I wanted to try out sub-division based approaches where the algorithm ensures that the sub-division contains certain smaller rectangles of specific dimensions while the remaining space can be considered as empty space. Any suggestions on fast algorithms for the generation of these sub-divisions preferably using the shapely library?
The irregular polygon container would look some thing like the diagram below. The edges of the container will always be right angles and it is known that items to be packed can all fit in to the given container with additional remaining free space. The constraints are the same as regular bing packing where the items should not overlap and all items should be completely within the container. The items can be rotated (preferably at right angles).

Python. Data structure for storing objects coordinates on map

I'm developing the game server. I want data structure for storing 3d-coordinates (x, y, z) of the objects on the map. This objects can dynamically change their position (move, teleport, be destroyed etc). I want to store this coordinates in specific order according to their visual position on the map. So, I think using graph for this purpose is good choice. But I'm not sure.
Please, notice, I need to store real-time coordinates for many objects (about 4-5 thousands). Does it exists library or algorithm to implement what I need ?

You want a spatial structure like an Octree (for 3D spaces) or Quadtree (for 2D spaces).
They partition their contents into buckets that group neighboring coordinates together. This makes doing checks surrounding a coordinate quick because only one bucket of coords (or, only buckets immediately surrounding the coord's bucket for larger checks) need to be checked. This can greatly limit the number of objects that need to be checked.
These structures arr commonly used for collision detection in games.

Get n largest regions from binary image

I have given a large binary image (every pixel is either 1 or 0).
I know that in that image there are multiple regions (a region is defined as a set of neighboring 1s which are enclosed by 0s).
The goal is to find the largest (in terms of pixel-count or enclosed area, both would work out for me for now)
My current planned approach is to:
start an array of array of coordinates of the 1s (or 0s, whatever represents a 'hit')
until no more steps can be made:
for the current region (which is a set of coordinates) do:
see if any region interfaces with the current region, if yes add them togther, if no continue with the next iteration
My question is: is there a more efficient way of doing this, and are there already tested (bonus points for parallel or GPU-accelerated) implementations out there (in any of the big libraries) ?

You could Flood Fill every region with an unique ID, mapping the ID to the size of the region.

You want to use connected component analysis (a.k.a. labeling). It is more or less what you suggest to do, but there are extremely efficient algorithms out there. Answers to this question explain some of the algorithms. See also connected-components.
This library collects different efficient algorithms and compares them.
From within Python, you probably want to use OpenCV. cv.connectedComponentsWithStats does connected component analysis and outputs statistics, among other things the area for each connected component.
With regards to your suggestion: using coordinates of pixels rather than the original image matrix directly is highly inefficient: looking for neighbor pixels in an image is trivial, looking for the same in a list of coordinates requires expensive searchers.

python: sorting two lists of polygons for intersections

I have two big lists of polygons.
Using python, I want to take each polygon in list 1, and find the results of its geometric intersection with the polygons in list 2 (I'm using shapely to do this).
So for polygon i in list 1, there may be several polygons in list 2 that would intersect with it.
The problem is that both lists are big, and if I simply nest two loops and run the intersection command for every
possible pair of polygons, it takes a really long time. I'm not sure if preceding the intersection with a boolean test would speed this up significantly (e.g. if intersects: return intersection).
What would be a good way for me to sort or organize these two lists of polygons in order to make the intersections
more efficient? Is there a sorting algorithm that would be appropriate to this situation, and which I could make with python?
I am relatively new to programming, and have no background in discrete mathematics, so if you know an existing algorithm
that I should use, (which I assume exist for these kinds of situations), please link to or give some explanation that could assist me in actually
implementing it in python.
Also, if there's a better StackExchange site for this question, let me know. I feel like it kind of bridges general python programming, gis, and geometry, so I wasn't really sure.

Quadtrees are often used for the purpose of narrowing down the sets of polygons that need to be checked against each other - two polygons only need to be checked against each other if they both occupy at least one of the same regions in the quadtree. How deep you make your quadtree (in the case of polygons, as opposed to points) is up to you.

Even just dividing your space up to smaller constant-size areas would speed up the intersection detection (if your polygons are small and sparse enough). You make a grid and mark each polygon to belong to some cells in the grid. And then find cells that have more than one polygon in them and make the intersection calculations for those polygons only. This optimization is the easiest to code, but the most ineffective. The second easiest and more effective way would be quadtrees. Then there are BSP tres, KD trees, and BVH trees that are probably the most effective, but the hardest to code.
Edit:
Another optimization would be the following: find out the left-most and the right-most vertices of each polygon and put them in a list. Sort the list and then loop it somehow from left to right and easily find polygons whose bounding boxes' x coordinates overlap, and then make the intersection calculations for those polygons.

Distance by sea calculator, intermediate coordinates?

How do I calculate distance between 2 coordinates by sea? I also want to be able to draw a route between the two coordinates.
Only solution I found so far is to split a map into pixels, identify each pixel as LAND or SEA and then try to find the path using A* algorithm. Then transform pixels to relative coordinates.
There are some software packages I could buy but none have online extensions. A service that calculates distances between sea ports and plots the path on a map is searates.com

Beware of the fact that maps can distort distances. For example, in a Mercator projections segments far away from the equator represent less actual distance than segments near the equator of equal length. If you just assign uniform cost to your pixels/squares/etc, you will end up with non-optimal routing and erroneous distance calculations.
If you project a grid on your map (pixels being just one particular grid out of many possible ones) and search for the optimal path using A*, all you need to do to get the search algorithm to behave properly is set the edge weight according to the real distance along the surface of the sphere (earth) and not the distance on the map.
Beware that simply saying "sea or not-sea" is not enough to determine navigability. There are also issues of depth, traffic routing (e.g. shipping traffic thought the English Channel is split into lanes) and political considerations (territorial waters etc). You also want to add routes manually for channels that are too small to show up on the map (Panama, Suez) and adjust their cost to cover for any overhead incurred.

Pretty much you'll need to split the sea into pixels and do something like A*. You could optimize it a bit by coalescing contiguous pixels into larger areas, but if you keep everything squares it'll probably make the search easier. The search would no longer be Manhattan-style, but if you had large enough squares, the additional connection decision time would be more than made up for.
Alternatively, you could iteratively "grow" polygons from all of your ports, building up convex polygons (so that any point within the polygon is reachable from any other without going outside, you want to avoid the PacMan shape, for instance), although this is a refinement/complication/optimization of the "squares" approach I first mentioned. The key is that you know once you're in an area that you can get to anywhere else in that area.
I don't know if this helps, sorry. It's been a long day. Good luck, though. It sounds like a fun problem!
Edit: Forgot to mention, you could also preprocess your area into a quadtree. That is, take your entire map and split it in half vertically and horizontally (you don't need to do both splits at the same time, and if you want to spend some time making "better" splits, you can do that later), and do that recursively until each node is entirely land or sea. From this you can trivially make a network of connections (just connect neighboring leaves), and the A* should be easy enough to implement from there. This'll probably be the easiest way to implement my first suggestion anyway. :)

I reached a satisfactory solution. It is along the lines of what you suggested and what I had in mind initially but it took me a while to figure out the software and GIS concepts, I am a GIS newbie. If someone bumps into something similar again here's my setup: PostGIS for PostgreSQL, maps from Natural Earth, GIS editing software qGis and OpenJUmp, routing algorithms pgRouting.
The Natural Earth maps needed some processing to be useful, I joined the marine polys and the rivers to be able to get some accurate paths to the most inland points. Then I used the 1 degree graticules to get paths from one continent to another (I need to find a more elegant solution than this because some paths look like chess cubes). All these operations can be done from command line by using PostGIS, I found it easier to use the desktop software (next, next). An alternative to Natural Earth maps might be the OpenStreetMap but the planet.osm dump is aroung 200Gb and that discouraged me.
I think this setup also solves the distance accuracy problem, PostGIS takes into account the Earth's actual form and distances should be pretty accurate.
I still need to do some testing and fine tunings but I can say it can calculate and draw a route from any 2 points on the world's coastlines (no small isolated islands yet) and display the routing points names (channels, seas, rivers, oceans).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.