Server Side Google Markers Clustering - Python/Django - python

After experimenting with client side approach to clustering large numbers of Google markers I decided that it won't be possible for my project (social network with 28,000+ users).
Are there any examples of clustering the coordinates on the server side - preferably in Python/Django?
The way I would like this to work is to gradually index the markers based on their proximity (radius) and zoom level.
In another words when a new user registers he/she is automatically assigned to a certain 'group' of markers that are close to each other thus increasing the 'group's' counter. What's being send to the server is just a small number of 'groups'. Only when the zoom level/scale of map is 1:1 - actual users are shown on the map.
That way the client side will have to deal only with 10-50 markers per request/zoom level.

This is a paid service that uses server-side clustering, but I'm not sure how it works. I'm guessing that they just use your data to generate the markers to be shown at each zoom level.
Update: This tutorial demonstrates a basic server-side clustering function. It's written in PHP for the Static Maps API, but you could use it as a starting point.

You might want to take a look at the DBSCAN and OPTICS pages on wikipedia, these looks very suitable for clustering places on a map. There is also a page about Cluster Analysis that shows all the possible algorithms you can use, most would be trivial to implement using the language of your choice.
With 28k+ points, you might want to skip django and just jump into C/C++ directly, and surely not expect this to get calculated in real-time in response to web requests.

One way to do it would be to define a grid with a unit size based on the zoom level. So you collect up all the items within a grid by lat,lon to one decimal place. An example is 42.2x73.4. So a point at 42.2003x73.4021 falls in that grid cell. That cell is bounded by 42.2x73.3 and 42.2x73.5.
If there are one or more points in a grid cell, you place a marker in the center of that grid.
You then hook up the zoomend event and change your grid size accordingly, and redraw the markers.
http://code.google.com/apis/maps/documentation/reference.html#GMap2.zoomend

You can try my server-side clustering django app:
https://github.com/biodiv/anycluster
It prvides a kmeans and a grid cluster.

Related

Compressing big GeoJSON/Shapefle datasets for viewing on web browser

So I have a shapefile that is 3GB in size and as you can imagine my browser doesn't like it. How can I compress the data I have which is either in lon/lat coordinates or points on an X,Y grid?
I saw a video on Computerphile about Discreet Cosine Transforms for reducing high dimesionality data but being a programmer and not a mathematician I don't know if this is even possible. I have tried to take a point every 10 steps in the file like so: map[0:100000:10] but this had an udesireable and very lossy effect.
I would ideally like to have my data so it would work like Google Earth in which the resolution adjusts to your viewport altitude. So when you zoom in to the map higher freqency data is presented in the viewport, limiting the amount of points but I don't know how they do this and Google return nothing of value.
Last point is that since these are just vectors is there any type of vector compression I could use? I'm not to great at math so as you can imagine when I look into this I just get confused fairly quickly. I uderstand SciPy has some DCT built in and I know it has a whole bunch of other features which I don't understand, perhaps I could use this?
I can answer the "level of detail" part: you can experiment with leaflet (a javascript mapping library). You could then define a "coarse" layer wich is displayed for low zoom levels and "high detail" layers that are only displayed at higher zoom levels. You probably need to capture the map zoomend event and load/unload your layers from there.
One solution to this problem is to use a Web Map Server (WMS) like GeoServer or MapServer that stores your ShapeFile (though a spatial database like PostGIS would be better) on the server and sends a rendered image (often broken down into cacheable tiles) to the browser.

How to analyse/calculate circumference of human body parts from point cloud or 3d objects?

I am using win10, python and c#. I want to calculate circumference of human parts (belly, biceps etc) with using point cloud or 3d scans like .stl .obj .ply. Now I can get the point cloud of human body with kinect v2. I have point cloud of human body, scanned 3d human body in .stl .obj .ply formats.
I need some ideas and infos about it. I don't know how to analyse the things I have and how to calculate what I want.
Here I found an example of what I am trying to do but It doesn't need to be perfectly stable like that, Its for a school homework. Maybe you can give me some ideas about how to achieve my goal. Thank you for your help.
https://www.youtube.com/watch?time_continue=48&v=jOvaZGloNRo
I get 3d scanned object with kinect v2 and use PCL to convert it into point cloud.
I don't know about using PCL with Python or C#. In general you are looking at the following steps:
Filtering the points to the interested region
Segmenting the shape
Extracting the parameters
If you're interested in only Python, then OpenCV might be the best option. You can also develop the core logic in C++ and wrap it for Python or C#. C++ also has some nice UI libaries (Qt, nanogui), please see the following details for achieving the objective with PCL
Filtering
CropBox or PassThrough can be used for this. It'll result in similar results as shown in the image assuming that the frame has been chosen properly. If not, the points cloud can be easily transformed
Segmenting the shape
Assuming you want an average circumference, you might need to experiment with Circle 2D, Circle 3D and Cylinder models. More details regarding usage and API are here. The method chosen can be simple SAC (Sample Consensus) like RANSAC (Random SAC) or advanced method like LMEDS (Least Median of Squares) or MLESAC (Max Likelihood Estimation SAC)
Extracting the parameters
All models have a radius field which can be used to find the circumference using standard formula (2*pi*r)
Disclaimer: Please take note that the shape is circular, not ellipse and the cylinder are right angled cylinders. So if the object measured (arm, or bicep) is not circular, the computed value might not be close to ground truth in extreme cases

Package to vizualize clustering in python or Java?

I am doing an agent based modeling and currently have this set up in Python, but I can switch over to Java if necessary.
I have a dataset on Twitter (11 million nodes and 85 million directed edges), and I have set up a dictionary/hashmap so that the key is a specific user A and its value is a list of all the followers (people that follow user A). The "nodes" are actually just the integer ID numbers (unique), and there is no other data. I want to be able to visualize this data through some method of clustering. Not all individual nodes have to be visualized, but I want the nodes with the n most followers to be visualized clearly, and the surrounding area around that node would represent all the people who follow it. I'm modeling the spread of something throughout the map, so I need the nodes and areas around the nodes to change colors. Ideally, it would be a continuous visualization, but I don't mind it just taking snapshots at every ith iteration.
Additionally, I was thinking of having the clusters be separated such that:
if person A and person B have enough followers to be visualized individually, and person A and B are connected (one follows the other or maybe even both ways), then they are both visualized, but are visually separated from each other despite being connected so that the visualization is clearer.
Anyways, I was wondering whether there was a package in Python (preferably) or Java that would allow one to do this semi easily.
Gephi has a very nice GUI and an associated Java toolkit. You can experiment with visual layout in the GUI until you have everything looking the way you like and then code up your own version using the toolkit.

How to use R-Tree for plotting large number of map markers on google maps

After searching SO and multiple articles I haven't found a solution to my problem.
What I am trying to achieve is to load 20,000 markers on Google Maps.
R-Tree seems like a good approach but it's only helpful when searching for points within the visible part of the map. When the map is zoomed out it will return all of the points and...crash the browser.
There is also the problem with dragging the map and at the end of dragging re-running the query.
I would like to know how I can use R-Tree and be able to achieve the all of the above.
As noted, R-Tree won't help you when you're looking at a zoomed-out view. This problem is often addressed by marker clustering, because showing 20,000 points in a browser window isn't that useful.
Marker Manager is an open source javascript library which addresses this, but there are others.
With a very great number of markers, you may need to look at server-side clustering, (where R-Tree may come in handy!). Here is one discussion of it, and its google cache because link is dead at time of writing.
If you don't want to bother with clustering, then just terminate your marker list at a preset number, maybe a few hundred (which you can determine by usability testing), and display some indication that there are more available as you zoom in

Distance by sea calculator, intermediate coordinates?

How do I calculate distance between 2 coordinates by sea? I also want to be able to draw a route between the two coordinates.
Only solution I found so far is to split a map into pixels, identify each pixel as LAND or SEA and then try to find the path using A* algorithm. Then transform pixels to relative coordinates.
There are some software packages I could buy but none have online extensions. A service that calculates distances between sea ports and plots the path on a map is searates.com
Beware of the fact that maps can distort distances. For example, in a Mercator projections segments far away from the equator represent less actual distance than segments near the equator of equal length. If you just assign uniform cost to your pixels/squares/etc, you will end up with non-optimal routing and erroneous distance calculations.
If you project a grid on your map (pixels being just one particular grid out of many possible ones) and search for the optimal path using A*, all you need to do to get the search algorithm to behave properly is set the edge weight according to the real distance along the surface of the sphere (earth) and not the distance on the map.
Beware that simply saying "sea or not-sea" is not enough to determine navigability. There are also issues of depth, traffic routing (e.g. shipping traffic thought the English Channel is split into lanes) and political considerations (territorial waters etc). You also want to add routes manually for channels that are too small to show up on the map (Panama, Suez) and adjust their cost to cover for any overhead incurred.
Pretty much you'll need to split the sea into pixels and do something like A*. You could optimize it a bit by coalescing contiguous pixels into larger areas, but if you keep everything squares it'll probably make the search easier. The search would no longer be Manhattan-style, but if you had large enough squares, the additional connection decision time would be more than made up for.
Alternatively, you could iteratively "grow" polygons from all of your ports, building up convex polygons (so that any point within the polygon is reachable from any other without going outside, you want to avoid the PacMan shape, for instance), although this is a refinement/complication/optimization of the "squares" approach I first mentioned. The key is that you know once you're in an area that you can get to anywhere else in that area.
I don't know if this helps, sorry. It's been a long day. Good luck, though. It sounds like a fun problem!
Edit: Forgot to mention, you could also preprocess your area into a quadtree. That is, take your entire map and split it in half vertically and horizontally (you don't need to do both splits at the same time, and if you want to spend some time making "better" splits, you can do that later), and do that recursively until each node is entirely land or sea. From this you can trivially make a network of connections (just connect neighboring leaves), and the A* should be easy enough to implement from there. This'll probably be the easiest way to implement my first suggestion anyway. :)
I reached a satisfactory solution. It is along the lines of what you suggested and what I had in mind initially but it took me a while to figure out the software and GIS concepts, I am a GIS newbie. If someone bumps into something similar again here's my setup: PostGIS for PostgreSQL, maps from Natural Earth, GIS editing software qGis and OpenJUmp, routing algorithms pgRouting.
The Natural Earth maps needed some processing to be useful, I joined the marine polys and the rivers to be able to get some accurate paths to the most inland points. Then I used the 1 degree graticules to get paths from one continent to another (I need to find a more elegant solution than this because some paths look like chess cubes). All these operations can be done from command line by using PostGIS, I found it easier to use the desktop software (next, next). An alternative to Natural Earth maps might be the OpenStreetMap but the planet.osm dump is aroung 200Gb and that discouraged me.
I think this setup also solves the distance accuracy problem, PostGIS takes into account the Earth's actual form and distances should be pretty accurate.
I still need to do some testing and fine tunings but I can say it can calculate and draw a route from any 2 points on the world's coastlines (no small isolated islands yet) and display the routing points names (channels, seas, rivers, oceans).

Categories