How to determine the projection or coordinate reference system given spatial points - python

I am just a starter to Spatial Analysis and am stuck at a point.
I have a crime data set where the points are given in latitude and longitude. I have another dataset (a shape file of Chicago) and I would like to plot all the lat-long points on top of map plot using the polygons from the shape file.
The problem is that the shape file contains polygon information in a different format which I am unaware of. I retrieve the shape file from
https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Neighborhoods/9wp7-iasj
From the above download I use the Neighborhoods_2012b.shp file
Latitude Longitude from crime data:
POINT (-87.680162979 41.998718085)
POINT (-87.746717696 41.934629749)
Polygon shapes in the Chicago Shapefile: (All are positive values)
POLYGON ((1182322.0429 1876674.730700001, 1182...
POLYGON ((1176452.803199999 1897600.927599996,...
I tried transforming the Latitude and Longitude information into different projection (Mercator) such as (epsg:3857, epsg:3395), but these projection give me both positive and Negative values
epsg:3857:
POINT (-9760511.095493518 5160787.421333898)
POINT (-9767919.932699846 5151192.321624438)
I even tried transforming all Lat-Long into UTM (using the python UTM library), which hopefully gives me all positive value but still it doesn't seem the right format as the plots are at very different scale.
Using UTM python Library (utm.from_latlon)
POINT (4649857.621612935 443669.2483944244)
POINT (4642787.870839979 438095.1726599361)
I am not sure how to handle this situation, Is there a way to know what type of projection is used given the spatial points?
I'd be glad for any help.

The prj file says:
PROJCS["NAD_1983_StatePlane_Illinois_East_FIPS_1201_Feet",GEOGCS["GCS_North_American_1983",DATUM["D_North_American_1983",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",984250.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",-88.33333333333333],PARAMETER["Scale_Factor",0.999975],PARAMETER["Latitude_Of_Origin",36.66666666666666],UNIT["Foot_US",0.3048006096012192]]
I opened the layer with QGIS and it did not use the prj file directly. However, with the information of the prj file, you can use the CRS selector to retrieve it. Search parameters are : NAD83 Illinois East here. Choose the one that is in Feet as suggested by the prj file. EPSG = 6455 is a good one for instance. I think you now have enough information to continue...

Related

Points in Polygons. How can I match them spatially with given coordinates?

I have a dataset of georeferenced flickr posts (ca. 35k, picture below) and I have an unrelated dataset of georeferenced polygons (ca. 40k, picture below), both are currently panda dataframes. The polygons do not cover the entire area where flickr posts are possible. I am having trouble understanding how to sort many different points in many different polygons (or check if they are close). In the end I want a map with the points from the flickerdata in polygons colord to an attribute (Tag). I am trying to do this in Python. Do you have any ideas or recommendations?
Point dataframe Polygon dataframe
Since, you don't have any sample data to load and play with, my answer will be descriptive in nature, trying to explain some possible strategies to approach the problem you are trying to solve.
I assume that:
these polygons are probably some addresses and you essentially want to place the geolocated flickr posts to the nearest best-match among the polygons.
First of all, you need to identify or acquire information on the precision of those flickr geolocations. How off could they possibly be because of numerous sources of errors (the reason behind those errors is not your concern, but the amount of error is). This will give you an idea of a circle of confusion (2D) or more likely a sphere of confusion (3D). Why 3D? Well, you might have flickr post from a certain elevation on a high-rise apartment, and so, (x: latitude,y: longitude, z: altitude) all may be necessary to consider. But, you have to study the data and any other information available to you to determine the best option here (2D/3D space-of-confusion).
Once you have figured out the type of ND-space-of-confusion, you will need a distance metric (typically just a distance between two points) -- call this sigma. Just to be on the safe side, find all the addresses (geopolygons) within a radius of 1 sigma and additionally within 2 sigma -- these are your possible set of target addresses. For each of these addresses have a variable that calculates its distances of its centroid, and the four corners of its rectangular outer bounding box from the flickr geolocations.
You will then want to rank these addresses for each flickr geolocation, based on their distances for all the five points. You will need a way of identifying a flickr point that is far from a big building's center (distance from centroid could be way more than distance from the corners) but closer to it's edges vs. a different property with smaller area-footprint.
For each flickr point, thus you would have multiple predictions with different probabilities (convert the distance metric based scores into probabilities) using the distances, on which polygon they belong to.
Thus, if you choose any flickr location, you should be able to show top-k geopolygons that flickr location could belong to (with probabilities).
For visualizations, I would suggest you to use holoviews with datashader as that should be able to take care of curse of dimension in your data. Also, please take a look at leafmap (or, geemap).
References
holoviews: https://holoviews.org/
datshader: https://datashader.org/
leafmap: https://leafmap.org/
geemap: https://geemap.org/

How do I label the pointcloud if I have the 3D boxes of the objects annotated?

I am trying to annotate my point cloud data. I found a number of tools but could only access the demo version of 3D Point cloud tool by Supervisely. Once the annotation is complete by drawing 3D boxes around the objects, the output annotation file is in the form of a JSON file. This contains, the class and global position, orientation and dimensions of the box. How do I use this file to assign labels to the points inside these 3D boxes. I want the output format in either .pcd file or .bin file.
The output of the JSON file is as follows:
[{"id":36698,"name":"vel_1558647824006782.pcd","annotations":[{"className":"Car","geometryType":"cuboid","classId":957,"geometry":{"position":{"x":9.539855967959713,"y":18.342023271012913,"z":0.43944128482454614},"rotation":{"x":0,"y":0,"z":0},"dimensions":{"x":5.691547052392309,"y":1.6625674002633986,"z":1.757779283656416}}},{"className":"ground","geometryType":"cuboid","classId":958,"geometry":{"position":{"x":28.890481890779242,"y":8.463823613489927,"z":-1.0314986175132965},"rotation":{"x":0,"y":0,"z":0},"dimensions":{"x":96.34273328620523,"y":18.714553504372063,"z":1.0544185995045456}}}]}]
I thought of using a crop_box filter by PCL. But is there any other way around it. It would also help if someone can point me to other point cloud annotation tools which could best help me solve the problem?
I was able to write a c++ script to read the json file and use PCL's cropbox filter to solve the problem. The methodology used is as follows:
1) Read the json file using nlohmann/json
std::ifstream ifs("somepath.json");
json j = json::parse(ifs);
2) Extract the "position" (centroid of cuboid), "orientation" of cuboid, "dimensions" of the cuboid and the "className" for each box. The code below shows a way to extract the position data into std::vector. (i is iterating over the number of boxes.)
std::vector<float> position {
j[0]["annotations"][i]["geometry"]["position"]["x"].get<float>(),
j[0]["annotations"][i]["geometry"]["position"]["y"].get<float>(),
j[0]["annotations"][i]["geometry"]["position"]["z"].get<float>()};
3) Get the max and min (x,y,z) coordinate points of the vertices of the box. This goes as input to the cropbox filter. (Note: the x,y,z point does not have to relate to a single vertex. Xmin will be minimum x of all 8 vertices and Ymin will minium of all 8 vertices and so on.)
4) Use cropbox filter from PCL. This allows you to get indices of all the points inside the given box. Examples can be found here and here.
5) Depending upon the class of the box, assign different colors to the points at those indices of the point cloud.
6) Save the point cloud.
This would be a generalized way of labeling every point of a point cloud from the an input json file which consists information of the position, dimension and orientation of the 3D cubes.

Plotting gridded data using KML

We are beginning a project to visualize the results of a finite volume (FV) calculation using Google Earth. The FV data is essentially 2d (lat/long) data consisting of a Cartesian array of values (sea surface height, for example). Each value should be mapped to a color from some colormap, and then displayed as a single mesh cell in a gridded array suitable for Google Earth. The Cartesian array could be 100x100 or larger.
My question is, do we construct polygons for each mesh cell C_{ij} in the array, assigning a color corresponding to the q_{ij} value for that mesh cell? This would seem to create a huge KML file, if the coordinates of the four corners of every mesh cell must be described, (i.e. 10,000 polygons, for example).
Or are there KML tools we could use that would allow us to specify, for example, the lower and upper coordinates of the array, a generic mesh cell size (e.g. dX, dY values), and the array of q data (or, equivalently, colours) that should be used to fill the "patch"?
Alternatively, we could create an image file, containing for example, a rendered image of our data array (created by some other means), and then referenced from the KML file.
Our aim is to use PyKML for this project.
Any suggestions would be very helpful.
After much digging around, I think I now have a better understanding of what Google Earth can and cannot do, (or is not designed to do). It seems that Google Earth is not designed as a visualization tool for numerical data. This does not mean it cannot be done, but that one must create the image files elsewhere, and then overlay them onto Google Earth. For example, this link provides instructions for visualizing the output from a fire modeling code :
http://www.openwfm.org/wiki/Visualization_in_Google_Earth
The instructions here suggests how pseudocolor plots can used in at least one special case to visualize output in Google Earth.

How to draw coastlines over a custom map without resampling

I would like to display a satellite image (preferably using python, but other solutions are welcome). It consists in a floating-point parameter P, with dimension NxM, and each pixel is geolocated by the fields latitude and longitude (each of size NxM). So I would like to:
(1) create an image of parameter P with an associated color scale. The image should not be resampled, so it should have dimension NxM
(2) display coastlines over this image
Currently, I can do (1) using PIL. I can also use the basemap library to display an image and the coastlines, but I don't know how to do it without reprojection, by staying in the image native projection with size NxM.
Edit: the parameter P does not contain any information about the coastline. Only the location (lat, lon) of the pixels should be used to overlay the coastline. The coordinates for the coastline can be obtained from gshhs for example. gshhs is actually used in the basemap library.
If all you're trying to do is enhance the boundaries between land and water, it might be good to use a high-pass filter.
For instance, start out with Lena:
and apply a highpass filter:
then overlay the highpass on top of the original:
(more details and examples can be found here).
You can find filters in scipy here.
For those in the community still looking for an answer to this question, the method which I am currently implementing (for v. similar purposes - I'm trying to test the geolocation of satellite data) requires a landmask.
There are landmask datasets available all over the place online, each with different rules and characteristics. I am working with netCDF4 data in python and my landmask is a gridded .nc dataset in which ocean elements are valued as 1 and land elements are valued as 0.
Iterating through my satellite data I multiply each latitude and longitude value by the number of elements per degree in the landmask. In my case there are 120 elements per degree in lat/lon, so
lon_inds = (lons*120).astype(int)
lat_inds = (lats*120).astype(int)
A more general way of writing this would involve substituting 120 for
len(lons)/360
len(lats)/180
respectively. Both examples of these operations can be done nearly instantaneously if using numpy arrays (which is the case for the python netCDF4 module).
Now I create a mask of my own: it must have the same dimensions as the data array (for those not intimately acquainted with satellites, the data, lats and lons arrays will all have identical dimensions):
my_mask = np.zeros(data.shape, dtype=int)
Now all we need to do is replace values in the mask where there is a coastline. This is done by iterating through the lat_inds and lon_inds arrays, looking up the value in the landmask of
landmask[lon_inds[i,j],lat_inds[i,j]]
and changing the value of
mask[i,j]
to 1 if any of the neighbors
landmask[lon_inds[i,j]-1,lat_inds[i,j]]
landmask[lon_inds[i,j]+1,lat_inds[i,j]]
landmask[lon_inds[i,j],lat_inds[i,j]-1]
landmask[lon_inds[i,j],lat_inds[i,j]+1]
are not equal to 0 (of course, a smoother coastline can be generated by adding in the diagonal neighboring cells, but this should not be necessary as hopefully you should be using a landmask dataset with sharper spatial resolution than your satellite data).

Getting Easting & Northing Values from geopy

I have a table full of longitude/ latitude pairs in decimal format (e.g., -41.547, 23.456). I want to display the values in "Easting and Northing"/ UTM format. Does geopy provide a way to convert from decimal to UTM? I see in the code that it will parse UTM values, but I don't see how to get them back out and the geopy Google Group has gone the way of all things.
Nope. You need to reproject your points, and geopy isn't going to do that for you.
What you need is libgdal and some Python bindings. I always use the bindings in GeoDjango, but there are other alternatives.
EDIT: It is just a mathematical formula, but it's non-trivial. There are thousands of different ways to represent the surface of the Earth. See here for a huge but incomplete list.
There are two parts to a geographic projection of the Earth-- a coordinate system and a datum. The latter is essentially a three-dimensional model of the planet. When you say you want to convert latitude/longitude points to UTM values, you're missing a couple of pieces of the puzzle.
Let's assume that your lat/long points are based on the WGS84 datum, because that's a pretty common standard for lat/long points these days. You want to convert those points to a UTM coordinate system. But to which UTM coordinate system? There are 60 of them.
I think I may have over-complicated things. All I wanted was the dms values (so 42.519540,
-70.896716 becomes 42º31'10.34" N 70º53'48.18" W). You can get this by creating a geopy point object with your long and lat, then calling format(). However, as of this writing, format() is broken and requires the patch here.

Categories