I have a csv file that contains a column that contains multipolygon representation of NYC. Meaning, NYC is divided into multi-polygons and those multi-polygons are stored in a column called the_geom in the csv file I have.
My question is how can I get the list of all possible points in each multipolygon?
What I have done so far is simply pick a random point in a multipolygon, then check if the selected point is contained in the multipolygon, the code to do so is:
multipolygon = shapely.wkt.loads(row['the_geom'])
minx, miny, maxx, maxy = multipolygon.bounds
pnt1 = Point(random.uniform(minx, maxx), random.uniform(miny, maxy))
while not multipolygon.contains(pnt1):
pnt1 = Point(random.uniform(minx, maxx), random.uniform(miny, maxy))
How do I get the list of all points within a multipolygon?
Should I convert each multipolygon to a polygon first so I can get the list of all possible points in each multipolygon?
I have seen solutions is google that suggests using GeoPandas.explode(), but I'm not sure if it is the correct approach to convert multipolygon to polygon or not.
I'd like to know what is the correct way to get a list of all points within a multipolygon, thanks!
To get all points defining the boundaries of the polygons which make up the MultiPolygon, you can iterate through the shapes using the MultiPolygon.bounds attribute, e.g.:
np.hstack([np.array(g.boundary.xy) for g in multipolygon.geoms])
I would like to convert an image (.tiff) into Shapely points. There are 45 million pixels, I need a way to accomplish this without a loop (currently taking 15+ hours)
For example, I have a .tiff file which when opened is a 5000x9000 array. The values are pixel values (colors) that range from 1 to 215.
I open tif with rasterio.open(xxxx.tif).
Desired epsg is 32615
I need to preserve the pixel value but also attach geospatial positioning. This is to be able to sjoin over a polygon to see if the points are inside. I can handle the transform after processing, but I cannot figure a way to accomplish this without a loop. Any help would be greatly appreciated!
If you just want a boolean array indicating whether the points are within any of the geometries, I'd dissolve the shapes into a single MultiPolygon then use shapely.vectorized.contains. The shapely.vectorized module is currently not covered in the documentation, but it's really good to know about!
Something along the lines of
# for a gridded dataset with 2-D arrays lats, lons
# and a list of shapely polygons/multipolygons all_shapes
XX = lons.ravel()
YY = lats.ravel()
single_multipolygon = shapely.ops.unary_union(all_shapes)
in_any_shape = shapely.vectorized.contains(single_multipolygon, XX, YY)
If you're looking to identify which shape the points are in, use geopandas.points_from_xy to convert your x, y point coordinates into a GeometryArray, then use geopandas.sjoin to find the index of the shape corresponding to each (x, y) point:
geoarray = geopandas.points_from_xy(XX, YY)
points_gdf = geopandas.GeoDataFrame(geometry=geoarray)
shapes_gdf = geopandas.GeoDataFrame(geometry=all_shapes)
shape_index_by_point = geopandas.sjoin(
shapes_gdf, points_gdf, how='right', predicate='contains',
)
This is still a large operation, but it's vectorized and will be significantly faster than a looped solution. The geopandas route is also a good option if you'd like to convert the projection of your data or use other geopandas functionality.
I am using rasterio to convert a geopandas dataframe of points to a geotif raster.
For that I am using this python code:
with rasterio.open("somepath/rasterized.tif", 'w+', **meta) as out:
out.nodata = 0
out_arr = out.read(1)
# this is where we create a generator of geom, value pairs to use in rasterizing
shapes = ((geom, value * 2) for geom, value in zip(gdf.geometry, gdf["PositionConfidence"]))
burned = features.rasterize(shapes=shapes, fill=0, out=out_arr, transform=out.transform)
out.write_band(1, burned)
out.write_band(2, burned)
out.write_band(3, burned)
out_arr1 = out.read(1)
The code not only writes a fixed value to the raster but a value based on the point to be converted.
The problem is that there are many points per raster pixel. Using above approach only a single points value is burned to the pixel.
What I am looking for is having the average of all points values per pixel burned to the given pixel.
Thanks for your help
I found some idea to do it:
First we sort all the points by pixel into a dictionary. Then we calculate the value that we need, in this case average of some value, for all points in that pixel. Now we assign this value to a new point with the coordinates of the pixel and use that instead of the old points.
I have downloaded the velocity field of the Greenland ice sheet from the CCI website as a NetCDF file. However, the projection is given as (see below, where x ranges between [-639750,855750] and y [-655750,-3355750])
How can I project these data to actual lat/lon coordinates in the NetCDF file? Thanks already! For the ones interested: the file can be downloaded here: http://products.esa-icesheets-cci.org/products/downloadlist/IV/
Variables:
crs
Size: 1x1
Dimensions:
Datatype: int32
Attributes:
grid_mapping_name = 'polar_stereographic'
standard_parallel = 70
straight_vertical_longitude_from_pole = -45
false_easting = 0
false_northing = 0
unit = 'meter'
latitude_of_projection_origin = 90
spatial_ref = 'PROJCS["WGS 84 / NSIDC Sea Ice Polar Stereographic North",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Polar_Stereographic"],PARAMETER["latitude_of_origin",70],PARAMETER["central_meridian",-45],PARAMETER["scale_factor",1],PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["X",EAST],AXIS["Y",NORTH],AUTHORITY["EPSG","3413"]]'
y
Size: 5401x1
Dimensions: y
Datatype: double
Attributes:
units = 'm'
axis = 'Y'
long_name = 'y coordinate of projection'
standard_name = 'projection_y_coordinate'
x
Size: 2992x1
Dimensions: x
Datatype: double
Attributes:
units = 'm'
axis = 'X'
long_name = 'x coordinate of projection'
standard_name = 'projection_x_coordinate'
If you want to transform the whole grid from its native Polar Stereographic coordinates to a geographic (longitude by latitude) grid, you'll probably want to use a tool like gdalwarp. I don't think that's the question you're asking, though.
If I'm reading your question correctly, you want to pick points out of the file and locate them as lon/lat coordinate pairs. I'm assuming that you know how to get a location as an XY pair out of your netCDF file, along with the velocity values at that location. I'm also assuming that you're doing this in Python, since you put that tag on this question.
Once you've got an XY pair, you just need a function (with a bunch of parameters) to transform it to lon/lat. You can find that function in the pyproj module.
Pyproj wraps the proj4 C library, which is very widely used for coordinate system transformations. If you have an XY pair in projected coordinates and you know the definition of the projected coordinate system, you can use pyproj's transform function like this:
import pyproj
# Output coordinates are in WGS 84 longitude and latitude
projOut = pyproj.Proj(init='epsg:4326')
# Input coordinates are in meters on the Polar Stereographic
# projection given in the netCDF file
projIn = pyproj.Proj('+proj=stere +lat_0=90 +lat_ts=70 +lon_0=-45
+k=1 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs ',
preserve_units=True)
# here is a coordinate pair near the middle of your data set
x, y = 0.0, -2000000
# transform x,y to lon/lat
lon, lat = pyproj.transform(projIn, projOut, x, y)
# answer: lon = -45.0; lat = 71.6886
... and there you go. Note that the output longitude is -45.0, which should give you a nice warm feeling, since the input X coordinate was 0, and -45.0 is the central meridian of the data set's projection. If you want your answer in radians instead of degrees, set the radians kwarg in the transform function to True.
Now for the hard part, which is actually the thing you do first -- defining the projIn and projOut that are used as arguments for the transform function. These are in the input and output coordinate systems for the transformation. These are Proj objects, and they hold a mess of parameters for the coordinate system transformation equations. The proj4 developers have encapsulated them all in a tidy set of functions and the pyproj developers have put a nice python wrapper around them, so you and I don't have to keep track of all the details. I will be grateful to them for all the days that remain to me.
The output coordinate system is trivial
projOut = pyproj.Proj(init='epsg:4326')
The pyproj library can build a Proj object from an EPSG code. 4326 is the EPSG code for WGS 84 lon/lat. Done.
Setting projIn is harder, because your netCDF file defines its coordinate system with a WKT string, which (I'm pretty sure) can't be read directly by proj4 or pyproj. However, pyproj.Proj() will take a proj4 parameter string as an argument. I've already given you the one you need for this operation, so you can just take my for for it that this
+proj=stere +lat_0=90 +lat_ts=70 +lon_0=-45 +k=1 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs
is the equivalent of this (which is copied directly from your netCDF file):
PROJCS["WGS 84 / NSIDC Sea Ice Polar Stereographic North",
GEOGCS["WGS 84",
DATUM["WGS_1984",
SPHEROID["WGS 84",6378137,298.257223563,
AUTHORITY["EPSG","7030"]],
AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],
UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],
AUTHORITY["EPSG","4326"]],
PROJECTION["Polar_Stereographic"],
PARAMETER["latitude_of_origin",70],
PARAMETER["central_meridian",-45],
PARAMETER["scale_factor",1],
PARAMETER["false_easting",0],
PARAMETER["false_northing",0],
UNIT["metre",1,AUTHORITY["EPSG","9001"]],
AXIS["X",EAST],
AXIS["Y",NORTH],
AUTHORITY["EPSG","3413"]]'
If you want to be able to do this more generally, you'll need another module to convert WKT coordinate system definitions to proj4 parameter strings. One such module is osgeo.osr and there's an example program at this blog post that shows you how to do that conversion.
I have a map in galactic coordinates and I need to save it in equatorial coordinates in another file . I know i can use:
import healpy as hp
map=hp.read_map('file.fits')
map_rot=hp.mollview(map, coord=['G','C'], return_projected_map=True)
and this should return a 2D numpy array stored in map_rot. But when I read map_rot, I found out it is a masked_array filled ONLY with -inf values, and mask=False , fill_value=-1.6735e+30 (so, apparently, -inf is not a mask). Moreover, the total number of elements of map_rot do not match with the number of pixels I would expect for a map (npix=12*nside**2). For example if nside=256 I would expect to obtain npix=786432, while map_rot has 400*800=320000 elements. What's going on?
(I have already seen this post, but I have a map in polarization, so I need to rotate Stokes' parameters. Since mollview knows how to do that, I was trying to obtain the new map directly from mollview. )
One way to go around this is to save the output, for instance with pickle
import healpy as hp, pickle
map=hp.read_map('file.fits')
map_rot=hp.mollview(map, coord=['G','C'], return_projected_map=True)
pickle.dump(map_rot, open( "/path/map.p", "wb"))
The return value of hp.mollview() has a format that can be displayed using the standard imshow() function. So next time you want to plot it, just do the following
map_rot = pickle.load(open("/path/map.p"), 'rb'))
plt.imshow(map_rot)
map_rot describes the pixels in the entire matplotlib window, including the white area (-inf color-coded with white) around the ellipsoid.
In contrast, mollview() accepts only an array of pixels which reside in the ellipsoid, i.e. array of the length.
len(hp.pixelfunc.nside2npix(NSIDE))