st_make_grid method equivalent in python - python

Is there an equivalent to the very good st_make_grid method of the sf package from r-spatial in python? The method create rectangular grid geometry over the bounding box of a polygon.
I would like to do exactly the same as the solution proposed in this question, e.g. divide a polygon into several squares of the same area that I choose. Thanks for your help.
Alternatively, I could use rpy2 to run a script in r that executes the st_make_grid method which takes a shapely polygon as input and outputs the square polygons, to be read with shapely. Would this be effective on many polygons to process?

Would this be effective on many polygons to process?
Certainly not. There's no built-in Python version but the function below does the trick. If you need performance, make sure that you have pygeos installed in your environment.
def make_grid(polygon, edge_size):
"""
polygon : shapely.geometry
edge_size : length of the grid cell
"""
from itertools import product
import numpy as np
import geopandas as gpd
bounds = polygon.bounds
x_coords = np.arange(bounds[0] + edge_size/2, bounds[2], edge_size)
y_coords = np.arange(bounds[1] + edge_size/2, bounds[3], edge_size)
combinations = np.array(list(product(x_coords, y_coords)))
squares = gpd.points_from_xy(combinations[:, 0], combinations[:, 1]).buffer(edge_size / 2, cap_style=3)
return gpd.GeoSeries(squares[squares.intersects(polygon)])

Related

Mask raster by extent in Python using rasterio

I want to clip one raster based on the extent of another (smaller) raster. First I determine the coordinates of the corners of the smaller raster using
import rasterio as rio
import gdal
from shapely.geometry import Polygon
src = gdal.Open(smaller_file.tif)
ulx, xres, xskew, uly, yskew, yres = src.GetGeoTransform()
lrx = ulx + (src.RasterXSize * xres)
lry = uly + (src.RasterYSize * yres)
geometry = [[ulx,lry], [ulx,uly], [lrx,uly], [lrx,lry]]
This gives me the following output geometry = [[-174740.0, 592900.0], [-174740.0, 2112760.0], [900180.0, 2112760.0], [900180.0, 592900.0]]. (Note that the crs is EPSG: 32651).
Now I would like to clip the larger file using rio.mask.mask(). According to the documentation, the shape variable should be GeoJSON-like dict or an object that implements the Python geo interface protocol (such as a Shapely Polygon). Therefore I create a Shapely Polygon out of the variable geometry, using
roi = Polygon(geometry)
Now everything is ready to use the rio.mask() function.
output = rio.mask.mask(larger_file.tif, roi, crop = True)
But this gives me the following error
TypeError: 'Polygon' object is not iterable
What do I do wrong? Or if someone knows a more elegant way to do it, please let me know.
(Unfortunately I cannot upload the two files since they're too large)
I found your question when I needed to figure out this kind of clipping myself. I got the same error and fixed it the following way:
rasterio.mask expects a list of features, not a single geometry. So the algorithm wants to run masking over several features bundled in an iterable (e.g. list or tuple) so we need to pass it our polygon within a list (or tuple) object.
The code you posted works after following change:
roi = [Polygon(geometry)]
All we have to do is to enclose the geometry in a list/tuple and then rasterio.mask works as expected.

Find in what polygon is each point

I am new to Python, so I apologize for the rudimentary programming skills, I am aware I am using a bit too much "loop for" (coming from Matlab it is dragging me down).
I have millions of points (timestep, long, lat, pointID) and hundreds of irregular non-overlapping polygons (vertex_long,vertex_lat,polygonID).points and polygons format sample
I want to know what polygon contains each point.
I was able to do it this way:
from matplotlib import path
def inpolygon(lon_point, lat_point, lon_poly, lat_poly):
shape = lon_point.shape
lon_point = lon_point.reshape(-1)
lat_point = lat_point.reshape(-1)
lon_poly = lon_poly.values.reshape(-1)
lat_poly = lat_poly.values.reshape(-1)
points = [(lon_point[i], lat_point[i]) for i in range(lon_point.shape[0])]
polys = path.Path([(lon_poly[i], lat_poly[i]) for i in range(lon_poly.shape[0])])
return polys.contains_points(points).reshape(shape)
And then
import numpy as np
import pandas as pd
Areas_Lon = Areas.iloc[:,0]
Areas_Lat = Areas.iloc[:,1]
Areas_ID = Areas.iloc[:,2]
Unique_Areas = np.unique(Areas_ID)
Areas_true=np.zeros((Areas_ID.shape[0],Unique_Areas.shape[0]))
for i in range(Areas_ID.shape[0]):
for ii in range(Unique_Areas.shape[0]):
Areas_true[i,ii]=(Areas_ID[i]==Unique_Areas[ii])
Areas_Lon_Vertex=np.zeros(Unique_Areas.shape[0],dtype=object)
Areas_Lat_Vertex=np.zeros(Unique_Areas.shape[0],dtype=object)
for i in range(Unique_Areas.shape[0]):
Areas_Lon_Vertex[i]=(Areas_Lon[(Areas_true[:,i]==1)])
Areas_Lat_Vertex[i]=(Areas_Lat[(Areas_true[:,i]==1)])
import f_inpolygon as inpolygon
Areas_in=np.zeros((Unique_Areas.shape[0],Points.shape[0]))
for i in range (Unique_Areas.shape[0]):
for ii in range (PT.shape[0]):
Areas_in[i,ii]=(inpolygon.inpolygon(Points[ii,2], Points[ii,3], Areas_Lon_Vertex[i], Areas_Lat_Vertex[i]))
This way the final outcome Areas_in Areas_in format contains as many rows as polygons and as many columns as points, where every column is true=1 at the row where the point is relative to polygon index (1st given polygon ID --> 1st row, and so).
The code works but very slowly for what it is supossed to do. When locating points in a regular grid or within a point radius I have succesfully tried implement a KDtree, what increases dramatically the speed, but I can`t do the same or whatever faster to irregular non-overlapping polygons.
I have seen some related questions but rather than asking for what polygons a point is were about whether a point is inside a polygon or not.
Any idea please?
Have you tried Geopandas Spatial join?
install the Package using pip
pip install geopandas
or conda
conda install -c conda-forge geopandas
then you should able to read the data as GeoDataframe
import geopandas
df = geopandas.read_file("file_name1.csv") # you can read shp files too.
right_df = geopandas.read_file("file_name2.csv") # you can read shp files too.
# Convert into geometry column
geometry = [Point(xy) for xy in zip(df['longitude'], df['latitude'])] # Coordinate reference system : WGS84
crs = {'init': 'epsg:4326'}
# Creating a Geographic data frame
left_df = geopandas.GeoDataFrame(df, crs=crs, geometry=geometry)
Then you can apply the sjoin
jdf = geopandas.sjoin(left_df, right_df, how='inner', op='intersects', lsuffix='left', rsuffix='right')
the option in op are:
intersects
contains
within
All should do the same in your case when you joining two geometry columns of type Polygon and Point

Why is transforming a shapely polygon not working in some cases?

I'm trying to calculate the size of a polygon of geographic coordinates using shapely, which seems to require a transformation into a suitable projection to yield a results in square meter. I found a couple of examples online, but I couldn't get it working for my example polygon.
I therefore tried to use the same example polygons that came with the code snippets I found, and I noticed that it works for some whole not for others. To reproduce the results, here's the minimal example code:
import json
import pyproj
from shapely.ops import transform
from shapely.geometry import Polygon, mapping
from functools import partial
coords1 = [(-97.59238135821987, 43.47456565304017),
(-97.59244690469288, 43.47962399877412),
(-97.59191951546768, 43.47962728271748),
(-97.59185396090983, 43.47456565304017),
(-97.59238135821987, 43.47456565304017)]
coords1 = reversed(coords1) # Not sure if important, but https://geojsonlint.com says it's wrong handedness
# Doesn't seem to affect the error message though
coords2 = [(13.65374516425911, 52.38533382814119),
(13.65239769133293, 52.38675829106993),
(13.64970274383571, 52.38675829106993),
(13.64835527090953, 52.38533382814119),
(13.64970274383571, 52.38390931824483),
(13.65239769133293, 52.38390931824483),
(13.65374516425911, 52.38533382814119)]
coords = coords1 # DOES NOT WORK
#coords = coords2 # WORKS
polygon = Polygon(coords)
# Print GeoJON to check on https://geojsonlint.com
print(json.dumps(mapping(polygon)))
projection = partial(pyproj.transform,
pyproj.Proj('epsg:4326'),
pyproj.Proj('esri:54009'))
transform(projection, polygon)
Both coords1 and coords2 are just copied from code snippets that supposedly work. However, only coords2 works for me. I've used https://geojsonlint.com to see if there's a difference between the two polygons, and it seems that the handedness/orientation of the polygon is not valid GeoJSON. I don't know if shapely even cares, but reversing the order -- and https://geojsonlint.com says it's valid GeoJSON then, and it shows the polygon on the map -- does not change the error.
So, it works with coords2, but when I use coords1 I get the following error:
~/env/anaconda3/envs/py36/lib/python3.6/site-packages/shapely/geometry/base.py in _repr_svg_(self)
398 if xmin == xmax and ymin == ymax:
399 # This is a point; buffer using an arbitrary size
--> 400 xmin, ymin, xmax, ymax = self.buffer(1).bounds
401 else:
402 # Expand bounds by a fraction of the data ranges
ValueError: not enough values to unpack (expected 4, got 0)
I assume there's something different about coords1 (and the example polygon from my own data) that causes the problem, but I cannot tell what could be different compared to coords2.
In short, what's the difference between coords1 and coords2, with one working and the other not?
UPDATE: I got it working by adding always_xy=True to the definition of the projections. Together with the newer syntax provided by shapely, avoiding partial, the working snippet looks like this:
project = pyproj.Transformer.from_proj(
pyproj.Proj('epsg:4326'), # source coordinate system
pyproj.Proj('epsg:3857'),
always_xy=True
) # destination coordinate system
transform(project.transform, polygon)
To be honest, even after reading the docs, I don't really know what always_xy is doing. Hence I don't want to provide is an answer.
i think you did good, only that the reversed does not create new dataset.
try to use this function to create reversed order list:
def rev_slice(mylist):
'''
return a revered list
mylist: is a list
'''
a = mylist[::-1]
return a
execute the function like so:
coords = rev_slice(coords1)

incorrect estimate_normals with open3d?

I am trying to calculate the normals of a point cloud formed by three planes each aligned with an axis.
In matlab the function pcnormals gives me a coherent result, while when I try to do the same with estimate_normals of open3d the result is incorrect.
The code is here:
import numpy as np
from open3d import *
pcd = read_point_cloud("D:\Artificial.txt",format = 'xyz')
estimate_normals(pcd, search_param = KDTreeSearchParamKNN(knn = 25))
x = np.concatenate((np.asarray(pcd.points),np.asarray(pcd.normals)),axis=1)
np.savetxt("D:\ArtificialN_python.txt",x,delimiter=',')
I also have tried with differen knn value and search_param, but the result is similar.
I enclose the images of the coloured clouds according to the third component of the normal one (red-horizontal and green-inclined) calculated with matlab and python.
matlab result:
python result:
Anybody know what that might be due to?

Can anyone please explain how this python code works line by line?

I am working in image processing right now in python using numpy and scipy all the time. I have one piece of code that can enlarge an image, but not sure how this works.
So please some expert in scipy/numpy in python can explain to me line by line. I am always eager to learn.
import numpy as N
import os.path
import scipy.signal
import scipy.interpolate
import matplotlib.pyplot as plt
import matplotlib.cm as cm
def enlarge(img, rowscale, colscale, method='linear'):
x, y = N.meshgrid(N.arange(img.shape[1]), N.arange(img.shape[0]))
pts = N.column_stack((x.ravel(), y.ravel()))
xx, yy = N.mgrid[0.:float(img.shape[1]):1/float(colscale),
0.:float(img.shape[0]):1/float(rowscale)]
large = scipy.interpolate.griddata(pts, img.flatten(), (xx, yy), method).T
large[-1,:] = large[-2,:]
large[:,-1] = large[:,-2]
return large
Thanks a lot.
First, a grid of empty points is created with point per pixel.
x, y = N.meshgrid(N.arange(img.shape[1]), N.arange(img.shape[0]))
The actual image pixels are placed into the variable pts which will be needed later.
pts = N.column_stack((x.ravel(), y.ravel()))
After that, it creates a mesh grid with one point per pixel for the enlarged image; if the original image was 200x400, the colscale set to 4 and rowscale set to 2, the mesh grid would have (200*4)x(400*2) or 800x800 points.
xx, yy = N.mgrid[0.:float(img.shape[1]):1/float(colscale),
0.:float(img.shape[0]):1/float(rowscale)]
Using scipy, the points in pts variable are interpolated into the larger grid. Interpolation is the manner in which missing points are filled or estimated usually when going from a smaller set of points to a larger set of points.
large = scipy.interpolate.griddata(pts, img.flatten(), (xx, yy), method).T
I am not 100% certain what the last two lines do without going back and looking at what the griddata method returns. It appears to be throwing out some additional data that isn't needed for the image or performing a translation.
large[-1,:] = large[-2,:]
large[:,-1] = large[:,-2]
return large

Categories