Geopandas buffer and intersect - python

I am using the geojson file from the [OpenData Vancouver][1] website and I am trying to find the zoning classifications that fall within 5 kms of a "Historical Area".
So, I am buffering all historical areas by 5kms (My data is projected), performing the intersect operation and using the intersect results as an index:
buffs = gdf_26910[gdf_26910['zoning_classification']=='Historical Area']['geometry'].buffer(5000)
gdf_26910[buffs.intersects(gdf_26910['geometry'])]
However, this is the output I am getting:
zoning_category zoning_classification zoning_district object_id geometry area centroid buffer5K
87 HA Historical Area HA-1A 78541 POLYGON ((492805.516 5458679.305, 492805.038 5... 3848.384041 POINT (492778.785 5458699.947) POLYGON ((497803.807 5458548.605, 497803.124 5...
111 HA Historical Area HA-3 78640 POLYGON ((491358.402 5458065.050, 491309.735 5... 66336.339719 POINT (491183.139 5458103.162) POLYGON ((492818.045 5453267.595, 492421.697 5...
180 HA Historical Area HA-1A 78836 POLYGON ((492925.194 5458575.204, 492929.600 5... 90566.768532 POINT (492753.969 5458456.804) POLYGON ((487583.872 5458086.263, 487564.746 5...
683 HA Historical Area HA-1 78779 POLYGON ((492925.194 5458575.204, 492802.702 5... 69052.427940 POINT (492606.372 5458621.753) POLYGON ((487874.100 5456398.633, 487789.801 5...
1208 HA Historical Area HA-2 78833 POLYGON ((492332.139 5458699.308, 492346.989 5... 179805.027166 POINT (492343.437 5458944.412) POLYGON ((489822.136 5454379.453, 489755.087 5...
Clearly, I am getting a match for the Historical Areas and not all the other geometries that intersect the buffers.
I have plotted the buffers and the output looks correct:
#Plot
base=gdf_26910.plot()
buffs.plot(ax=base, color='red', alpha=0.25)
[![enter image description here][2]][2]
I have also opened the data in QGIS and verified that there are 5 'Historical Areas' and they are all adjacent to 'Comprehensive Development'. So, the matching rows after the intersect operation should be "Comprehensive Development" at the least.
Where am I going wrong?

Two core points
need to work in meters for a 5km buffer. Hence have used estimate_utm_crs() for projection. Have also use cap_style and join_style for a more reflective buffered polygon.
have used sjoin() instead of mask approach in your code. This will effectively give duplicates, so de-dupe using pandas groupby().first()
UPDATE changed to predicate="within" and used folium to visualise (possibly helps you understand how geometry is working)
import geopandas as gpd
import folium
gdf_26910 = gpd.read_file(
"https://opendata.vancouver.ca/explore/dataset/zoning-districts-and-labels/download/?format=geojson&timezone=Europe/London&lang=en"
)
buffs = gdf_26910.loc[gdf_26910["zoning_classification"] == "Historical Area"]
# buffer is defined as km, so need a CRS in meters...
buffs = (
buffs.to_crs(buffs.estimate_utm_crs())
.buffer(5000, cap_style=2, join_style=3)
.to_crs(gdf_ha.crs)
)
# this warns so is clearly bad !
# gdf_26910[buffs.intersects(gdf_26910['geometry'])]
# some geometries intersect multiple historical areas, take first intersection from sjoin()
gdf_5km = (
gdf_26910.reset_index()
.sjoin(buffs.to_frame(), predicate="within")
.groupby("index")
.first()
.set_crs(gdf_26910.crs)
)
m = buffs.explore(name="buffer")
gdf_5km.explore("zoning_classification", m=m, name="within")
gdf_26910.explore("zoning_classification", m=m, name="all", legend=False)
folium.LayerControl().add_to(m)
m

Related

How to create Isopolygon using python?

I have a set of points of a location. I am trying to create an isoline using those points. In order to generate isolines I used convex hull and alphashape which is creating kind of box shaped or straight cut line kind of polygon structure like below. How do I get a proper isoline shape? What is way of perfect way to generate an isochrone using python?
print(df)
id latitude longitude geometry
8758520180 53.334261 -2.569419 POINT (-2.56942 53.33426)
9339285446 53.346211 -2.575348 POINT (-2.57535 53.34621)
616761660 53.340828 -2.566912 POINT (-2.56691 53.34083)
9454070930 53.338889 -2.574538 POINT (-2.57454 53.33889)
9454071045 53.339388 -2.574591 POINT (-2.57459 53.33939)
and so on.
import alphashape
polygon= alphashape.alphashape(df['geometry'], 0.20)
GeoDataFrame(polygon, crs="EPSG:4326", geometry=p_df['geometry'])
final output by alphashape:-
Excepted output (sketch):-
points :-

Is there a way to convert a polygon shapefile into coordinates in Python?

I am trying to download satellite images from Sentinel 2 through ESA Sentinel data hub.
The code that I am using to get the shapefile layer's extent to set the query is not in lat/long coordinates but rather strange numbers. I carefully followed the practical instructions with no luck.
Any advice or help on how to solve this issue would be much appreciated!
Below is the code:
# Get the shapefile layer's extent
driver = ogr.GetDriverByName("ESRI Shapefile")
ds = driver.Open(shapefile, 0)
lyr = ds.GetLayer()
extent = lyr.GetExtent()
print("Extent of the area of interest (shapefile):\n", extent)
# get projection information from the shapefile to reproject the images to
outSpatialRef = lyr.GetSpatialRef().ExportToWkt()
ds = None # close file
print("\nSpatial referencing information of the shapefile:\n", outSpatialRef)
Extent of the area of interest (shapefile):
(363337.9978, 406749.40699999966, 565178.6085999999, 633117.0013999995)
Spatial referencing information of the shapefile:
PROJCS["OSGB_1936_British_National_Grid",GEOGCS["GCS_OSGB 1936",DATUM["OSGB_1936",SPHEROID["Airy_1830",6377563.396,299.3249646]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["false_easting",400000.0],PARAMETER["false_northing",-100000.0],PARAMETER["central_meridian",-2.0],PARAMETER["scale_factor",0.9996012717],PARAMETER["latitude_of_origin",49.0],UNIT["Meter",1.0]]
# extent of our shapefile in the right format for the Data Hub API.
def bbox(extent):
# Create a Polygon from the extent tuple
box = ogr.Geometry(ogr.wkbLinearRing)
box.AddPoint(extent[0],extent[2])
box.AddPoint(extent[1], extent[2])
box.AddPoint(extent[1], extent[3])
box.AddPoint(extent[0], extent[3])
box.AddPoint(extent[0],extent[2])
poly = ogr.Geometry(ogr.wkbPolygon)
poly.AddGeometry(box)
return poly
# Let's see what it does
print(extent)
print(bbox(extent))
(363337.9978, 406749.40699999966, 565178.6085999999, 633117.0013999995)
POLYGON ((363337.9978 565178.6086 0,406749.407 565178.6086 0,406749.407 633117.001399999 0,363337.9978 633117.001399999 0,363337.9978 565178.6086 0))
Turns out the coordinate system that the shapefile is in is quite crucial and it should be in GCS_WGS_1984.

in GeoPandas, select (line string) data within a latitude longitude box defined by user

I have a geopandas dataframe consisting of a combination of LineStrings and MultiLineStrings. I would like to select those LineStrings and MultiLineStrings containing a point within a box (defined by me) of latitude longitude, for which I don't have a geometry. In other words, I have some mapped USGS fault traces and I would like to pick a square inset of those fault lines within a certain distance from some lat/lons. So far I've had some success unwrapping just coordinates from the entire data frame and only saving points that fall within a box of lat/lon, but then I no longer keep the original geometry or information saved in the data frame. (i.e. like this:)
xvals=[]
yvals=[]
for flt in qfaults['geometry']:
for coord in flt.coords:
if coord[1] >= centroid[1]-1 and coord[1] <= centroid[1]+1 and coord[0]<=centroid[0]+1 and coord[0]>=centroid[0]-1:
xvals.append(coord[0])
yvals.append(coord[1])
Is there any intuition as to how to do this using the GeoPandas data frame? Thanks in advance.
GeoPandas has .cx indexer which works exactly like this. See https://geopandas.readthedocs.io/en/latest/docs/user_guide/indexing.html
Syntax is gdf.cx[xmin:xmax, ymin:ymax]
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
southern_world = world.cx[:, :0]
western_world = world.cx[:0, :]
western_europe = world.cx[1:10, 40:60]

Which multipolygon does the point of longtitude and latitude belong to in Python?

I have the longitude and latitude and the expected result is that whichever multipolygon the point is in, I get the name or ID of the multipolygon.
import geopandas as gpd
world = gpd.read_file('/Control_Areas.shp')
world.plot()
Output
0 MULTIPOLYGON (((-9837042.000 6137048.000, -983...
1 MULTIPOLYGON (((-11583146.000 5695095.000, -11...
2 MULTIPOLYGON (((-8542840.287 4154568.013, -854...
3 MULTIPOLYGON (((-10822667.912 2996855.452, -10...
4 MULTIPOLYGON (((-13050304.061 3865631.027, -13.
Previous attempts:
I have tried fiona, shapely and geopandas to get that done but I have struggled horribly to make progress on this. The closest I have gotten is the within and contains function, but the area of work I have struggled is the transformation of multipolygon to polygon successfully as well and then utilising the power of within and contains to get the desired output.
The shapefile has been downloaded from here.
world.crs gives {'init': 'epsg:3857'} (Web Mercator projection) so you should first reproject your GeoDataFrame in the WGS84 projection if you want to keep the latitude-longitude coordinate system of your point.
world = world.to_crs("EPSG:4326")
Then you can use the intersects method of GeoPandas to find the indexes of the Polygons that contain your point.
For example for the city of New York:
from shapely.geometry import Point
NY_pnt = Point(40.712784, -74.005941)
world[["ID","NAME"]][world.intersects(NY_pnt)]
which results in:
ID NAME
20 13501 NEW YORK INDEPENDENT SYSTEM OPERATOR
you can check the result with shapely within method:
NY_pnt.within(world["geometry"][20])
If you have multiple points, you can create a GeoDataFrame and use the sjoin method:
NY_pnt = Point(40.712784, -74.005941)
LA_pnt = Point(34.052235, -118.243683)
points_df = gpd.GeoDataFrame({'geometry': [NY_pnt, LA_pnt]}, crs='EPSG:4326')
results = gpd.sjoin(points_df, world, op='within')
results[['ID', 'NAME']]
Output:
ID NAME
0 13501 NEW YORK INDEPENDENT SYSTEM OPERATOR
1 11208 LOS ANGELES DEPARTMENT OF WATER AND POWER

geopandas sjoin returning empty rows

I have a table of polygons of all UK output areas structured as such:
newpoly
OBJECTID OA11CD LAD11CD Shape__Are Shape__Len TCITY15NM geometry
67519 67520 E00069658 E06000018 3.396296e+04 1006.464423 Nottingham POLYGON ((456069.067 340766.874, 456057.000 34...
67520 67521 E00069659 E06000018 1.014138e+05 1404.327776 Nottingham POLYGON ((456691.549 340778.104, 456557.864 34...
67521 67522 E00069660 E06000018 1.812783e+04 731.882609 Nottingham POLYGON ((456945.994 340821.233, 456969.220 34...
67522 67523 E00069661 E06000018 2.765546e+04 1112.317587 Nottingham POLYGON ((456527.178 340669.119, 456484.993 34...
67523 67524 E00069662 E06000018 3.647822e+04 964.989153 Nottingham POLYGON ((456301.845 340419.759, 456244.357 34...
and a table of points structured like:
restaurants
name latitude longitude geometry
0 Restaurant Sat Bains with rooms 52.925050 -1.167712 POINT (-1.16771 52.92505)
1 Revolution Hockley 52.954090 -1.144025 POINT (-1.14403 52.95409)
2 Revolution Cornerhouse 52.955517 -1.150088 POINT (-1.15009 52.95552)
but when i do:
spatial_join = gpd.sjoin(restaurants, newpoly, op = 'contains')
spatial_join
0 rows match.
the geometry column of the restaurants were made via:
restaurants = pd.read_csv('Restaurants_clean.csv')
restaurants = gpd.GeoDataFrame(
restaurants, geometry=gpd.points_from_xy(restaurants.longitude, restaurants.latitude))
I have tried different 'op' arguments but the same problem occurs. I am convinced that there must be a join because all UK output areas exist in the table.
Am i missing something?
You are using different projections. I am sure GeoPandas sjoin actually warns you about that. Create your point layer in the following way:
restaurants = pd.read_csv('Restaurants_clean.csv')
restaurants = gpd.GeoDataFrame(
restaurants,
geometry=gpd.points_from_xy(restaurants.longitude, restaurants.latitude),
crs=4326)
restaurants = restaurants.to_crs(newpoly.crs)
I am first specifying the CRS of input (as 4326, which is EPSG code of WS84, i.e. lon/lat coordinates) and then I am re-projecting the data to the same CRS newpoly has (I assume 27700).

Categories