The task is to make an adress popularity map for Moscow. Basically, it should look like this:
https://nbviewer.jupyter.org/github/python-visualization/folium/blob/master/examples/GeoJSON_and_choropleth.ipynb
For my map I use public geojson: http://gis-lab.info/qa/moscow-atd.html
The only data I have - points coordinates and there's no information about the district they belong to.
Question 1:
Do I have to manually calculate for each disctrict if the point belongs to it, or there is more effective way to do this?
Question 2:
If there is no way to do this easier, then, how can I get all the coordinates for each disctrict from the geojson file (link above)?
import pandas as pd
import numpy as np
import geopandas as gpd
from shapely.geometry import Point
Reading in the Moscow area shape file with geopandas
districts = gpd.read_file('mo-shape/mo.shp')
Construct a mock user dataset
moscow = [55.7, 37.6]
data = (
np.random.normal(size=(100, 2)) *
np.array([[.25, .25]]) +
np.array([moscow])
)
my_df = pd.DataFrame(data, columns=['lat', 'lon'])
my_df['pop'] = np.random.randint(500, 100000, size=len(data))
Create Point objects from the user data
geom = [Point(x, y) for x,y in zip(my_df['lon'], my_df['lat'])]
# and a geopandas dataframe using the same crs from the shape file
my_gdf = gpd.GeoDataFrame(my_df, geometry=geom)
my_gdf.crs = districts.crs
Then the join using default value of 'inner'
gpd.sjoin(districts, my_gdf, op='contains')
Thanks to #BobHaffner, I tried to solve the problem using geopandas.
Here are my steps:
I download a shape-files for Moscow using this link click
From a list of tuples containing x and y (latitude and logitude) coordinates I create list of Points (docs)
Assuming that in the dataframe from the first link I have polygons I can write a simple loop for checking if the Point is inside this polygon. For details read this.
Related
I am new to Python, so I apologize for the rudimentary programming skills, I am aware I am using a bit too much "loop for" (coming from Matlab it is dragging me down).
I have millions of points (timestep, long, lat, pointID) and hundreds of irregular non-overlapping polygons (vertex_long,vertex_lat,polygonID).points and polygons format sample
I want to know what polygon contains each point.
I was able to do it this way:
from matplotlib import path
def inpolygon(lon_point, lat_point, lon_poly, lat_poly):
shape = lon_point.shape
lon_point = lon_point.reshape(-1)
lat_point = lat_point.reshape(-1)
lon_poly = lon_poly.values.reshape(-1)
lat_poly = lat_poly.values.reshape(-1)
points = [(lon_point[i], lat_point[i]) for i in range(lon_point.shape[0])]
polys = path.Path([(lon_poly[i], lat_poly[i]) for i in range(lon_poly.shape[0])])
return polys.contains_points(points).reshape(shape)
And then
import numpy as np
import pandas as pd
Areas_Lon = Areas.iloc[:,0]
Areas_Lat = Areas.iloc[:,1]
Areas_ID = Areas.iloc[:,2]
Unique_Areas = np.unique(Areas_ID)
Areas_true=np.zeros((Areas_ID.shape[0],Unique_Areas.shape[0]))
for i in range(Areas_ID.shape[0]):
for ii in range(Unique_Areas.shape[0]):
Areas_true[i,ii]=(Areas_ID[i]==Unique_Areas[ii])
Areas_Lon_Vertex=np.zeros(Unique_Areas.shape[0],dtype=object)
Areas_Lat_Vertex=np.zeros(Unique_Areas.shape[0],dtype=object)
for i in range(Unique_Areas.shape[0]):
Areas_Lon_Vertex[i]=(Areas_Lon[(Areas_true[:,i]==1)])
Areas_Lat_Vertex[i]=(Areas_Lat[(Areas_true[:,i]==1)])
import f_inpolygon as inpolygon
Areas_in=np.zeros((Unique_Areas.shape[0],Points.shape[0]))
for i in range (Unique_Areas.shape[0]):
for ii in range (PT.shape[0]):
Areas_in[i,ii]=(inpolygon.inpolygon(Points[ii,2], Points[ii,3], Areas_Lon_Vertex[i], Areas_Lat_Vertex[i]))
This way the final outcome Areas_in Areas_in format contains as many rows as polygons and as many columns as points, where every column is true=1 at the row where the point is relative to polygon index (1st given polygon ID --> 1st row, and so).
The code works but very slowly for what it is supossed to do. When locating points in a regular grid or within a point radius I have succesfully tried implement a KDtree, what increases dramatically the speed, but I can`t do the same or whatever faster to irregular non-overlapping polygons.
I have seen some related questions but rather than asking for what polygons a point is were about whether a point is inside a polygon or not.
Any idea please?
Have you tried Geopandas Spatial join?
install the Package using pip
pip install geopandas
or conda
conda install -c conda-forge geopandas
then you should able to read the data as GeoDataframe
import geopandas
df = geopandas.read_file("file_name1.csv") # you can read shp files too.
right_df = geopandas.read_file("file_name2.csv") # you can read shp files too.
# Convert into geometry column
geometry = [Point(xy) for xy in zip(df['longitude'], df['latitude'])] # Coordinate reference system : WGS84
crs = {'init': 'epsg:4326'}
# Creating a Geographic data frame
left_df = geopandas.GeoDataFrame(df, crs=crs, geometry=geometry)
Then you can apply the sjoin
jdf = geopandas.sjoin(left_df, right_df, how='inner', op='intersects', lsuffix='left', rsuffix='right')
the option in op are:
intersects
contains
within
All should do the same in your case when you joining two geometry columns of type Polygon and Point
I am new to shapefiles and mapping in python so I was hoping to get some help with overlaying data points from a shapefile on a density map.
To be honest, I am a beginner with mapping and reading in shapefiles so what I have so far not much.
I have started off using pyshp but if there are better packages out there to do this then I would love any feedback.
The following code is to create the base map of the LA area:
def get_base_map(rides_clean):
return folium.Map(locations=[rides_clean.start_lat.mean(),
rides_clean.start_lon.mean()],
zoom_start = 20, tiles = 'cartodbpositron')
The following code is to create the density/heat map:
from folium import plugins
stationArr = rides_clean[['start_lat', 'start_lon']][:40000].as_matrix()
get_base_map(rides_clean).add_child(plugins.HeatMap(stationArr,
radius=40, max_val=300))
The following code is the same heat map but with route lines added:
(draw_route_lines(get_base_map(rides_clean),
routedf_vol)).add_child(plugins.HeatMap(stationArr, radius=40,
max_val=300))
I want to see data points from the shapefile shown as markers on top of the density plot.
It is possible to do this with pyshp. I've only ever used Matplotlib to plot shapefile points on a map, but this method will create two arrays which will be the x and y coordinates of each point you'd like to plot. The first snippet is used if you have multiple shapes in your shapefile, while the second can be used if you only have one shape.
import shapefile
import numpy as np
sf = shapefile.Reader('/path/to/shapefile')
point_list = []
for shape in sf:
temp = shape.points()
point_list.append(temp)
point_list = np.array(point_list)
x = point_list[:,0]
y = point_list[:,1]
And for a shapefile with only a single shape:
import shapefile
import numpy as np
sf = shapefile.Reader('/path/to/shapefile')
point_list = np.array(sf.shape(0).points)
x = point_list[:,0]
y = point_list[:,1]
You can tell how many shapes are in your shapefile using sf.shapes() and it will print a list detailing all the different shapes. From your question it appeared you were wanting to plot it as points on the marker rather than lines, sorry if this is not the case.
I want to create a visualization on a map using folium. In the map I want to observe how many items are related to a particular geographical point building a heatmap. Below is the code I'm using.
import pandas as pd
import folium
from folium import plugins
data = [[41.895278,12.482222,2873494.0,20.243001,20414,7.104243],
[41.883850,12.333330,3916.0,0.835251,4,1.021450],
[41.854241,12.567000,22263.0,1.132390,35,1.572115],
[41.902147,12.590388,19505.0,0.839181,37,1.896950],
[41.994240,12.48520,16239.0,1.383981,25,1.539504]]
df = pd.DataFrame(columns=['latitude','longitude','population','radius','count','normalized'],data=data)
middle_lat = df['latitude'].median()
middle_lon = df['longitude'].median()
m = folium.Map(location=[middle_lat, middle_lon],tiles = "Stamen Terrain",zoom_start=11)
# convert to (n, 2) nd-array format for heatmap
points = df[['latitude', 'longitude', 'normalized']].dropna().values
# plot heatmap
plugins.HeatMap(points, radius=15).add_to(m)
m.save(outfile='map.html')
Here the result
In this map, each point has the same radius. Insted, I want to create a heatmap in which the points radius is proportional with the one of the city it belongs to. I already tried to pass the radii in a list, but it is not working, as well as passing the values with a for loop.
Any idea?
You need to add one point after another. So you can specify the radius for each point. Like this:
import random
import numpy
pointArrays = numpy.split(points, len(points))
radii = [5, 10, 15, 20, 25]
for point, radius in zip(pointArrays, radii):
plugins.HeatMap(point, radius=radius).add_to(m)
m.save(outfile='map.html')
Here you can see, each point has a different size.
I have created a geoDataFrame using, and would like to create a Folium Map, plotting the population eat for each country. Do I have to create the Json file, or I can directly use the geoDataFrame file?
import folium
import fiona
import geopandas as gpd
world = fiona.open(gpd.datasets.get_path('naturalearth_lowres'))
world = gpd.GeoDataFrame.from_features([feature for feature in world])
world = world[(world.pop_est > 0) & (world.name != "Antarctica")]
I used folium.map and geojson function, but it failed to create correct JSON files.
Thanks for the help!
The m.cholopleth() code in #joris's answer is now deprecated. The following code produces the same result using the new folium.Chloropleth() function:
m = folium.Map()
folium.Choropleth(world, data=world,
key_on='feature.properties.name',
columns=['name', 'pop_est'],
fill_color='YlOrBr').add_to(m)
folium.LayerControl().add_to(m)
m
In recent releases of folium, you don't need to convert the GeoDataFrame to geojson, but you can pass it directly. Connecting the population column to color the polygons is still somewhat tricky to get correct:
m = folium.Map()
m.choropleth(world, data=world, key_on='feature.properties.name',
columns=['name', 'pop_est'], fill_color='YlOrBr')
m
I have a pandas Dataframe with a few million rows, each with an X and Y attribute with their location in kilometres according to the WGS 1984 World Mercator projection (created using ArcGIS).
What is the easiest way to project these points back to degrees, without leaving the Python/pandas environment?
There is already a python module that can do these kind of transformations for you called pyproj. I will agree it is actually not the simplest module to find via google. Some examples of its use can be seen here
Many years later, this is how I would do this. Keeping everything in GeoPandas to minimise the possibility of footguns.
Some imports:
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
Create a dataframe (note the values must be in metres!)
df = pd.DataFrame({"X": [50e3, 900e3], "Y": [20e3, 900e3]})
Create geometries from the X/Y coordinates
df["geometry"] = df.apply(lambda row: Point(row.X, row.Y), axis=1)
Convert to a GeoDataFrame, setting the current CRS.
In this case EPSG:3857, the projection from the question.
gdf = gpd.GeoDataFrame(df, crs=3857)
Project it to the standard WGS84 CRS in degrees (EPSG:4326).
gdf = gdf.to_crs(4326)
And then (optionally), extract the X/Y coordinates in degrees back into standard columns:
gdf["X_deg"] = gdf.geometry.apply(lambda p: p.x)
gdf["Y_deg"] = gdf.geometry.apply(lambda p: p.y)