I am learning how to work with spatial data in python at the moment, and have run into a problem: I'm trying to overlay some gridded data onto a basemap instance, but the projections don't seem to match up. Could anyone please tell me what I've done wrong?
I have been assuming that lon/lat positions are absolute, and the projection should be handled by the basemap instance - is this not the case? Basically, is my error in the grid coordinate conversion stage, or the plotting stage?
Thanks very much in advance!
My example:
First, download the gridded data.
import numpy as np
import pyproj
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
# Import the gridded data I want to plot
# NB: the header describes the grid layout: LL corner is -200000 eastings, -200000 northings, in the British National Grid (BNG, EPSG:27700) system. The 'no data' value is -9999
grid = np.loadtxt('grid_data.txt', skiprows = 6, dtype = 'float')
grid = np.ma.masked_where(grid<0,grid)
# Take a look at the grid: it's a map of the UK!
plt.imshow(grid)
# Define a BNG coordinate grid for the data, based on information in the header:
# LLcorner eastings = -200000
# LLcorner northings = -200000
# grid box size = 5000
# All units in metres
lle = -200000 + 2500 # because we want to plot the box point at the centre of each box, not its lower left corner.
lln = -200000 + 2500
delta = 5000
nn,ne = grid.shape
ns = np.arange(0.,nn,1)*delta + lln
es = np.arange(0.,ne,1)*delta + lle
# convert to mesh of eastings/northings.
eastings,northings = np.meshgrid(es,ns)
# convert mesh to lon/lat (wgs84) using pyproj:
bng = pyproj.Proj(init='epsg:27700')
wgs84 = pyproj.Proj(init='epsg:4326')
lons,lats = pyproj.transform(bng,wgs84,eastings,northings)
lats = lats[::-1] # because otherwise the map is upside down (differences between imshow and pcolormesh)
# Get a basemap of the UK
m = Basemap(projection='merc',llcrnrlon=-15 ,llcrnrlat=47.8,urcrnrlon=7.2 ,urcrnrlat=62.5,resolution='h')
# project the lon/lat grids onto map coordinates
x,y = m(lons,lats)
# Draw the map!
m.drawcoastlines()
m.pcolormesh(x,y,grid,cmap=plt.cm.jet)
plt.show()
Related
I followed this excellent guide by Adam Symington and successfully created the following topographic map of Sabah (a state in Malaysia, which is a Southeast Asian nation). The awkward blob of black in the upper left corner is my attempt to plot certain coordinates on the map.
I would like to improve this diagram in the following ways:
EDIT: I have figured item (1) out and posted the solution below. (2) and (3) pending.
[SOLVED] The sch dataframe contains coordinates of all schools in the state. I would like to plot these on the map. I suspect that it is currently going wonky because the axes are not "geo-axes" (meaning, not using lat/lon scales) - you can confirm this by setting ax.axis('on'). How do I get around this? [SOLVED]
I'd like to set the portion outside the actual territory to white. Calling ax.set_facecolor('white') isn't working. I know that the specific thing setting it to grey is the ax.imshow(hillshade, cmap='Greys', alpha=0.3) line (because changing the cmap changes the background); I just don't know how to alter it while keeping the color within the map as grey.
If possible, I'd like the outline of the map to be black, but this is just pedantic.
All code to reproduce the diagram above is below. The downloadSrc function gets and saves the dependencies (a 5.7MB binary file containing the topographic data and a 0.05MB csv containing the coordinates of points to plot) in a local folder; you need only run that once.
import rasterio
from rasterio import mask as msk
import matplotlib.pyplot as plt
from matplotlib.colors import LinearSegmentedColormap
from matplotlib.colors import ListedColormap
import numpy as np
import pandas as pd
import geopandas as gpd
import earthpy.spatial as es
from shapely.geometry import Point
def downloadSrc(dl=1):
if dl == 1:
import os
os.mkdir('sabah')
import requests
r = requests.get('https://raw.githubusercontent.com/Thevesh/Display/master/sabah_tiff.npy')
with open('sabah/sabah_topog.npy', 'wb') as f: f.write(r.content)
df = pd.read_csv('https://raw.githubusercontent.com/Thevesh/Display/master/schools.csv')
df.to_csv('sabah/sabah_schools.csv')
# Set dl = 0 after first run; the files will be in your current working directory + /sabah
downloadSrc(dl=1)
# Load topography of Sabah, pre-saved from clipped tiff file (as per Adam Symington guide)
value_range = 4049
sabah_topography = np.load('sabah/sabah_topog.npy')
# Load coordinates of schools in Sabah
crs={'init':'epsg:4326'}
sch = pd.read_csv('sabah/sabah_schools.csv',usecols=['lat','lon'])
geometry = [Point(xy) for xy in zip(sch.lon, sch.lat)]
schools = gpd.GeoDataFrame(sch, crs=crs, geometry=geometry)
# Replicated directly from guide, with own modifications only to colours
sabah_colormap = LinearSegmentedColormap.from_list('sabah', ['lightgray', '#e6757b', '#CD212A', '#CD212A'], N=value_range)
background_color = np.array([1,1,1,1])
newcolors = sabah_colormap(np.linspace(0, 1, value_range))
newcolors = np.vstack((newcolors, background_color))
sabah_colormap = ListedColormap(newcolors)
hillshade = es.hillshade(sabah_topography[0], azimuth=180, altitude=1)
# Plot
plt.rcParams["figure.figsize"] = [5,5]
plt.rcParams["figure.autolayout"] = True
fig, ax = plt.subplots()
ax.imshow(sabah_topography[0], cmap=sabah_colormap)
ax.imshow(hillshade, cmap='Greys', alpha=0.3)
schools.plot(color='black', marker='x', markersize=10,ax=ax)
ax.axis('off')
plt.show()
As it turns out, I had given myself the hint to answering point (1), and also managed to solve (2).
For (1), the points simply needed to be rescaled, and we get this:
I did so by getting the max/min points of the map from the underlying shapefile, and then scaling it based on the max/min points of the axes, as follows:
# Get limit points
l = gpd.read_file('param_geo/sabah.shp')['geometry'].bounds
lat_min,lat_max,lon_min,lon_max = l['miny'].iloc[0], l['maxy'].iloc[0], l['minx'].iloc[0], l['maxx'].iloc[0]
xmin,xmax = ax.get_xlim()
ymin,ymax = ax.get_ylim()
# Load coordinates of schools in Sabah and rescale
crs={'init':'epsg:4326'}
sch = pd.read_csv('sabah/sabah_schools.csv',usecols=['lat','lon'])
sch.lat = ymin + (sch.lat - lat_min)/(lat_max - lat_min) * (ymax - ymin)
sch.lon = xmin + (sch.lon - lon_min)/(lon_max - lon_min) * (xmax - xmin)
For (2), the grey background is coming from the fact that the hillshade array has values outside the map area which are being mapped to grey. To remove the grey, we need to nullify these values.
In this specific case, we can leverage on the fact that we know the top right corner of this map is "outside" the map (every country in the world will have at least one corner for which this is true, because no country is a perfect square):
top_right = hillshade[0,-1]
hillshade[hillshade == top_right] = np.nan
And voila, a beautiful white background:
For (3), I suspect it requires us to rescale the Polygon from the shapefile in a manner similar to how we rescaled the coordinates.
I want to do a spatial binning (using median as aggregation function)
starting from a CSV file containing pollutant values measured at positions long and lat.
The resulting map should be something as:
But for data applied to a city's extent.
At this regard I found this tutorial that is close to what I want to do, but I was not able to get the desired result.
I think that I'm missing something on how to correctly use dissolve and plot the resulting data (better using Folium)
Any useful example code?
you have not provided sample data. So I have used global earthquakes as set of points and geometry of California for scope / extent
it's simple to create grid using shapely.geometry.box()
I have shown use of median and also another aggfunc to demonstrate multiple metrics can be calculated
have used folium to plot. This feature is new in geopandas 0.10.0 https://geopandas.org/en/stable/docs/user_guide/interactive_mapping.html
import geopandas as gpd
import shapely.geometry
import numpy as np
# equivalent of CSV, all earthquake points globally
gdf_e = gpd.read_file(
"https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_month.geojson"
)
# get geometry of bounding area. Have selected a state rather than a city
gdf_CA = gpd.read_file(
"https://raw.githubusercontent.com/glynnbird/usstatesgeojson/master/california.geojson"
).loc[:, ["geometry"]]
BOXES = 50
a, b, c, d = gdf_CA.total_bounds
# create a grid for Califormia, could be a city
gdf_grid = gpd.GeoDataFrame(
geometry=[
shapely.geometry.box(minx, miny, maxx, maxy)
for minx, maxx in zip(np.linspace(a, c, BOXES), np.linspace(a, c, BOXES)[1:])
for miny, maxy in zip(np.linspace(b, d, BOXES), np.linspace(b, d, BOXES)[1:])
],
crs="epsg:4326",
)
# remove grid boxes created outside actual geometry
gdf_grid = gdf_grid.sjoin(gdf_CA).drop(columns="index_right")
# get earthquakes that have occured within one of the grid geometries
gdf_e_CA = gdf_e.loc[:, ["geometry", "mag"]].sjoin(gdf_grid)
# get median magnitude of eargquakes in grid
gdf_grid = gdf_grid.join(
gdf_e_CA.dissolve(by="index_right", aggfunc="median").drop(columns="geometry")
)
# how many earthquakes in the grid
gdf_grid = gdf_grid.join(
gdf_e_CA.dissolve(by="index_right", aggfunc=lambda d: len(d))
.drop(columns="geometry")
.rename(columns={"mag": "number"})
)
# drop grids geometries that have no measures and create folium map
m = gdf_grid.dropna().explore(column="mag")
# for good measure - boundary on map too
gdf_CA["geometry"].apply(lambda g: shapely.geometry.MultiLineString([p.exterior for p in g.geoms])).explore(m=m)
Thanks to #Rob Raymond,
finally solved with the following code:
import pandas as pd
import geopandas as gpd
import pyproj
import matplotlib.pyplot as plt
import numpy as np
import shapely
from folium import plugins
df=pd.read_csv('../Desktop/test_esri.csv')
gdf_monica = gpd.GeoDataFrame(
df, geometry=gpd.points_from_xy(df.long, df.lat))
gdf_monica=gdf_monica.set_crs('epsg:4326')
gdf_area = gpd.read_file('https://raw.githubusercontent.com/openpolis/geojson-italy/master/geojson/limits_IT_municipalities.geojson')#.loc[:, ["geometry"]]
gdf_area =gdf_area[gdf_area['name']=='Portici'].loc[:,['geometry']]
BOXES = 50
a, b, c, d = gdf_area.total_bounds
gdf_grid = gpd.GeoDataFrame(
geometry=[
shapely.geometry.box(minx, miny, maxx, maxy)
for minx, maxx in zip(np.linspace(a, c, BOXES), np.linspace(a, c, BOXES)[1:])
for miny, maxy in zip(np.linspace(b, d, BOXES), np.linspace(b, d, BOXES)[1:])
],
crs="epsg:4326",
)
# remove grid boxes created outside actual geometry
gdf_grid = gdf_grid.sjoin(gdf_area).drop(columns="index_right")
gdf_monica_binned = gdf_monica.loc[:, ["geometry", "CO"]].sjoin(gdf_grid)
# get median magnitude of CO pollutant
gdf_grid = gdf_grid.join(
gdf_monica_binned.dissolve(by="index_right", aggfunc="median").drop(columns="geometry")
)
# how many earthquakes in the grid
gdf_grid = gdf_grid.join(
gdf_monica_binned.dissolve(by="index_right", aggfunc=lambda d: len(d))
.drop(columns="geometry")
.rename(columns={"CO": "number"})
)
# drop grids geometries that have no measures and create folium map
m = gdf_grid.dropna().explore(column="CO")
# for good measure - boundary on map too
gdf_area["geometry"].apply(lambda g: shapely.geometry.MultiLineString([p.exterior for p in g.geoms])).explore(m=m)
that produce:
As you can understand, I have little or no knowledge regarding spatial analysis. I was not able to get correct results without using geojson data that describe a geometry within which the points of interest fall.
If anyone could add more insights... thanks!
I want to convert a pandas DataFrame to a spatial enabled geopandas one as:
df=pd.read_csv('../Desktop/test_esri.csv')
df.head()
Then converted using:
gdf = geopandas.GeoDataFrame(
df, geometry=geopandas.points_from_xy(df.long, df.lat))
from pyproj import crs
crs_epsg = crs.CRS.from_epsg(4326)
gdf=gdf.set_crs('epsg:4326')
Then I want to overimpose a spatial grid as:
import numpy as np
import shapely
from pyproj import crs
# total area for the grid
xmin, ymin, xmax, ymax= gdf.total_bounds
# how many cells across and down
n_cells=30
cell_size = (xmax-xmin)/n_cells
# projection of the grid
# crs = "+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs"
# create the cells in a loop
grid_cells = []
for x0 in np.arange(xmin, xmax+cell_size, cell_size ):
for y0 in np.arange(ymin, ymax+cell_size, cell_size):
# bounds
x1 = x0-cell_size
y1 = y0+cell_size
grid_cells.append( shapely.geometry.box(x0, y0, x1, y1) )
cell = geopandas.GeoDataFrame(grid_cells, columns=['geometry'],
crs=crs.CRS('epsg:4326'))
Then merge the grid with geopandas dataframe:
merged = geopandas.sjoin(gdf, cell, how='left', predicate='within')
To finally compute the desired metric inside "dissolve":
# Compute stats per grid cell -- aggregate fires to grid cells with dissolve
dissolve = merged.dissolve(by="index_right", aggfunc="median")
But I think I did something wrong with the "cell" grid and I can't figure it out!!
An extract of csv file used con be found here.
I have plotted a heatmap with the following data.
I have thousands of rows. Its just a sample. I also wanted to see the google map view of that coordinate. So I did something like this.
import folium
from folium.plugins import HeatMap
from folium.plugins import FastMarkerCluster
default_location=[11.1657, 45.4515]
m = folium.Map(location=default_location, zoom_start=13)
heat_data = [[row['lat'],row['lon']] for index, row in test.iterrows()]
# Plot it on the map
HeatMap(heat_data).add_to(m)
callback = ('function (row) {'
'var marker = L.marker(new L.LatLng(row[0], row[1]), {color: "red"});'
'var icon = L.AwesomeMarkers.icon({'
"icon: 'info-sign',"
"iconColor: 'white',"
"markerColor: 'green',"
"prefix: 'glyphicon',"
"extraClasses: 'fa-rotate-0'"
'});'
'marker.setIcon(icon);'
"var popup = L.popup({maxWidth: '300'});"
"const display_text = {text1: row[0], text2: row[1]};"
"var mytext = $(`<div id='mytext' class='display_text' style='width: 100.0%; height: 100.0%;'>\
<a href=https://https://www.google.com/maps?ll=${display_text.text1},${display_text.text2} target='_blank'>Open Google Maps</a></div>`)[0];"
"popup.setContent(mytext);"
"marker.bindPopup(popup);"
'return marker};')
m.add_child(FastMarkerCluster(heat_data, callback=callback))
# Display the map
m
Now for every gps coordinate I want to plot a small arrow or few small arrows in the angle of heading_direction and if possible show the distance_of_item in that angle from the gps coordinate. The expected outcome may be something like this.
In the above image, the location pointer is the gps coordinate, the direction and angle would be according to heading direction angle and there is a little star plotted which is the object. The object should be placed at a distance(in meters) mentioned in the dataset. I am not sure how to achieve that. Any lead or suggestions are most welcome. Thanks!
given your sample data is an image, have used alternate GPS data (UK hospitals) then added distance and direction columns as random values
given requirement is to plot a marker at location defined by distance and direction, first step is to calculate GPS co-ordinates of this.
use UTM CRS so that distance is meaningful
use high school maths to calculate x and y in UTM CRS
convert CRS back to WSG 84 so that have GPS co-ordinates
you have tagged question as plotly so I have used mapbox line and scatter traces to demonstrate building a tiled map
sample data is 1200+ hospitals, performance is decent
geopandas data frame could also be used to build folium tiles / markers. Key step is calculating the GPS co-ordinates
import geopandas as gpd
import pandas as pd
import numpy as np
import shapely
import math
import plotly.express as px
import plotly.graph_objects as go
import io, requests
# get some public addressess - hospitals. data that has GPS lat / lon
dfhos = pd.read_csv(io.StringIO(requests.get("http://media.nhschoices.nhs.uk/data/foi/Hospital.csv").text),
sep="¬",engine="python",).loc[:, ["OrganisationName", "Latitude", "Longitude"]]
# debug with fewer records
# df = dfhos.loc[0:500]
df = dfhos
# to use CRS transformations use geopandas, initial data is WSG 84, transform to UTM geometry
# directions and distances are random
gdf = gpd.GeoDataFrame(
data=df.assign(
heading_direction=lambda d: np.random.randint(0, 360, len(d)),
distance_of_item=lambda d: np.random.randint(10 ** 3, 10 ** 4, len(d)),
),
geometry=df.loc[:, ["Longitude", "Latitude"]].apply(
lambda r: shapely.geometry.Point(r["Longitude"], r["Latitude"]), axis=1
),
crs="EPSG:4326",
).pipe(lambda d: d.to_crs(d.estimate_utm_crs()))
# standard high school geometry...
def new_point(point, d, alpha):
alpha = math.radians(alpha)
return shapely.geometry.Point(
point.x + (d * math.cos(alpha)),
point.y + (d * math.sin(alpha)),
)
# calculate points based on direction and distance in UTM CRS. Then convert back to WSG 84 CRS
gdf["geometry2"] = gpd.GeoSeries(
gdf.apply(
lambda r: new_point(
r["geometry"], r["distance_of_item"], r["heading_direction"]
),
axis=1,
),
crs=gdf.geometry.crs,
).to_crs("EPSG:4326")
gdf = gdf.to_crs("EPSG:4326")
# plot lines to show start point and direct. plot markers of destinations for text of distance, etc
fig = px.line_mapbox(
lon=np.stack(
[gdf.geometry.x.values, gdf.geometry2.x.values, np.full(len(gdf), np.nan)],
axis=1,
).reshape([1, len(gdf) * 3])[0],
lat=np.stack(
[gdf.geometry.y.values, gdf.geometry2.y.values, np.full(len(gdf), np.nan)],
axis=1,
).reshape([1, len(gdf) * 3])[0],
).add_traces(
px.scatter_mapbox(
gdf,
lat=gdf.geometry2.y,
lon=gdf.geometry2.x,
hover_data=["distance_of_item", "OrganisationName"],
).data
)
# c = gdf.loc[]
fig.update_layout(mapbox={"style": "open-street-map", "zoom": 8, 'center': {'lat': 52.2316838387109, 'lon': -1.4577750831062155}}, margin={"l":0,"r":0,"t":0,"r":0})
I have a 2-d gridded files which represents the land use catalogues for the place of interest.
I also have some lat/lon based point distributed in this area.
from netCDF4 import Dataset
## 2-d gridded files
nc_file = "./geo_em.d02.nc"
geo = Dataset(nc_file, 'r')
lu = geo.variables["LU_INDEX"][0,:,:]
lat = geo.variables["XLAT_M"][0,:]
lon = geo.variables["XLONG_M"][0,:]
## point files
point = pd.read_csv("./point_data.csv")
plt.pcolormesh(lon,lat,lu)
plt.scatter(point_data.lon,cf_fire_data.lat, color ='r')
I want to extract the values of the 2-d gridded field which those points belong, but I found it is difficult to define a simple function to solve that.
Is there any efficient method to achieve it?
Any advices would be appreciated.
PS
I have uploaded my files here
1. nc_file
2. point_file
I can propose solution like this, where I just loop over the points and select the data based on the distance from the point.
#/usr/bin/env ipython
import numpy as np
from netCDF4 import Dataset
import matplotlib.pylab as plt
import pandas as pd
# --------------------------------------
## 2-d gridded files
nc_file = "./geo_em.d02.nc"
geo = Dataset(nc_file, 'r')
lu = geo.variables["LU_INDEX"][0,:,:]
lat = geo.variables["XLAT_M"][0,:]
lon = geo.variables["XLONG_M"][0,:]
## point files
point = pd.read_csv("./point_data.csv")
plt.pcolormesh(lon,lat,lu)
#plt.scatter(point_data.lon,cf_fire_data.lat, color ='r')
# --------------------------------------------
# get data for points:
dataout=[];
lon_ratio=np.cos(np.mean(lat)*np.pi/180.0)
for ii in range(len(point)):
plon,plat = point.lon[ii],point.lat[ii]
distmat=np.sqrt(1./lon_ratio*(lon-plon)**2+(lat-plat)**2)
kk=np.where(distmat==np.min(distmat));
dataout.append([float(lon[kk]),float(lat[kk]),float(lu[kk])]);
# ---------------------------------------------
I'm using python's matplotlib and Basemap libraries.
I'm attempting to plot a list of GPS points around the city of Chicago for a project that I'm working on but it's not working. I've looked at all of the available examples, but despite copying and pasting them verbatim (and then changing the gps points) the map fails to render with the points plotted.
Here are some example points as they are stored in my code:
[(41.98302392, -87.71849159),
(41.77351707, -87.59144826),
(41.77508317, -87.58899995),
(41.77511247, -87.58646695),
(41.77514645, -87.58515301),
(41.77538531, -87.58611272),
(41.71339537, -87.56963306),
(41.81685612, -87.59757281),
(41.81697313, -87.59910809),
(41.81695808, -87.60049861),
(41.75894604, -87.55560586)]
and here's the code that I'm using to render the map (which doesn't work).
# -*- coding: utf-8 -*-
from pymongo import *
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
from collections import Counter
import ast
def routes_map():
"""
doesn't work :(
# map of chicago
"""
all_locations = [] #<-- this is the example data above
x = []
y = []
for loc in all_locations: #creates two lists for the x and y (lat,lon) coordinates
x.append(float(loc[0]))
y.append(float(loc[1]))
# llcrnrlat,llcrnrlon,urcrnrlat,urcrnrlon
# are the lat/lon values of the lower left and upper right corners
# of the map.
# resolution = 'i' means use intermediate resolution coastlines.
# lon_0, lat_0 are the central longitude and latitude of the projection.
loc = [41.8709, -87.6331]
# setup Lambert Conformal basemap.
m = Basemap(llcrnrlon=-90.0378,llcrnrlat=40.6046,urcrnrlon=-85.4277,urcrnrlat=45.1394,
projection='merc',resolution='h')
# draw coastlines.
m.drawcoastlines()
m.drawstates()
# draw a boundary around the map, fill the background.
# this background will end up being the ocean color, since
# the continents will be drawn on top.
m.drawmapboundary(fill_color='white')
x1, y1 = m(x[:100],y[:100])
m.plot(x1,y1,marker="o",alpha=1.0)
plt.title("City of Chicago Bus Stops")
plt.show()
This is what I get from running this code:
Does anyone have any tips as to what I'm doing wrong?
You are accidentally inputting latitude values as x and longitude values as y. In the example data you give, the first column is latitude and the second column is longitude, not the other way around as your code seems to think.
So use x.append(float(loc[1])) and y.append(float(loc[0])) instead of what you have.