I have data on NY State Hospitals with georeferences and want to create a choropleth map with base map. I also tried layering with .plot() with no success.
When I run my code for layered plot no image is shown, and when I run contextily I get this error message:
HTTPError: Tile URL resulted in a 404 error. Double-check your tile url:
https://stamen-tiles-a.a.ssl.fastly.net/terrain/24/8388574/8388589.png
The conda environement:
*conda config --add channels conda-forge*
*conda config --add channels anaconda*
*conda create -n geo python=3.7.0 geopandas=0.4.0 spyder contextily*
*conda activate geo*
This is what I am running, beginning to end:
import pandas as pd
import geopandas as gp
import contextily as ctx
import matplotlib.pyplot as plt
from shapely.geometry import Point
%matplotlib inline
usa = gp.read_file("https://alicia.data.socrata.com/resource/cap4-bs3u.geojson")
NY_state = usa.loc[usa['state_abbr'] == 'NY']
ax = NY_state.to_crs(epsg=3857).plot(figsize=(10,10), alpha=0.5, edgecolor='k')
ctx.add_basemap(ax)
ax
The steps above work well and create a nice map. Below is how I manipulated the NY State Hospital data to create make GeoDataFrame and value for choropleth map (Infections Observed/Infections Predicted)
polpro_new = pd.read_csv('https://health.data.ny.gov/api/views/utrt-
zdsi/rows.csv?accessType=DOWNLOAD&api_foundry=true' )
polpro_new.rename(columns = {'New Georeferenced Column':'Georef'}, inplace = True)
Geo_df = polpro_new.Georef.str.split(expand=True)
Geo_df = Geo_df.dropna()
Geo_df = polpro_new.Georef.str.split(expand=True)
Geo_df.rename(columns = {0:'Latitude', 1:'Longitude'}, inplace = True)
Geo_df['Latitude'] = Geo_df['Latitude'].str.replace(r'(', '')
Geo_df['Latitude'] = Geo_df['Latitude'].str.replace(r',', '')
Geo_df['Longitude'] = Geo_df['Longitude'].str.replace(r')', '')
New_df = pd.concat([polpro_new, Geo_df], axis=1, join='inner', sort=False)
clabsi1 = New_df[(New_df['Indicator Name']==
'CLABSI Overall Standardized Infection Ratio')]
clabsi_2008 = clabsi1[(clabsi1['Year']== 2008)]
df08 = pd.DataFrame([Point(xy) for xy in zip(clabsi_2008.loc[:,
'Longitude'].astype(float), clabsi_2008.loc[:,'Latitude'].astype(float))])
df08.rename(columns = {0:'geometry'}, inplace = True)
clabsi_08 = clabsi_2008.reset_index()
df08.reset_index()
New_df_08 = pd.concat([clabsi_08, df08], axis=1, sort=False)
New_df_08['IO_to_IP'] = New_df_08['Infections Observed']/New_df_08
['Infections Predicted']
I used the 'coords' DataFrame to create a GeoDataFrame
coords = New_df_08[['IO_to_IP', 'geometry']]
geo_df = gp.GeoDataFrame(coords, crs = 3857, geometry = New_df_08['geometry'])
Below are the two ways I tried to plot the georefrence data over a base map
y_plot = geo_df.plot(column='IO_to_IP', figsize=(10,10), alpha=0.5, edgecolor='k')
ctx.add_basemap(ny_plot)
ny_plot
and
geo_df.plot('IO_to_IP', ax=ax)
plt.show()
plt.savefig("my_plot")
I am able to create the ny_plot but no base plot, get this error:
HTTPError: Tile URL resulted in a 404 error. Double-check your tile url:
https://stamen-tiles-a.a.ssl.fastly.net/terrain/24/8388574/8388589.png
What might be the problem here? How do I go about fixing it?
Again the output I am looking for is a choropleth map of infection ratio (observed/predicted) on a base map of NY State
Related
I'm struggling to get a Bokeh map. The cell runs but does not show anything. It takes about 50s. I can get a blank map to display, but nothing I have tried has worked.
Jupyter version 6.4.12 run through Anaconda 2.3.2
import pandas as pd
import numpy as np
from bokeh.plotting import figure, show, output_notebook
from bokeh.tile_providers import CARTODBPOSITRON, get_provider
from bokeh.models import ColumnDataSource, LinearColorMapper, ColorBar, NumeralTickFormatter
from bokeh.palettes import PRGn, RdYlGn
from bokeh.transform import linear_cmap, factor_cmap
from bokeh.layouts import row, column
from bokeh.resources import INLINE
pd.set_option('display.max_columns', None)
output_notebook(INLINE)
I have Lat & Lon coordinates in my dataset, which I discovered I need to convert to mercator coordinates.
# Define function to switch from lat/long to mercator coordinates
def x_coord(x, y):
lat = x
lon = y
r_major = 6378137.000
x = r_major * np.radians(lon)
scale = x/lon
y = 180.0/np.pi * np.log(np.tan(np.pi/4.0 + lat * (np.pi/180.0)/2.0)) * scale
return (x, y)
# Define coord as tuple (lat,long)
df['coordinates'] = list(zip(df['LATITUDE'], df['LONGITUDE']))
# Obtain list of mercator coordinates
mercators = [x_coord(x, y) for x, y in df['coordinates'] ]
# Create mercator column in our df
df['mercator'] = mercators
# Split that column out into two separate columns - mercator_x and mercator_y
df[['mercator_x', 'mercator_y']] = df['mercator'].apply(pd.Series)
From there, this is my code cell for the plot:
tile = get_provider('CARTODBPOSITRON')
source = ColumnDataSource(data = df)
palette = PRGn[11]
color_mapper = linear_cmap(field_name = 'FIRE_SIZE', palette = palette,
low=df['FIRE_SIZE'].min(), high = df['FIRE_SIZE'].max())
tooltips = [('Fire Year', '#FIRE_YEAR'),('State','#STATE')]
p = figure(title = 'Fire Locations',
x_axis_type = 'mercator',
y_axis_type = 'mercator',
x_axis_label = 'Longitude',
y_axis_label = 'Latitude',
tooltips = tooltips)
p.add_tile(tile)
p.circle(x = 'mercator_x',
y = 'mercator_y',
color = color_mapper,
size = 10,
fill_alpha = 0.7,
source = source)
color_bar = ColorBar(color_mapper = color_mapper['transform'],
formatter = NumeralTickFormatter(format='0.0[0000]'),
`your text` label_standoff = 13, width = 8, location = (0,0))
p.add_layout(color_bar, 'right')
show(p)
The cell runs, but nothing shows. There are no errors. I confirmed that I can get a plot to display using this code:
#Test
tile = get_provider('CARTODBPOSITRON')
p = figure(x_range = (-2000000, 2000000),
y_range = (1000000, 7000000),
x_axis_type = 'mercator',
y_axis_type = 'mercator')
p.add_tile(tile)
show(p)
This is a large dataset, with 2,303,566 entries. I have checked that I have no null values in any of the columns that I am using, as well as verifying the correct data types (lat/lon are float64).
Returning to answer my own question here. After doing some more testing based on helpful comments I received from #mosc9575 and #bigreddot, I determined that the size of my dataset is the reason for Bokeh failing to display the map. I used a single point first, and then a small slice of my dataframe - and the map displayed just fine.
I hope this is helpful to someone else at some point!
Thanks to everyone who assisted.
I am trying to get a DEM raster to line up with a shapefile in Python, but it will not show up no matter what I do. This is for lab exercise, the entire rest of the exercise relies on these lining up, as I will be extracting data from the raster and polygon layers to a point layer.
I know how to do all this "by hand" in ArcGIS, but the point of the exercise is to use R or Python (the professor did an example with R, but we can use whichever, and I have been learning Python the past couple of months for a work project). In the class notes, he says that both files are in EPSG 3847, but the shapefile was missing the CRS, so I added the CRS to it in geopandas.
The DEM appears to be EPSG 3006 (even though it was supposed to be in 3847), so I tried converting it to EPSG 3847 and it still does not show up. So then I tried going the other way and converting the shapefile to EPSG 3006, which did not help either.
import contextily as cx
import geopandas as gpd
import rasterio
from rasterio.plot import show
from rasterio.crs import CRS
from rasterio.plot import show as rioshow
import matplotlib.pyplot as plt
#data files
abisveg = gpd.read_file(r'/content/drive/MyDrive/Stackoverflow/Sweden/abisveg_polygon.shp')
abisveg_3847 = abisveg.set_crs(epsg = 3847)
abisveg_3006 = abisveg_3847.to_crs(epsg = 3006)
src = rasterio.open(r'/content/drive/MyDrive/Stackoverflow/Sweden/nh_75_6.tif')
DEM = src.read()
### creating plot grid
fig = plt.figure(figsize = (20,20), constrained_layout = True)
gs = fig.add_gridspec(1,3)
ax1 = fig.add_subplot(gs[0,0])
ax2 = fig.add_subplot(gs[0,1], sharex = ax1, sharey = ax1)
ax3 = fig.add_subplot(gs[0,2], sharex = ax1, sharey = ax1)
### Plot 1 - Basemap Only
abisveg_3006.plot(ax = ax1, color = 'none')
cx.add_basemap(ax1, crs = 3006)
ax1.set_aspect('equal')
ax1.set_title("Basemap of AOI")
### Plot 2 - DEM
# abisveg_3847.plot(ax = ax2, color = 'none')
show(DEM, ax=ax2, cmap = "Greys")
cx.add_basemap(ax2, crs = 3006)
ax2.set_aspect('equal')
ax2.set_title('Digitial Elevation Model of AOI')
### Plot 3 - Vegetation Types
abisveg_3006.plot(ax = ax3, column = "VEGKOD", cmap = "viridis")
cx.add_basemap(ax3, crs = 3006)
ax3.set_aspect('equal')
ax3.set_title("Vegetation Types")
3 Panel map with missing DEM:
https://i.imgur.com/taG2U9Q.jpg
Trying to plot the files in Matplotlib has not worked, b/c they do not align at all. I am using contextily for the basemap, and have set the basemap CRS to EPSG 3847 (or 3006, depending on which version of the GIS files I was using). The shapefile shows up in the correct location no matter the projection, but the Raster does not show up. What's weird is that if I open everything up in ArcGIS, it all lines up correctly.
If I plot just the DEM all by itself, it shows up, though I don't know where on the earth it is plotting.
fig = plt.figure(figsize = (10,10), constrained_layout = True)
show(DEM, cmap = "Greys")
DEM just by itself:
https://i.imgur.com/KyYu7jc.jpg
I have my code in a colab notebook here:
https://colab.research.google.com/drive/1VAZ3dgf0QS2PPBOl8KJ2FXtB2oRj0qJ8?usp=share_link
The files are here:
https://drive.google.com/drive/folders/1t-xvpIcLOIR9uYXOguJ7KyKqt7wuYSNc?usp=share_link
You could give EOmaps a try... it uses matplotlib/cartopy for plotting and handles re-projecting the data and shapes to the plot-crs
from pathlib import Path
from eomaps import Maps
import geopandas as gpd
p = Path(r"path to the data folder")
# read shapefile
abisveg = gpd.read_file(p / 'abisveg_polygon.shp').set_crs(epsg = 3847)
# create a map in epsg=3006
m = Maps(crs=3006, figsize=(10, 8))
# add stamen-terrain basemap
m.add_wms.OpenStreetMap.add_layer.stamen_terrain()
# plot shapefile (zorder=2 to be on top of the DEM)
m.add_gdf(abisveg, column=abisveg.VEGKOD, cmap="viridis", ec="k", lw=0.2, alpha=0.5, zorder=2)
# plot DEM
m2 = m.new_layer_from_file.GeoTIFF(p / "nh_75_6.tif", cmap="Greys", zorder=1)
m.ax.set_extent((589913.0408156103, 713614.6619114348, 7495264.310799116, 7618965.93189494),
Maps.CRS.epsg(3006))
Im trying to make a kdeplot using geopandas.
this is my code:
Downloading shape file
URL = "https://data.sfgov.org/api/geospatial/wkhw-cjsf?method=export&format=Shapefile"
response = requests.get(URL)
open('pd_data.zip', 'wb').write(response.content)
with zipfile.ZipFile('./pd_data.zip', 'r') as zip_ref:
zip_ref.extractall('./ShapeFiles')
Making the geopandas data frame
data = train.groupby(['PdDistrict']).count().iloc[:,0]
data = pd.DataFrame({ "district": data.index,
"incidences": data.values})
california_map = str(list(pathlib.Path('./ShapeFiles').glob('*.shp'))[0])
gdf = gdp.read_file(california_map)
gdf = pd.merge(gdf, data, on = 'district')
Note: I didn't include the link to the train set because it's not important for this question(use any data you want)
This is the part that I don't get,
what arguments should I pass to the kdeplot function, like where I pass the shape file and where I pass the data?
ax = gplt.kdeplot(
data, clip=gdf.geometry,
shade=True, cmap='Reds',
projection=gplt.crs.AlbersEqualArea())
gplt.polyplot(boroughs, ax=ax, zorder=1)
had a few challenges setting up an environment where I did not get kernel crashes. Used none wheel versions of shapely and pygeos
a few things covered in documentation kdeplot A basic kdeplot takes pointwise data as input. You did not provide sample for data I'm not sure that it is point wise data. Have simulated point wise data, 100 points within each of the districts in referenced geometry
I have found I cannot use clip and projection parameters together. One or the other not both
shape file is passed to clip
import geopandas as gpd
import pandas as pd
import numpy as np
import geoplot as gplt
import geoplot.crs as gcrs
# setup starting point to match question
url = "https://data.sfgov.org/api/geospatial/wkhw-cjsf?method=export&format=Shapefile"
gdf = gpd.read_file(url)
# generate 100 points in each of the districts
r = np.random.RandomState(42)
N = 5000
data = pd.concat(
[
gpd.GeoSeries(
gpd.points_from_xy(*[r.uniform(*g.bounds[s::2], N) for s in (0, 1)]),
crs=gdf.crs,
).loc[lambda s: s.intersects(g.buffer(-0.003))]
for _, g in gdf["geometry"].iteritems()
]
)
data = (
gpd.GeoDataFrame(geometry=data)
.sjoin(gdf)
.groupby("district")
.sample(100, random_state=42)
.reset_index(drop=True)
)
ax = gplt.kdeplot(
data,
clip=gdf,
fill=True,
cmap="Reds",
# projection=gplt.crs.AlbersEqualArea(),
)
gplt.polyplot(gdf, ax=ax, zorder=1)
Anyone can guide how to can I plot a column value against Lat & Long. The data which I want to plot through python is mentioned below. I have run the code but it isn't working. Kindly guide me on how to do it
Data in CSV File :
Longitude Latitude RSRP
71.676847 29.376015 -89
71.676447 29.376115 -101
71.677847 29.376215 -90
Code :
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
df = pd.read_csv('C:\\Users\\uwx630237\\BWPMR.csv')
gdf = gpd.GeoDataFrame(df)
Lon = df['Longitude']
Lat = df['Latitude']
RSRP = df['RSRP']
Required Ouput Picture
Without map as background on the plot, you can use df.plot.scatter(). Here is the relevant lines of code that you can try:
# ... previous lines of code
# add a column, named `color`, and set values in it
df.loc[:, 'color'] = 'green' # set all rows -> color=green
df.loc[df['RSRP'] < -100, 'color'] = 'red' # set some rows -> color=red
# plot the data as a scatter plot
df.plot.scatter( x='Longitude' , y='Latitude', s=20, color=df['color'], alpha=0.8 )
The output will look like this:
You can achieve that using folium. Here is a toy example how to add data to a map of San Francisco
import foilum
import folium.plugins
import branca
import branca.colormap as cm
colormap = cm.LinearColormap(colors=['red','lightblue'], index= 90,100],vmin=90,vmax=100)
sanfrancisco_map = folium.Map(location=[37.77, -122.42], zoom_start=12)
lat = list(df.latitude)
lon = list(df.longitude)
RSRP = list(df.RSRP)
for loc, RSRP in zip(zip(lat, lon), RSRP):
folium.Circle(
location=loc,
radius=10,
fill=True,
color=colormap(p),
).add_to(map)
# add incidents to map
sanfran_map.add_child(colormap)
I have small csv that has 6 coordinates from Birmingham England. I read the csv with pandas then transformed it into GeoPandas DataFrame changing my latitude and longitude columns with Shapely Points. I am now trying to plot my GeoDataframe and all I can see are the points. How do I get the Birmingham map represented as well? A good documentation source on GeoPandas would be strongly appreciated too.
from shapely.geometry import Point
import geopandas as gpd
import pandas as pd
df = pd.read_csv('SiteLocation.csv')
df['Coordinates'] = list(zip(df.LONG, df.LAT))
df['Coordinates'] = df['Coordinates'].apply(Point)
# Building the GeoDataframe
geo_df = gpd.GeoDataFrame(df, geometry='Coordinates')
geo_df.plot()
The GeoPandas documentation contains an example on how to add a background to a map (https://geopandas.readthedocs.io/en/latest/gallery/plotting_basemap_background.html), which is explained in more detail below.
You will have to deal with tiles, that are (png) images served through a web server, with a URL like
http://.../Z/X/Y.png, where Z is the zoom level, and X and Y identify the tile
And geopandas's doc shows how to set tiles as backgrounds for your plots, fetching the correct ones and doing all the otherwise difficult job of spatial syncing, etc...
Installation
Assuming GeoPandas is already installed, you need the contextily package in addition. If you are under windows, you may want to pick a look at How to install Contextily?
Use case
Create a python script and define the contextily helper function
import contextily as ctx
def add_basemap(ax, zoom, url='http://tile.stamen.com/terrain/tileZ/tileX/tileY.png'):
xmin, xmax, ymin, ymax = ax.axis()
basemap, extent = ctx.bounds2img(xmin, ymin, xmax, ymax, zoom=zoom, url=url)
ax.imshow(basemap, extent=extent, interpolation='bilinear')
# restore original x/y limits
ax.axis((xmin, xmax, ymin, ymax))
and play
import matplotlib.pyplot as plt
from shapely.geometry import Point
import geopandas as gpd
import pandas as pd
# Let's define our raw data, whose epsg is 4326
df = pd.DataFrame({
'LAT' :[-22.266415, -20.684157],
'LONG' :[166.452764, 164.956089],
})
df['coords'] = list(zip(df.LONG, df.LAT))
# ... turn them into geodataframe, and convert our
# epsg into 3857, since web map tiles are typically
# provided as such.
geo_df = gpd.GeoDataFrame(
df, crs ={'init': 'epsg:4326'},
geometry = df['coords'].apply(Point)
).to_crs(epsg=3857)
# ... and make the plot
ax = geo_df.plot(
figsize= (5, 5),
alpha = 1
)
add_basemap(ax, zoom=10)
ax.set_axis_off()
plt.title('Kaledonia : From Hienghène to Nouméa')
plt.show()
Note: you can play with the zoom to find the good resolution for the map. E.g./I.e. :
... and such resolutions implicitly call for changing the x/y limits.
Just want to add the use case concerning zooming whereby the basemap is updated according to the new xlim and ylim coordinates. A solution I have come up with is:
First set callbacks on the ax that can detect xlim_changed and ylim_changed
Once both have been detected as changed get the new plot_area calling ax.get_xlim() and ax.get_ylim()
Then clear the ax and re-plot the basemap and any other data
Example for a world map showing the capitals. You notice when you zoom in the resolution of the map is being updated.
import geopandas as gpd
import matplotlib.pyplot as plt
import contextily as ctx
figsize = (12, 10)
osm_url = 'http://tile.stamen.com/terrain/{z}/{x}/{y}.png'
EPSG_OSM = 3857
EPSG_WGS84 = 4326
class MapTools:
def __init__(self):
self.cities = gpd.read_file(
gpd.datasets.get_path('naturalearth_cities'))
self.cities.crs = EPSG_WGS84
self.cities = self.convert_to_osm(self.cities)
self.fig, self.ax = plt.subplots(nrows=1, ncols=1, figsize=figsize)
self.callbacks_connect()
# get extent of the map for all cities
self.cities.plot(ax=self.ax)
self.plot_area = self.ax.axis()
def convert_to_osm(self, df):
return df.to_crs(epsg=EPSG_OSM)
def callbacks_connect(self):
self.zoomcallx = self.ax.callbacks.connect(
'xlim_changed', self.on_limx_change)
self.zoomcally = self.ax.callbacks.connect(
'ylim_changed', self.on_limy_change)
self.x_called = False
self.y_called = False
def callbacks_disconnect(self):
self.ax.callbacks.disconnect(self.zoomcallx)
self.ax.callbacks.disconnect(self.zoomcally)
def on_limx_change(self, _):
self.x_called = True
if self.y_called:
self.on_lim_change()
def on_limy_change(self, _):
self.y_called = True
if self.x_called:
self.on_lim_change()
def on_lim_change(self):
xlim = self.ax.get_xlim()
ylim = self.ax.get_ylim()
self.plot_area = (*xlim, *ylim)
self.blit_map()
def add_base_map_osm(self):
if abs(self.plot_area[1] - self.plot_area[0]) < 100:
zoom = 13
else:
zoom = 'auto'
try:
basemap, extent = ctx.bounds2img(
self.plot_area[0], self.plot_area[2],
self.plot_area[1], self.plot_area[3],
zoom=zoom,
url=osm_url,)
self.ax.imshow(basemap, extent=extent, interpolation='bilinear')
except Exception as e:
print(f'unable to load map: {e}')
def blit_map(self):
self.ax.cla()
self.callbacks_disconnect()
cities = self.cities.cx[
self.plot_area[0]:self.plot_area[1],
self.plot_area[2]:self.plot_area[3]]
cities.plot(ax=self.ax, color='red', markersize=3)
print('*'*80)
print(self.plot_area)
print(f'{len(cities)} cities in plot area')
self.add_base_map_osm()
self.callbacks_connect()
#staticmethod
def show():
plt.show()
def main():
map_tools = MapTools()
map_tools.show()
if __name__ == '__main__':
main()
Runs on Linux Python3.8 with following pip installs
affine==2.3.0
attrs==19.3.0
autopep8==1.4.4
Cartopy==0.17.0
certifi==2019.11.28
chardet==3.0.4
Click==7.0
click-plugins==1.1.1
cligj==0.5.0
contextily==1.0rc2
cycler==0.10.0
descartes==1.1.0
Fiona==1.8.11
geographiclib==1.50
geopandas==0.6.2
geopy==1.20.0
idna==2.8
joblib==0.14.0
kiwisolver==1.1.0
matplotlib==3.1.2
mercantile==1.1.2
more-itertools==8.0.0
munch==2.5.0
numpy==1.17.4
packaging==19.2
pandas==0.25.3
Pillow==6.2.1
pluggy==0.13.1
py==1.8.0
pycodestyle==2.5.0
pyparsing==2.4.5
pyproj==2.4.1
pyshp==2.1.0
pytest==5.3.1
python-dateutil==2.8.1
pytz==2019.3
rasterio==1.1.1
requests==2.22.0
Rtree==0.9.1
Shapely==1.6.4.post2
six==1.13.0
snuggs==1.4.7
urllib3==1.25.7
wcwidth==0.1.7
Note especially requirement for contextily==1.0rc2
On windows I use Conda (P3.7.3) and don't forget to set the User variables:
GDAL c:\Users\<username>\Anaconda3\envs\<your environment>\Library\share\gdal
PROJLIB c:\Users\<username>\Anaconda3\envs\<your environment>\Library\share
Try df.unary_union. The function will aggregate points into a single geometry.
Jupyter Notebook can plot it