I'm working with Folium for the first time, and attempting to make a Choropleth map of housing values in North Carolina using Zillow data as the source. I've been running into lots of issues along the way, and right now I'm a bit stuck on how to add in colors to the map; if the property value is >100k make it green, and slowing increasing the gradient to orange if it's <850k.
At the moment the map does generate the zip code data fine, but all of the polygons are a black-grey color. It's also not showing a color key or map name, and I have a feeling some of my earlier code could be off.
import folium
import pandas as pd
import requests
import os
working_directory = os.getcwd()
print(working_directory)
path = working_directory + '/Desktop/NCHomes.csv'
df = pd.read_csv(path)
df.head()
df['Homes'].min(), df['Homes'].max()
INDICATOR = 'North Carolina Home Values by Zip Code'
data = df[df['RegionName'] == INDICATOR]
max_value = data['Homes'].max()
data = data[data['Homes'] == max_value]
data.head()
geojson_url = 'https://raw.githubusercontent.com/OpenDataDE/State-zip-code-GeoJSON/master/nc_north_carolina_zip_codes_geo.min.json'
response = requests.get(geojson_url)
geojson = response.json()
geojson
geojson['features'][0]
map_data = data[['RegionName', 'Homes']]
map_data.head()
M = folium.Map(location=[20, 10], zoom_start=2)
folium.Choropleth(
geo_data=geojson,
data=map_data,
columns=['RegionName', 'Homes'],
fill_color='YlOrRd',
fill_opacity=0.7,
line_opacity=0.2,
legend_name=INDICATOR
).add_to(M)
M
You can specify the threshold_scale parameter as follows:
folium.Choropleth(
geo_data=geojson,
data=map_data,
columns=['RegionName', 'Homes'],
fill_color='YlOrRd',
fill_opacity=0.7,
line_opacity=0.2,
threshold_scale=[100000, 850000],
legend_name=INDICATOR
).add_to(M)
Related
I'm working on Python with a dataset that has data about a numerical variable for each italian region, like this:
import numpy as np
import pandas as pd
regions = ['Trentino Alto Adige', "Valle d'Aosta", 'Veneto', 'Lombardia', 'Emilia-Romagna', 'Toscana', 'Friuli-Venezia Giulia', 'Liguria', 'Piemonte', 'Marche', 'Lazio', 'Umbria', 'Abruzzo', 'Sardegna', 'Puglia', 'Molise', 'Basilicata', 'Calabria', 'Sicilia', 'Campania']
df = pd.DataFrame([regions,[10+(i/2) for i in range(20)]]).transpose()
df.columns = ['region','quantity']
df.head()
I would like to generate a map of Italy in which the colour of the different regions depends on the numeric values of the variable quantity (df['quantity']),i.e., a choropleth map like this:
How can I do it?
You can use geopandas.
The regions in your df compared to the geojson dont match exactly. I'm sure you can find another one, or alter the names so they match.
import pandas as pd
import geopandas as gpd
regions = ['Trentino Alto Adige', "Valle d'Aosta", 'Veneto', 'Lombardia', 'Emilia-Romagna', 'Toscana', 'Friuli-Venezia Giulia', 'Liguria', 'Piemonte', 'Marche', 'Lazio', 'Umbria', 'Abruzzo', 'Sardegna', 'Puglia', 'Molise', 'Basilicata', 'Calabria', 'Sicilia', 'Campania']
df = pd.DataFrame([regions,[10+(i/2) for i in range(20)]]).transpose()
df.columns = ['region','quantity']
#Download a geojson of the region geometries
gdf = gpd.read_file(filename=r'https://raw.githubusercontent.com/openpolis/geojson-italy/master/geojson/limits_IT_municipalities.geojson')
gdf = gdf.dissolve(by='reg_name') #The geojson is to detailed, dissolve boundaries by reg_name attribute
gdf = gdf.reset_index()
#gdf.reg_name[~gdf.reg_name.isin(regions)] Two regions are missing in your df
#16 Trentino-Alto Adige/Südtirol
#18 Valle d'Aosta/Vallée d'Aoste
gdf = pd.merge(left=gdf, right=df, how='left', left_on='reg_name', right_on='region')
ax = gdf.plot(
column="quantity",
legend=True,
figsize=(15, 10),
cmap='OrRd',
missing_kwds={'color': 'lightgrey'});
ax.set_axis_off();
I having trouble getting some air pollution data to show different colors in a chloropleth map using folium. Please let me know where my code may be throwing an error. I think it is the key_on parameter but need help.
This is how my map turns out.
enter image description here
What I would like is for the mean concentration of the air pollution data to show up on the map but the map is still greyed out.
Here are the files I used:
Geojson file - Used "download zip" in upper right of this website https://gist.github.com/miguelpaz/edbc79fc55447ae736704654b3b2ef90#file-uhf42-geojson
Data file - Exported data from here https://a816-dohbesp.nyc.gov/IndicatorPublic/VisualizationData.aspx?id=2023,719b87,122,Summarize
Here is my code:
import geopandas as gpd
import folium
#clean pollution data
pm_df1 = pd.read_csv('/work/Fine Particulate Matter (PM2.5).csv',header = 5, usecols = ['GeoTypeName', 'Borough','Geography', 'Geography ID','Mean (mcg per cubic meter)'], nrows = 140)
#limit dataframe to rows with neighborhood (UHF 42) that matches geojson file
pm_df2 = pm_df1[(pm_df1['GeoTypeName'] == 'Neighborhood (UHF 42)')]
pm_df2
#clean geojson file
uhf_df2 = gpd.read_file('/work/uhf42.geojson', driver='GeoJSON')
uhf_df2.head()
#drop row 1 that has no geography
uhf_df3 = uhf_df2.iloc[1:]
uhf_df3.head()
## create a map
pm_testmap = folium.Map(location=[40.65639,-73.97379], tiles = "cartodbpositron", zoom_start=10)
# generate choropleth map
pm_testmap.choropleth(
geo_data=uhf_df3,
data=pm_df2,
columns=['Geography', 'Mean (mcg per cubic meter)'],
key_on='feature.properties.uhf_neigh', #think this is where I mess up.
fill_color='BuPu',
fill_opacity=0.2,
line_opacity=0.7,
legend_name='Average dust concentration',
smooth_factor=0)
# display map
pm_testmap
The problem with key_on is right as you think.
Both data have the name of UHF written on them, but in a completely different form.
In order to link these two, the data must first be preprocessed.
I don't know your data.
It would be nice if you could df.head() the two data to show them, but I'll explain based on the data I checked through the link you provided.
In your geojson file, uhf_neigh simply says Northeast Bronx. However, your PM data appears to have the region listed as Bronx: Northeast Bronx. The following process seems to be necessary to unify your local name before plotting map.
uhf_df2['UHF_NEIGH'] = uhf_df2['BOROUGH']+ ': ' + uhf_df2['UHF_NEIGH']
I tried to run it with your data and code, but it was not even displaying the map. There should be no problem in your code because you have associated the place name in the data frame with the place name in geojson. I gave up on the string association and changed the association to a place name code association, and the map was displayed. The provided csv file failed to load, so I deleted the unnecessary lines and loaded it. Also, I read the file as a json file instead of geopandas.
import pandas as pd
import geopandas as gpd
import json
import folium
pm_df1 = pd.read_csv('./data/test_20211221.csv')
pm_df1 = pm_df1[['GeoTypeName', 'Borough', 'Geography', 'Geography ID', 'Mean (mcg per cubic meter)']]
pm_df2 = pm_df1[(pm_df1['GeoTypeName'] == 'Neighborhood (UHF 42)')]
with open('./data/uhf42.geojson') as f:
uhf_df3 = json.load(f)
pm_testmap = folium.Map(location=[40.65639,-73.97379], tiles = "cartodbpositron", zoom_start=10)
# generate choropleth map
pm_testmap.choropleth(
geo_data=uhf_df3,
data=pm_df2,
columns=['Geography ID', 'Mean (mcg per cubic meter)'],
key_on='feature.properties.uhfcode', #think this is where I mess up.
fill_color='BuPu',
fill_opacity=0.2,
line_opacity=0.7,
legend_name='Average dust concentration',
smooth_factor=0)
# display map
pm_testmap
I have been working with an Accident Database from Seattle which contains the coordinates of around 200,000 accidents. What I want to do is to group those accidents geographically in districts, for example. To that end I visualised the grouping on a map using Folium but now I don't know how to extract those same groups into a new column in my database (or if it is even possible).
Here is what I have been doing with Folium and the result:
from folium import plugins
#Using Seattle's latitude and longitude
latitude = 47.608013
longitude = -122.335167
seattle_map = folium.Map(location = [latitude, longitude], zoom_start = 12)
incidents = plugins.MarkerCluster().add_to(seattle_map)
for lat, lng, label, in zip(database.Y, database.X, database.SEVERITYCODE):
folium.Marker(
location=[lat, lng],
icon=None,
popup=folium.Popup(label),
).add_to(incidents)
seattle_map
Output Folium
If you want to add the districts of Seattle, you can use this Github repository with all kinds of geographical data about Seattle : https://github.com/seattleio/seattle-boundaries-data
For instance, if you want to add the zipcode areas on the map, you can use the geojson file like this :
latitude = 47.608013
longitude = -122.335167
url = "https://raw.githubusercontent.com/seattleio/seattle-boundaries-data/master/data/zip-codes.geojson"
seattle_map = folium.Map(location = [latitude, longitude], zoom_start = 12)
folium.GeoJson(
url,
name='zip_code'
).add_to(seattle_map)
seattle_map
If you want to add the zipcode areas to the collisions data, the best is to use Choropleth map from Folium. You need to work a little on your data to know to which zipcode area belongs the collision. I use the shapely library to do so. You can create a code like that :
import json
import requests
import folium
import pandas as pd
from shapely.geometry import shape, Point
# Url of the geojson with zipcode of Seattle
url = "https://raw.githubusercontent.com/seattleio/seattle-boundaries-data/master/data/zip-codes.geojson"
# Import data of the collisions in Seattle
df = pd.read_csv("Collisions.csv")
# Keep only lat and long
df_clean = df.loc[:, ["X", "Y"]]
df_clean = df_clean.dropna()
r = requests.get(url)
for index,row in df_clean.iterrows():
for feature in r.json()["features"]:
polygon = shape(feature['geometry'])
point = Point(row[0], row[1])
if polygon.contains(point):
df_clean.loc[index,'ZCTA5CE10'] = feature["properties"]['ZCTA5CE10']
break
df_clean = df_clean.dropna()
result = df_clean.groupby(["ZCTA5CE10"])["X"].count()
result = pd.DataFrame(result)
result.reset_index(level=0, inplace=True)
#Using Seattle's latitude and longitude
latitude = 47.608013
longitude = -122.335167
seattle_map = folium.Map(location = [latitude, longitude], zoom_start = 12)
folium.Choropleth(
geo_data=url,
name='choropleth',
data=result,
columns=["ZCTA5CE10", 'X'],
key_on='feature.properties.ZCTA5CE10',
fill_color='YlOrRd',
).add_to(seattle_map)
seattle_map
You can improve a lot my code in terms of performance (it is clearly not the best choice to use a for loop to create a new column of the dataframe)
I am surely missing something in choropleth configuration. Please find below code.
import pandas as pd
import folium
df = pd.read_csv("https://cocl.us/sanfran_crime_dataset",index_col=0)
# group by neighborhood
sf = df.groupby('PdDistrict').count()
sf = pd.DataFrame(sf,columns=['Category']) # remove unneeded columns
sf.reset_index(inplace=True) # default index, otherwise groupby column becomes index
sf.rename(columns={'PdDistrict':'Neighborhood','Category':'Count'}, inplace=True)
sf.sort_values(by='Count', inplace=True, ascending=False)
sf
# San Francisco latitude and longitude values
latitude = 37.77
longitude = -122.42
sf_neighborhood_geo = 'https://raw.githubusercontent.com/codeforamerica/click_that_hood/master/public/data/san-francisco.geojson'
# Create map
sf_map = folium.Map(location=[latitude,longitude], zoom_start=12)
# Use json file TEST based on class
sf_map.choropleth(
geo_data=sf_neighborhood_geo,
data=sf,
columns=['Neighborhood','Count'],
key_on='name',
fill_color='YlOrRd',
fill_opacity='0.7',
line_opacity='0.3',
legend_name='Crime Rate in San Francisco, by Neighborhood')
folium.LayerControl().add_to(sf_map)
# display the map
sf_map
PLease let me know what part of the choropleth is not correct?
First of all, please use class folium.Choropleth() instead of method choropleth() which is deprecated.
For example, for your problem:
m = folium.Map(location=[latitude,longitude], zoom_start=12)
folium.Choropleth(geo_data=sf_neighborhood_geo,
name='choropleth',
data=sf,
columns=['Neighborhood','Count'],
key_on='feature.properties.name',
fill_color='YlOrRd',
fill_opacity=0.7,
line_opacity=0.2,
legend_name='Crime Rate in San Francisco, by Neighborhood').add_to(m)
folium.LayerControl().add_to(m)
Having said that, there are two problems in your code:
according to the geojson file, key_on='name' should be key_on='feature.properties.name'
the column Neighborhood in you DataFrame does not have names contained in the geojson file, therefore you are going to likely obtain a map like this:
In order to obtain a meaningful choropleth map, names in sf_neighborhood_geo should correspond to values in sf['Neighborhood'].
I'm trying to follow the blog post from Domino lab, Creating interactive crime maps with Folium. And I found that the code base is too old to run the Folium's Choropleth map marker. Although older version on Domino platform seems working (2015), the latest Ipython notebook doesn't work. So I'm guessing Folium changed something on markers? I tried to find the update but I can't find it. Are anyone familiar with this library? If so please give me advices.
My code below:
from IPython.display import HTML
def display(m, height=500):
"""Takes a folium instance and embed HTML."""
m._build_map()
srcdoc = m.HTML.replace('"', '"')
embed = HTML('<iframe srcdoc="{0}" '
'style="width: 100%; height: {1}px; '
'border: none"></iframe>'.format(srcdoc, height))
return embed
import folium
import pandas as pd
SF_COORDINATES = (37.76, -122.45)
crimedata = pd.read_csv('data/SFPD_Incidents_-_Current_Year__2015_.csv')
#for speed purposes
MAX_RECORDS = 1000
#create empty map zoomed in on San Francisco
map = folium.Map(location=SF_COORDINATES, zoom_start=12)
#add a marker for every record in the filtered data, use a clustered view
for each in crimedata[0:MAX_RECORDS].iterrows():
map.simple_marker(
location = [each[1]['Y'],each[1]['X']],
clustered_marker = True)
display(map)
#definition of the boundaries in the map
district_geo = r'data/sfpddistricts.json'
#calculating total number of incidents per district
crimedata2 = pd.DataFrame(crimedata['PdDistrict'].value_counts().astype(float))
crimedata2.to_json('data/crimeagg.json')
crimedata2 = crimedata2.reset_index()
crimedata2.columns = ['District', 'Number']
#creation of the choropleth
map1 = folium.Map(location=SF_COORDINATES, zoom_start=12)
map1.geo_json(geo_path = district_geo,
data_out = 'data/crimeagg.json',
data = crimedata2,
columns = ['District', 'Number'],
key_on = 'feature.properties.DISTRICT',
fill_color = 'YlOrRd',
fill_opacity = 0.7,
line_opacity = 0.2,
legend_name = 'Number of incidents per district')
display(map1)
Not sure if you mean markers (popups) or the choropleth method itself isn't working?
The map1.geo_json() method is deprecated (see here).
Instead, try map1.choropleth(geo_path = district_geo,
data_out = 'data/crimeagg.json',
data = crimedata2,
columns = ['District', 'Number'],
key_on = 'feature.properties.DISTRICT',
fill_color = 'YlOrRd',
fill_opacity = 0.7,
line_opacity = 0.2,
legend_name = 'Number of incidents per district')
The map.choropleth method worked for me, but don't know if they fixed the popup issue for choropleth maps. Hope this helps!
The mapObject.choropleth method is being depricated.
folium.GeoJson is the suggested method as per this github issue: https://github.com/python-visualization/folium/issues/589
A comment in that issue links to this example, which shows how to build the choropleth:
http://nbviewer.jupyter.org/github/python-visualization/folium/blob/master/examples/GeoJSON_and_choropleth.ipynb?flush_cache=true
TLDR
replace geo_json with GeoJson
and for the args like fill_color, use fillColor: <hex_color> in the style_function dictionary kwarg.