Python Folium Choropleth plots - python

I am trying to plot a choropleth plot of the UK using geojson data file which I downloaded from here: https://data.gov.uk/dataset/regions-december-2016-full-extent-boundaries-in-england2
Below is an example of the json data:
{
"type":"FeatureCollection",
"features":[
{
"type":"Feature",
"properties":{"objectid":1,"rgn16cd":"E12000001","rgn16nm":"North East","bng_e":417313,"bng_n":600358,"long":-1.72889996,"lat":55.2970314,"st_areashape":8675727008.425964,"st_lengthshape":795456.8022925043},
"geometry":{
"type":"MultiPolygon",
"coordinates":[[[[-2.0301237629331097,55.80991509288915],[-2.030069429494278,55.80991420787532],[-2.0300215494803053,55.80992140589199],[-2.0300040593387223,55.80993039246682],
My csv file looks like this:
csv
I essentially just want to plot the Taxi column using folium.
The problem is the plot does not show anything. I used the following code.
import pandas as pd
import os
import json
# read in population data
df = pd.read_csv('map-data.csv')
import folium
from branca.utilities import split_six
state_geo = 'Regions_December_2016_Full_Extent_Boundaries_in_England.geojson'
m = folium.Map(location=[55, 4], zoom_start=5)
m.choropleth(
geo_data=state_geo,
data=df,
columns=['LA-Code', 'Taxi'],
key_on='feature.properties.rgn16cd',
fill_color='YlGn',
fill_opacity=0.7,
line_opacity=0.2,
legend_name='h',
highlight=True
)
m
I think the problem relates the key_on argument.
I can access the correct code in the json file using something like this:
geodata['features'][0]['properties']['rgn16cd']
which gives me back the correct LA code (E12000001) but it does not seem
to work in the above code. I also tried using features instead of feature in the key_on argument but that gives me an error
AttributeError: 'NoneType' object has no attribute 'get'
Does anyone have any ideas what the problem is? Thanks.

From the folium library's documentation on github:
To display it in a Jupyter notebook, simply ask for the object representation:
In : m
It is likely that the source of your issue is that you are not in a Jupiter notebook. Saving the map as a html file and opening it in a browser works fine, without changing the json file. Try the code below:
import pandas as pd
import folium
# read in population data
df = pd.read_csv('map-data.csv')
state_geo = 'Regions_December_2016.geojson'
m = folium.Map(location=[55, 4], zoom_start=5)
m.choropleth(
geo_data=state_geo,
data=df,
columns=['LA-Code', 'Taxi'],
key_on='feature.properties.rgn16cd',
fill_color='YlGn',
fill_opacity=0.7,
line_opacity=0.2,
legend_name='h',
highlight=True
)
m.save("my_map.html")
To open the map from the script, you can call your web-browser through subprocess.call or os.system, by adding those line at the end of your script:
import os
os.system("firefox my_map.html")

Related

Plotting a geopandas dataframe using plotly

I have a geopandas dataframe, which consists of the region name(District), the geometry column, and the amount column. My goal is to plot a choropleth map using the method mentioned below
https://plotly.com/python/choropleth-maps/#using-geopandas-data-frames
Here’s a snippet of my dataframe
I also checked that my columns were in the right format/type.
And here's the code I used to plot the map
fig = px.choropleth(merged,
geojson=merged.geometry,
locations=merged.index,
color="Amount")
fig.update_geos(fitbounds="locations", visible=False)
fig.show()
It produced the below figure
which is obviously not the right figure. For some reasons, it doesn't show the map, instead it shows a line and when I zoom in, I am able to see the map but it has lines running through it. Like this
Has anyone ran into a similar problem? If so how were you able to resolve it?
The Plotly version I am using is 4.7.0. I have tried upgrading to a most recent version but it still didn’t work.
Any help is greatly appreciated. Please find my code and the data on my github.
I'll give you the answer to #tgrandje's comment that solved the problem. Thanks to #Poopah and #tgrandje for the opportunity to raise the answer.
import pandas as pd
import plotly.express as px
import geopandas as gpd
import pyproj
# reading in the shapefile
fp = "./data/"
map_df = gpd.read_file(fp)
map_df.to_crs(pyproj.CRS.from_epsg(4326), inplace=True)
df = pd.read_csv("./data/loans_amount.csv")
# join the geodataframe with the cleaned up csv dataframe
merged = map_df.set_index('District').join(df.set_index('District'))
#merged = merged.reset_index()
merged.head()
fig = px.choropleth(merged, geojson=merged.geometry, locations=merged.index, color="Amount")
fig.update_geos(fitbounds="locations", visible=False)
fig.show()
Another possible source of the problem (when using Plotly graph_objects) is mentioned in this answer over at gis.stackexchange.com:
The locations argument has to point to a column that matches GeoJSON's 'id's.
The geojson argument expects a dictionary.
To solve your problem, you should: (i) point locations to the dataframe's index, and (ii) turn your GeoJSON string to a dictionary.
It's not exactly the answer to your question, but I thought my problem was the same as yours and this helped me. So I am including the answer here.

Plotly Choropleth Map not Displaying Correctly

First off, I'd like to start by saying that I am asking this question since I do not believe that it is related to the similarly titled question: plotly choropleth not plotting data, as I believe it has something to do with the custom GEOJSON boundaries and/or how I am accessing the data.
I use the crime.csv table from https://www.kaggle.com/ankkur13/boston-crime-data, and I am using a GEOJSON file from Boston Analytics (Fetched in the script). Currently, my code runs without errors, but it does not load the data onto the plot.
import pandas as pd
import plotly.graph_objs as go
from urllib.request import urlopen
import json
# Read Dataset
# Located at: https://www.kaggle.com/ankkur13/boston-crime-data
df = pd.read_csv("crime.csv")
with urlopen('http://bostonopendata-boston.opendata.arcgis.com/datasets/9a3a8c427add450eaf45a470245680fc_5.geojson?outSR={%22latestWkid%22:2249,%22wkid%22:102686}') as response:
pd_districts = json.load(response)
df_agg = df.groupby("DISTRICT").agg(CRIMES=("YEAR","count"))
df_agg.reset_index(inplace=True)
df_agg = df_agg[df_agg['DISTRICT'] != 'nan']
df['DISTRICT'] = df['DISTRICT'].apply(lambda x: str(x))
fig = go.Figure(go.Choroplethmapbox(geojson=pd_districts,
locations=df_agg['DISTRICT'].unique(),
z=df_agg['CRIMES'],
featureidkey="features.properties.DISTRICT")
)
fig.update_layout(mapbox_style="carto-positron")
fig.update_geos(fitbounds="locations")
fig.show()
I believe it has something to do with my featureidkey. However, I have tried multiple variants such as properties.DISTRICT, properties.ID, but to no avail.
I also ran a few sanity checks to make sure that the data was accessible, and here they are:
print(df_agg['DISTRICT'].unique())
print(pd_districts['features'][0]['properties']['ID'])
print(df['DISTRICT'].dtype)
Any help would be appreciated.

Pandas pyplot throwing error "no numeric data to plot" when the dataset clearly has correct data

So I have this very basic piece of code to just learn Box plots in matplotlib.pyplot , I am following a tutorial where it works perfectly well for the instructor but not me. Its literally the same code, I would like to know if this feature has been like changed or something. Dataset
import pandas
import matplotlib.pyplot as plt
# This is taken from CSV but its easily available on the web , epecially kaggle
url ="D:\PycharmProjects\ML\Datasets\pima-indians-diabetes-database\diabetes.csv"
names = ['preg','plas','pres','skin','test','mass','pedi','age','class']
data = pandas.read_csv(url,names = names)
# this is where the issue arises
data.plot(kind='box',subplots = 'True',layout=(3,3),sharex=False,sharey=False)
plt.show()
Before to plot, add this line to your code:
data = data.apply(pd.to_numeric)

python file does not run properly on Spyder

I am using Spyder as my python IDE.
I tried run this python code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
path = os.getcwd() + '\ex1data1.txt'
data = pd.read_csv(path, header = None, names = ['Population', 'Profit'])
data.head()
In theory it is supposed to show me a table of data since I am using
data.head()
at the end
but when I run it, it only shows :
I thought it was supposed to display a new window with the data table.
What is wrong?
You are calling data.head() in .py script. The script is supposed to only read the data, not print it. Try print(data.head())
You want to print your data.head() so do that...
print(data.head())

How do I resolve a glpyh not showing or BAD_COLUMN_NAME error with bokeh when importing my own excel or csv data?

I'm getting an error trying to plot different excel and csv data with bokeh.
The examples in the tutorial work, but not when I am making my own dataframes with read_csv or read_excel in pandas. The error says
(BAD_COLUMN_NAME): Glyph refers to nonexistent column name:
Sometimes I just get a blank styled figure object with no plots instead of the error message. My column names are correct having copy pasted them from the os.listdir() output. However, bokeh is saying the data frame doesn't have these columns, therefore cannot plot them.
My code is below:
from bokeh.io import output_notebook,show
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource
import pandas as pd
import os
output_notebook()
df = pd.read_csv('weightData.csv')
source = ColumnDataSource(df)
p = figure(width=400,height=350)
p.circle('Weight','Fat mass',size=10, color='orange',
x_range_name='Weight', y_range_name='Fat Mass',
fill_alpha=0.3)
show(p)
Here's a screenshot of the error I'm getting:
You are not passing the source to p.circle(...) You need to pass source=source as an argument. Unless you pass it a source then p.circle will just create and use a default (empty) one.

Categories