Plotly choropleth can't read State iso codes - python

I want to visualize the number of crimes by state using plotly express.
This is the code :
import plotly.express as px
fig = px.choropleth(grouped, locations="Code",
color="Incident",
hover_name="Code",
animation_frame='Year',
scope='usa')
fig.show()
The dataframe itself looks like this:
I only get blank map:
What is the wrong with the code?

The reason for the lack of color coding is that the United States is not specified in the location mode. please find attached a graph with locationmode='USA-states' added. You can find an example in the references. The data was created for your data.
df.head()
Year Code State incident
0 1980 AL Alabama 1445
1 1980 AK Alaska 970
2 1980 AZ Arizona 3092
3 1980 AR Arkansas 1557
4 1980 CA California 1614
import plotly.express as px
fig = px.choropleth(grouped,
locations='Code',
locationmode='USA-states',
color='incident',
hover_name="Code",
animation_frame='Year',
scope="usa")
fig.show()

Related

python graph not showing up

So I tried graphing a data frame using pandas and when I typed it out there is a blank image that shows up with no errors or anything. I was hoping someone knows what the problem could be and how I can solve it.
I was wondering if this is a backend issue or what. Thank you!
For faster answers, we need the code in text format and sample data for reproduction. I have tried to apply the sample from the official reference to your code. The reason why the graph doesn't show up is a guess, since I don't have any code or data, but I think the country name is not retrieved from the dictionary. I extracted the top 10 countries from the sample data by population, and drew a graph based on the data extracted from the original data frame for those country names. The data used as the basis for the looping process is a dictionary of country names and arbitrary colors.
import plotly.express as px
from plotly.subplots import make_subplots
df1 = px.data.gapminder().query('year==2007').sort_values('pop', ascending=False).head(10)
df1
country
continent
year
lifeExp
pop
gdpPercap
iso_alpha
iso_num
299
China
Asia
2007
72.961
1318683096
4959.11
CHN
156
707
India
Asia
2007
64.698
1110396331
2452.21
IND
356
1619
United States
Americas
2007
78.242
301139947
42951.7
USA
840
719
Indonesia
Asia
2007
70.65
223547000
3540.65
IDN
360
179
Brazil
Americas
2007
72.39
190010647
9065.8
BRA
76
1175
Pakistan
Asia
2007
65.483
169270617
2605.95
PAK
586
107
Bangladesh
Asia
2007
64.062
150448339
1391.25
BGD
50
1139
Nigeria
Africa
2007
46.859
135031164
2013.98
NGA
566
803
Japan
Asia
2007
82.603
127467972
31656.1
JPN
392
995
Mexico
Americas
2007
76.195
108700891
11977.6
MEX
484
# create dict country and color
colors = px.colors.sequential.Plasma
color = {k:v for k,v in zip(df1.country,colors)}
{'China': '#0d0887',
'India': '#46039f',
'United States': '#7201a8',
'Indonesia': '#9c179e',
'Brazil': '#bd3786',
'Pakistan': '#d8576b',
'Bangladesh': '#ed7953',
'Nigeria': '#fb9f3a',
'Japan': '#fdca26',
'Mexico': '#f0f921'}
# top10 data
df1_top10 = px.data.gapminder().query('country in #df1.country')
import plotly.graph_objects as go
fig = go.Figure()
colors = px.colors.sequential.Plasma
for k,v in color.items():
fig.add_trace(go.Scatter(
x=df1_top10[df1_top10['country']==k]['year'],
y=df1_top10[df1_top10['country']==k]['lifeExp'],
name=k,
mode='markers+text+lines',
marker_color='black',
marker_size=3,
line=dict(color=color[k]),
yaxis='y1'))
fig.update_layout(
title="Top 10 Country wise Life Ladder trend",
xaxis_title="Year",
yaxis_title="Life Ladder",
template='ggplot2',
font=dict( size=16,
color="Black",
family="Garamond"
),
xaxis=dict(showgrid=True),
yaxis=dict(showgrid=True)
)
fig.show()

ValueError when creating Plotly Choropleth

I'm trying to create a chloropleth chart using plotly express. I have two files, my geojson file and my data file. Example snippet for one country in my geojson file below:
{'type': 'Feature',
'properties': {'ADMIN': 'Aruba', 'ISO_A3': 'ABW'},
'geometry': {'type': 'Polygon',
'coordinates': [[[-69.99693762899992, 12.577582098000036],
[-69.93639075399994, 12.53172435100005],
[-69.92467200399994, 12.519232489000046],
[-69.91576087099992, 12.497015692000076],
[-69.88019771999984, 12.453558661000045],
[-69.87682044199994, 12.427394924000097],
[-69.88809160099993, 12.417669989000046],
[-69.90880286399994, 12.417792059000107],
[-69.93053137899989, 12.425970770000035],
[-69.94513912699992, 12.44037506700009],
[-69.92467200399994, 12.44037506700009],
[-69.92467200399994, 12.447211005000014],
[-69.95856686099992, 12.463202216000099],
[-70.02765865799992, 12.522935289000088],
[-70.04808508999989, 12.53115469000008],
[-70.05809485599988, 12.537176825000088],
[-70.06240800699987, 12.546820380000057],
[-70.06037350199995, 12.556952216000113],
[-70.0510961579999, 12.574042059000064],
[-70.04873613199993, 12.583726304000024],
[-70.05264238199993, 12.600002346000053],
[-70.05964107999992, 12.614243882000054],
[-70.06110592399997, 12.625392971000068],
[-70.04873613199993, 12.632147528000104],
[-70.00715084499987, 12.5855166690001],
[-69.99693762899992, 12.577582098000036]]]},
'id': 'ABW'}
Head from df is shown below which has the column 'data' which will be used to create the heatmap
Location_Site
Country
City
Cluster_Name
Market_Type
data
id
2
IT-MIL
Italy
Milan
Italy
Mature
73.14%
ITA
3
ES-MAD
Spain
Madrid
Iberia
Mature
55.27%
ESP
4
PT-LIS
Portugal
Lisbon
Iberia
Medium
45.71%
PRT
5
AE-DXB
United Arab Emirates
Dubai
EMEA Emerging Markets (EEM)
Emerging
62.98%
ARE
6
EG-CAI
Egypt
Cairo
EMEA Emerging Markets (EEM)
Emerging
20.36%
EGY
The below code snippet is what I'm trying to execute to plot my choropleth graph
fig = px.choropleth(df,
locations = 'id',
geojson = data,
color = 'data')
fig.show()
I am receiving the below error after execution:
ValueError: The first argument to the plotly.graph_objs.layout.Template
constructor must be a dict or
an instance of :class:`plotly.graph_objs.layout.Template`
Any ideas on what might be creating this error? Thanks!
To solve your problem, you need to tie the ID value of the data frame to the ISO_A3 value of the geojson value. aruba was modified to ABW for ITA in Italy, and the output of the map was obtained.
import plotly.express as px
geo_data = {'type': 'Feature',
'properties': {'ADMIN': 'Aruba', 'ISO_A3': 'ABW'},
'geometry': {'type': 'Polygon',
'coordinates': [[[-69.99693762899992, 12.577582098000036],
[-69.93639075399994, 12.53172435100005],
[-69.92467200399994, 12.519232489000046],
[-69.91576087099992, 12.497015692000076],
[-69.88019771999984, 12.453558661000045],
[-69.87682044199994, 12.427394924000097],
[-69.88809160099993, 12.417669989000046],
[-69.90880286399994, 12.417792059000107],
[-69.93053137899989, 12.425970770000035],
[-69.94513912699992, 12.44037506700009],
[-69.92467200399994, 12.44037506700009],
[-69.92467200399994, 12.447211005000014],
[-69.95856686099992, 12.463202216000099],
[-70.02765865799992, 12.522935289000088],
[-70.04808508999989, 12.53115469000008],
[-70.05809485599988, 12.537176825000088],
[-70.06240800699987, 12.546820380000057],
[-70.06037350199995, 12.556952216000113],
[-70.0510961579999, 12.574042059000064],
[-70.04873613199993, 12.583726304000024],
[-70.05264238199993, 12.600002346000053],
[-70.05964107999992, 12.614243882000054],
[-70.06110592399997, 12.625392971000068],
[-70.04873613199993, 12.632147528000104],
[-70.00715084499987, 12.5855166690001],
[-69.99693762899992, 12.577582098000036]]]},
'id': 'ABW'}
import pandas as pd
import numpy as np
import io
data = '''
Location_Site Country City Cluster_Name Market_Type data id
2 IT-MIL Italy Milan Italy Mature 73.14% ABW
3 ES-MAD Spain Madrid Iberia Mature 55.27% ESP
4 PT-LIS Portugal Lisbon Iberia Medium 45.71% PRT
5 AE-DXB "United Arab Emirates" Dubai "EMEA Emerging Markets (EEM)" Emerging 62.98% ARE
6 EG-CAI Egypt Cairo "EMEA Emerging Markets (EEM)" Emerging 20.36% EGY
'''
df = pd.read_csv(io.StringIO(data), delim_whitespace=True)
fig = px.choropleth(df,
locations = 'id',
geojson = geo_data,
featureidkey="properties.ISO_A3",
color = 'data')
fig.show()

Ploting a Histogram

I need to plot a histogram for the data below, country wise quantity sum.
Country Quantity
0 United Kingdom 4263829
1 Netherlands 200128
2 EIRE 142637
3 Germany 117448
4 France 110480
5 Australia 83653
6 Sweden 35637
7 Switzerland 30325
8 Spain 26824
9 Japan 25218
so far i have tried this but unable to specify the axis myself:
df.plot(x='Country', y='Quantity', kind='hist', bins=10)
Try a bar plot instead of a plot:
df.bar(x='Country', y='Quantity')
Try this :
import matplotlib.pyplot as plt
plt.bar(df['Country'],df['Quantity'])
plt.show()

How to do a point in polygon query efficiently using geopandas?

I have a shapefile that has all the counties for the US, and I am doing a bunch of queries at a lat/lon point and then finding what county the point lies in. Right now I am just looping through all the counties and doing pnt.within(county). This isn't very efficient. Is there a better way to do this?
Your situation looks like a typical case where spatial joins are useful. The idea of spatial joins is to merge data using geographic coordinates instead of using attributes.
Three possibilities in geopandas:
intersects
within
contains
It seems like you want within, which is possible using the following syntax:
geopandas.sjoin(points, polygons, how="inner", op='within')
Note: You need to have installed rtree to be able to perform such operations. If you need to install this dependency, use pip or conda to install it
Example
As an example, let's plot European cities. The two example datasets are
import geopandas
import matplotlib.pyplot as plt
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
cities = geopandas.read_file(geopandas.datasets.get_path('naturalearth_cities'))
countries = world[world['continent'] == "Europe"].rename(columns={'name':'country'})
countries.head(2)
pop_est continent country iso_a3 gdp_md_est geometry
18 142257519 Europe Russia RUS 3745000.0 MULTIPOLYGON (((178.725 71.099, 180.000 71.516...
21 5320045 Europe Norway -99 364700.0 MULTIPOLYGON (((15.143 79.674, 15.523 80.016, ...
cities.head(2)
name geometry
0 Vatican City POINT (12.45339 41.90328)
1 San Marino POINT (12.44177 43.93610)
cities is a worldwide dataset and countries is an European wide dataset.
Both dataset need to be in the same projection system. If not, use .to_crs before merging.
data_merged = geopandas.sjoin(cities, countries, how="inner", op='within')
Finally, to see the result let's do a map
f, ax = plt.subplots(1, figsize=(20,10))
data_merged.plot(axes=ax)
countries.plot(axes=ax, alpha=0.25, linewidth=0.1)
plt.show()
and the underlying dataset merges together the information we need
data_merged.head(5)
name geometry index_right pop_est continent country iso_a3 gdp_md_est
0 Vatican City POINT (12.45339 41.90328) 141 62137802 Europe Italy ITA 2221000.0
1 San Marino POINT (12.44177 43.93610) 141 62137802 Europe Italy ITA 2221000.0
192 Rome POINT (12.48131 41.89790) 141 62137802 Europe Italy ITA 2221000.0
2 Vaduz POINT (9.51667 47.13372) 114 8754413 Europe Austria AUT 416600.0
184 Vienna POINT (16.36469 48.20196) 114 8754413 Europe Austria AUT 416600.0
Here, I used inner join method but that's a parameter you can change if, for instance, you want to keep all points, including those not within a polygon.

How do I make a plot like the one given below with df.plot function?

I've this data of 2007 with population in Millions,GDP in Billions and index column is Country
continent year lifeExpectancy population gdpPerCapita GDP Billions
country
China Asia 2007 72.961 1318.6831 4959.11485 6539.50093
India Asia 2007 64.698 1110.39633 2452.21041 2722.92544
United States Americas 2007 78.242 301.139947 42951.6531 12934.4585
Indonesia Asia 2007 70.65 223.547 3540.65156 791.502035
Brazil Americas 2007 72.39 190.010647 9065.80083 1722.59868
Pakistan Asia 2007 65.483 169.270617 2605.94758 441.110355
Bangladesh Asia 2007 64.062 150.448339 1391.25379 209.311822
Nigeria Africa 2007 46.859 135.031164 2013.97731 271.9497
Japan Asia 2007 82.603 127.467972 31656.0681 4035.1348
Mexico Americas 2007 76.195 108.700891 11977.575 1301.97307
I am trying to plot a histogram as the following:
This was plotted using matplotlib (code below), and I want to get this with df.plot method.
The code for plotting with matplotlib:
x = data.plot(y=[3],kind = "bar")
data.plot(y = [3,5],kind = "bar",secondary_y = True,ax = ax,style='g:', figsize = (24, 6))
plt.show()
You could use df.plot() with the y axis columns you need in your plot and secondary_y argument as the second column
data[['population','gdpPerCapita']].plot(kind='bar', secondary_y='gdpPerCapita')
If you want to set the y labels for each side, then you have to get all the axes of the plot (in this case 2 y axis) and set the labels respectively.
ax1, ax2 = plt.gcf().get_axes()
ax1.set_ylabel('Population')
ax2.set_ylabel('GDP')
Output:

Categories