Choropleth map not filled in/ hover isn't working - python

I created a small excel file listing the confirmed cases, deaths, and recovered cases of the Coronavirus here in the U.S, but I can't seem to get the choropleth map working.
Here's my code:
import pandas as pd
import chart_studio.plotly as py
import plotly.graph_objs as go
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)
col_Names=["State", "Country", "Time Discovered", "Confirmed", "Deaths","Recovered"]
df = pd.read_csv("coronavirusUS.csv", names=col_Names)
data = dict(type='choropleth',
colorscale= 'magma',
locations = df['State'],
locationmode= 'USA-states',
z = df['Confirmed'],
text = df['Confirmed'],
marker = dict(line=dict(color='rgb(255, 255, 255)', width=2)),
colorbar = {'title':'Coronavirus in the U.S'})
layout = dict(title = 'Coronavirus in the US',
geo= dict(scope = 'usa',
showlakes = True,
lakecolor = 'rgb(85, 173, 240)'))
choromap = go.Figure(data = [data], layout = layout)
iplot(choromap)
And then my map comes out looking like this:
empty map
As you can see, the colorbar is accurate, but the map itself is blank. Except for the lakes.
Here's the .csv file I'm referring to.
data table
I'm using jupyter notebook.
I've tried switching from a .xls to .csv, but that didn't work.
Thanks in advance.

Related

Visualize a 408x408 numpy array as a heatmap

Hello I want to visualize the sandbox game map. I have collected the data from API and now I want to create a heatmap kind of visualization, where the color changes depending on how many times the land's been sold. I'm looking for a Python tool / GUI that will let me visualize a 408x408 numpy array. I've tried the seaborn heatmap, but it doesn't look clean (see image), even If I try to set figsize to (200, 200) it's not big enough for my needs. I want to have a visualization on potentially whole screen, where each land is big enough so that I can write something on it (potentially price). Better option would be to have a big map with sliders.
Perhaps it's possible to do what I want using Seaborn's heatmap, but I'm not very familiar with it.
Here's the code I used for visualization:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
arr = np.random.rand(408, 408)
x_labels = list(range(-204, 204))
y_labels = list(reversed(range(-204, 204)))
fig, ax = plt.subplots(figsize=(100, 100))
sns.heatmap(arr, square=True, xticklabels=x_labels, yticklabels=y_labels, ax=ax)
ax.tick_params(axis="both", labelsize=40)
Visualizing such large data with seaborn or Matplotlib will be difficult.
For that, we can use Plotly and the dash python library. So, we can add a slider to view some portion of data at a time.
I have used these two libraries.
import plotly.express as px
from dash import Dash, dcc, html, Input, Output
import numpy as np
import pandas as pd
#creating data
arr = np.random.rand(408, 408)
x_labels = list(range(-204, 204))
y_labels = list(reversed(range(-204, 204)))
#Converted to dataframe
df_data = pd.DataFrame(arr,index =y_labels, columns = [x_labels] )
app = Dash(__name__)
#How many items to show at a time
show_item_limit = 20
app.layout = html.Div([
html.H4('Range'),
dcc.Graph(id="graph"),
html.P("Select range"),
dcc.Slider(
min = 0,
max = 408-show_item_limit,
step = show_item_limit,
value = 0,
id= 'my-slider'
),
])
#app.callback(
Output("graph", "figure"),
Input("my-slider", "value"))
def filter_heatmap(selected_value):
# Selected value will be passed from Slider
df = df_data # replace with your own data source
#We can filter the data here
filtered_df = df_data.iloc[selected_value:selected_value+show_item_limit,range(selected_value,selected_value+show_item_limit)]
#Update using plotly
fig = px.imshow(filtered_df,
text_auto=True,
labels=dict(x="X-range", y="y-range"),
x = filtered_df.columns,
y = filtered_df.index
)
return fig
app.run_server(debug=True)
See the output image: Output from code

Bokeh unable to display Berlin in Germany map

I'm using Bokeh and Geopandas to plot an interactive map of Germany. Germany has total 16 states but the plot shows only 15. It does not display the map of Berlin, which is the capital city (Berlin is also a state). I'm using the shapefile as an input to plot the map. I have tried different shapefiles and looked for different solutions but I'm unable to find the root of the problem. Please have a look at the code and the output.
`
import pandas as pd
# Import geopandas package
import geopandas as gpd
# Read in shapefile and examine data
germany = gpd.read_file('Igismap/Germany_Polygon.shp')
pop_states = germany
vargeojson = pop_states.to_json()
import json
from bokeh.io import show, output_notebook
from bokeh.models import (ColumnDataSource,
GeoJSONDataSource, HoverTool,
LinearColorMapper)
from bokeh.layouts import column, row, widgetbox
from bokeh.plotting import figure
output_notebook()
# Input GeoJSON source that contains features for plotting
geosource = GeoJSONDataSource(geojson = vargeojson)
tools = "pan, wheel_zoom, box_zoom, reset"
p = figure(title = 'All states of Germany',
plot_height = 600 ,
plot_width = 600,
toolbar_location = 'right',
tools = tools)
p.xgrid.grid_line_color = None
p.ygrid.grid_line_color = None
# Add patch renderer to figure.
states = p.patches('xs','ys', source = geosource,
line_color = "grey",
line_width = 0.25,
fill_alpha = 1)
# Create hover tool
p.add_tools(HoverTool(renderers = [states],
tooltips = [('Lander','#name')]))
show(p)
`
Click here to see the output of above code.... and
Click here to see the desired output

Plotly Choropleth Map Not Showing Data

I'm running this in Jupyter Notebook. I'll attach my full code. I'm using a csv file from Kaggle to plot the cumulative coronavirus cases throughout different countries in the world.
Here's the link to the Kaggle dataset download: https://www.kaggle.com/sudalairajkumar/novel-corona-virus-2019-dataset
I'm using the "covid_19_data.csv" file.
import chart_studio.plotly as py
import plotly.graph_objs as go
import pandas as pd
from plotly.offline import download_plotlyjs, init_notebook_mode, iplot, plot
init_notebook_mode(connected = True)
cf.go_offline()
df = pd.read_csv('covid_19_data.csv')
data = dict(type = 'choropleth',
locations = df['Country/Region'],
z = df['Confirmed'],
text = df['Province/State'],
colorbar = {'title':'Cases of COVID-19'} )
layout = dict(title = '2020 Global Coronavirus Cases', geo = dict(showframe = False, projection = {'type':'natural earth'}))
choromap = go.Figure(data = [data],layout = layout)
iplot(choromap)
The output is a gray map of the world. There is a legend with color, and a title as well. I'm confused why the data is not being plotted!

Bokeh plots an empty map from a shapefile converted to geojson and don't know what's wrong

I have some sample code to plot a map of Ontario using Bokeh. The code reads in the shapefile and converts it to a geojson file as suggested from examples available in the internet.
The shapefile source data is the Ontario census subdivision geographic boundary from the StatsCan website downloaded as a shapefile.
Image screenshot: https://imgur.com/xn1Zzdh
The result so far is an empty chart and I can't figure out what's wrong.
The shapefile is loaded first as a geopandas dataframe and converted to geojson.
Apologies for my lack of stackoverflow etiquette. I'm a new user.
%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
import geopandas
import os
from bokeh.plotting import figure, output_file, show, save,output_notebook
from bokeh.models import GeoJSONDataSource, LinearColorMapper, ColorBar
from bokeh.palettes import brewer
pd.options.display.max_rows = 10
workspace = r'C:\Users\user\Documents\lcsd000b16a_e'
CSD_LAYER = geopandas.read_file(os.path.join(workspace,r"lcsd000b16a_e.shp"))
ONT_CSD = CSD_LAYER[CSD_LAYER['PRUID']=='35']
ONT_CSD['geometry'].head()
1372 POLYGON ((7202895.13143 1077367.822855, 720382...
1373 POLYGON ((7205717.394285 1098087.974285, 72058...
1374 POLYGON ((7169056.905715 1216085.682855, 71693...
1614 POLYGON ((7162217.717145 948748.982855, 716229...
1809 POLYGON ((7506330.95143 1116872.145715, 750632...
# # Get the CRS of our grid
CRS = ONT_CSD.crs
print('FROM:' + str(CRS))
ONT_CSD = ONT_CSD.to_crs(epsg=3857) #transform to webmercator
print('TO: '+ str(ONT_CSD.crs))
FROM:{'init': 'epsg:3347'}
TO: {'init': 'epsg:3857', 'no_defs': True}
import json
#read data to json file
ONT_CSD_json = json.loads(ONT_CSD.to_json())
#convert to string like object
ONT_CSD_JSON_DATA = json.dumps(ONT_CSD_json)
#Input GeoJSON source that contains features for plotting.
geosource = GeoJSONDataSource(geojson = ONT_CSD_JSON_DATA)
#Create figure object.
p = figure(title = 'test', plot_height = 600 , plot_width = 950)
p.xgrid.grid_line_color = None
p.ygrid.grid_line_color = None
#Add patch renderer to figure.
p.patch('xs','ys', source = geosource,
line_color = 'black', line_width = 1, fill_alpha = 0.75)

Plotly: How to set values for major ticks / gridlines for timeseries on x-axis?

Background:
This question is related, but not identical, to Plotly: How to retrieve values for major ticks and gridlines?. A similar question has also been asked but not answered for matplotlib here: How do I show major ticks as the first day of each months and minor ticks as each day?
Plotly is fantastic, and maybe the only thing that bothers me is the autoselection of ticks / gridlines and the labels chosen for the x-axis like in this plot:
Plot 1:
I think the natural thing to display here is the first of each month (depending ong the period of course). Or maybe even just an abreviateed month name like 'Jan' on each tick. I realize both the technical and even visual challenges due to the fact that all months are not of equal length. But does anyone know how to do this?
Reproducible snippet:
import plotly
import cufflinks as cf
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import pandas as pd
import numpy as np
from IPython.display import HTML
from IPython.core.display import display, HTML
import copy
# setup
init_notebook_mode(connected=True)
np.random.seed(123)
cf.set_config_file(theme='pearl')
# Random data using cufflinks
df = cf.datagen.lines()
#df = df['UUN.XY']
fig = df.iplot(asFigure=True, kind='scatter',
xTitle='Dates',yTitle='Returns',title='Returns')
iplot(fig)
(updated answer for newer versions of plotly)
With newer versions of plotly, you can specify dtick = 'M1' to set gridlines at the beginning of each month. You can also format the display of the month through tickformat:
Snippet 1
fig.update_xaxes(dtick="M2",
tickformat="%b\n%Y"
)
Plot 1
And if you'd like to set the gridlines at every second month, just change "M1" to "M2"
Plot 2
Complete code:
# imports
import pandas as pd
import plotly.express as px
# data
df = px.data.stocks()
df = df.tail(40)
colors = px.colors.qualitative.T10
# plotly
fig = px.line(df,x = 'date',
y = [c for c in df.columns if c != 'date'],
template = 'plotly_dark',
color_discrete_sequence = colors,
title = 'Stocks',
)
fig.update_xaxes(dtick="M2",
tickformat="%b\n%Y"
)
fig.show()
Old Solution:
How to set the gridlines will depend entirely on what you'd like to display, and how the figure is built before you try to edit the settings. But to obtain the result specified in the question, you can do it like this.
Step1:
Edit fig['data'][series]['x'] for each series in fig['data'].
Step2:
set tickmode and ticktext in:
go.Layout(xaxis = go.layout.XAxis(tickvals = [some_values]
ticktext = [other_values])
)
Result:
Complete code for a Jupyter Notebook:
# imports
import plotly
import cufflinks as cf
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import pandas as pd
import numpy as np
from IPython.display import HTML
from IPython.core.display import display, HTML
import copy
import plotly.graph_objs as go
# setup
init_notebook_mode(connected=True)
np.random.seed(123)
cf.set_config_file(theme='pearl')
#%qtconsole --style vim
# Random data using cufflinks
df = cf.datagen.lines()
# create figure setup
fig = df.iplot(asFigure=True, kind='scatter',
xTitle='Dates',yTitle='Returns',title='Returns')
# create df1 to mess around with while
# keeping the source intact in df
df1 = df.copy(deep = True)
df1['idx'] = range(0, len(df))
# time variable operations and formatting
df1['yr'] = df1.index.year
df1['mth'] = df1.index.month_name()
# function to replace month name with
# abbreviated month name AND year
# if the month is january
def mthFormat(month):
dDict = {'January':'jan','February':'feb', 'March':'mar',
'April':'apr', 'May':'may','June':'jun', 'July':'jul',
'August':'aug','September':'sep', 'October':'oct',
'November':'nov', 'December':'dec'}
mth = dDict[month]
return(mth)
# replace month name with abbreviated month name
df1['mth'] = [mthFormat(m) for m in df1['mth']]
# remove adjacent duplicates for year and month
df1['yr'][df1['yr'].shift() == df1['yr']] = ''
df1['mth'][df1['mth'].shift() == df1['mth']] = ''
# select and format values to be displayed
df1['idx'][df1['mth']!='']
df1['display'] = df1['idx'][df1['mth']!='']
display = df1['display'].dropna()
displayVal = display.values.astype('int')
df_display = df1.iloc[displayVal]
df_display['display'] = df_display['display'].astype('int')
df_display['yrmth'] = df_display['mth'] + '<br>' + df_display['yr'].astype(str)
# set properties for each trace
for ser in range(0,len(fig['data'])):
fig['data'][ser]['x'] = df1['idx'].values.tolist()
fig['data'][ser]['text'] = df1['mth'].values.tolist()
fig['data'][ser]['hoverinfo']='all'
# layout for entire figure
f2Data = fig['data']
f2Layout = go.Layout(
xaxis = go.layout.XAxis(
tickmode = 'array',
tickvals = df_display['display'].values.tolist(),
ticktext = df_display['yrmth'].values.tolist(),
zeroline = False)#,
)
# plot figure with specified major ticks and gridlines
fig2 = go.Figure(data=f2Data, layout=f2Layout)
iplot(fig2)
Some important details:
1. Flexibility and limitations with iplot():
This approach with iplot() and editing all those settings is a bit clunky, but it's very flexible with regards to the number of columns / variables in the dataset, and arguably preferable to building each trace manually like trace1 = go.Scatter() for each and every column in the df.
2. Why do you have to edit each series / trace?
If you try to skip the middle part with
for ser in range(0,len(fig['data'])):
fig['data'][ser]['x'] = df1['idx'].values.tolist()
fig['data'][ser]['text'] = df1['mth'].values.tolist()
fig['data'][ser]['hoverinfo']='all'
and try to set tickvals and ticktext directly on the entire plot, it will have no effect:
I think that's a bit weird, but I think it's caused by some underlying settings initiated by iplot().
3. One thing is still missing:
In order fot thie setup to work, the structure of ticvals and ticktext is [0, 31, 59, 90] and ['jan<br>2015', 'feb<br>', 'mar<br>', 'apr<br>'], respectively. This causes the xaxis line hovertext show the position of the data where ticvals and ticktext are empty:
Any suggestions on how to improve the whole thing is highly appreciated. Better solutions than my own will instantly receive Accepted Answer status!

Categories