Mapping time-series data with Plotly Dash, but without using MapBox - python

I have a huge dataset with geo-location and timestamps. My goal is to map this data on an interactive dashboard where it's possible to choose the timeframe that a user wants to map. I started doing this in Plotly Dash, and I saw most of the tutorials with Mapbox, but given it's limits in the free package, it is out of the question.
Is there some other (100% free) way to still interactively map this timeseries data using Dash?
I tried converting the dataset into geojson file and adding a geojson layer, but it still didn't show the data on the map. Combining different callback options from some online tutorials also didn't result with the data showed on map but simply an empty map (left as comments in the code snippet below).
Apart from this, I chose date range picker, but given the timestamp data, I would prefer to have the date picker available in detail (DD/MM/YYYY HH:MM:SS) on the dashboard (based on the data timestamps), though I am not sure if that's even possible.
Does some have a suggestion on how to solve this?
Below I share a small part of the data:
timestamp id lat lon mode geometry
0 2022-04-01 13:48:38 15 52.5170365 13.3888599 AUTO POINT (52.5170365 13.3888599)
1 2022-04-01 13:48:40 15 52.5170365 13.3888599 AUTO POINT (52.5170365 13.3888599)
2 2022-04-01 13:48:42 15 52.5170365 13.3888599 AUTO POINT (52.5170365 13.3888599)
3 2022-04-01 13:49:18 15 52.5170375 13.3888605 AUTO POINT (52.5170375 13.3888605)
4 2022-04-01 13:49:34 15 52.5170375 13.3888605 AUTO POINT (52.5170375 13.3888605)
5 2022-04-01 13:49:52 15 52.5170375 13.3888605 AUTO POINT (52.5170375 13.3888605)
6 2022-04-01 13:50:10 15 52.5170385 13.3888609 AUTO POINT (52.5170385 13.3888609)
7 2022-04-01 13:50:46 15 52.5170385 13.3888609 AUTO POINT (52.5170385 13.3888609)
8 2022-04-01 13:51:24 15 52.5170395 13.3888614 AUTO POINT (52.5170395 13.3888614)
9 2022-04-01 13:51:46 15 52.5170395 13.3888614 AUTO POINT (52.5170395 13.3888614)
..and also the code I wrote for the map:
import dash
import dash_daq as daq
from dash import dcc, html, Input, Output
from pandas_datareader import data as web
from datetime import datetime as dt
from datetime import date
import dash_leaflet as dl
import dash_leaflet.express as dlx
from jupyter_dash import JupyterDash
import plotly.express as px
import plotly.graph_objects as go
import json
min_date='01/04/2022'
max_date=dt.today()
#gdf = #the data showed above
geojson = json.loads(gdf.to_json())
# Build the dashboard
app = JupyterDash(__name__)
app.layout = html.Div([
html.H2('Dashboard', style={'text-align':'center', 'font-family': 'sans-serif'}),
dcc.DatePickerRange(
id='date-picker-range',
min_date_allowed=min_date,
max_date_allowed=max_date,
initial_visible_month=dt(dt.today().year, 4, 1).date(),
start_date=dt(2022, 4, 1).date(),
end_date=dt.today(),
show_outside_days=True,
day_size=32,
display_format='DD/MM/YYYY',
clearable=True,
style={'text-align':'center', 'width': '100%', 'height': '60px', 'font-size': '10'}),
#dcc.Graph(id='my-graph', figure=fig),
dl.Map([
dl.TileLayer(maxZoom=20)
#dl.GeoJSON(data=geojson)
],
id='map',
center=[52.517, 13.388],
zoom=14.5,
style={'width': '400px', 'height': '400px', 'margin': "auto", "display": "block"})
])
##app.callback(
# Output("map", "figure"),
# [Input('date-picker-range', 'start_date'),
# Input('date-picker-range', 'end_date')])
if __name__ == '__main__':
app.run_server(mode="external", debug=True)

Related

Dash RangeSlider automatically rounds marks

I am using the RangeSlider in Python Dash. This slider is supposed to allow users to select a range of dates to display, somewhere between the minimum and maximum years in the dataset. The issue that I am having is that each mark shows as 2k due to it being automatically rounded. The years range between 1784 and 2020, with a step of 10 each time. How do I get the marks to show as the actual dates and not just 2k? This is what I have below.
dcc.RangeSlider(sun['Year'].min(), sun['Year'].max(), 10,
value=[sun['Year'].min(), sun['Year'].max()], id='years')
You can use attribute marks to style the ticks of the sliders as follows:
marks={i: '{}'.format(i) for i in range(1784,2021,10)}
The full code:
from dash import Dash, dcc, html
app = Dash(__name__)
app.layout = html.Div([
dcc.RangeSlider(1784, 2020,
id='non-linear-range-slider',
marks={i: '{}'.format(i) for i in range(1784,2021,10)},
value=list(range(1784,2021,10)),
dots=False,
step=10,
updatemode='drag'
),
html.Div(id='output-container-range-slider-non-linear', style={'margin-top': 20})
])
if __name__ == '__main__':
app.run_server(debug=True, use_reloader=False)
Output

How to Increase Dot Size and Prevent Text Overlap in Plotly Scattergeo

I am using px.scattergeo to generate a scatter map of percentages centered over selected countries. There are two problems I encountered:
The dots are too small
The text between Korea and China overlaps
I was wondering if there were any solutions to these problems. Theoretically I could solve problem 2 by reducing the text size, but I am presenting this map, so I need the text to be larger.
My code is demonstrated below:
data='''
Country 2018 2019 2020 2021
China $50.00 $251.00 $2,123.00 $210.00
USA $541.00 $52.00 $32.00 $23.00
Korea $689.00 $444.00 $441.00 $456.00
'''
from io import StringIO
import pandas as pd
df=pd.read_csv(StringIO(data),sep='\s+')
df=country.melt(id_vars='Country',value_vars=country.columns)
agg_df=df[['variable','value']].groupby('variable').agg('sum').reset_index()
df=pd.merge(df,agg_df,on='variable',how='left')
df['Percentage']=(df.value_x/df.value_y*100).round(1)
fig=px.scatter_geo(df,locations='Country',locationmode='country names',size='Percentage',text='Percentage',color='value_x',
animation_frame='variable',width=1000,height=1000)
fig.update_traces(
texttemplate='<b>%{text}', # use '%{text}' to show only percentage
textposition='middle left'
)
fig.show()
fig.update_layout(font_size=24)

Create a plot showing the duration of time a player was in the team

this is an example data frame, i will be working with much larger data frames.
I need to create a plot to show the duration of time a player stayed at the club - the plot is not exclusive to each team in this plot. But my second plot will be showing the correlation between the team and the duration of staying. but I keep getting several errors and i am unsure on how to use 'matplot.lib' assuming that's what i should be using, to begin with.
Thank you in advance. Sorry if this has been answered before.
Name
Age
Team
Joined_on
Lost_on
Benjamin
18
A FC
2019-01-13
NaN
Natty
17
A FC
2016-05-06
2022-01-12
Smith
19
C FC
2016-01-13
NaN
Will
15
A FC
2020-03-09
NaN
Harry
20
B FC
2020-09-09
2021-01-01
As furas mentioned, this should be pretty direct once you convert the missing dates to today's dates and then apply a difference to extract the duration column.
Here's the complete code for the dataframe:
import pandas as pd
import numpy as np
from datetime import datetime
import matplotlib.pyplot as plt
#creating dataframe
data = {'Name':['Benjamin','Natty','Smith','Will','Harry'],
'Age':[18,17,19,15,20],
'Team':['A FC','A FC','C FC','A FC','B FC'],
'Joined_on': ['2019-01-13','2016-05-06','2016-01-13','2020-03-09','2020-09-09'],
'Lost_on': [np.nan,'2022-01-12',np.nan,np.nan,'2021-01-01']}
df = pd.DataFrame(data)
#fill nan with todays date
df['Lost_on'].fillna(pd.to_datetime(datetime.today().strftime('%Y-%m-%d')), inplace= True)
#convert to datetime formats
df['Joined_on'] = pd.to_datetime(df['Joined_on'])
df['Lost_on'] = pd.to_datetime(df['Lost_on'])
#duration column
df['Duration'] = (df['Lost_on'] - df['Joined_on'])#time delta
df
Outputs:
Because the duration columns in this table are in timedelta format, you may convert them to integers and plot them. I prefer plotly, however, if matplotlib is required in this situation, you may do it as follows:
#name and duration
plt.bar(df['Name'], df['Duration'].dt.days, edgecolor='white', linewidth=0.7)
#teams and duration
plt.bar(df.Team, df.Duration.dt.days)
I'm not sure what kind of correlation you're looking for, you may use a similar framework and plot using the matplotlib manual for your relevant chart.

Plotly: How to animate a bar chart with multiple groups using plotly express?

I have a dataframe that looks like this:
I want to have one bar for old freq and one for new freq. Currently I have graph that looks like this:
This is what the code looks like:
freq_df['date'] = pd.to_datetime(freq_df['date'])
freq_df['hour'] = freq_df['hour'].astype(str)
fig = px.bar(freq_df, x="hour", y="old freq",hover_name = "date",
animation_frame= freq_df.date.dt.day)
fig.update_layout(transition = {'duration': 2000})
How do I add another bar?
Explanation about DF:
It has frequencies relevant to each hour in a specific date.
Edit:
One approach could be to create a category column and add old and new freq and assign values in another freq column. How do I do that :p ?
Edit:
Here is the DF
,date,hour,old freq,new freq
43,2020-09-04,18,273,224.0
44,2020-09-04,19,183,183.0
45,2020-09-04,20,99,111.0
46,2020-09-04,21,130,83.0
47,2020-09-04,22,48,49.0
48,2020-09-04,23,16,16.0
49,2020-09-05,0,8,6.0
50,2020-09-05,1,10,10.0
51,2020-09-05,2,4,4.0
52,2020-09-05,3,7,7.0
53,2020-09-05,4,25,21.0
54,2020-09-05,5,114,53.0
55,2020-09-05,6,284,197.0
56,2020-09-05,7,343,316.0
57,2020-09-05,8,418,419.0
58,2020-09-05,9,436,433.0
59,2020-09-05,10,469,396.0
60,2020-09-05,11,486,300.0
61,2020-09-05,12,377,140.0
62,2020-09-05,13,552,103.0
63,2020-09-05,14,362,117.0
64,2020-09-05,15,512,93.0
65,2020-09-05,16,392,41.0
66,2020-09-05,17,268,31.0
67,2020-09-05,18,223,30.0
68,2020-09-05,19,165,24.0
69,2020-09-05,20,195,15.0
70,2020-09-05,21,90,
71,2020-09-05,22,46,1.0
72,2020-09-05,23,17,1.0
The answer in two steps:
1. Perform a slight transformation of your data using pd.wide_to_long:
df_long = pd.wide_to_long(freq_df, stubnames='freq',
i=['date', 'hour'], j='type',
sep='_', suffix='\w+').reset_index()
2. Plot two groups of bar traces using:
fig1 = px.bar(df_long, x='hour', y = 'freq', hover_name = "date", color='type',
animation_frame= 'date', barmode='group')
This is the result:
The details:
If I understand your question correctly, you'd like to animate a bar chart where you've got one bar for each hour for your two frequencies freq_old and freq_new like this:
If that's the case, then you sample data is no good since your animation critera is hour per date and you've only got four observations (hours) for 2020-09-04 and then 24 observations for 2020-09-05. But don't worry, since your question triggered my interest I just as well made some sample data that will in fact work the way you seem to want them to.
The only real challenge is that px.bar will not accept y= [freq_old, freq_new], or something to that effect, to build your two bar series of different categories for you. But you can make px.bar build two groups of bars by providing a color argument.
However, you'll need a column to identify your different freqs like this:
0 new
1 old
2 new
3 old
4 new
5 old
6 new
7 old
8 new
9 old
In other words, you'll have to transform your dataframe, which originally has a wide format, to a long format like this:
date hour type day freq
0 2020-01-01 0 new 1 7.100490
1 2020-01-01 0 old 1 2.219932
2 2020-01-01 1 new 1 7.015528
3 2020-01-01 1 old 1 8.707323
4 2020-01-01 2 new 1 7.673314
5 2020-01-01 2 old 1 2.067192
6 2020-01-01 3 new 1 9.743495
7 2020-01-01 3 old 1 9.186109
8 2020-01-01 4 new 1 3.737145
9 2020-01-01 4 old 1 4.884112
And that's what this snippet does:
df_long = pd.wide_to_long(freq_df, stubnames='freq',
i=['date', 'hour'], j='type',
sep='_', suffix='\w+').reset_index()
stubnames uses a prefix to identify the columns you'd like to stack into a long format. And that's why I've renamed new_freq and old_freq to freq_new and freq_old, respectively. j='type' simply takes the last parts of your cartegory names using sep='_' and produces the column that we need to tell the freqs from eachother:
type
old
new
old
...
suffix='\w+' tells pd.wide_to_long that we're using non-integers as suffixes.
And that's it!
Complete code:
# imports
import plotly.express as px
import plotly.graph_objects as go
import pandas as pd
import numpy as np
import random
# sample data
observations = 24*5
np.random.seed(5); cols = list('a')
freq_old = np.random.uniform(low=-1, high=1, size=observations).tolist()
freq_new = np.random.uniform(low=-1, high=1, size=observations).tolist()
date = [t[:10] for t in pd.date_range('2020', freq='H', periods=observations).format()]
hour = [int(t[11:13].lstrip()) for t in pd.date_range('2020', freq='H', periods=observations).format()]
# sample dataframe of a wide format such as yours
freq_df=pd.DataFrame({'date': date,
'hour':hour,
'freq_new':freq_new,
'freq_old':freq_old})
freq_df['day']=pd.to_datetime(freq_df['date']).dt.day
# attempt to make my random data look a bit
# like your real world data.
# but don't worry too much about that...
freq_df.freq_new = abs(freq_df.freq_new.cumsum())
freq_df.freq_old = abs(freq_df.freq_old.cumsum())
# sample dataframe of a long format that px.bar likes
df_long = pd.wide_to_long(freq_df, stubnames='freq',
i=['date', 'hour'], j='type',
sep='_', suffix='\w+').reset_index()
# plotly express bar chart with multiple bar groups.
fig = px.bar(df_long, x='hour', y = 'freq', hover_name = "date", color='type',
animation_frame= 'date', barmode='group')
# set up a sensible range for the y-axis
fig.update_layout(yaxis=dict(range=[df_long['freq'].min()*0.8,df_long['freq'].max()*1.2]))
fig.show()
I was able to create the bars for both the old and new frequencies, however using a separate plot for each day (Plotly Express Bar Charts don't seem to have support for multiple series). Here is the code for doing so:
# Import packages
import pandas as pd
import numpy as np
import plotly.graph_objs as go
import plotly
import plotly.express as px
from plotly.offline import init_notebook_mode, plot, iplot, download_plotlyjs
init_notebook_mode(connected=True)
plotly.offline.init_notebook_mode(connected=True)
# Start formatting data
allDates = np.unique(df.date)
numDates = allDates.shape[0]
print(numDates)
for i in range(numDates):
df = original_df.loc[original_df.date == allDates[i]]
oldFreqData = go.Bar(x=df["hour"].to_numpy(), y=df["old_freq"].to_numpy(), name="Old Frequency")
newFreqData = go.Bar(x=df["hour"].to_numpy(), y=df["new_freq"].to_numpy(), name="New Frequency")
fig = go.Figure(data=[oldFreqData,newFreqData])
fig.update_layout(title=allDates[i])
fig.update_xaxes(title='Hour')
fig.update_yaxes(title='Frequency')
fig.show()
where df is the dataframe DF from your question.
Here is the output:
However, if you prefer the use of the animation frame from Plotly Express, you can have two separate plots: one for old frequencies and one for new using this code:
# Reformat data
df = original_df
dates = pd.to_datetime(np.unique(df.date)).strftime('%Y-%m-%d')
numDays = dates.shape[0]
print(numDays)
hours = np.arange(0,24)
numHours = hours.shape[0]
allDates = []
allHours = []
oldFreqs = []
newFreqs = []
for i in range(numDays):
for j in range(numHours):
allDates.append(dates[i])
allHours.append(j)
if (df.loc[df.date == dates[i]].loc[df.hour == j].shape[0] != 0): # If data not missing
oldFreqs.append(df.loc[df.date == dates[i]].loc[df.hour == j].old_freq.to_numpy()[0])
newFreqs.append(df.loc[df.date == dates[i]].loc[df.hour == j].new_freq.to_numpy()[0])
else:
oldFreqs.append(0)
newFreqs.append(0)
d = {'Date': allDates, 'Hour': allHours, 'Old_Freq': oldFreqs, 'New_Freq': newFreqs, 'Comb': combined}
df2 = pd.DataFrame(data=d)
# Create px plot with animation
fig = px.bar(df2, x="Hour", y="Old_Freq", hover_data=["Old_Freq","New_Freq"], animation_frame="Date")
fig.show()
fig2 = px.bar(df2, x="Hour", y="New_Freq", hover_data=["Old_Freq","New_Freq"], animation_frame="Date")
fig2.show()
and here is the plot from that code:

Callback function not upgrading graphs

I have a dataset and I want to plot some graphs. I created a plotly graph that loads correctly with a callback function where I can select the year I want the data. The dataset loads correctly and there are no missings or errors. But Whenever I try to change the data nothing happens.
my dataset has columns like this
codass Q NF CURS
240011 1 7 2010
240011 2 5 2010
240012 1 2 2011
I tried starting with a blank graph, and nothing happens. The data is fine cause the initial graph loads correctly it's on the updating procedure.
import pandas as pd
import numpy as np
import dash
import dash_core_components as dcc
from dash.dependencies import Input, Output
import dash_html_components as html
import plotly.graph_objs as go
FaseIni = pd.read_csv('/Users/Jordi/Documents/Universitat/TFG/tfg/Spyder path/qfaseini18.csv',sep=';',encoding='utf-8')
Q=[1,2]
anys=FaseIni['CURS'].unique()
pv = FaseIni.pivot_table( index=['CODASS'], columns=['Q'], values=['NF'], fill_value=0)
trace1=go.Bar(x=pv.index, y =pv[('NF',1)],name='q1')
trace2=go.Bar(x=pv.index, y =pv[('NF',2)],name='q2')
app = dash.Dash()
app.layout = html.Div([
html.Div([
dcc.Dropdown(
id='Anys',
options= [{'label':'2010' , 'value':2010 },{'label':'2011' , 'value':2011 }],
value =2010,
)
]),
html.Div([
dcc.Graph(
id='notes',
figure={
'data':[trace1,trace2]
}
)
])
])
#app.callback(
Output('notes','figure'),
[Input('Anys','value')])
def update_graph(Anys):
pv2 = FaseIni.loc[FaseIni['CURS'] == Anys]
pv2 = pv2.pivot_table( index=['CODASS'], columns=['Q'], values=['NF'],
fill_value=0)
trace3=go.Bar(x=pv2.index, y =pv2[('Anys',1)],name='q1')
trace4=go.Bar(x=pv2.index, y =pv2[('Anys',2)],name='q2')
return {'data':[trace3,trace4]}
The initial graph loads correctly, it just doesn't update
I had this problem the other day. I'm not sure if this will troubleshoot it because I don't have the csv file to test it but try changing to this:
trace3=go.Bar(x=pv2.index, y =pv2[(value,1)],name='q1')
trace4=go.Bar(x=pv2.index, y =pv2[(value,2)],name='q2')

Categories