Interactive plotly boxplot with ipywidgets - python

I am trying to create an interactive boxplot with ipywidgets and Plotly.
I started by looking at this example
While this is fine, I'd like to change the groupings of the boxplot based on a dropdown input.
With interact I can do this:
import datetime
import numpy as np
import pandas as pd
import plotly.graph_objects as go
from ipywidgets import widgets
df = pd.read_csv(
'https://raw.githubusercontent.com/yankev/testing/master/datasets/nycflights.csv')
df = df.drop(df.columns[[0]], axis=1)
from ipywidgets import interact
def view_image(col):
fig = go.FigureWidget()
for val in df[col].unique():
groupData = df.query(f'{col} == "{val}"')
fig.add_trace(
go.Box(y = groupData['distance'],
name = val)
)
fig.show()
interact(view_image, col = ['origin', 'carrier'])
And the result is that I can change the column based on which the data is grouped.
However, I would like to have more control on the widgets, like in the official example.
This is what I am trying (and failing):
# Assign an empty figure widget with two traces
gdata = []
for origin in df.origin.unique():
groupData = df.query(f'origin == "{origin}"')
gdata.append(
go.Box(y = groupData['distance'],
name = origin)
)
g = go.FigureWidget(data=gdata,
layout=go.Layout(
title=dict(
text='NYC FlightDatabase'
),
barmode='overlay'
))
def response_box(change):
col = column.value
with g.batch_update():
gdata = []
for val in df[col].unique():
groupData = df.query(f'{col} == "{val}"')
gdata.append(
go.Box(y = groupData['distance'],
name = val)
)
g.data = gdata
column = widgets.Dropdown(
options=['origin','carrier']
)
column.observe(response_box, 'value')
container2 = widgets.HBox([column])
widgets.VBox([container2,
g])
Note that since I have new groupings, I cannot just go into g.data[index].y and change per index, but I have to re-generate the figure as in the interact function.
This particular iteration gives me a "you cannot update data directly" error. I tried in a few different ways, but I don't seem to find one that works.
Any idea?

it's not clear how you want to interact with the dimensions of data. So I've gone with defining x and color of figure, plus filtering by origin, dest, carrier
box plots are far simpler to create using Plotly Express so have used that
it then really simplifies to passing parameters. Have used https://ipywidgets.readthedocs.io/en/latest/examples/Using%20Interact.html with decorator
import datetime
import numpy as np
import pandas as pd
import plotly.graph_objects as go
import plotly.express as px
from ipywidgets import widgets
from ipywidgets import interact
df = pd.read_csv(
"https://raw.githubusercontent.com/yankev/testing/master/datasets/nycflights.csv"
)
df = df.drop(df.columns[[0]], axis=1)
#interact
def view_image(
col=widgets.Dropdown(
description="Plot:", value="carrier", options=["origin", "carrier"]
),
filtercol=widgets.Dropdown(
description="Filter by:", value="carrier", options=["origin", "dest", "carrier"]
),
filter=widgets.Text(
description="Filter:", value=""
),
):
# check if filter results in any rows... if not all data...
if df[filtercol].eq(filter).any():
dfp = df.loc[df[filtercol].eq(filter)]
else:
dfp = df
fig = px.box(dfp, x=col, y="distance", color=col)
go.FigureWidget(fig.to_dict()).show()

Related

Mixing Plotly/ipywidgets to modify the x axis of a scatter plot

I want to mix Plotly with a dropdown widget, the idea being to make some scatter plots and modify the x axis through the widget. Let's say that my dataset is the following :
import sea born as sns
import plotly.graph_objects as go
import pandas as pd
import ipywidgets as widgets
import seaborn as sns
df = sns.load_dataset('diamonds')
And my target is the column carat. What I tried so far is to create the scatters, include them into the widget and display it :
predictors = df.columns.tolist()
predictors.remove("carat")
target = df["carat"]
data = []
for predictor in predictors:
chart = go.Scatter(x = df[predictor],
y = target,
mode="markers")
fig = go.Figure(data=chart)
data.append((predictor,fig))
widgets.Dropdown(options = [item[0] for item in data],
value = [item[0] for item in data][0],
description = "Select :",
disabled=False)
Yet, I am new to ipywidgets/plotly and don't understand what is not working here, since it displays the widget but not the charts even when I change its value. How can I modify the code so that it finally displays the charts when selecting a predictor ?
You can use interact to read the values from the DropDown and plot your graph:
import plotly.graph_objects as go
import pandas as pd
import seaborn as sns
from ipywidgets import widgets
from ipywidgets import interact
import plotly.express as px
df = sns.load_dataset('diamonds')
predictors = df.columns.tolist()
predictors.remove("carat")
target = df["carat"]
#interact
def read_values(
predictor=widgets.Dropdown(
description="Select :", value="clarity", options=predictors
)
):
fig = px.scatter(df, x = predictor, y = target)
go.FigureWidget(fig.to_dict()).show()

Visualize a 408x408 numpy array as a heatmap

Hello I want to visualize the sandbox game map. I have collected the data from API and now I want to create a heatmap kind of visualization, where the color changes depending on how many times the land's been sold. I'm looking for a Python tool / GUI that will let me visualize a 408x408 numpy array. I've tried the seaborn heatmap, but it doesn't look clean (see image), even If I try to set figsize to (200, 200) it's not big enough for my needs. I want to have a visualization on potentially whole screen, where each land is big enough so that I can write something on it (potentially price). Better option would be to have a big map with sliders.
Perhaps it's possible to do what I want using Seaborn's heatmap, but I'm not very familiar with it.
Here's the code I used for visualization:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
arr = np.random.rand(408, 408)
x_labels = list(range(-204, 204))
y_labels = list(reversed(range(-204, 204)))
fig, ax = plt.subplots(figsize=(100, 100))
sns.heatmap(arr, square=True, xticklabels=x_labels, yticklabels=y_labels, ax=ax)
ax.tick_params(axis="both", labelsize=40)
Visualizing such large data with seaborn or Matplotlib will be difficult.
For that, we can use Plotly and the dash python library. So, we can add a slider to view some portion of data at a time.
I have used these two libraries.
import plotly.express as px
from dash import Dash, dcc, html, Input, Output
import numpy as np
import pandas as pd
#creating data
arr = np.random.rand(408, 408)
x_labels = list(range(-204, 204))
y_labels = list(reversed(range(-204, 204)))
#Converted to dataframe
df_data = pd.DataFrame(arr,index =y_labels, columns = [x_labels] )
app = Dash(__name__)
#How many items to show at a time
show_item_limit = 20
app.layout = html.Div([
html.H4('Range'),
dcc.Graph(id="graph"),
html.P("Select range"),
dcc.Slider(
min = 0,
max = 408-show_item_limit,
step = show_item_limit,
value = 0,
id= 'my-slider'
),
])
#app.callback(
Output("graph", "figure"),
Input("my-slider", "value"))
def filter_heatmap(selected_value):
# Selected value will be passed from Slider
df = df_data # replace with your own data source
#We can filter the data here
filtered_df = df_data.iloc[selected_value:selected_value+show_item_limit,range(selected_value,selected_value+show_item_limit)]
#Update using plotly
fig = px.imshow(filtered_df,
text_auto=True,
labels=dict(x="X-range", y="y-range"),
x = filtered_df.columns,
y = filtered_df.index
)
return fig
app.run_server(debug=True)
See the output image: Output from code

Display only single line on hover, hide all other lines

Is there is a way to hide all other lines on a figure, when I am hovering a single one?
Example:
import numpy as np
import plotly.express as px
np.random.seed(42)
data = np.random.randint(0, 10, (5, 10))
fig = px.line(data_frame=pd.DataFrame(data))
fig.show()
will produce:
and I want this, when I hover particular line (but without manually switching off all other lines and keeping X, Y axes dims):
UPD: inside jupyter notebook
The following suggestion draws from the posts Plotly Dash: How to reset the "n_clicks" attribute of a dash-html.button? and Plotly-Dash: How to code interactive callbacks for hover functions in plotly dash and will let you display a single trace by hovering on any point or part of the line. The rest of the traces aren't completely hidden though, but are made grey and transparent in the background so that you can more easily make another selection.
To reset the figure and make all traces fully visible at the same time, just click Clear Selection. I understand that you would prefer a "plain" Jupyter approach to obtain this functionality, but you'd be missing out on the true power of plotly which reveals itself to the full extent only through Dash and JupyterDash. With this suggestion, you won't see any difference between Dash and "plain" Jupyter since the figure, or app, is displayed inline with app.run_server(mode='inline')
Plot 1 - Upon launch. Or after clicking Clear selection
Plot 2 - Selection = trace a
Plot 3 - Selection = trace b
Complete code
import pandas as pd
import plotly.graph_objects as go
import numpy as np
import dash
import dash_core_components as dcc
import dash_html_components as html
import plotly.express as px
from dash.dependencies import Input, Output
from jupyter_dash import JupyterDash
# pandas and plotly settings
pd.options.plotting.backend = "plotly"
# app info
app = JupyterDash(__name__)
# sample data and figure setup
df = pd.DataFrame(np.random.randint(-1,2,size=(200, 12)), columns=list('abcdefghijkl'))
df = df.cumsum()#.reset_index()
fig = df.plot(title = 'Selected traces = all', template='plotly_dark')#.update_traces(line_color = 'rgba(50,50,50,0.5)')
set_fig = go.Figure(fig)
colors = {d.name:d.line.color for i, d in enumerate(set_fig.data)}
# jupyterdash app
app.layout = html.Div([html.Button('Clear selection', id='clearit', n_clicks=0),
dcc.Graph(id='hoverfig',figure=fig,#clear_on_unhover = True
),])
colors = {d.name:d.line.color for i, d in enumerate(set_fig.data)}
# callbacks
#app.callback(
Output('hoverfig', 'figure'),
[Input('hoverfig', 'hoverData'), Input('clearit', 'n_clicks')])
def display_hover_data(hoverData, resetHover):
changed_id = [p['prop_id'] for p in dash.callback_context.triggered][0]
if 'clearit' in changed_id:
return set_fig
else:
try:
fig2 = fig.update_layout(title = 'Selected trace = ' + fig.data[hoverData['points'][0]['curveNumber']].name)
fig2.for_each_trace(lambda t: t.update(line_color = 'rgba(50,50,50,0.5)',line_width = 1) if t.name != fig.data[hoverData['points'][0]['curveNumber']].name else t.update(line_color = colors[t.name], line_width = 2))
return fig2
except:
return fig
app.run_server(mode='inline', port = 8070, dev_tools_ui=True,
dev_tools_hot_reload =True, threaded=True)
import plotly.graph_objects as go
import numpy as np
import pandas as pd
# callback function for on_hover
def hide_traces_on_hover(trace, points, selector):
if len(points.point_inds)==1: # identify hover
i = points.trace_index # get the index of the hovered trace
f.data[i].visible = True # keep the hovered trace visible
# create a list of traces you want to hide
hide_traces = [l_trace for idx, l_trace in enumerate(f.data) if idx != i]
for l_trace in hide_traces: # iterate over hide_traces
l_trace.visible = 'legendonly' # hide all remaining traces
# callback function to unhide traces on click
def unhide_traces_on_click(trace, points, selector):
for l_trace in f.data:
l_trace.visible = True
# your sample data frame
np.random.seed(42)
data = np.random.randint(0, 10, (5, 10))
df = pd.DataFrame(data)
f = go.FigureWidget() # create figure widget
f.update_yaxes(autorange=False) # set auto range to false
# define the range of the y-axis using min and max functions
f.update_layout(yaxis_range=[df.values.min()-1, df.values.max()+1])
# create your traces from the data frame
for col in df.columns:
trace = go.Scatter(x=df.index, y=df[col], mode='lines+markers')
f.add_trace(trace)
# assign your functions to each trace
for trace in f.data:
trace.on_hover(hide_traces_on_hover)
trace.on_click(unhide_traces_on_click)
f
If you are running into issues, here is the jupyter notebook support documentation for using FigureWidget and here is the jupyter lab support documentation. Make sure you have the ipywidgets package installed. Also, just as a FYI, here is the FigureWidget documentation.
When you hover over a marker in the graph.
When you click on any marker of the visible trace all the hidden traces will become visible again.

Plotly: How to set values for major ticks / gridlines for timeseries on x-axis?

Background:
This question is related, but not identical, to Plotly: How to retrieve values for major ticks and gridlines?. A similar question has also been asked but not answered for matplotlib here: How do I show major ticks as the first day of each months and minor ticks as each day?
Plotly is fantastic, and maybe the only thing that bothers me is the autoselection of ticks / gridlines and the labels chosen for the x-axis like in this plot:
Plot 1:
I think the natural thing to display here is the first of each month (depending ong the period of course). Or maybe even just an abreviateed month name like 'Jan' on each tick. I realize both the technical and even visual challenges due to the fact that all months are not of equal length. But does anyone know how to do this?
Reproducible snippet:
import plotly
import cufflinks as cf
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import pandas as pd
import numpy as np
from IPython.display import HTML
from IPython.core.display import display, HTML
import copy
# setup
init_notebook_mode(connected=True)
np.random.seed(123)
cf.set_config_file(theme='pearl')
# Random data using cufflinks
df = cf.datagen.lines()
#df = df['UUN.XY']
fig = df.iplot(asFigure=True, kind='scatter',
xTitle='Dates',yTitle='Returns',title='Returns')
iplot(fig)
(updated answer for newer versions of plotly)
With newer versions of plotly, you can specify dtick = 'M1' to set gridlines at the beginning of each month. You can also format the display of the month through tickformat:
Snippet 1
fig.update_xaxes(dtick="M2",
tickformat="%b\n%Y"
)
Plot 1
And if you'd like to set the gridlines at every second month, just change "M1" to "M2"
Plot 2
Complete code:
# imports
import pandas as pd
import plotly.express as px
# data
df = px.data.stocks()
df = df.tail(40)
colors = px.colors.qualitative.T10
# plotly
fig = px.line(df,x = 'date',
y = [c for c in df.columns if c != 'date'],
template = 'plotly_dark',
color_discrete_sequence = colors,
title = 'Stocks',
)
fig.update_xaxes(dtick="M2",
tickformat="%b\n%Y"
)
fig.show()
Old Solution:
How to set the gridlines will depend entirely on what you'd like to display, and how the figure is built before you try to edit the settings. But to obtain the result specified in the question, you can do it like this.
Step1:
Edit fig['data'][series]['x'] for each series in fig['data'].
Step2:
set tickmode and ticktext in:
go.Layout(xaxis = go.layout.XAxis(tickvals = [some_values]
ticktext = [other_values])
)
Result:
Complete code for a Jupyter Notebook:
# imports
import plotly
import cufflinks as cf
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import pandas as pd
import numpy as np
from IPython.display import HTML
from IPython.core.display import display, HTML
import copy
import plotly.graph_objs as go
# setup
init_notebook_mode(connected=True)
np.random.seed(123)
cf.set_config_file(theme='pearl')
#%qtconsole --style vim
# Random data using cufflinks
df = cf.datagen.lines()
# create figure setup
fig = df.iplot(asFigure=True, kind='scatter',
xTitle='Dates',yTitle='Returns',title='Returns')
# create df1 to mess around with while
# keeping the source intact in df
df1 = df.copy(deep = True)
df1['idx'] = range(0, len(df))
# time variable operations and formatting
df1['yr'] = df1.index.year
df1['mth'] = df1.index.month_name()
# function to replace month name with
# abbreviated month name AND year
# if the month is january
def mthFormat(month):
dDict = {'January':'jan','February':'feb', 'March':'mar',
'April':'apr', 'May':'may','June':'jun', 'July':'jul',
'August':'aug','September':'sep', 'October':'oct',
'November':'nov', 'December':'dec'}
mth = dDict[month]
return(mth)
# replace month name with abbreviated month name
df1['mth'] = [mthFormat(m) for m in df1['mth']]
# remove adjacent duplicates for year and month
df1['yr'][df1['yr'].shift() == df1['yr']] = ''
df1['mth'][df1['mth'].shift() == df1['mth']] = ''
# select and format values to be displayed
df1['idx'][df1['mth']!='']
df1['display'] = df1['idx'][df1['mth']!='']
display = df1['display'].dropna()
displayVal = display.values.astype('int')
df_display = df1.iloc[displayVal]
df_display['display'] = df_display['display'].astype('int')
df_display['yrmth'] = df_display['mth'] + '<br>' + df_display['yr'].astype(str)
# set properties for each trace
for ser in range(0,len(fig['data'])):
fig['data'][ser]['x'] = df1['idx'].values.tolist()
fig['data'][ser]['text'] = df1['mth'].values.tolist()
fig['data'][ser]['hoverinfo']='all'
# layout for entire figure
f2Data = fig['data']
f2Layout = go.Layout(
xaxis = go.layout.XAxis(
tickmode = 'array',
tickvals = df_display['display'].values.tolist(),
ticktext = df_display['yrmth'].values.tolist(),
zeroline = False)#,
)
# plot figure with specified major ticks and gridlines
fig2 = go.Figure(data=f2Data, layout=f2Layout)
iplot(fig2)
Some important details:
1. Flexibility and limitations with iplot():
This approach with iplot() and editing all those settings is a bit clunky, but it's very flexible with regards to the number of columns / variables in the dataset, and arguably preferable to building each trace manually like trace1 = go.Scatter() for each and every column in the df.
2. Why do you have to edit each series / trace?
If you try to skip the middle part with
for ser in range(0,len(fig['data'])):
fig['data'][ser]['x'] = df1['idx'].values.tolist()
fig['data'][ser]['text'] = df1['mth'].values.tolist()
fig['data'][ser]['hoverinfo']='all'
and try to set tickvals and ticktext directly on the entire plot, it will have no effect:
I think that's a bit weird, but I think it's caused by some underlying settings initiated by iplot().
3. One thing is still missing:
In order fot thie setup to work, the structure of ticvals and ticktext is [0, 31, 59, 90] and ['jan<br>2015', 'feb<br>', 'mar<br>', 'apr<br>'], respectively. This causes the xaxis line hovertext show the position of the data where ticvals and ticktext are empty:
Any suggestions on how to improve the whole thing is highly appreciated. Better solutions than my own will instantly receive Accepted Answer status!

How do I add space between the tick labels and the graph in plotly (python)?

If I create a horizontal bar graph using plotly, the labels for each bar are right up against the graph. I'd like to add some space/pad/margin between the label and the graph. How can I do this?
Example:
import plotly.offline as py
import plotly.graph_objs as go
labels = ['Alice','Bob','Carl']
vals = [2,5,4]
data = [go.Bar(x=vals, y=labels, orientation='h')]
fig = go.Figure(data)
py.iplot(fig)
Just use parameter pad in margin. Check example from docs here.
Code:
import plotly.offline as py
import plotly.graph_objs as go
labels = ['Alice','Bob','Carl']
vals = [2,5,4]
data = [go.Bar(x=vals, y=labels, orientation='h')]
layout = go.Layout(
margin=dict(
pad=20
),
title = 'hbar',
)
fig = go.Figure(data=data,layout=layout)
py.plot(fig, filename='horizontal-bar.html')
And plot should be looks something like that:
Shorter solution:
fig.update_layout(margin_pad=10)
I think you could add some code like this.
import plotly.offline as py
import plotly.graph_objs as go
labels = ['Alice','Bob','Carl']
vals = [2,5,4]
data = [go.Bar(x=vals, y=labels, orientation='h')]
layout = dict(yaxis=dict(ticksuffix=" "))
fig = go.Figure(data=data,layout=layout)
py.iplot(fig)
add a suffix will fix this problem easily. I have checked the reference plotly ref, it also have more suitable key named tickformat, but it hard to use so I didn't use it.

Categories