I'm trying to create a dropdown button in plotly that would allow for plotting multiple vectors at once for a subset of data (or, to have multiple traces for the same dropdown button). These subsets would be chosen via the above-mentioned button.
Here's a toy example in plotly.express:
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
df = pd.DataFrame(
dict(
x=[1.0, 2.0, 3.0, 4.0],
y1=[1.0, 2.0, 3.0, 4.0],
y2=[2.0, 4.0, 6.0, 8.0],
z=["a", "a", "b", "b"],
)
)
px.scatter(df, x="x", y=["y1", "y2"], symbol="z")
plt.show()
What I'd like to achieve is a dropdown button that would select a subset for distinct z values ("a" or "b").
Unfortunately, go.Scatter does not seem like to have multiple y arrays and I end up with a complete mess.
zs = df.z.unique()
dropdown_buttons = []
fig = go.Figure()
for i, val in enumerate(zs):
df_ = df.query(f'z=="{val}"')
fig.add_trace(
go.Scatter(
x=df_["x"],
y=df_[["y1", "y2"]],
name=val,
)
)
dropdown_buttons.append(
{
"label": val,
"method": "update",
"args": [
{"visible": [x == i for x in range(len(zs))]},
{"title": val},
],
}
)
fig.update_layout(
{
"updatemenus": [
{
"type": "dropdown",
"showactive": True,
"active": 0,
"buttons": dropdown_buttons,
}
]
}
)
fig.show()
Related
I have multiple clusters and each datapoint in the cluster has a special group. I am trying to highlight selected data points (with yellow color) in the plotly scatter plot based on selected value (group) from a dropdown list.
Here is a code to generate sample data:
import pandas as pd
import numpy as np
def generate_random_cluster(name, size, loc_x, loc_y, groups=['A','B','C'], p=None):
return pd.DataFrame({
'name': name,
'x': np.random.normal(loc=loc_x, size=size),
'y': np.random.normal(loc=loc_y, size=size),
'group': np.random.choice(['A','B','C'], size=size, p=p)
})
groups = ['A','B','C']
cluster_1 = generate_random_cluster(name='cluster_1', size=15, loc_x=3, loc_y=2, groups=groups, p=[0.7, 0.2, 0.1])
cluster_2 = generate_random_cluster(name='cluster_2', size=35, loc_x=9, loc_y=5, groups=groups, p=[0.2, 0.7, 0.1])
cluster_3 = generate_random_cluster(name='cluster_3', size=20, loc_x=6, loc_y=8, groups=groups, p=[0.1, 0.2, 0.7])
data = pd.concat([cluster_1, cluster_2, cluster_3]).reset_index(drop=True)
data.head()
Which returns dataframe like this:
name
x
y
group
cluster_1
3.198048
0.385736
B
cluster_1
1.784080
2.608631
A
cluster_1
4.160103
2.119545
A
cluster_1
2.522486
1.994962
B
cluster_1
4.073054
1.204167
A
I am quite new to plotly, but based from documentation I thought I just need to use update_layout method like this:
import plotly.graph_objects as go
cluster_colors = {'cluster_1': 'green', 'cluster_2': 'red', 'cluster_3': 'blue'}
layout = go.Layout(
xaxis = go.layout.XAxis(
showticklabels=False),
yaxis = go.layout.YAxis(
showticklabels=False
)
)
fig = go.Figure(layout=layout)
for cluster_ix, (cluster, df) in enumerate(data.groupby('name')):
customdata = df['group']
fig.add_scatter(
x=df['x'],
y=df['y'],
name=cluster,
mode='markers',
customdata=customdata,
hovertemplate="<br>".join([
"X: %{x}",
"Y: %{y}",
"Group: %{customdata}"
]),
marker_color=[cluster_colors[cluster] for _ in range(len(df))],
)
def highlight_group(group):
result = []
for tracer_ix, tracer in enumerate(fig["data"]):
colors = ["yellow" if datapoint_group == group else cluster_colors[fig["data"][tracer_ix]["name"]] for datapoint_group in fig["data"][tracer_ix]["customdata"]]
result.append(colors)
return result
fig.update_layout(
updatemenus=[
{
"buttons": [
{
"label": group,
"method": "update",
"args": [
{"marker": {"color": highlight_group(group)}}
],
}
for group in groups
]
}
],
margin={"l": 0, "r": 0, "t": 25, "b": 0},
height=700
)
fig.show()
This generates plot like this:
But when I change the value from the dropdown list, every marker turns black:
How to correctly highlight selected markers?
Based on #jmmease's answer here in the plotly forums, I believe you can restructure the markers dictionary:
fig.update_layout(
updatemenus=[
{
"buttons": [
{
"label": group,
"method": "update",
"args": [
{"marker.color": highlight_group(group)}
],
}
for group in groups
]
}
],
margin={"l": 0, "r": 0, "t": 25, "b": 0},
height=700
)
Here is the result:
This accomplishes what you asked in your original question, but from a design perspective, you might want to add another dropdown option so that you can select no groups – otherwise, once you select a group, you cannot return the figure to its original state.
Since your code is pretty robust, you can iterate through groups+["None"] to create the buttons instead of groups (so that I don't modify groups), you will have another dropdown option with the label None:
fig.update_layout(
updatemenus=[
{
"buttons": [
{
"label": group,
"method": "update",
"args": [
{"marker.color": highlight_group(group)}
],
}
for group in groups+["None"]
]
}
],
margin={"l": 0, "r": 0, "t": 25, "b": 0},
height=700
)
Then the result looks like this:
This next part is beyond the scope of your original question, but there may be some potential confusion in the legend because when you create the figure, the name of each cluster (and therefore the marker color as indicated in the legend) is linked to the cluster instead of the marker color – this means that when you select a certain group to color "yellow", you'll have a cluster group where some markers are colored yellow, and other markers have their original color, and I believe plotly will have to choose a color arbitrarily for the legend – probably the color of the first marker within a group.
For example, once we select Group B from the dropdown, cluster 3 is mostly blue markers as you defined when creating the figure, but there is also a mixture of yellow markers from Group B, and this causes the legend entry to be colored yellow. The same issue exists for cluster 2 which is mostly red markers but contains some Group B yellow markers. If I think of a solution, I'll update my answer.
I am trying to build a plotly scatterplot in Jupyter Lab to be able to see dependencies between various columns in a DataFrame.
I want to have two dropdown menus (corresponding to the X and Y axes), in each of which a full list of the DF columns will be available. When I select a column in any of the menus, the data on the appropriate axis should be replaced by the column I selected (so, if I select the same column for X and Y, I would expect a straight line).
Below is my current implementation with a sample DataFrame:
# Creating the DataFrame
temp = pd.DataFrame(np.random.randint(0, 1000, (100, 10)))
col_list = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J']
temp.columns = col_list
# Init figure with the A column on both axes by default
fig = go.Figure()
default_col = 0
fig.add_trace(
go.Scatter(
x=temp[col_list[default_col]].values,
y=temp[col_list[default_col]].values,
name="Metric correlation",
mode="markers"
),
)
fig.update_xaxes(title_text=col_list[default_col])
fig.update_yaxes(title_text=col_list[default_col])
col_list = temp.columns
# Building options for each of the lists
btns_x = [
dict(
label=c,
method="update",
args=[
{"x": temp[c].fillna(0).values,
'xaxis': {'title': c}
}],
) for c in col_list]
btns_y = [
dict(
label=c,
method="update",
args=[
{"y": temp[c].fillna(0).values,
'yaxis': {'title': c}
}],
) for c in col_list]
# Adding the lists to the figure
fig.update_layout(
updatemenus=[
dict(
buttons=btns_x,
# method="update",
direction="down",
pad={"r": 10, "t": 10},
showactive=True,
x=0.1,
xanchor="left",
y=1.1,
yanchor="top"
),
dict(
buttons=btns_y,
# method="update",
direction="down",
pad={"r": 10, "t": 10},
showactive=True,
x=0.1,
xanchor="right",
y=1.1,
yanchor="top"
),
]
)
fig.update_layout(width=1000, height=1000)
fig.show()
The figure draws correctly initially:
Still, there are a few problems:
When I change values in a dropdown, it only works once. On the next tries nothing happens
If I first change the value on one dropdown and then on the other, all data disappears from the graph (see screenshot below)
The axes labels are not being updated
It's just about being systematic around the list comprehensions. Below fully works, allows selection of any column and updates appropriate axis title.
import pandas as pd
import numpy as np
import plotly.express as px
temp = pd.DataFrame(np.random.randint(0, 1000, (100, 10)))
col_list = ["A", "B", "C", "D", "E", "F", "G", "H", "I", "J"]
temp.columns = col_list
fig = px.scatter(temp, x="A", y="B")
fig.update_layout(
updatemenus=[
{
"buttons": [
{
"label": c,
"method": "update",
"args": [
{axis: [temp[c]]},
{f"{axis}axis": {"title": {"text": c}}},
],
}
for c in temp.columns
],
"x": 0 if axis == "x" else 0.1,
"y": 1.2,
}
for axis in "xy"
]
)
I'm trying to create a button that allows to switch x and y axes from a plotly.js figure so that x =becomes=> y axis and y =becomes=> x
Reading the documentation the only thing I could find regards reversing the range using the autorange attribute.
Is there a way to simply switch x and y without having to create a new figure from scratch?
This is tagged as python. Below works for flipping x & y in python. Similar approach could be used in javascript for structure of updatemenus
import pandas as pd
import numpy as np
import plotly.express as px
df = pd.DataFrame(
{"var1": np.random.uniform(1, 5, 30), "var2": np.random.uniform(4, 10, 30)}
)
fig = px.scatter(df, x="var1", y="var2")
fig.update_layout(
updatemenus=[
{
"buttons": [
{
"label": combi,
"method": "restyle",
"args": [
{"x": [fig.data[0][combi[0]]], "y": [fig.data[0][combi[1]]]}
],
}
for combi in ["xy", "yx"]
]
}
]
)
I made a choropleth map with continuous color scale and divided it for 4 sub-map facets. The problem is that color scale gauge is one for all 4 maps, and even if I'll try to change color scale it’ll be applied to all maps together. Is there a way to set different color scaling for different facets?
fig = px.choropleth(df,
geojson=counties,
locations='id',
color='count',
facet_col='age_group',
facet_col_wrap=2,
color_continuous_scale='BuGn',
hover_name='county',
width=1000,
height=900,
animation_frame='year')
fig.update_geos(fitbounds="locations")
fig.show()
you have not provided sample geojson or data. Have picked up US states and simulated a dataframe so your code pretty much works unchanged (add featurekeyid parameter)
it's a case of updating traces in figure and animation frames to use separate coloraxis
with traces updated, positioning and colorscale of each coloraxis needs to be set
import plotly.express as px
import requests
import geopandas as gpd
# get some geojson
geojson = requests.get(
"https://raw.githubusercontent.com/nvkelso/natural-earth-vector/master/geojson/ne_110m_admin_1_states_provinces.geojson"
).json()
counties = {
k: v
if k != "features"
else [
{
k: v if k != "properties" else {"id": i, "name": v["name"]}
for k, v in f.items()
}
for i, f in enumerate(v)
]
for k, v in geojson.items()
}
# construct a dataframe of strucrture implied in question
df = (
pd.json_normalize(counties["features"])
.pipe(lambda d: d.drop(columns=[c for c in d.columns if not "properties" in c]))
.rename(columns={"properties.id": "id", "properties.name": "county"})
.merge(pd.DataFrame({"year": range(2015, 2023)}), how="cross")
.merge(pd.DataFrame({"age_group": ["<18", "18-30", "30-65", ">65"]}), how="cross")
.pipe(
lambda d: d.assign(
count=np.random.randint(1, 50, len(d))
* (pd.factorize(d["age_group"])[0] + 1)
)
)
)
fig = px.choropleth(
df,
geojson=counties,
locations="id",
featureidkey="properties.id",
color="count",
facet_col="age_group",
facet_col_wrap=2,
color_continuous_scale="BuGn",
hover_name="county",
width=1000,
height=900,
animation_frame="year",
)
fig.update_geos(fitbounds="locations")
# update traces to use different coloraxis
for i, t in enumerate(fig.data):
t.update(coloraxis=f"coloraxis{i+1}")
for fr in fig.frames:
# update each of the traces in each of the animation frames
for i, t in enumerate(fr.data):
t.update(coloraxis=f"coloraxis{i+1}")
# position / config all coloraxis
fig.update_layout(
coloraxis={"colorbar": {"x": -0.2, "len": 0.5, "y": 0.8}},
coloraxis2={
"colorbar": {
"x": 1.2,
"len": 0.5,
"y": 0.8,
},
"colorscale": fig.layout["coloraxis"]["colorscale"],
},
coloraxis3={
"colorbar": {"x": -0.2, "len": 0.5, "y": 0.3},
"colorscale": fig.layout["coloraxis"]["colorscale"],
},
coloraxis4={
"colorbar": {"x": 1.2, "len": 0.5, "y": 0.3},
"colorscale": fig.layout["coloraxis"]["colorscale"],
},
)
I would like to hide specific nodes (in my case, the rightmost) while preserving the size of intermediate nodes. As a simplistic example:
import plotly.graph_objects as go
link_data = dict(
source = [0,1,1],
target = [1,2,2],
value = [1,1,1]
)
node_data = dict(
label = ['a','b','c'],
)
fig = go.Figure(
data = [go.Sankey(
node = node_data,
link = link_data
)]
)
fig.show()
Results in:
But I want something more like this:
Some approaches I've tried:
I can remove the extra b-to-c connection and feed it back to b. This preserves the height of node b, but adds a circular link (which I don't want). This might be ok if I could remove the loop.
I can specify link colors as ['grey','white','white] (or 'rgba(0,0,0,0) in place of 'white') and node colors as ['blue','blue','white'], but this isn't the best looking: it adds a large pad of space to the right. And this seems like it adds unnecessary elements to the figure (more important to me for performance when I my figure is complex).
-Python 3.8, Plotly 5.3.1
re-using this approach to creating a sankey plot plotly sankey graph data formatting
I used a slightly more sophisticated approach that is similar to your second approach. This as you have noted does mean two things
there is space to right of chart
hover info still there !
have extended sample data to show node d is invisible as well as it's an end node with no flows going out of it
import pandas as pd
import numpy as np
import plotly.graph_objects as go
import plotly.express as px
links = [
{"source": "a", "target": "b", "value": 1},
{"source": "b", "target": "c", "value": 1},
{"source": "b", "target": "c", "value": 1},
{"source": "b", "target": "d", "value": 1}
]
df = pd.DataFrame(links)
nodes = np.unique(df[["source", "target"]], axis=None)
nodes = pd.Series(index=nodes, data=range(len(nodes)))
invisible = set(df["target"]) - set(df["source"])
fig = go.Figure(
go.Sankey(
node={
"label": [n if not n in invisible else "" for n in nodes.index],
"color": [
px.colors.qualitative.Plotly[i%len(px.colors.qualitative.Plotly)]
if not n in invisible
else "rgba(0,0,0,0)"
for i, n in enumerate(nodes.index)
],
"line": {"width": 0},
},
link={
"source": nodes.loc[df["source"]],
"target": nodes.loc[df["target"]],
"value": df["value"],
"color": [
"lightgray" if not n in invisible else "rgba(0,0,0,0)"
for n in df["target"]
],
},
)
)
fig