Extracting plotly.express selection in JupyterLab - python

I want to extract the indices or a mask from a selection made in a plotly.express figure. The figure is created in JupyterLab.
import plotly.express as px
df = px.data.iris()
fig = px.scatter(df x="sepal_width", y="sepal_length", color="species")
fig.show()
This figure shows the untouched figure.
This figure show a arbitrary selection. From this selection, I would like to extract a list of indices or a boolean mask, or anything that will allow the selection to be extracted from the original DataFrame.
There seems to be some attributes/functions that are to aid with this, such as fig.data[0].selectedpoints. I am unable to utilize them.
plotly is version: '4.14.3'

As far as I know, there is no way to get the range selected by the user. The feature you pointed out in your question, selectedpoints, is there for graphers to use to highlight specific ranges. It can be used as a scenario for the creator rather than a user choice. I have customized this feature with information from this page.
import plotly.graph_objects as go
import numpy as np
df = px.data.iris()
fig = go.Figure()
fig.add_trace(go.Scatter(x=df['sepal_width'],
y=df['sepal_length'],
mode='markers',
marker=dict(color='rgba(0, 45, 240)', size=10)))
fig.update_layout(width=600,
height=550,
autosize=False,
xaxis=dict(zeroline=False),
hovermode='closest')
fig.show()
inds = [15+k for k in range(30)]
fig.data[0].update(selectedpoints=inds,
selected=dict(marker=dict(color='red')),#color of selected points
unselected=dict(marker=dict(color='rgb(200,200, 200)',#color of unselected pts
opacity=0.9)));
fig.show()

Related

How to set color per column with Plotly

So simple and yet i just can't find solution after reading a lot.
I would like to plot 2 columns out of my dataframe (Pandas) and i want to set color for each.
color_dic = {"Close":'#565454',"MA":"red"}
fig = data.plot(x=data.index,y=["Close","MA"],template="simple_white",color=color_dic)
Which is not the way to do so, but what would be an equivalent way to get this ?
Also , how can i add a scatter on top of this with a different color ?
You can do this in many ways, and you can take a look at Plotly: How to define colors in a figure using plotly.graph_objects and plotly.express? for some details. But since you're specifically asking how to assign a color to a trace by the name of the source data in a pandas dataframe, I would use color_discrete_map = color_dict , where color_dict is a dictionary that contains {"Close":'#565454',"MA":"red"}, like this:
fig = df.plot(x=df.index,y=["Close","MA"],template="simple_white",
color_discrete_map = color_dict)
Plot 1:
To include another trace, I would use fig.update_trace along with the trace type of choice like this:
fig.add_trace(go.Scatter(x=df.index, y=df['Close']*2,
mode = 'lines',
line_color = 'blue'))
Plot 2:
Complete code:
import numpy as np
import pandas as pd
pd.options.plotting.backend = "plotly"
df = pd.DataFrame({"Close":[1,2,3,4,5,8,7,8],"MA":[2,2,2,3,4,4,6,7]})
color_dict = {"Close":'#565454',"MA":"red"}
fig = df.plot(x=df.index,y=["Close","MA"],template="simple_white",
color_discrete_map = color_dict)
fig.add_trace(go.Scatter(x=df.index, y=df['Close']*2,
mode = 'lines',
line_color = 'blue'))
fig.show()
there are many possibilities to plot with plotly, but if you use graphObject you can do:
import pandas as pd
import plotly.graph_objects as go
import plotly
data={"Close":[1,2,3,2,1,2,3,4],"MA":[0,1,2,3,4,5,2,1]}
df=pd.DataFrame(data)
fig = go.Figure()
color_dic = {"Close":'#565454',"MA":"red"}
# Add traces
for col in df.columns:
fig.add_trace(go.Scatter(x=df.index, y=df[col],
mode='lines+markers',
name=col,
marker_color=color_dic[col]))
result:

Plotly bar chart not ascending/descending

I have a bar chart in plotly that I have produced, however, it is not in any type of order. How would I sort to ascending or descending?
What I am doing:
fig = px.bar(data, x='Old_SKU', y='u_power')
fig = data.sort_values('u_power', ascending=True)
fig.show()
I'm not sure what your desired output is, or what your data looks like. In any case fig in plotly terms is normaly a plotly figure object. When you're running fig = data.sort_values('u_power', ascending=True) you're not building a figure, but sorting a dataframe. So far I can only imagine that you'd like to sort a dataset that looks like this:
... into this:
Or maybe you're expecting a continuous increase or decrease? In that case you will have to share a dataset. Nevertheless, with a few tweaks depending on your dataset, the following snippet should not be far from a working solution:
import plotly.express as px
import numpy as np
import pandas as pd
var = np.random.randint(low=2, high=6, size=20).tolist()
data = pd.DataFrame({'u_power':var,
'Old_SKU':np.arange(0, len(var))})
# fig = px.bar(data, x='Old_SKU', y='u_power', barmode='stack')
fig = px.bar(data.sort_values('u_power'), x='Old_SKU', y='u_power', barmode='stack')
fig.show()

Plotly: How to create a line plot with different style and color for each variable?

I am trying to create a graph with 10 different lines with different colours and markers using Plotly express. Something similar to this:
I can create a nice looking graph with different colours using the px.line function as the documentation suggests. My code looks like this:
import plotly.express as px
import numpy as np
import pandas as pd
rand_elems = []
for i in range(10):
rand_elems.append(np.random.randn(25))
data = pd.DataFrame(rand_elems)
px.line(data_frame=data.T)
and my line graph looks like this:
where each variable is a (25,) numpy array with random values from the standard normal distribution (created with np.random.randn(25)).
Is there a way I can add different styles to each line? Other plotting libraries are also welcome as I couldn't find a solution for this in Plotly's documentation.
I understand there is a limit of line styles I could use. Maybe I could cycle through them and the colours? What would be a good solution for this?
EDIT: The graph purpose is solely to show that the signals are random and within the standard normal distribution limits.
I have taken #vestland 's code and adapted and fixed to a simpler version that achieves exactly what I needed.
The idea is to use SymbolValidator() from plotly.validators.scatter.marker as noted in #vestland 's answer. One could also add a random factor to this list for more distinct results.
sample running code:
import plotly.express as px
from itertools import cycle
from plotly.validators.scatter.marker import SymbolValidator
# data
df = px.data.gapminder()
df = df[df['country'].isin(['Canada', 'USA', 'Norway', 'Sweden', 'Germany'])]
# plotly
fig = px.line(df, x='year', y='lifeExp',
color='country',
line_dash = 'continent')
raw_symbols = SymbolValidator().values
# Take only the string values which are in this order.
symbols_names = raw_symbols[::-2]
markers = cycle(symbols_names)
# set unique marker style for different countries
fig.update_traces(mode='lines+markers')
for d in fig.data:
d.marker.symbol = next(markers)
d.marker.size = 10
fig.show()
This code produces the following graph:
Thanks a lot for the insights #vestland I wonder if Plotly could make this a parameter option in the next versions.
Update:
If your DataFrame does not have an aggregation column to use for the line_dash parameter like mine, you can also cycle through a list with line styles and overwrite them using `fig.data.line["dash"] as presented below:
Code to generate data and import packages.
import numpy as np
import random
import plotly.express as px
import numpy as np
import pandas as pd
from itertools import cycle
from random import shuffle
from plotly.validators.scatter.marker import SymbolValidator
from plotly.validators.scatter.line import DashValidator
rand_elems = []
for i in range(10):
rand_elems.append(np.random.randn(25))
data = pd.DataFrame(rand_elems)
data.index = [f"Signal {i}" for i in range(10)]
Code to generate different marker and line styles for each column in the DataFrame.
line_styles_names = ['solid', 'dot', 'dash', 'longdash', 'dashdot', 'longdashdot']
line_styles = cycle(line_styles_names)
fig = (px.line(data_frame=data.T,
color_discrete_sequence=px.colors.qualitative.Bold_r,
title="First 10 Signals Visualization",
))
symbols_names = list(set([i.replace("-open", "").replace("-dot", "") for i in SymbolValidator().values[::-2]]))
shuffle(symbols_names)
markers = cycle(symbols_names)
_ = fig.update_traces(mode='lines+markers')
for d in fig.data:
d.line["dash"] = next(line_styles)
d.marker.symbol = next(markers)
d.marker.size = 10
fig.show()
with this, the result should look similar to this, which is exactly what I was looking for.
px.line is perfect for datasets of high diversity, like different categories for an array of countries accross several continents because you can distinguish the catgories using arguments like color = 'country, line_dash = 'continent to assign colors and shapes to them. Here's an example using a subset the built-in dataset px.data.gapminder()
plot 1
code 1
import plotly.express as px
from plotly.validators.scatter.marker import SymbolValidator
# data
df = px.data.gapminder()
df = df[df['country'].isin(['Canada', 'USA', 'Norway', 'Sweden', 'Germany'])]
# plotly
fig = px.line(df, x='year', y='lifeExp',
color='country',
line_dash = 'continent')
fig.show()
But you seem to be interested in different shapes of the markers as well, and there does not seem to be a built-in functionality for those that are as easy to use as color and line_shape. So what follows is a way to cycle through available marker shapes and apply those to, for example, the different countries. You could of course define your own sequence by picking shapes from marker styles, like:
['arrow-bar-left', 'asterisk', 'arrow-right', 'line-ne', 'circle-cross', 'y-left']
But you can also grab a bunch of styles based on raw_symbols = SymbolValidator().values, refine those findings a bit and add those to, for example, country names.
Here's the result
And here's how you do it:
import plotly.express as px
from itertools import cycle
# data
df = px.data.gapminder()
df = df[df['country'].isin(['Canada', 'USA', 'Norway', 'Sweden', 'Germany'])]
# plotly
fig = px.line(df, x='year', y='lifeExp',
color='country',
line_dash = 'continent')
# retrieve a bunch of markers
raw_symbols = SymbolValidator().values
namestems = []
namevariants = []
symbols = []
for i in range(0,len(raw_symbols),3):
name = raw_symbols[i+2]
symbols.append(raw_symbols[i])
namestems.append(name.replace("-open", "").replace("-dot", ""))
namevariants.append(name[len(namestems[-1]):])
markers = cycle(list(set(namestems)))
# set unique marker style for different countries
fig.update_traces(mode='lines+markers')
for d in fig.data:
d.marker.symbol = next(markers)
d.marker.size = 10
fig.show()
The way I know is to create individual graph objects for each line you want to plot (each having it's own style). Then create a list of all the graph objects and pass it to the data argument of the go.Figure() function.
See this blog for an example.

Plotly: How to add vertical lines at specified points?

I have a data frame plot of a time series along with a list of numeric values at which I'd like to draw vertical lines. The plot is an interactive one created using the cufflinks package. Here is an example of three time series in 1000 time values, I'd like to draw vertical lines at 500 and 800. My attempt using "axvlinee" is based upon suggestions I've seen for similar posts:
import numpy as np
import pandas as pd
import cufflinks
np.random.seed(123)
X = np.random.randn(1000,3)
df=pd.DataFrame(X, columns=['a','b','c'])
fig=df.iplot(asFigure=True,xTitle='time',yTitle='values',title='Time Series Plot')
fig.axvline([500,800], linewidth=5,color="black", linestyle="--")
fig.show()
The error message states 'Figure' object has no attribute 'axvline'.
I'm not sure whether this message is due to my lack of understanding about basic plots or stems from a limitation of using igraph.
The answer:
To add a line to an existing plotly figure, just use:
fig.add_shape(type='line',...)
The details:
I gather this is the post you've seen since you're mixing in matplotlib. And as it has been stated in the comments, axvline has got nothing to do with plotly. That was only used as an example for how you could have done it using matplotlib. Using plotly, I'd either go for fig.add_shape(go.layout.Shape(type="line"). But before you try it out for yourself, please b aware that cufflinks has been deprecated. I really liked cufflinks, but now there are better options for building both quick and detailed graphs. If you'd like to stick to one-liners similat to iplot, I'd suggest using plotly.express. The only hurdle in your case is changing your dataset from a wide to a long format that is preferred by plotly.express. The snippet below does just that to produce the following plot:
Code:
import numpy as np
import pandas as pd
import plotly.express as px
from plotly.offline import iplot
#
np.random.seed(123)
X = np.random.randn(1000,3)
df=pd.DataFrame(X, columns=['a','b','c'])
df['id'] = df.index
df = pd.melt(df, id_vars='id', value_vars=df.columns[:-1])
# plotly line figure
fig = px.line(df, x='id', y='value', color='variable')
# lines to add, specified by x-position
lines = {'a':500,'c':700,'a':900,'b':950}
# add lines using absolute references
for k in lines.keys():
#print(k)
fig.add_shape(type='line',
yref="y",
xref="x",
x0=lines[k],
y0=df['value'].min()*1.2,
x1=lines[k],
y1=df['value'].max()*1.2,
line=dict(color='black', width=3))
fig.add_annotation(
x=lines[k],
y=1.06,
yref='paper',
showarrow=False,
text=k)
fig.show()
Not sure if this is what you want, adding two scatter seems to work:
np.random.seed(123)
X = np.random.randn(1000,3)
df=pd.DataFrame(X, columns=['a','b','c'])
fig = df.iplot(asFigure=True,xTitle='time',yTitle='values',title='Time Series Plot')
fig.add_scatter(x=[500]*100, y=np.linspace(-4,4,100), name='lower')
fig.add_scatter(x=[800]*100, y=np.linspace(-4,4,100), name='upper')
fig.show()
Output:

Multiple series in a trace for plotly

I dynamically generate a pandas dataframe where columns are months, index is day-of-month, and values are cumulative revenue. This is fairly easy, b/c it just pivots a dataframe that is month/dom/rev.
But now I want to plot it in plotly. Since every month the columns will expand, I don't want to manually add a trace per month. But I can't seem to have a single trace incorporate multiple columns. I could've sworn this was possible.
revs = Scatter(
x=df.index,
y=[df['2016-Aug'], df['2016-Sep']],
name=['rev', 'revvv'],
mode='lines'
)
data=[revs]
fig = dict( data=data)
iplot(fig)
This generates an empty graph, no errors. Ideally I'd just pass df[df.columns] to y. Is this possible?
You were probably thinking about cufflinks. You can plot a whole dataframe with Plotly using the iplot function without data replication.
An alternative would be to use pandas.plot to get an matplotlib object which is then converted via plotly.tools.mpl_to_plotly and plotted. The whole procedure can be shortened to one line:
plotly.plotly.plot_mpl(df.plot().figure)
The output is virtually identical, just the legend needs tweaking.
import plotly
import pandas as pd
import random
import cufflinks as cf
data = plotly.tools.OrderedDict()
for month in ['2016-Aug', '2016-Sep']:
data[month] = [random.randrange(i * 10, i * 100) for i in range(1, 30)]
#using cufflinks
df = pd.DataFrame(data, index=[i for i in range(1, 30)])
fig = df.iplot(asFigure=True, kind='scatter', filename='df.html')
plot_url = plotly.offline.plot(fig)
print(plot_url)
#using mpl_to_plotly
plot_url = plotly.offline.plot(plotly.tools.mpl_to_plotly(df.plot().figure))
print(plot_url)

Categories