Plotly Dash - not fully understanding the extendData feature with stacked barplot - python

I've created a stacked barplot in a dash app that updates the number of data points in a trace according to input from a slider, as shown:
Two different states:
Unfortunately, the updates on the image are painfully slow (note that the update mode is 'drag' and not the default. I want fluid action on the slider as it's moved. I've tried to implement a clientside callback to update the data, as shown on this post: Plotly/Dash display real time data in smooth animation (thanks to emher for that great response), but I can't seem to get it working (I feel it's extremely complicated to understand the interactions between inputs/outputs without more complex examples and I'm not even sure if this is the best solution or if it works with my plot type). I think it has to do with the several go.Bar() calls I have in the main go.Figure() call, but I'm honestly not too sure and have a really hard time finding documentation using the extendData or clientside callback functions. I'm attaching a very basic working example of the structure of my plots - it doesn't run exceptionally slow because the amount of data is small, but ideally I would like to have full smooth responsiveness as the user drags the slider around. I would really appreciate some help in making clientside callbacks work here or any other method of allowing this functionality instead of having to ping the aws container all the time for re-renders. (On the JS side, I'm not sure if I should be re-rendering the plot in Plotly.js, or only updating the data, or how to use dcc.Store with all of this.) Thanks for any guidance.
import datetime
import dash
import dash_html_components as html
from dash.dependencies import Input, Output
import dash_core_components as dcc
import plotly.graph_objects as go
trace1 = [1, 2, 3, 2, 3, 4, 5, 2]
trace2 = [3, 4, 2, 5, 3, 1, 3, 2]
dates = [datetime.date(2019, 1, 1), datetime.date(2019, 1, 2), datetime.date(2019, 1, 3), datetime.date(2019, 1, 4),
datetime.date(2019, 1, 5), datetime.date(2019, 1, 6), datetime.date(2019, 1, 7), datetime.date(2019, 1, 8)]
def serve_layout():
return html.Div(children=[
dcc.Graph(id="bar-chart"),
html.Div(" Select the starting date:"),
html.Br(),
html.Div(dcc.Slider(
id='slider',
min=0,
max=len(dates) - 1,
value=0,
updatemode='drag'))
])
app = dash.Dash(__name__)
#app.callback(
Output("bar-chart", "figure"),
[Input("slider", "value")])
def update_bar_chart(day):
fig = go.Figure(data=[
go.Bar(name='Trace 1', x=dates[day:], y=trace1[day:], marker_color='blue'),
go.Bar(name='Trace 2', x=dates[day:], y=trace2[day:], marker_color='red')])
fig.update_layout({'barmode': 'stack'})
return fig
app.layout = serve_layout
if __name__ == '__main__':
app.run_server()

Related

Dynamic update open-street-map position mark in plotly.graph_objects in python ( dash )

In the dash application, I have a map displayed with the center in my position and on it I have a point (mark) in the center again as my position. However, I need to dynamically change my position on the map.
The map itself changes its center without problems, but the point (marker) does not, as if it is anchored in its original position and does not move to a new position (the new center of the map).
Here is a simple code that shows it nicely. After running it, the map is redrawn every 2 seconds as I want. But the point remains in the last-origial position..... Does anyone know what I'm doing wrong?
import random
import dash
from dash import dcc,html
from dash.dependencies import Input, Output
import plotly.graph_objects as go
def get_fig():
# random position as center
lat = 48.8 + 1/random.randrange(300, 800)
lon = 20.1+ 1/random.randrange(300, 800)
fig = go.Figure()
# blue point - should be in center of map , vhy is not ?
fig.add_trace(
go.Scattermapbox(
lat=[lat],
lon=[lon],
mode="markers",
marker=go.scattermapbox.Marker(size=20),
)
)
# open street map with center at lat:lon
fig.update_layout(
height=552,
width=640,
margin={"r": 2, "t": 2, "b": 2, "l": 2},
autosize=True,
mapbox=dict(style="open-street-map", center=dict(lat=lat, lon=lon), zoom=16),
)
return fig
app = dash.Dash()
#app.callback(
Output("graph", "figure"),
Input("my_interval", "n_intervals"))
def update_bar_chart(dummy):
return get_fig()
app.layout = html.Div([
dcc.Graph(id = "graph"),
dcc.Interval(id="my_interval", interval=2000, n_intervals=0, disabled=False),
])
app.run_server(debug=True, use_reloader=False)
I use python 3.9 , plotly version is 5.11.0 and dash version is 2.7.0
I've tried several different ways, but they all fail in the same way.

Detecting pattern in OHLC data in Python [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I have the following set of OHLC data:
[[datetime.datetime(2020, 7, 1, 6, 30), '0.00013449', '0.00013866', '0.00013440', '0.00013857', '430864.00000000', 1593579599999, '59.09906346', 1885, '208801.00000000', '28.63104974', '0', 3.0336828016952944], [datetime.datetime(2020, 7, 1, 7, 0), '0.00013854', '0.00013887', '0.00013767', '0.00013851', '162518.00000000', 1593581399999, '22.48036621', 809, '78014.00000000', '10.79595625', '0', -0.02165439584236435], [datetime.datetime(2020, 7, 1, 7, 30), '0.00013851', '0.00013890', '0.00013664', '0.00013780', '313823.00000000', 1593583199999, '43.21919087', 1077, '157083.00000000', '21.62390537', '0', -0.5125983683488642], [datetime.datetime(2020, 7, 1, 8, 0), '0.00013771', '0.00013818', '0.00013654', '0.00013707', '126925.00000000', 1593584999999, '17.44448931', 428, '56767.00000000', '7.79977280', '0', -0.46474475346744676], [datetime.datetime(2020, 7, 1, 8, 30), '0.00013712', '0.00013776', '0.00013656', '0.00013757', '62261.00000000', 1593586799999, '8.54915420', 330, '26921.00000000', '3.69342184', '0', 0.3281796966161107], [datetime.datetime(2020, 7, 1, 9, 0), '0.00013757', '0.00013804', '0.00013628', '0.00013640', '115154.00000000', 1593588599999, '15.80169390', 510, '52830.00000000', '7.24924784', '0', -0.8504761212473579], [datetime.datetime(2020, 7, 1, 9, 30), '0.00013640', '0.00013675', '0.00013598', '0.00013675', '66186.00000000', 1593590399999, '9.02070446', 311, '24798.00000000', '3.38107106', '0', 0.25659824046919455], [datetime.datetime(2020, 7, 1, 10, 0), '0.00013655', '0.00013662', '0.00013577', '0.00013625', '56656.00000000', 1593592199999, '7.71123423', 367, '27936.00000000', '3.80394497', '0', -0.2196997436836377], [datetime.datetime(2020, 7, 1, 10, 30), '0.00013625', '0.00013834', '0.00013625', '0.00013799', '114257.00000000', 1593593999999, '15.70194874', 679, '56070.00000000', '7.70405037', '0', 1.2770642201834814], [datetime.datetime(2020, 7, 1, 11, 0), '0.00013812', '0.00013822', '0.00013630', '0.00013805', '104746.00000000', 1593595799999, '14.39147417', 564, '46626.00000000', '6.39959586', '0', -0.05068056762237037], [datetime.datetime(2020, 7, 1, 11, 30), '0.00013805', '0.00013810', '0.00013720', '0.00013732', '37071.00000000', 1593597599999, '5.10447229', 231, '16349.00000000', '2.25258584', '0', -0.5287939152480996], [datetime.datetime(2020, 7, 1, 12, 0), '0.00013733', '0.00013741', '0.00013698', '0.00013724', '27004.00000000', 1593599399999, '3.70524540', 161, '15398.00000000', '2.11351192', '0', -0.06553557125171522], [datetime.datetime(2020, 7, 1, 12, 30), '0.00013724', '0.00013727', '0.00013687', '0.00013717', '27856.00000000', 1593601199999, '3.81864840', 140, '11883.00000000', '1.62931445', '0', -0.05100553774411102], [datetime.datetime(2020, 7, 1, 13, 0), '0.00013716', '0.00013801', '0.00013702', '0.00013741', '83867.00000000', 1593602999999, '11.54964001', 329, '42113.00000000', '5.80085155', '0', 0.18226888305628908], [datetime.datetime(2020, 7, 1, 13, 30), '0.00013741', '0.00013766', '0.00013690', '0.00013707', '50299.00000000', 1593604799999, '6.90474065', 249, '20871.00000000', '2.86749244', '0', -0.2474346845207872], [datetime.datetime(2020, 7, 1, 14, 0), '0.00013707', '0.00013736', '0.00013680', '0.00013704', '44745.00000000', 1593606599999, '6.13189248', 205, '14012.00000000', '1.92132206', '0', -0.02188662727072625], [datetime.datetime(2020, 7, 1, 14, 30), '0.00013704', '0.00014005', '0.00013703', '0.00013960', '203169.00000000', 1593608399999, '28.26967457', 904, '150857.00000000', '21.00600041', '0', 1.8680677174547595]]
That looks like this:
I'm trying to detect a pattern that looks like the one above in other sets of OHLC data. It doesn't have to be the same, it only needs to be similar, i.e. the number of candles doesn't have to be the same. Just the shape needs to be similar.
The problem:
I don't know where to start to accomplish this. I know it's not easy to do, but I'm sure there is a way to do this.
What I have tried:
Until now, I only managed to cut away manually the OHLC data that I don't need, so that I can only have the patterns I want. Then, I plotted it using a Pandas dataframe:
import mplfinance as mpf
import numpy as np
import pandas as pd
df = pd.DataFrame([x[:6] for x in OHLC],
columns=['Date', 'Open', 'High', 'Low', 'Close', 'Volume'])
format = '%Y-%m-%d %H:%M:%S'
df['Date'] = pd.to_datetime(df['Date'], format=format)
df = df.set_index(pd.DatetimeIndex(df['Date']))
df["Open"] = pd.to_numeric(df["Open"],errors='coerce')
df["High"] = pd.to_numeric(df["High"],errors='coerce')
df["Low"] = pd.to_numeric(df["Low"],errors='coerce')
df["Close"] = pd.to_numeric(df["Close"],errors='coerce')
df["Volume"] = pd.to_numeric(df["Volume"],errors='coerce')
mpf.plot(df, type='candle', figscale=2, figratio=(50, 50))
What I thought: A possible solution to this problem is using Neural Networks, so I would have to feed images of the patterns I want to a NN and let the NN loop though other charts and see if it can find the patterns I specified. Before going this way, I was looking for simpler solutions, since I don't know much about Neural Networks and I don't know what kind of NN I would need to do and what tools would I be supposed to use.
Another solution I was thinking about was the following: I would need, somehow, to convert the pattern I want to find on other datasets in a series of values. So for example the OHLC data I posted above would be quantified, somehow, and on another set of OHLC data I would just need to find values that get close to the pattern I want. This approach is very empirical for now and I don't know how to put that in code.
A tool I was suggested to use: Stumpy
What I need:
I don't need the exact code, I only need an example, an article, a library or any kind of source that can point me out on how to work when I want to detect a certain pattern specified by me on a OHLC data set. I hope I was specific enough; any kind of advice is appreciated!
Stumpy will work for you.
Basic Methodology
The basic gist of the algorithm is to compute a matrix profile of a data stream, and then use that to find areas that are similar. (You can think of the matrix profile as a sliding window that gives a rating of how closely two patters match using Z-normalized Euclidean Distance).
This article explains matrix profiles in a pretty straightforward way. Here's an excerpt that explains what you want:
Simply put, a motif is a repeated pattern in a time series and a discord is an anomaly.
With the Matrix Profile computed, it is simple to find the top-K number of motifs or
discords. The Matrix Profile stores the distances in Euclidean space meaning that a
distance close to 0 is most similar to another sub-sequence in the time series and a
distance far away from 0, say 100, is unlike any other sub-sequence. Extracting the lowest
distances gives the motifs and the largest distances gives the discords.
The benefits of using a matrix profile can be found here.
The gist of what you want to do is compute the matrix profile, then look for minima. Minima mean the sliding window matched another place well.
This example shows how to use it to find repeating patterns in one data set:
To reproduce their results myself, I navigated to the DAT file and downloaded it myself, then opened and read it instead of using their broken urllib calls to get the data.
Replace
context = ssl.SSLContext() # Ignore SSL certificate verification for simplicity
url = "https://www.cs.ucr.edu/~eamonn/iSAX/steamgen.dat"
raw_bytes = urllib.request.urlopen(url, context=context).read()
data = io.BytesIO(raw_bytes)
with
steam_df = None
with open("steamgen.dat", "r") as data:
steam_df = pd.read_csv(data, header=None, sep="\s+")
I also had to add some plt.show() calls since I ran it outside of Jupyter. With those tweaks, you can run their example and see how it works.
Here's the full code I used, so you don't have to repeat what I did:
import pandas as pd
import stumpy
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
import urllib
import ssl
import io
import os
def change_plot_size(width, height, plt):
fig_size = plt.rcParams["figure.figsize"]
fig_size[0] = width
fig_size[1] = height
plt.rcParams["figure.figsize"] = fig_size
plt.rcParams["xtick.direction"] = "out"
change_plot_size(20, 6, plt)
colnames = ["drum pressure", "excess oxygen", "water level", "steam flow"]
context = ssl.SSLContext() # Ignore SSL certificate verification for simplicity
url = "https://www.cs.ucr.edu/~eamonn/iSAX/steamgen.dat"
raw_bytes = urllib.request.urlopen(url, context=context).read()
data = io.BytesIO(raw_bytes)
steam_df = None
with open("steamgen.dat", "r") as data:
steam_df = pd.read_csv(data, header=None, sep="\s+")
steam_df.columns = colnames
steam_df.head()
plt.suptitle("Steamgen Dataset", fontsize="25")
plt.xlabel("Time", fontsize="20")
plt.ylabel("Steam Flow", fontsize="20")
plt.plot(steam_df["steam flow"].values)
plt.show()
m = 640
mp = stumpy.stump(steam_df["steam flow"], m)
true_P = mp[:, 0]
fig, axs = plt.subplots(2, sharex=True, gridspec_kw={"hspace": 0})
plt.suptitle("Motif (Pattern) Discovery", fontsize="25")
axs[0].plot(steam_df["steam flow"].values)
axs[0].set_ylabel("Steam Flow", fontsize="20")
rect = Rectangle((643, 0), m, 40, facecolor="lightgrey")
axs[0].add_patch(rect)
rect = Rectangle((8724, 0), m, 40, facecolor="lightgrey")
axs[0].add_patch(rect)
axs[1].set_xlabel("Time", fontsize="20")
axs[1].set_ylabel("Matrix Profile", fontsize="20")
axs[1].axvline(x=643, linestyle="dashed")
axs[1].axvline(x=8724, linestyle="dashed")
axs[1].plot(true_P)
def compare_approximation(true_P, approx_P):
fig, ax = plt.subplots(gridspec_kw={"hspace": 0})
ax.set_xlabel("Time", fontsize="20")
ax.axvline(x=643, linestyle="dashed")
ax.axvline(x=8724, linestyle="dashed")
ax.set_ylim((5, 28))
ax.plot(approx_P, color="C1", label="Approximate Matrix Profile")
ax.plot(true_P, label="True Matrix Profile")
ax.legend()
plt.show()
approx = stumpy.scrump(steam_df["steam flow"], m, percentage=0.01, pre_scrump=False)
approx.update()
approx_P = approx.P_
seed = np.random.randint(100000)
np.random.seed(seed)
approx = stumpy.scrump(steam_df["steam flow"], m, percentage=0.01, pre_scrump=False)
compare_approximation(true_P, approx_P)
# Refine the profile
for _ in range(9):
approx.update()
approx_P = approx.P_
compare_approximation(true_P, approx_P)
# Pre-processing
approx = stumpy.scrump(
steam_df["steam flow"], m, percentage=0.01, pre_scrump=True, s=None
)
approx.update()
approx_P = approx.P_
compare_approximation(true_P, approx_P)
Self join vs. join against target
Note that this example was a "self join", meaning it was looking for repeated patterns in it's own data. You'll want to join with the target you are looking to match.
Looking at the signature of stumpy.stump shows you how to do this:
def stump(T_A, m, T_B=None, ignore_trivial=True):
"""
Compute the matrix profile with parallelized STOMP
This is a convenience wrapper around the Numba JIT-compiled parallelized
`_stump` function which computes the matrix profile according to STOMP.
Parameters
----------
T_A : ndarray
The time series or sequence for which to compute the matrix profile
m : int
Window size
T_B : ndarray
The time series or sequence that contain your query subsequences
of interest. Default is `None` which corresponds to a self-join.
ignore_trivial : bool
Set to `True` if this is a self-join. Otherwise, for AB-join, set this
to `False`. Default is `True`.
Returns
-------
out : ndarray
The first column consists of the matrix profile, the second column
consists of the matrix profile indices, the third column consists of
the left matrix profile indices, and the fourth column consists of
the right matrix profile indices.
What you'll want to do is pass the data (pattern) you want to look for as T_B and then the larger sets you want to look in as T_A. The window size specifies how large of a search area you want (this will probably be the length of your T_B data, I'd imagine, or smaller if you want).
Once you have the matrix profile, you will just want to do a simple search and get the indicies of the lowest values. Each window starting at that index is a good match. You may also want to define some threshold minimum such that you only consider it a match if there is at least one value in the matrix profile below that minimum.
Another thing to realize is that your data set is really several correlated data sets (Open, High, Low, Close, and Volume). You'll have to decide which you want to match. Maybe you want a good match just for the opening prices, or maybe you want a good match for all of them. You'll have to decide what a good match means and calculate the matrix for each, then decide what to do if only one or a couple of those subsets match. For example, one data set may match the opening prices well, but close prices don't match as well. Another set's volume may match and that's it. Maybe you'll want to see if the normalized prices match (meaning you'd only be looking at the shape and not the relative magnitudes, i.e. a $1 stock going to $10 would look the same as a $10 one going to $100). All of that is pretty straightforward once you can compute a matrix profile.

3D surface plot never shows any data

The 3D surface plot in plotly never shows the data, I get the plot to show up, but nothing shows up in the plot, as if I had ploted an empty Data Frame.
At first, I tried something like the solution I found here(Plotly Plot surface 3D not displayed), but had the same result, another plot with no data.
df3 = pd.DataFrame({'x':[1, 2, 3, 4, 5],'y':[10, 20, 30, 40, 50],'z': [5, 4, 3, 2, 1]})
iplot(dict(data=[Surface(x=df3['x'], y=df3['y'], z=df3['z'])]))
And so I tried the code at the plotly website(the first cell of this notebook: https://plot.ly/python/3d-scatter-plots/), exactly as it is there, just to see if their example worked, but I get an error.
I am getting this:
https://lh3.googleusercontent.com/sOxRsIDLVkBGKTksUfVqm3HtaSQAN_ybQq2HLA-aclzEU-9ekmvd1ETdfsswC2SdbysizOI=s151
But I should get this:
https://lh3.googleusercontent.com/5Hy2Z-97_vwd3ftKBA6dYZfikJHnA-UMEjd3PHvEvdBzw2m2zeEHBtneLC1jzO3RmE2lyw=s151
Observation: could not post the images because of lack of reputation.
In order to plot a surface you have to provide a value for each point. In this case your x and y are series of size 5, that means that your z should have a shape (5, 5).
If I had a bit more info I could give you more details but for a minimal working example try to pass a (5, 5) dataframe, numpy array or even a list of lists to the z value of data.
EDIT:
In a notebook environment the following code works for me:
from plotly import offline
from plotly import graph_objs as go
offline.init_notebook_mode(connected=False)
df3 = {'x':[1, 2, 3, 4, 5],'y':[10, 20, 30, 40, 50],'z': [[5, 4, 3, 2, 1]]*5}
offline.iplot(dict(data=[go.Surface(x=df3['x'], y=df3['y'], z=df3['z'])]))
as shown here:
I'm using plotly 3.7.0.

Plotly / Dash: Multiple filters

I would like to implement a few data filters to preselect the data by certain criteria. These filters should be diagrams itself, i.e. a pie chart (e.g. where one can select a continent) and a time line (e.g. where one can select a time-span). Most importantly, I need to apply multiple filters from mutliple diagrams without them resetting every time I filter by selecting another diagram.
However, I do not know how to implement this. I found something old using dash.dependencies.Events, but that is not supported anymore.
Whenever I filter by a criterion in diagram A and then want to filter by another criterion from diagram B, diagram A gets reset.
Since this is probably a situation encountered by many people, and since dash does not seem to support this natively, I wanted to ask whether anyone has a workaround on this?
//edit: Here is a simple example. I can filter by clicking on a datapoint on the bar graph above. But whenever I click on a point on the line graph below, it resets the settings from the bar graph. I want to keep both.
import datetime
import dash
import dash_core_components as dcc
import dash_html_components as html
import plotly
from dash.dependencies import Input, Output
external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']
app = dash.Dash(__name__, external_stylesheets=external_stylesheets)
app.layout = html.Div([
dcc.Graph(id='graph')
])
# Multiple components can update everytime interval gets fired.
#app.callback(Output('graph', 'figure'),
[Input('graph', 'selectedData')])
def update_graph_live(input):
print(input)
data = {
'x': [1,2,3,4,5],
'y': [1,2,3,4,5],
'a': [0,-1,-2],
'b': [100,101,102]
}
# Create the graph with subplots
fig = plotly.tools.make_subplots(rows=2, cols=1, vertical_spacing=0.2)
fig['layout']['margin'] = {
'l': 30, 'r': 10, 'b': 30, 't': 10
}
fig['layout']['legend'] = {'x': 0, 'y': 1, 'xanchor': 'left'}
fig['layout']['clickmode'] = 'event+select'
fig.append_trace({
'x': data['x'],
'y': data['y'],
'name': 'xy',
'type': 'bar',
}, 1, 1)
fig.append_trace({
'x': data['a'],
'y': data['b'],
'name': 'ab',
'mode': 'lines+markers',
'type': 'scatter'
}, 2, 1)
return fig
if __name__ == '__main__':
app.run_server(debug=True)
Right now it seems your problem is that selecting data in any of the graphs in the figure of the graph component, i.e. the output of your function, triggers the Input(graph,'selectedData') to that same function!
So what you need to do is separate the graphs into separate dcc.Graph things and use dash.dependencies.State to listen to and maintain each graph's selectedData property.
Thank you for your answers. I managed to find a workaround.
First of all, as #russellr mentioned, dash.dependencies.State can be passed, but it will not trigger a callback. I would like callbacks to be triggered on multiple filters without them resetting each other.
Now, for the good people at Dash, disabling this reset would enable endless loops, so it there is a lot of sense in disabling it.
The way I did it is that I introduced Dropdown lists for filtering, and callbacks only go from the value of the Dropdown to the figure of the graph.
I select multiple conditions on the dropdowns, get the (non-interactive) visualisations from it. It might be not as pretty, but the app still got good useability feedback.

An elegant way to add long names and units to plots with Holoviews

I've started to use Holoviews with Python3 and Jupyter notebooks, and I'm looking for a good way to put long names and units on my plot axis. An example looks like this:
import holoviews as hv
import pandas as pd
from IPython.display import display
hv.notebook_extension()
dataframe = pd.DataFrame({"time": [0, 1, 2, 3],
"photons": [10, 30, 20, 15],
"norm_photons": [0.33, 1, 0.67, 0.5],
"rate": [1, 3, 2, 1.5]}, index=[0, 1, 2, 3])
hvdata = hv.Table(dataframe, kdims=["time"])
display(hvdata.to.curve(vdims='rate'))
This gives me a nice plot, but instead of 'time' on the x-axis and 'rate' on the y-axis, I would prefer something like 'Time (ns)' and 'Rate (1/s)', but I don't want to type that in the code every time.
I've found this blog post by PhilippJFR which kind of does what I need, but the DFrame() function which he uses is depreciated, so I would like to avoid using that, if possible. Any ideas?
Turns out it's easy to do but hard to find in the documentation. You just pass a holoviews.Dimension instead of a string as the kdims parameter:
hvdata = hv.Table(dataframe, kdims=[hv.Dimension('time', label='Time', unit='ns')])
display(hvdata.to.curve(vdims=hv.Dimension('rate', label='Rate', unit='1/s')))
You can find good alternatives in this SO question:
Setting x and y labels with holoviews
I like doing it like this:
Creating a tuple with the name of the variable and the long name you would like to see printed on the plot:
hvdata = hv.Table(
dataframe,
kdims=[('time', 'Time (ns)')],
vdims=[('rate', 'Rate (1/s)')],
)

Categories