I have a large dataset of around 500 parquet files with about 42 million samples.
In order to read those files I'm using Dask which does a great job.
In order to display them, I downsampled the Dask DataFrame in the most basic way (something like dd[::200]) and plotted it using Plotly.
So far everything works great.
Now, I had like to have an interactive figure on one side but I don't want it to open a web tab/to use jupyter/anything of this kind. I just want it to create a figure as matplotlib does.
In order to do so, I found a great solution that uses QWebEngineView:
plotly: how to make a standalone plot in a window?
My simplified code looks something like this:
import dask.dataframe as dd
import time
import plotly.graph_objects as go
def show_in_window(fig):
import sys, os
import plotly.offline
from PyQt5.QtCore import QUrl
from PyQt5.QtWebEngineWidgets import QWebEngineView
from PyQt5.QtWidgets import QApplication
plotly.offline.plot(fig, filename='temp.html', auto_open=False)
app = QApplication(sys.argv)
web = QWebEngineView()
file_path = os.path.abspath(os.path.join(os.path.dirname(__file__), "temp.html"))
web.load(QUrl.fromLocalFile(file_path))
web.show()
sys.exit(app.exec_())
def create_plot(df,x_param,y_param):
fig.add_trace(go.Scattergl(x = df[x_param] , y = df[y_param], mode ='markers'))
fig = go.Figure()
ddf = dd.read_parquet("results_parq/*.parquet")
create_data_for_plot(ddf,'t','reg',1)
fig.update_layout(showlegend=False)
show_in_window(fig)
QUESTION:
Since the dataset is large and I want to use a smarter downsample method, I would like to use a library called plotly-resampler (https://predict-idlab.github.io/plotly-resampler/getting_started.html#how-to-use) which dynamically changes the amount of samples based on the zooming level. However, it uses Dash.
I thought to do something like:
fig = FigureResampler(go.Figure())
ddf = dd.read_parquet("results_parq/*.parquet")
create_data_for_plot(ddf,'t','reg',1)
show_in_window(fig)
This creates a plot with the smart resample but it does not change its resampling when the zoom changes (it basically stuck on its initial sampling).
Is there any solution that might give me a Dash figure in a separate window instead of a tab and yet to have the functionalities of Dash?
Thank you
I believe you could store it in a local file, then use the code
import webbrowser
new = 2 # open in a new window, if possible
// open an HTML file on my own (Windows) computer
url = "file://d/testdata.html"
webbrowser.open(url,new=new)
to open it in a new window
Related
I have an ipywidgets.interact slider bar on a long-ish running process. This creates a situation where, when I move the slider bar, several values get buffered and I sit and wait for a while for the output to "catch up" to the point to which I've moved the slider bar. I'd like to set the number of values that get buffered when I use the slider bar.
Example:
from ipywidgets import interact
import matplotlib.pyplot as plt
import cv2
from skimage import io
image = io.imread('https://www.wikipedia.org/portal/wikipedia.org/assets/img/Wikipedia-logo-v2.png')
#interact
def edges(low=100, high=150, aperture=3):
plt.imshow(cv2.Canny(image, low, high, apertureSize=aperture))
Try moving the slider around and watch the image continue updating for a while after you stop. I'm on a laptop, so your mileage may vary if you have a beast of a machine.
How can I set the "framerate" to the interact function?
The continuous_update setting is what you want to disable for the sliders. However, I'm not 100% sure you can use it with the simple decorator approach though? Did you try this:
from ipywidgets import interact
import matplotlib.pyplot as plt
import cv2
from skimage import io
image = io.imread('https://www.wikipedia.org/portal/wikipedia.org/assets/img/Wikipedia-logo-v2.png')
#interact(continuous_update=False)
def edges(low=100, high=150, aperture=3):
plt.imshow(cv2.Canny(image, low, high, apertureSize=aperture))
I tried it and it works without saying there is an issue with the #interact(continuous_update=False) line. However, I'm not seeing it be slow without it, and so it is hard to test it is having the desired effect.
It is available for your sliders for sure if you define them yourself and not use the #interact route to handle giving you the sliders automatically.
I try to do back testing on stock data using backtrading library in python. and I use this simple strategy
class CrossOver(bt.SignalStrategy):
def __init__(self):
sma=bt.ind.SMA(period=50)
price=self.data
crossover=bt.ind.CrossOver(price,sma)
self.signal_add(bt.SIGNAL_LONG,crossover)
Then I run it and try to plot it and display in streamlit
cerebro=bt.Cerebro()
cerebro.addstrategy(CrossOver)
cerebro.adddata(data)
cerebro.run()
pl=cerebro.plot()
st.pyplot(pl)
But I am not able to see the graph in streamlit. does anyone know how to display backtrader's graph in streamlit? thanks in advance.
I'm not that familiar with backtrader so i took an example from their documentation on how to create a plot. The data used in the plot can be downloaded from their github repository.
The solution contains the following steps:
make sure we use a matplotlib backend that doesn't display the plots to the user because we want to display it in the Streamlit app and the plot() function of backtrader displays the plot. this can be done using:
matplotlib.use('Agg')
get the matplotlib figure from the plot() function of backtrader. this can be done using:
figure = cerebro.plot()[0][0]
display the plot in streamlit. this can be done using:
st.pyplot(figure)
All together:
import streamlit as st
import backtrader as bt
import matplotlib
# Use a backend that doesn't display the plot to the user
# we want only to display inside the Streamlit page
matplotlib.use('Agg')
# --- Code from the backtrader plot example
# data can be found in there github repo
class St(bt.Strategy):
def __init__(self):
self.sma = bt.indicators.SimpleMovingAverage(self.data)
data = bt.feeds.BacktraderCSVData(dataname='2005-2006-day-001.txt')
cerebro = bt.Cerebro()
cerebro.adddata(data)
cerebro.addstrategy(St)
cerebro.run()
figure = cerebro.plot()[0][0]
# show the plot in Streamlit
st.pyplot(figure)
Output:
Before plotting using matplotlib, you must specify your display's DPI if you have a high DPI display, since otherwise the image is too small. I have a 4K display, so I definitely need to do this. (I think that matplotlib should automatically do this for you, but that is another topic...)
As a first attempt to specify the DPI, consider the code below. It manually specifies the display's DPI and then creates and plots a test DataFrame:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import sys
# method #1: manually specify my display's DPI:
dpi = 163 # this value is valid for my Dell U2718Q 4K (3840 x 2160) display
plt.rcParams["figure.dpi"] = dpi
print("plt.matplotlib.rcParams[\"figure.dpi\"] = " + str(plt.matplotlib.rcParams["figure.dpi"]))
# define a test DataFrame (here, chose to calculate sin and cos over their range of 2 pi):
n = 100
x = (2 * np.pi / n) * np.arange(n)
df = pd.DataFrame( {
"sin(x)" : np.sin(x),
"cos(x)": np.cos(x),
}
)
# plot the DataFrame:
df.plot(figsize = (12, 8), title = "sin and cos", grid = True, color = ["red", "green"])
When I put the code above into a file and run it all at once in PyCharm, everything behaves exactly as expected: the script completes without error, the plot is generated at the correct size, and the plot remains open in a window after the script ends.
So far, so good.
But the code above is brittle: run it on a computer with a different display DPI, and the image will not be sized correctly.
Doing a web search, I found this link which has code claims to automatically determine your display's DPI. My (slight) adaptation of the code is this
# method #2: call code to determine my display's DPI (only works if the backend is Qt)
if plt.get_backend() == "Qt5Agg":
from matplotlib.backends.qt_compat import QtWidgets
qApp = QtWidgets.QApplication(sys.argv)
plt.matplotlib.rcParams["figure.dpi"] = qApp.desktop().physicalDpiX()
If I modify my file to use the code above ("method #2") instead of the manual DPI setting ("method #1"), I find that the script completes without error, but the plot only comes up for a brief instant before being automatically closed!
By successively commenting out lines in the "method #2" code, starting with the last and working backwards, I have determined that the culprit is the call to QtWidgets.QApplication(sys.argv).
In particular, if I reduce the "method #2" code to just this
if plt.get_backend() == "Qt5Agg":
from matplotlib.backends.qt_compat import QtWidgets
QtWidgets.QApplication(sys.argv)
I get this plot auto close behavior.
Another defect, is that the original "method #2" code calculates the DPI of my monitor, a Dell U2718Q, to be 160, when it really is 163: in this link go to p. 3 / 4 and look at the Pixels per inch (PPI) spec.
Does anyone know of a solution to this?
Better code to determine the DPI?
A modification of the "method #2" code which will not cause plots to auto close?
Is this a bug that needs to be reported to matplotlib or Qt?
I am using pyqtgraph to plot some data and noticed that when I move the plot from my laptop screen to a second monitor, the scaling on the plot is affected:
laptop monitor:
external monitor:
notice that the axes got "compressed", and the plot is no longer scaled properly on the second monitor.
I found others reporting similar issues on the web, but could not find any real solution. One solution suggested was to make the monitors' resolutions the same. I don't like this solution because I'd have to sacrifice laptop resolution to accommodate my lower resolution external monitor.
The other solution I found was to add the line app.setAttribute(QtCore.Qt.AA_Use96Dpi) to the main loop, prior to instantiating the Qapplication as shown below, to allegedly have Qt ignore the OS's DPI settings:
def main():
import sys
app = QtWidgets.QApplication(sys.argv)
app.setAttribute(QtCore.Qt.AA_Use96Dpi)
MainWindow =GraphWindow()
MainWindow.show()
sys.exit(app.exec_())
This seems at first to work, because the plotted data is scaled properly on the axes. However, it doesn't seem to really work -- the addition of this line affected the scaling of the axes on the laptop as shown below (same data is now plotted on axes that span 0 to 7000 on the x-axis, and -2 to -26dB on the Yaxis):
,
but did "fix" the issue when moving the plot onto the second monitor to look like the first "original" laptop plot shown above.
This is particularly worrisome, because in the case of the laptop output after the app.setAttribute(QtCore.Qt.AA_Use96Dpi) instruction "looks" right, but misrepresents the actual data. I could have easily missed this had included this instruction when I first plotted the data.
What is the right way to have the plot accurately display regardless of the OS's DPI setting and monitor resolutions? It is very strange that the plotted data seems disassociated with the axis values.
Here is a mininimal reproducible sample:
from PyQt5 import QtWidgets, QtCore
from pyqtgraph import PlotWidget, plot
import pyqtgraph as pg
import sys # We need sys so that we can pass argv to QApplication
import os
from numpy.random import seed
from numpy.random import randint
class MainWindow(QtWidgets.QMainWindow):
def __init__(self, *args, **kwargs):
super(MainWindow, self).__init__(*args, **kwargs)
self.graphWidget = pg.PlotWidget()
self.setCentralWidget(self.graphWidget)
x = [1,2,3,4,5,6,7,8,9,10]
seed(1)
y = randint(5,35,10)
# plot data: x, y values
self.graphWidget.plot(x, y)
def main():
app = QtWidgets.QApplication(sys.argv)
app.setAttribute(QtCore.Qt.AA_Use96Dpi)
main = MainWindow()
main.show()
sys.exit(app.exec_())
if __name__ == '__main__':
main()
The setAttribute solution never worked for me in that way and the windll manipulation makes the gui blurry...
adding following two lines before app = QApplication(sys.argv) solved my problem:
QApplication.setHighDpiScaleFactorRoundingPolicy(Qt.HighDpiScaleFactorRoundingPolicy.PassThrough)
QtCore.QCoreApplication.setAttribute(QtCore.Qt.AA_EnableHighDpiScaling, True)
Answers can be found here: https://github.com/pyqtgraph/pyqtgraph/issues/756
Quick Summary of this issue:
There are essentially two ways to solve this problem.
Make your app DPI-aware (by Androwei)
import ctypes
import platform
def make_dpi_aware():
if int(platform.release()) >= 8:
ctypes.windll.shcore.SetProcessDpiAwareness(True)
# add this code before "app = QtWidgets.QApplication(sys.argv)"
make_dpi_aware()
set Qt.HighDpiScaleFactorRoundingPolicy to PassThrough (by andybarry)
# add this code before "app = QtWidgets.QApplication(sys.argv)"
QtWidgets.QApplication.setAttribute(QtCore.Qt.HighDpiScaleFactorRoundingPolicy.PassThrough)
I have tried both, and they both work perfectly! Thanks to these contributors. Hope you can find this useful as well.
I am trying to embed a streaming bokeh plot into an HTML file using the autoload_server function:
from bokeh.client import push_session
from bokeh.embed import autoload_server
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure, curdoc
data = dict(x=[], y=[])
source = ColumnDataSource(data)
plot = figure()
plot.circle(source=source, x='x', y='y')
counter = -1
def update_data():
global xDate, yWind, counter
counter += 1
xDate = counter
yWind = counter
new_data_wind = dict(x=[xDate], y=[yWind])
source.stream(new_data_wind, 300)
curdoc().add_root(plot)
curdoc().add_periodic_callback(update_data, 300)
session = push_session(curdoc())
script = autoload_server(plot, session_id=session.id)
print(script)
I basically start a bokeh server by using: "bokeh serve" and then run the code and insert the given script into an HTML file.
At first, no plot would be displayed, but after adding --allow-websocket-origin=localhost:63342 to the bokeh serve command, the page would show the plot grid, but no data is displayed.
Does someone have an idea as to why the data streaming function doesn't seem to work or what I can change to make the embedded plot stream the data?
I'm thankful for any further input, since I have yet to find some on the Internet.
EDIT
I've found the solution to my problem and will leave it here if anyone encounters something similar:
The code fragment:
session.loop_until_closed()
needs to be added to the end of the example above, so the session is looped and the final plot gets updated inside the browser.
I'll just post my answer as seen above, so this won't show up as unanswered question anymore:
The code fragment:
session.loop_until_closed()
needs to be added to the end of the example above, so the session is looped and the final plot gets updated inside the browser.