Plotly figures in jupyter notebook take huge amounts of memory

Plotly figures in jupyter notebook take huge amounts of memory - python

I may be doing something really stupid, but I've been using plotly offline in my jupyter notebook using
import plotly.offline as py
py.init_notebook_mode(connected=True)
from plotly.graph_objs import *
I'm trying to display a sequence of images that can be navigated with a slider. The entire numpy array with the image data is 50 images x 64 wide x 64 tall.
I put that into the following slider function I pieced together from code I found online. The Figure object itself that's returned is not very large. However, when plotly's iplot is called, the size of my jupyter notebook on disk (as measured by ls -l) is really big - like 15 MB, even though the numpy source data is like 1MB. This becomes unmanageable for larger/multiple figures. Does anyone know what's going on?
def slider_ims(imgs):
imgs = np.flip(imgs,1)
data = [dict(
type='heatmap',
z = imgs[step,:,:],
visible = False,
showscale=False,
xaxis="x",
yaxis="y",
name = 'z = '+str(step)) for step in np.arange(imgs.shape[0])]
data[0]['visible'] = True
steps = []
for i in range(len(data)):
step = dict(
method = 'restyle',
args = ['visible', [False] * len(data)],
label = str(i)
)
step['args'][1][i] = True # Toggle i'th trace to "visible"
steps.append(step)
sliders = [dict(
active = 0,
currentvalue = {"prefix": "Frame: "},
pad = {"t": 50},
steps = steps,
ticklen = 0,
minorticklen = 0
)]
layout = Layout(
sliders = sliders,
font=Font(family='Balto'),
width=800,
height=600,
)
fig=Figure(data=data, layout=layout)
py.iplot(fig)
return fig

You want smaller ipynb files? Don't store output cells.
If you only care about the on-disk size of your notebooks, you could change your Jupyter configuration to disable writing output cells to the ipynb file. This would mean that only your code is saved on disk. Whenever you open a notebook, the output cells will be empty and you need to re-run the notebook to get them. You have to decide whether this fits with how you use notebooks.
You can set this up by editing your jupyter_notebook_config.py configuration file, which is typically located in your home directory under ~/.jupyter (Windows: C:\Users\USERNAME\.jupyter\). If it does not exist yet, this file can be generated from the termial with jupyter notebook --generate-config (more info here).
In this configuration file, you need to add a pre-save hook that strips output cell before saving as described in the documentation:
def scrub_output_pre_save(model, **kwargs):
"""scrub output before saving notebooks"""
# only run on notebooks
if model['type'] != 'notebook':
return
# only run on nbformat v4
if model['content']['nbformat'] != 4:
return
for cell in model['content']['cells']:
if cell['cell_type'] != 'code':
continue
cell['outputs'] = []
cell['execution_count'] = None
c.FileContentsManager.pre_save_hook = scrub_output_pre_save
Bonus benefit: Stripping output cells like this is also a great way to get readable diffs for source control, e.g. git.

Normally, plotly's plot has a big size. Your notebook size increased because you save the plot on your notebook using inline plot (py.iplot).
If you don't want your notebook to be so large, just use the normal plot (py.plot) and save the plot in another file.
You can read the plotly's documentation

In my case, I used as follows to avoid saving the plot along with the notebook. I prefer to save the image as a different page to keep the notebook's size.
import plotly.offline as pyo
pyo.plot(fig, filename="example.html")

Related

saving a sankey to a file from jupyter

I have a loop where I want to render some sankeys to .png files.
Checking some docs here:
https://nbviewer.jupyter.org/github/ricklupton/ipysankeywidget/blob/master/examples/Exporting%20Images.ipynb
I would expect sankey.save_svg('test.svg') to work.
From reading those docs, there's a caveat that trying to display and then save won't work, but I'm not trying to display, I just want to save a list of images. However I'm getting the same warnings.warn('No png image available! Try auto_save_png() instead?') error and cannot save.
If I run one line at a time, and return the sankey and let it display in the normal run of a python notebook, things work ok... So it seems there's something that happens when you let the notebook display a sankey that isn't happening when I'm just trying in a loop to render them to files.
from ipywidgets import Layout
from ipysankeywidget import SankeyWidget
def draw_one(use_case, limit=50):
df = Cli.query_paths(use_case=use_case, limit=limit)
layout = Layout(width="2000", height="500")
fpath = f'./data/ignored/images/{use_case}-{limit}.png'
# with or without: .auto_save_png(fpath)
sankey = SankeyWidget(links=df.to_dict('records'), layout=layout)
sankey.save_png(fpath)
cases = [
'INTL',
'PAC',
]
def loop():
for use_case in cases:
print('sk:', use_case)
sk = draw_one(use_case, limit=50)
loop()

Plotly line graphs are buggy and lose their interactivity when rendered in a Panel dashboard

I am creating a dashboard at the moment using Plotly express and Panel.
The charts when created with Plotly work fine, but once rendered into the dashboard they become buggy. When hovering over data points no hover information pops up, the plotly tools like zoom, pan and reset disappear and my mouse when moving around the chart turns into a double sided horizontal arrow that you would see when you hover over a scroll bar.
I will put enough code to recreate one of the charts in and outside of the dashboard so anyone who wants to can test and see if they get the same result.
# Getting BTC TWh csv straight from the download link so that whenever the notebook is ran, the data is the latest available
req = requests.get('https://static.dwcdn.net/data/cFnri.csv')
url_content = req.content
csv_file = open('Resources/btc_twh.csv', 'wb')
csv_file.write(url_content)
# Creating BTC TWh dataframe from csv
btc_twh_df = pd.read_csv(Path("Resources/btc_twh.csv"))
# Changing column names for readability and replacing '/' with '-' for plotting reasons
btc_twh_df.columns = ["date", "btc_estimated", "btc_minimum"]
btc_twh_df["date"] = btc_twh_df["date"].str.replace("/", "-")
btc_twh_df_subset.head()
# Plot ETH/BTC TWh with Plotly
btc_twh_chart = px.line(
btc_twh_df,
y="btc_estimated",
title="BTC Energy Usage",
color_discrete_map={"btc_estimated":"orange"},
labels={"btc_estimated":"BTC"})
btc_twh_chart.update_layout(xaxis_title="", yaxis_title="Energy Usage in TWh", template="plotly_dark", height=600, width=1000)
btc_twh_chart.show()
###################
# Plot chart within the dashboard
# TWh panes and column
twh_pane = pn.pane.Plotly(btc_twh_chart)
# Dashboard
dashboard = pn.Tabs(
("Welcome", twh_pane)
)
dashboard.servable()
See how that goes and let me know, it might be my 2017 Macbook Pro, although I haven't had processor issues so far, and my gut tells me it feels more like a bug.
If it works fine and you feel like investigating further here is my repo
https://github.com/kez4twez/mining_energy_usage
Feel free to clone it and try and run the whole dashboard.ipynb file and see what happens.
Thanks

Python plotly sankey export broken

I have a python sankey chart which works well when exporting the html but looks completely broken when exporting it to other file formats
import plotly.graph_objects as go
fig = go.Figure(data=[go.Sankey(
node = dict(label = data["label"]),
link = dict(source = data["source"],target = data["target"],value = data["value"])
)])
fig.write_image("sankey.svg")
fig.write_image("sankey.eps")
fig.write_image("sankey.png")
fig.write_html("sankey.html")
HTML Screenshot
PNG Export (SVG, EPS differ a bit but also look broken)
I'm using python 3.8.5 with the kaleido 0.0.3 engine.
Additionally, I've tried Orca 1.2.1 but got the same results.

The answer actually is very easy. Tough most of the charts can figure out the required size on their own, the sankey chart obviously can't. So basically you just have to set dimensions for all exports on sankey charts (yes even for vector graphics like eps and svg).
Also worth mentioning is that a minimum size is required. While my example now looks satisfying with 1920x1080, a size of 1280x720 looks broken even with vector-graphics.
fig = go.Figure(...)
fig.update_layout(width=1920, height=1080)
fig.write_image(...)

How to show the full graph in a jupyter cell and export the full image instead of the part of the embedded image?

I have generated a plotly table and tried to export the embedded table. But it only export part of the table. So can anyone tell me how to show the whole table in a jupyter cell and export it correctly? Thanks
The way i tried to export the table is :
import os
if not os.path.exists("imagesplotly"):
os.mkdir("imagesplotly")
fig.write_image("imagesplotly/fig2.pdf")

You can adjust your pdf table by setting height in fig.write_image like this fig.write_image("C://imagesplotly//fig1.pdf", height=yourDesiredHeight).
Code 1 - The figure:
import plotly.graph_objects as go
import numpy as np
np.random.seed(123)
lst1=np.random.uniform(low=0, high=100, size=10).tolist()
lst2=np.random.uniform(low=0, high=100, size=10).tolist()
fig = go.Figure(data=[go.Table(cells=dict(values=[np.random.rand(100,1), np.random.rand(100,1)]))])
fig.show()
Jupyter output:
I thought it would be a good idea to multiply the default number of pixels for a plotly table row by the length of your data, but I couldn't figure out the defaults. It sure isn't 255, but the settings in the snippet below will capture the entire table for a list of length 100.
Code 2 - The pdf output:
pix = 225
datasize = len(lst1)
fig.write_image("C://imagesplotly//fig1.pdf", height=pix*datasize)
pdf output (abbreviated):

Output widget appears outside tab widget when using nbconvert on jupyter notebook with ipywidgets

I created a notebook which should display plots in a tab widget. As far as I understand, to include something like plots in the tab widget, I need to wrap it in the output widget. In the notebook itself it works but when I convert it to html via nbconvert it produces the wrong output.
Widgets like sliders, buttons or text appear within the tab widget where they should, but when I use the output widget to catch the plot (or even some text from a print() function) it appears before the tab environment which itself is then empty.
This is the how it shoud look with one plot per tab (works in the notebook):
Plots in tabs within notebook
And this is how it looks after nbconvert (in html). The plots appear before the tab envorinment:
Plots before tabs in html
Please note that nbconvert includes other widgets fine and also tabs with other content.
This is the used code:
# Import libraries
import pandas as pd
import matplotlib.pyplot as plt
import ipywidgets as widgets
import numpy as np
# Generated data for plotting
data = pd.DataFrame()
for i in range(5):
data[i] = np.random.normal(size = 50)
Now this part works in the notebook but not in the html, but as you will see it seems to be related to the output widget, as I does not work with plots or printed text.
# This does not work with plots
children = []
for i in range(data.shape[1]):
out = widgets.Output()
with out:
fig, axes = plt.subplots()
data[i].hist(ax = axes)
plt.show()
children.append(out)
tab = widgets.Tab()
tab.children = children
for i in range(len(children)):
tab.set_title(i, "Plot " + str(i))
tab
# And this does not work with printed output
children = []
for i in range(5):
out = widgets.Output()
with out:
print("This is text", i)
children.append(out)
tab = widgets.Tab()
tab.children = children
for i in range(len(children)):
tab.set_title(i, "Text " + str(i))
tab
However, if I use a different widget type (e.g. Text) it is displayed correctly in the notebook and the html output from nbconvert.
# This works with the Text widget
children = []
for i in range(5):
out = widgets.Text(description="P"+str(i))
children.append(out)
tab = widgets.Tab()
tab.children = children
for i in range(len(children)):
tab.set_title(i, "Text " + str(i))
tab
So, is there something I can change to make this actually work? What I need in the end is a way to display N plots in N tabs...

I had the same problem. Updating nbconvert (for example via pip install --upgrade nbconvert) solved it.
Quoting from https://github.com/jupyter/nbconvert/issues/923 :
MSeal commented on 2020-09-07:
Yes, this issue was resolved in jupyter/nbclient#24 and nbclient >=
0.4.0 has the fix. NBconvert 6.0 (which should be releasing tomorrow) defaults to using nbclient and the issue overall should disappear. You
can try it out on the 6.0.0rc0 release today if you like.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Plotly figures in jupyter notebook take huge amounts of memory - python

Normally, plotly's plot has a big size. Your notebook size increased because you save the plot on your notebook using inline plot (py.iplot). If you don't want your notebook to be so large, just use the normal plot (py.plot) and save the plot in another file. You can read the plotly's documentation

In my case, I used as follows to avoid saving the plot along with the notebook. I prefer to save the image as a different page to keep the notebook's size. import plotly.offline as pyo pyo.plot(fig, filename="example.html")

Related

saving a sankey to a file from jupyter

Plotly line graphs are buggy and lose their interactivity when rendered in a Panel dashboard

Python plotly sankey export broken

How to show the full graph in a jupyter cell and export the full image instead of the part of the embedded image?

Output widget appears outside tab widget when using nbconvert on jupyter notebook with ipywidgets

Categories

Resources