Why is my Datashader plot saved from HoloViews such a low resolution?

Why is my Datashader plot saved from HoloViews such a low resolution? - python

I'm trying to save plots, generated with holoviews using the bokeh backend, to a png. To do this I'm using the following code
import holoviews as hv
from holoviews.operation.datashader import datashade
curve: hv.Curve = hv.Curve(__some_data_for_curve__)
hv.save(datashade(curve), output_path, backend=bokeh)
Unfortunately the saved png is not rendered properly:
When I instead use
import panel as pn
import holoviews as hv
from holoviews.operation.datashader import datashade
curve: hv.Curve = hv.Curve(__some_data_for_curve__)
pn.serve(datashade(curve))
I get a nicely rendered Plot:
This leads to the assumption that datashader does not properly render the image when just saving the plot to file. Does anybody have an idea on how to get datashader to finish rendering before saving?

Good question about a subtle issue.
What's happening is that hv.save exports the "initial" rendering of a HoloViews object, before any subsequent hooks or streams take effect. The initial rendering includes an RGB image that is the result of HoloViews calling Datashader with initial height and width values determined by arguments to the datashade call (height=400 and width=400 by default). When you are viewing the plot interactively, the initial call is soon updated and overwritten with the size of the actual frame used in the plot as it gets laid out on your screen. Because your screen is usually much larger than 400x400, you won't normally even see the low-res version unless you save the file.
The other issue is that the default height and width are deliberately set to relatively low values, in order not to waste much time on a plot that most users will never see.
If you want the initial save to use a higher resolution, you can add arguments to the datashade call with specific values like height=400, width=1024 or you can just tell it "scale up by 4X" using pixel_ratio=4.
You can also set those parameters globally at the start of your script or notebook, if you always want high-res exports:
from holoviews.operation.datashader import ResamplingOperation
ResamplingOperation.width=1000
ResamplingOperation.height=1000
ResamplingOperation.pixel_ratio=2
Or if you always want higher res, you can put those settings into your ~/.config/holoviews/holoviews.rc file.

Related

How can I scale inline matplotlib figures within JupyterLab?

I'm trying out JupyterLab having used Jupyter notebooks for some time. I use the standard %matplotlib inline magic at the start. I've noticed that JupyterLab displays matplotlib figures much larger than Jupyter notebooks used to.
Is there a way to force JupyterLab to display the images in smaller window/area? I know I can change the figsize I pass when creating the figure but that does not scale the text/labels within the figure and I end up with effectively oversize labels and titles.
Ideally within JupyterLab I'd like to be able to set it up so images fit in an area I can define the size of and if they're larger they get scaled to fit.
I've been reading the JupyterLab docs but nothing leaps out at me at solving this particular problem.
Update: I'm running JupyterLab in Chrome. Chrome displays images up to the full width of the browser window; if the window is smaller than that width that allows the full size of the image, the image is scaled to fit - this is fully dynamic, if you shrink the width of the window the image will rescale on the fly. I changed my figsize parameter (and carefully adjusted font sizes to work) and I got a reasonably sized figure in JuptyerLab. I noticed that when I saved this to a jpg and put that in a powerpoint doc is was quite small (3,2). So I enlarged it, but it became blurred. So I regenerated it with dip=1200. The figure in JuputerLab got bigger. So JupyterLab does not respect the figsize. It's making somekind of judgement based on the number of pixels in the image.
Update 2: This piece of code demonstrates that the Juptyer Lab front end doesn't display images according to the figsize parameter but the product of figsize and dpi (upto the width of the screen, after which it is scaled to fit, presumably by Chrome itself). Note that the font size you see on the screen scales only with dpi and not with figsize (as it should).
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
xys = np.random.multivariate_normal([0.0, 0.0], [[1.0,-0.5],[-0.5,1.0]], 50)
for figsize in [(3,2),(6,4)]:
for dpi in [25,50,100]:
fig = plt.figure(figsize=figsize, dpi=dpi)
ax = fig.add_subplot(1,1,1)
ax.scatter(xys[:,0], xys[:,1])
ax.set_title('figsize = {}, dip = {}'.format(figsize, dpi))
A work around is to work in Jupyter Lab generating figures at a low dpi setting but saving figures at a high dpi setting for publications.

Select text in Bokeh plot

I would like to be able to search for specific words in my Bokeh plot. Say that I have a very simple plot:
import numpy as np
from bokeh.plotting import figure, show, output_file
x = np.linspace(0, 4*np.pi, 100)
y = np.sin(x)
TOOLS = "pan,wheel_zoom,box_zoom,reset,save,box_select"
p1 = figure(title="Some sample title", tools=TOOLS)
p1.circle(x,y, legend="sin(x)")
output_file("legend.html", title="legend.py example")
show(p1)
Which results in
I would like to be able to search the text in my browser using [ctrl+f] or [cmd+f]. Is there any way to do that? I would like to be able to search for the title and/or for labels, so in this case, example queries would be one of {sample, title,1,0.5}. Of course this example is hypothetical, but I think it's enough to illustrate the question.
Is there any way to use browser search functionality inside a Bokeh plot?

There is no way to do this in Boken currently, as it renders to an HTML5 canvas object, so the browser just sees the final result of the rendering. If you're willing to use Bokeh's sister library HoloViews, it however has a both Bokeh and SVG backend. When rendered through that SVG backend, your browser will then have access to all the text elements.
To help evaluate plotting libraries to see if they're suitable for your purpose, what you're looking for is basically a SVG backend. Usually it's easy to find a list of supported backends in the documentation of each library.
Also note that "having all individual plot elements accessible to the browser" and "plotting a lot of data points" are conflicting goals. The HTML5 canvas backend works well for plotting lots of data (even more so with datashader) partly because it only exposes the final plot image to the browser. If you want to expose the details of your plot to the browser (e.g via the SVG backend), you should expect to see a performance hit at some point if your plots get bigger (more data) or otherwise more complex, compared to the HTML5 canvas backend.

There is no way to do this. Bokeh plots are not textual DOM elements, everything is rendered on an HTML raster canvas, which the browser only sees as an rectangular area of RGBA pixels.

How to use Python Seaborn Visualizations in PowerPoint?

I created some figures with Seaborn in a Jupyter Notebook. I would now like to present those figures in a PowerPoint presentation.
I know that it is possible to export the figures as png and include them in the presentation. But then they would be static, and if something changes in the dataframe, the picture would be the same. Is there an option to have a dynamic figure in PowerPoint? Something like a small Jupyter Notebook you could Display in the slides?

You could try Anaconda Fusion (also the video here), which let's you use Python inside of Excel. This could possibly work since you can link figures/data elements between Excel and PowerPoint (but special restrictions might apply when the figure is created via Python rather than standard Excel). Anaconda Fusion is free to try for a couple of months.
Another solution would be to use the Jupyter Notebook to create your presentation instead of PowerPoint. Go to View -> Cell Toolbar -> Slideshowand you can choose which code cells should become slides.
A third approach would be to create an animation of the figure as the data frame changes and then include the animation (GIF or video) in PowerPoint.

The following procedures probably won't be the most elegant solution, but it will let you produce a Seaborn plot, store it as an image file, and export the same image to an open powerpoint presentation. Depending on whether you set LinkToFile to True or False, the images will or will not update when the source changes. I'm messing around with this using cells in Spyder, but it should work in a Jupyter notebook as well. Make sure that you have a folder named c:\pptSeaborn\.
Here it is:
# Some imports
import numpy as np
import seaborn as sns
import os
import matplotlib.pyplot as plt
import win32com.client
import win32api
os.chdir('C:/pptSeaborn')
# Settings for some random data
mu = 0
sigma = 1
simulation = np.random.normal(mu, sigma, 10)
# Make seaborn plot from simulated data. Save as image file.
def SeabornPlot(data, filename = 'c:\\pptSeaborn\\snsPlot.png'):
ax = sns.kdeplot(data, shade=True)
fig = ax.get_figure()
fig.savefig(filename, bbox_inches='tight', dpi = 440)
plt.close(fig)
# Import image file to active powerpoint presentation
def SeabornPPT(plotSource, linkImage):
Application = win32com.client.Dispatch("PowerPoint.Application")
Presentation = Application.Activepresentation
slidenr = Presentation.Slides.Count + 1
Base = Presentation.Slides.Add(slidenr, 12)
gph = Base.Shapes.AddPicture(FileName=plotSource,
LinkToFile=linkImage, SaveWithDocument=True,
Left=50, Top=25, Width=800, Height=500)
Presentation.slides(slidenr).select()
# Produce data, save plot as image, and export image to powerpoint
SeabornPlot(data = simulation)
SeabornPPT(plotSource = 'c:\\pptSeaborn\\snsPlot.png', linkImage = False)
Now, if you have an open powerpoint presentation and run this whole thing five times, you will get somthing like this:
If you go ahead and save this somewhere, and reopen it, it will still look the same.
Now you can set linkImage = True, and run the whole thing five times again. Depending on the random data generated, you will still get five slides with different graphs.
But NOW, if you save the presentation and reopen it, all plots will look the same because they're linked to the same image file:
The next step could be to wrap the whole thing into a function that takes filename and LinkToFile as arguments. You could also include whether or not the procedure makes a new slide each time an image is exported. I hope you find my sggestion useful. I liked your question, and I'm hoping to see a few other suggestions as well.

We now went with this approach:
You can save the figures as a .png file and insert this into Powerpoint. There is an Option when inserting it, that the Picture will be updated every time you open PowerPoint, retrivining a new version of the file from the Folder I saved it to. So when I make changes in Seaborn, a new version of the file is automatically saved as a Picture which will then be updated in PowerPoint.

Exporting matplotlib plot to holoviews

I'm using a library called GPy to fit a Gaussian process model and plot the output. The library has it's own plotting functionality, and returns a matplotlib figure.
I'd like to use this output in a holoviews element, as part of a dynamic map. This feels like it should be possible, but I can't find a good way to do it.
I had wondered about reading the matplotlib figure into a numpy image array and sending this to a holoviews Raster element - but the only way to do this seems to be saving the figure to a file, which does not seem a good option.

Great question! At the point where you define holoviews elements, they are still backend-agnostic. Matplotlib etc. only come into play when they are actually rendered. Hence no, you cannot take a matplotlib figure as such and pipe that into a holoviews element.
Hence you have two options:
Extract the data from the matplotlib figure in some way, or get hold of those data from GPy in some other way, and create a holoviews element from that, or
Use the code that generates the matplotlib figure in a panel (https://panel.pyviz.org) app.
Number 2 is closer to what you are probably imagining but without a minimal working example I cannot say much more.

matplotlib shows different figure than saves from the show() window

I plot rather complex data with matplotlib's imshow(), so I prefer to first visually inspect if it is all right, before saving. So I usually call plt.show(), see if it is fine, and then manually save it with a GUI dialog in the show() window. And everything was always fine, but recently I started getting a weird thing. When I save the figure I get a very wrong picture, though it looks perfectly fine in the matplotlib's interactive window.
If I zoom to a specific location and then save what I see, I get a fine figure.
So, this is the correct one (a small area of the picture, saved with zooming first):
And this one is a zoom into approximately the same area of the figure, after I saved it all:
For some reason pixels in the second one are much bigger! That is vary bad for me - as you can see, it looses a lot of details in there.
Unfortunately, my code is quite complicated and I wasn't able to reproduce it with some randomly generated data. This problem appeared after I started to plot two triangles of the picture separately: I read my two huge data files with np.loadtxt(), get np.triu(data1) and np.tril(data2), mask zeroes, NAs, -inf and +inf and then plot them on the same axes with plt.imshow(data, interpolation='none', origin='lower', extent=extent). I do lot's of other different things to make it nicer, but I guess it doesn't matter, because it all worked like a charm before.
Please, let me know, if you need to know anything else specific from my code, that could be relevant to this problem.

When you save a figure in png/jpg you are forced to rasterize it, convert it to a finite number of pixels. If you want to keep the full resolution, you have a few options:
Use a very high dpi parameter, like 900. Saving the plot will be slow, and many image viewers will take some time to open it, but the information is there and you can always crop it.
Save the image data, the exact numbers you used to make the plot. Whenever you need to inspect it, load it in Matplotlib in interactive mode, navigate to your desired corner, and save it.
Use SVG: it is a vector graphics format, so you are not limited to pixels.
Here is how to use SVG:
import matplotlib
matplotlib.use('SVG')
import matplotlib.pyplot as plt
# Generate the image
plt.imshow(image, interpolation='none')
plt.savefig('output_image')
Edit:
To save a true SVG you need to use the SVG backend from the beginning, which is unfortunately, incompatible with interactive mode. Some backends, like GTKCairo seem to allow both, but the result is still rasterized, not a true SVG.
This may be a bug in matplotlib, at least, to the best of my knowledge, it is not documented.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.