Run Bokeh server on Azure Databricks? - python

I locally use Bokeh server to visualize data. I tried doing this in Azure's version of Databricks as well, but couldn't get even the first lines of this simple example to run:
from bokeh.io import push_notebook, show, output_notebook
from bokeh.plotting import figure
output_notebook() # <- fails
This fails with the following error:
TypeError: publish_display_data() missing 1 required positional
argument: 'data'
I investigated further and found out that databricks is apparently built open IPython 2.2.0, which is over 4 years old!
import IPython
IPython.__version__ # Returns '2.2.0'
Is there anything I can do? Did anyone have success with running a bokeh server in Databricks? I want to have some kind of interactive Dashboard, and Databricks' own dashboard is extremely limited

As you note, IPython 2.2.0 is ancient. I'm not sure how far back you'd have to go in Bokeh releases to find one that supports it. The function publish_display_data is a Juypter/IPython API, and unfortunately it has seen a few breaking changes over the years. The Bokeh project used to maintain a compatibility polyfill for it to try to smooth over these changes, and support older versions, but it was removed in this commit last year:
https://github.com/bokeh/bokeh/commit/fb3f9cc4f9e9af786698462a9849e46c0ea34cf2
After that commit, 4.3 is the minimum notebook version for any use. Before that commit, some set of earlier Jupyter releases will work, but I can't say exactly how much earlier, and I can't guarantee that an emebedded Bokeh server apps would work (i.e. very possibly only inline standalone plots would work) Embedded Bokeh server apps have never been tested on anything earlier than Jupyter 4.3 and I would never make a claim that Bokeh supports embedded apps in notebook versions older than that.
TLDR; I highly doubt things are workable on IPython 2.2.0

Related

Matplotlib figures not generating in GitHub CodeSpaces

I just started using Codespaces. In my python file I have this code:
import matplotlib.pyplot as plt
import pandas as pd
print("Hello")
titanic_data = pd.read_csv("https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv")
titanic_data = titanic_data[titanic_data['Age'].notnull()]
titanic_data['Fare'] = titanic_data['Fare'].fillna(titanic_data['Fare'].mean())
titanic_data = titanic_data.drop_duplicates()
plt.scatter(titanic_data['Age'], titanic_data['Fare'])
plt.show()
print("Goodbye")
When I run this on my local machine, this works perfectly. I can see the console logs, and the figure appears as a new window:
However, when I run this in Codespaces, I can see all of the code running without any errors, but it does not show the figure. Is this a known limitation or a feature that is not yet supported? Is there another way I can plot figures in Codespaces?
They mention this in the docs docs:
The default container image that's used by GitHub Codespaces includes a set of machine learning libraries that are preinstalled in your codespace. For example, Numpy, pandas, SciPy, Matplotlib, seaborn, scikit-learn, Keras, PyTorch, Requests, and Plotly.
It sounds like it should be supported out of the box. Is additional configuration required?
Based on the experimentation I have done thus far, plotting these diagrams as one would do in a local dev environment is not (yet?) possible.
For this specific case, the next best solution was to create a new GitHub Codespace from this repo: https://github.com/education/codespaces-teaching-template-py
Once the repo has been cloned into the Codespace, navigate to an existing .ipynb file or create your own.
Inside there you'll be able to run chunks of custom code and plot figures.
The big limitation I see is that the figure cannot be interacted with the same way that one would be able to on a local machine (zooming, panning, etc).
As always, don't forget to shut your Codespace down when you're done using it!

How to change the default backend in matplotlib from 'QtAgg' to 'Qt5Agg' in Pycharm?

Qt5Agg is necessary to use the mayavi 3D visualization package. I have installed PyQt5 and mayavi using pip in a separate copied conda environment. The default backend then changes from TkAgg to QtAgg. This is a bit weird because in an earlier installation in a different PC the default changed directly to Qt5Agg. I always check the backend using the following commands from the python console :
import matplotlib
matplotlib.get_backend()
Even with the backend being 'QtAgg', I am able to use mayavi from the terminal without any issue but not when I do so in Pycharm. Here I get a non-responsive empty window (image below) :
Image of the non-responsive window
I have been able to get rid of this issue by explicitly using Qt5Agg instead of QtAgg before the plt call :
import matplotlib
matplotlib.use('Qt5Agg')
import matplotlib.pyplot as plt
But I would prefer a better way than using the above in every script that I write. As I had mentioned earlier, I already have mayavi installed and have used it successfully it in Pycharm in a different PC and there the default backend is 'Qt5Agg' and hence there is no need to change the backend explicitly.
Is there anything obvious that I'm overlooking ? Can you please let me know of a way to change the default backend for matplotlib from QtAgg to Qt5Agg after PyQt5 installation using pip ?
Thanks in advance !!
Thanks to #PaulH's comment, I was able to solve the issue. Owing to #mx0's suggestion, I shall now explicitly mention the fix below so that others can also benefit from it.
In a particular conda environment, if matplotlib package is installed, then there will be a 'matplotlibrc' file stored somewhere that defines what the default backend will be whenever matplotlib is imported from that conda environment. The location of this 'matplotlibrc' can be found using the following commands :
import matplotlib
matplotlib.matplotlib_fname()
Please look into the following link if there's any deprecation issue with the above commands :
https://matplotlib.org/stable/tutorials/introductory/customizing.html#customizing-with-matplotlibrc-files
Once the location of the 'matplotlibrc' file is known, open it and simply uncomment one line inside this file. Just change the backend from :
##backend: Agg
to :
backend: Qt5Agg
And that's it. All the plot window troubles in PyCharm will be solved as far as the mayavi 3D visualization package is concerned. For any other use, where a specific backend is necessary, you can also set the default to any other backend of choice.

Bokeh + interactive widgets + PythonAnywhere

I was unable to find a minimal working example for an interactive web app using bokeh and bokeh widgets that runs on PythonAnywhere.
Ideally, I would like to have a simple plot of a relatively complicated function (which I do not know analytically, but I have SymPy compute that for me) which should be replotted when a parameter changes.
All code that I have found so far does not do that, e.g. https://github.com/bokeh/bokeh/tree/master/examples, or refers to obsolete versions of bokeh.
Most of the documentation deals with running a bokeh-server, but there is no indication on how to have this running with WSGI (which is how PythonAnywhere handles the requests). For this reasone I have tried embedding a Bokeh plot within a Flask app. However, as far as I understand, in order to have interactive Bokeh widgets (which should trigger some computation in Python) do require a bokeh-server. I am not particularly attached to using either Flask or Bokeh, if I can achive a similar result with some other simpler tools. Unfortunately, a Jupyter notebook with interactive widgets does not seems to be an option in PythonAnywhere.
I have installed bokeh 0.12 on Python 3.5.
I have managed to run a simple bokeh plot within a flask app, but I am unable to use the Bokeh widgets.
Here is a working example of Jupyter notebook with interactive widgets on pythonanywhere:
%pylab inline
import matplotlib.pyplot as plt
from ipywidgets import interact
def plot_power_function(k):
xs = range(50)
dynamic_ys = [x ** k for x in xs]
plt.plot(xs, dynamic_ys)
interact(plot_power_function, k=[1, 5, 0.5])
PythonAnywhere does have the ipywidgets module pre-installed. But if you are not seeing the interactive widgets, make sure that you have run jupyter nbextension enable --py widgetsnbextension from a bash console to get it enabled for your notebooks. You will have to restart the jupyter server after enabling this extension (by killing the relevant jupyter processes from the consoles running processes list on the pythonanywhere dashboard).
As of Bokeh 0.12.5 you can embed Bokeh server applications directly in Jupyter notebooks. This is the best and most robust way to have interactive Bokeh plots and widgets (backed by real python code) in a notebook.
You can study an example of this in this demo notebook:
https://github.com/bokeh/bokeh/blob/master/examples/howto/server_embed/notebook_embed.ipynb
A screencast of that notebook in action is below:

Using matplotlib *without* TCL

Exactly what the title says. Is there a way to use the matplotlib library without installing TCL? Please don't tell me to bite the bullet and install TCL - I know how to do it but for my own (ok maybe silly) reasons I don't want to.
I don't care about displaying the plots, I only want to be able to output them in a png. I tried various things (using different backends etc) but matplotlib always wanted to find tcl to work :( Why is TCL so essential for matplotlib?
Also, please notice that I am using windows -- I have installed everything that could be required (numpy, pandas, matplotlib) using pip.
#gerrit's solution is the correct one (I was trying to change the backends but I was doing it after loading pyplot -- the important thing seems to be that you need to change the backend immediately after imporing matplotlib). Here's a small example using it:
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
fig, ax = plt.subplots( nrows=1, ncols=1 )
ax.plot([0,1,2], [10,20,3])
fig.savefig('foo.png')
plt.close(fig)
This will output a file named 'foo.png' without using TCL \o/
Matplotlib 3.0 and newer
(Added to answer in October 2018)
Starting with Matplotlib 3, released on 19 September 2018, the problem described in the question should not occur. From the what's new part of the documentation:
The default backend no longer must be set as part of the build process. Instead, at run time, the builtin backends are tried in sequence until one of them imports.
Headless linux servers (identified by the DISPLAY env not being defined) will not select a GUI backend.
So, as long as you make sure DISPLAY is not defined, you should not run into any problems with the backend when running in a script on a headless Linux server.
Matplotlib 2.2 and older
(Original answer May 2016)
Immediately after loading matplotlib, enter
matplotlib.use('Agg')
Do this before loading pyplot, if at all.
By default, Matplotlib uses the TkAgg backend, which requires Tcl. If you don't want to display the plots, Agg is fine. Other alternatives include WX and QTAgg, but both require the installation of additional libraries.
Alternately, you can set this directive in your matplotlibrc file:
backend : Agg
For details, see the Matplotlib Usage FAQ on What is a backend?.

python's save figure button does not work in mac, trying to solve it unsuccessfully, how to do this?

I updated my python distribution yesterday to EPD 7.3-2 (64-bit). I am working on a mac with snow leopard.
Now the plot device of matplotlib is broken in at least two ways:
the "save" button doesn't work and makes the terminal or ipython crash and
the only way to see the figure is to have it in front of you, there is no python figure icon in the dock.
I did my homework and these same problems were reported here and here.
I tried to follow the instructions to fix this given in here, but this is the error that I get:
$python install_pythonw.py `which python`/../..
/Library/Frameworks/EPD64.framework/Versions/Current/.Python does not exist; exiting.
Indeed, I looked at the given folder and I could not find a .Python file. I added a comment at the answer to this problem but so far no one has replied to it :( :(
Any idea of how to fix this?
thanks!
I have seen this problem a few times, and it seems to be a problem in some backends. Also, it doesn't seem normal that your session crashes after 4 or 5 plots. In particular, the MacOSX backend seems buggy.
As you installed the EPD, I think it's less likely that your installation is broken.
The solution seems to be using a different backend. You can try with ipython --pylab a few backends, try their features and see if the save button works. You can try the following:
ipython --pylab=wx
ipython --pylab=tk
ipython --pylab=osx
The last one is the option that you're probably using right now, so perhaps not the best. If you just call ipython --pylab, it will use the default backend from your ~/.matplotlib/matplotlibrc file. Once you find a working backend you can change the default by editing that file. Look for a line like this:
backend : MacOSX
(your version may have a different backend.) Just change that setting to WXAgg, TkAgg, or Qt4Agg. With the --pylab option the names are slightly different, they don't have the Agg part. My favourite backend for OSX is the Qt4Agg backend, but I don't think it ships with EPD and the save button also doesn't work! But either WXAgg or TkAgg should work fine.
Other ways of changing the backend in a script are:
import matplotlib
matplotlib.use('WXAgg')
or
matplotlib.rcParams['backend'] = 'WXAgg'

Categories