Data security with mpld3 - python

I often plot data that I do not want to leave my personal computer. I notice that when using mpld3 to generate plots in the jupyter notebook, https://mpld3.github.io is accessed. I think its just pulling plotting scripts, but is there any risk of any of my plotted data being sent off of my computer when using mpld3? Is there an "offline mode" I could use mpld3 with?

Try to pass local=True either to mpld3.enable_notebook() or mpld3.display(). As FAQ claims:
Setting this to True will copy the mpld3 and d3 JavaScript libraries to the notebook directory, and will use the appropriate path within IPython (/files/*.js) to load the libraries

Related

Matplotlib figures not generating in GitHub CodeSpaces

I just started using Codespaces. In my python file I have this code:
import matplotlib.pyplot as plt
import pandas as pd
print("Hello")
titanic_data = pd.read_csv("https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv")
titanic_data = titanic_data[titanic_data['Age'].notnull()]
titanic_data['Fare'] = titanic_data['Fare'].fillna(titanic_data['Fare'].mean())
titanic_data = titanic_data.drop_duplicates()
plt.scatter(titanic_data['Age'], titanic_data['Fare'])
plt.show()
print("Goodbye")
When I run this on my local machine, this works perfectly. I can see the console logs, and the figure appears as a new window:
However, when I run this in Codespaces, I can see all of the code running without any errors, but it does not show the figure. Is this a known limitation or a feature that is not yet supported? Is there another way I can plot figures in Codespaces?
They mention this in the docs docs:
The default container image that's used by GitHub Codespaces includes a set of machine learning libraries that are preinstalled in your codespace. For example, Numpy, pandas, SciPy, Matplotlib, seaborn, scikit-learn, Keras, PyTorch, Requests, and Plotly.
It sounds like it should be supported out of the box. Is additional configuration required?
Based on the experimentation I have done thus far, plotting these diagrams as one would do in a local dev environment is not (yet?) possible.
For this specific case, the next best solution was to create a new GitHub Codespace from this repo: https://github.com/education/codespaces-teaching-template-py
Once the repo has been cloned into the Codespace, navigate to an existing .ipynb file or create your own.
Inside there you'll be able to run chunks of custom code and plot figures.
The big limitation I see is that the figure cannot be interacted with the same way that one would be able to on a local machine (zooming, panning, etc).
As always, don't forget to shut your Codespace down when you're done using it!

Exporting Jupyter notebook with plotly to html not displaying correctly when offline

I am using Jupyter lab, everything works fine within jupyter lab even when I am offline. However whenever I try to export the report to HTML. the plotly plots are not rendered. If I turn on my internet connection the plots are rendered, just fine.
Here is a sample code:
import pandas as pd
import numpy as np
import plotly.express as px
df = pd.DataFrame(np.random.randn(100,4), columns='A B C D'.split())
px.scatter(df, x='A',y='B')
I have tried following the troubleshooting guide for plotly shown here. Additionally I tried installing on a fresh environment.
If i use the following:
import plotly.io as pio
pio.renderers.default = "jupyterlab"
The offline HTML includes static plot, however I would very like to have the interactivity enabled.
I have noticed that the files differs in size, the static pages are only around 700 Kb whereas when I try to save them as interactive they are about 4 Mb.
Is this not possible in Jupyter lab ? or am I missing something
If you want to be able to have interactivity while being offline, you need to add the plotly.js to the output html.
You can achieve that like this:
import plotly.io as pio
pio.renderers.default='notebook'
Actually, this should be done by default on JupyterLab (you can tell be the increased file size. As in your case it will be >4MB). So if that doesn't work, I suspect a bug. I think I've experienced something similar. Here's my browser console output when using your example and exporting it to html:
For some reason, the included plotly.js seems to depend on require.js which is not included in the html export for some reason. Instead, your page will try to load it from a CDN which fails when you are offline (as seen in the screenshot).
Now, what you can do is to manually include a local version of require.js. Get a copy here. Then, in your Notebook add the following at the top:
%%HTML
<script src="require.js"></script>
Then, export your notebook to html. Make sure, it is in the same folder as the require.js file you downloaded before and open it in the browser.
There should be no more error message in the console and your chart should appear and work interactively:
/e: If you want to share your notebook, this might be sub optimal as it requires you to also distribute the require.js script. You can also directly include the whole script in your notebook. Just put the %%js cell magic at the top of a code cell and paste the content of the require.js file you downloaded below that.
As you are trying to export it to HTML, don't forget jupyter's way (.html). Also with the "Open with" button on jupyter, you can see the maximum file size that it can handle. And most likely the storage wouldn't be the issue.

Pycharm Jupyter Interactive Matplotlib

Interactive matplotlib plotting is already a thing, but does not work properly in Pycharm, when used within a jupyter notebook.
The %matplotlib notebook does not work (throws no error, but I get <IPython ... JavaScript object> instead of a plot. If I plot normally (also with or without plt.show()) I just get a png and cannot interact in any way (even if, e.g., sliders are visible).
I couldn't find any answers elsewhere to this exact problem. It might be working in the browser version of jupyter, but I would like to stick to using PyCharm.
Pycharm v 2017.3 Community Edition
You can try:
import matplotlib
matplotlib.use('Qt5Agg')
import matplotlib.pyplot as plt
Instead of importing only the peplos.
It's a trick I found on forum of Jetbrains
https://intellij-support.jetbrains.com/hc/en-us/community/posts/115000736584-SciView-in-PyCharm-2017-3-reduces-functionality-of-Matplotlib It works for me. With this, you skip actually the Sciview and plot in a normal matplotlib window

Using matplotlib *without* TCL

Exactly what the title says. Is there a way to use the matplotlib library without installing TCL? Please don't tell me to bite the bullet and install TCL - I know how to do it but for my own (ok maybe silly) reasons I don't want to.
I don't care about displaying the plots, I only want to be able to output them in a png. I tried various things (using different backends etc) but matplotlib always wanted to find tcl to work :( Why is TCL so essential for matplotlib?
Also, please notice that I am using windows -- I have installed everything that could be required (numpy, pandas, matplotlib) using pip.
#gerrit's solution is the correct one (I was trying to change the backends but I was doing it after loading pyplot -- the important thing seems to be that you need to change the backend immediately after imporing matplotlib). Here's a small example using it:
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
fig, ax = plt.subplots( nrows=1, ncols=1 )
ax.plot([0,1,2], [10,20,3])
fig.savefig('foo.png')
plt.close(fig)
This will output a file named 'foo.png' without using TCL \o/
Matplotlib 3.0 and newer
(Added to answer in October 2018)
Starting with Matplotlib 3, released on 19 September 2018, the problem described in the question should not occur. From the what's new part of the documentation:
The default backend no longer must be set as part of the build process. Instead, at run time, the builtin backends are tried in sequence until one of them imports.
Headless linux servers (identified by the DISPLAY env not being defined) will not select a GUI backend.
So, as long as you make sure DISPLAY is not defined, you should not run into any problems with the backend when running in a script on a headless Linux server.
Matplotlib 2.2 and older
(Original answer May 2016)
Immediately after loading matplotlib, enter
matplotlib.use('Agg')
Do this before loading pyplot, if at all.
By default, Matplotlib uses the TkAgg backend, which requires Tcl. If you don't want to display the plots, Agg is fine. Other alternatives include WX and QTAgg, but both require the installation of additional libraries.
Alternately, you can set this directive in your matplotlibrc file:
backend : Agg
For details, see the Matplotlib Usage FAQ on What is a backend?.

Choosing a matplotlib backend for a specific IPython profile

matplotlib has a config file and IPython has its own. Which one has precedence when it comes to setting things like matplotlib backends?
For example, say my config file for matplotlib says to use a specific backend, but then I modify my IPython startup or config files to use a different one. Which one would be used when I start IPython and import matplotlib?
More generally, what is the right way to set things up so that different profiles use different matplotlib backends or matplotlib configurations?
IPython configuration is used, as IPython itself chooses the matplotlib backend.
For reference, see IPython:core/pylabtools.py:activate_matplotlib and notice how matplotlib.use(backend) is called explicitly.

Categories