Output a Document (preferably a PDF) from Python

Output a Document (preferably a PDF) from Python - python

I've got a Python script (3.5) that will go through a whole battery of tests. The user gets to select which tests get run out of N possible tests. So, the user could run 1 tests up to N tests.
Right now, I'm just outputting the results of the test to a plot with matplotlib and that looks OK, but they're just saved as individual files. Also, the code has some PASS/FAIL criteria for each test.
So, the issue is, I would like to use some tool with Python to output the whole story of the test sequence as a PDF. I'd like to be able to keep some Boiler-Plate stuff and update some stuff in the middle....For example:
Test was run for 7 minutes. Maximum power was recorded as -69.5dBm.
The thing that is the same each time is:
Test was run for minutes. Maximum power was recorded as dBm.
And, the number of minutes and the number for maximum power is pulled in from the results of the test. Also, the graph is pulled in from 'matplotlib'.
So, for each test, I'd like to append some Boiler-plate text, fill in some blanks with real data, and slap in an image where appropriate, and I'd like to do it with Python.
I looked at some of the suggestions on SO, but for the most part it looks like the solutions are for appending or watermarking existing PDFs. Also, Google turns up a lot of results for automating the generation of Python Code Documentation...Not Code for generating Documentation with Python.
Reportlab looked promising, but it looks like it's been abandoned.
Also, I'm not married to the requirement that the output be a PDF. This is for internal use only, so there's some flexability. HTML, Word or something else that can be converted to a PDF manually by the user afterwards is fine too. I know PDFs can be somewhat troublesome due to their binary nature.

You can do all of this directly in matplotlib.
It is possible to create several figures and afterwards save them all to the same pdf document using matplotlib.backends.backend_pdf.PdfPages.
The text can be set using the text() command.
Here is a basic example on how that would work.
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
import numpy as np
N=10
text = "The maximum is {}, the minimum is {}"
# create N figures
for i in range(N):
data = np.random.rand(28)
fig = plt.figure(i+1)
ax= fig.add_subplot(111)
ax.plot(np.arange(28), data)
# add some text to the figure
ax.text(0.1, 1.05,text.format(data.max(),data.min()), transform=ax.transAxes)
# Saving all figures to the same pdf
pp = PdfPages('multipage.pdf')
for i in range(N):
fig = plt.figure(i+1)
fig.savefig(pp, format='pdf')
pp.close()

Related

How to get a plot out of LuxPy package

I'd like to plot the color-rendering information for a given spectrum using the lx.cri.plot_cri_graphics(SPD) function of the LuxPy package.
There is the following example code on page 17 of this official tutorial which passes a predefined spectrum (SPD) to the mentioned function.
"The LuxPy.cri subpackage also supports a function that
[...] provides TM30-like graphical output. For example, the code below generates the output in Fig. 7"
import luxpy as lx
SPD = lx._CIE_ILLUMINANTS['F4']
data,_,_ = lx.cri.plot_cri_graphics(SPD)
Figure 7:
In my case, the program just finishes without plot, errors or terminal output.
The data object in the above example contains a bunch of data and is of type dict, but no plot appears and the script just finishes. Many of the package's classes have dedicated .plot() member functions, which - for my understanding - should all work the same way. But I do not get any plots popping up for any of those either (like I'm used to from working with Matplotlib).
Is there anything I need to do beside calling the .plot_cri_graphics() funtions? Something that may be self-explanatory for someone more experienced?
Do I have to pass data to another plot funtion to actually get a plot output?

The official tutorial uses IPython with inline plotting (see section 5.1: %matplotlib inline, see Plotting for details). If you run the example as plain python file, you'll need to add plt.show() at the end to actually show the plot.

How can I test if matplotlib has produced the correct plot?

I would like to write tests for code that produces matplotlib plots. Ideally the tests would be able to decide if the output was appropriate without my input.
I have decoupled the data setup into easily testable functions, but I'm unsure how I could decouple the plotting or test the outcome without visual inspection. Is this something anyone has dealt with before?
Is there an established practice for testing in situations like this?
Ideally I would like something like this:
fig, ax = plt.subplot()
ax.plot(x_data, y_data)
ax.set_xlabel('x')
ax.set_ylabel('y')
assertTrue(fig == expected_fig)

Yes, the Matplotlib developers have dealt with that. And the practice they've established is this:
Write a test that produces a plot figure.
Save that figure as an image file in a temporary folder.
Assert that output image and corresponding "baseline" image are the same.
The test will fail the first time it is run. You then inspect the image, that one time, and use it as the reference for future tests simply by copying the file to the folder holding baseline images.
You may be able to re-use the Matplotlib test fixtures (source code, API documentation) and look at the Matplotlib tests to see how they are used in practice. Essentially, the comparison mechanism loads both image files via PIL, converts them to NumPy arrays, and tests that the two arrays are equal. Though there is also a way to specify a tolerance and allow minor deviations.

How to use Python Seaborn Visualizations in PowerPoint?

I created some figures with Seaborn in a Jupyter Notebook. I would now like to present those figures in a PowerPoint presentation.
I know that it is possible to export the figures as png and include them in the presentation. But then they would be static, and if something changes in the dataframe, the picture would be the same. Is there an option to have a dynamic figure in PowerPoint? Something like a small Jupyter Notebook you could Display in the slides?

You could try Anaconda Fusion (also the video here), which let's you use Python inside of Excel. This could possibly work since you can link figures/data elements between Excel and PowerPoint (but special restrictions might apply when the figure is created via Python rather than standard Excel). Anaconda Fusion is free to try for a couple of months.
Another solution would be to use the Jupyter Notebook to create your presentation instead of PowerPoint. Go to View -> Cell Toolbar -> Slideshowand you can choose which code cells should become slides.
A third approach would be to create an animation of the figure as the data frame changes and then include the animation (GIF or video) in PowerPoint.

The following procedures probably won't be the most elegant solution, but it will let you produce a Seaborn plot, store it as an image file, and export the same image to an open powerpoint presentation. Depending on whether you set LinkToFile to True or False, the images will or will not update when the source changes. I'm messing around with this using cells in Spyder, but it should work in a Jupyter notebook as well. Make sure that you have a folder named c:\pptSeaborn\.
Here it is:
# Some imports
import numpy as np
import seaborn as sns
import os
import matplotlib.pyplot as plt
import win32com.client
import win32api
os.chdir('C:/pptSeaborn')
# Settings for some random data
mu = 0
sigma = 1
simulation = np.random.normal(mu, sigma, 10)
# Make seaborn plot from simulated data. Save as image file.
def SeabornPlot(data, filename = 'c:\\pptSeaborn\\snsPlot.png'):
ax = sns.kdeplot(data, shade=True)
fig = ax.get_figure()
fig.savefig(filename, bbox_inches='tight', dpi = 440)
plt.close(fig)
# Import image file to active powerpoint presentation
def SeabornPPT(plotSource, linkImage):
Application = win32com.client.Dispatch("PowerPoint.Application")
Presentation = Application.Activepresentation
slidenr = Presentation.Slides.Count + 1
Base = Presentation.Slides.Add(slidenr, 12)
gph = Base.Shapes.AddPicture(FileName=plotSource,
LinkToFile=linkImage, SaveWithDocument=True,
Left=50, Top=25, Width=800, Height=500)
Presentation.slides(slidenr).select()
# Produce data, save plot as image, and export image to powerpoint
SeabornPlot(data = simulation)
SeabornPPT(plotSource = 'c:\\pptSeaborn\\snsPlot.png', linkImage = False)
Now, if you have an open powerpoint presentation and run this whole thing five times, you will get somthing like this:
If you go ahead and save this somewhere, and reopen it, it will still look the same.
Now you can set linkImage = True, and run the whole thing five times again. Depending on the random data generated, you will still get five slides with different graphs.
But NOW, if you save the presentation and reopen it, all plots will look the same because they're linked to the same image file:
The next step could be to wrap the whole thing into a function that takes filename and LinkToFile as arguments. You could also include whether or not the procedure makes a new slide each time an image is exported. I hope you find my sggestion useful. I liked your question, and I'm hoping to see a few other suggestions as well.

We now went with this approach:
You can save the figures as a .png file and insert this into Powerpoint. There is an Option when inserting it, that the Picture will be updated every time you open PowerPoint, retrivining a new version of the file from the Folder I saved it to. So when I make changes in Seaborn, a new version of the file is automatically saved as a Picture which will then be updated in PowerPoint.

Efficient visualisation in Python

I have data (generated by an algorithm I wrote for it) for a random process which consists of coalescing and branching random walks on a finite space that I would like to visualize using python and probably something from matplotlib.
The data look like this:
A list of lists the of states of the process at times when something changes (a walk moves to an empty spot, coalesces with another one or a new particle is born), so something like this (let's say the process lives on {0,1,2,3,4}:
[[0,1,2,0,2],...,[1,0,2,2,0]], so at the beginning I start with the process having particles at positions 1,2 and 4 (there are two different kinds of particles so that "1" indicates the presence of a first type and "2" of the second, whole "0" means nothing there)
And I have also the list of events that alter the process, so a list of lists of the form
[place,time,type]
so I know what happens where and at what time (which corresponds to writing appropriate marks in the graphical representation, for example an arrow to the left if the event was that a particle moved to the left).
I wrote something like this :
import pylab as P
P.plot(-spacebound,0,spacebound,maxtime)
while something in the process:
current=listofevents.pop(0)
for i that are nonempty at current time:
P.arrow() in a way corresponding to the data
P.show()
This works, but it is extremely slow so that if I have a big process it takes an enormous amount of time to make this visualization (while generating the process data takes a few seconds at most for rather extreme parameters - a big space, time and a high rate of particle births which means a a lot of events changing the process often).
I am pretty sure using arrows like this is pretty idiotic, but since I've only visualized things in R so far (I could of course simply export my data from python and visualize them in R but I want to avoid that) I am also very green at doing this in Python.
I tried some googling, found out about matplotlib and looked at some tutorials there and apart from the arrows I also tried just visualizing the states of the process (without the events) by looping plt.scatter() over all the states, but while this is slightly faster, it is still extremely slow and it also looks messy.
So how would I plot this in a sensible way? Even a link to something like "learn to do plotting in Python properly" is welcome as an answer. Thanks!

matplotlib is not for interactive plotting. It used for generating a article-quality plots. For interactive plots you could try to use Chaco or other libs. The Chaco ideology is to create a plot and link it with the data. As you update the data you get your chart updated automatically.

Short guide how to use gnuplot with python?

I'm trying to draw a graph in Python, using Gnuplot. I have a hard time finding any guide/tutorials how to start.
What I'm wondering: what files/programs are necessary?(I'm using Ubuntu), Where do I begin?
If anyone could recommend a good tutorial, that would be very appreciated!
Thank you!

You could try gnuplot.py. It is an interface to gnuplot I used in the past.
In the website you have some indications and there are some example scripts in the distribution.
In fact it is very easy to run directly gnuplot from python. The gnuplot.py source code will give you valuable hints. See also here and here for other alternatives.
As other recommends the alternative is to use matplotlib. Matplotlib is great and I use it as my main visualization library. The downside is that working with a high number of data it can become slow. gnuplot in this case is a good option.

Your approach depends on what you already have and what you want to work with. To plot a graph with gnuplot you need two things:
A gnuplot script, that describes how the resulting plot should look like (title, axis description, legend...)
A data file, which holds the data you want to plot
If you already have lets say the gnuplot script file and you simply want to write new data files using python, than this approach is sound in my option. Simply export data to the specified format you used in your data files before and run gnuplot from within python with something like
import os
import subprocess
p = subprocess.Popen("gnuplot <scriptname>", shell = True)
os.waitpid(p.pid, 0)
Don't forget that you maybe have to change the path the data file in your gnuplot script if you write out new data files. So something like this:
plot "<path>" ...
If you don't yet have a gnuplot script you want to use you can definitely write one and use that from this point on, but using python there are also other alternatives.
You could take a look at matplotlib which is a plotting library that is very similar in the way Matlab uses the plot command. It is very well documented and there are lots of tutorials and examples online you can learn from and work with.

As a gnuplot fan, I use this gnuplot wrapper https://github.com/mzechmeister/python/wiki/gplot.py.
Here is a demo snippet
from gplot import *
gplot.term('wxt')
gplot.title('"gplot.py"').grid()
gplot.xlabel('"time"')
gplot([1,2,0,4,3.5], 'w l, sin(x), "<seq 10" us 1:(cos($1))')

about 10 years later, let me point the attention to autogpy or Autognuplotpy.
Autogpy aims at a full generation of gnuplot scripts (and suitably dumped data) from python.
For instance, the python code
import autogpy
import numpy as np
xx = np.linspace(0,6,100)
yy = np.sin(xx)
zz = np.cos(xx)
with autogpy.AutogpyFigure("test_figure") as figure:
# gnuplot-like syntax
figure.plot(r'with lines t "sin"',xx,yy)
# matplotlib-like syntax
figure.plot(xx,zz,u='1:2',w='lines',label='cos')
generates the gnuplot script
set terminal epslatex size 9.9cm,8.cm color colortext standalone 'phv,12 ' linewidth 2
set output 'fig.latex.nice/plot_out.tex'
p "fig__0__.dat" with lines t "sin",\
"fig__1__.dat" u 1:2 with lines t "cos"
and dumps readable data.
Supports latex, tiks and png terminal, but can be easily expanded to more.
Disclaimer: I am the author.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.