So I have a module that is supposed to read stuff from a couple of csv files, do some math and create plots out of it. I call the function that does this from inside a jupyter notebook.
The structure is roughly like this
Module:
def fun(file):
data = math(file)
plt.plot(data)
plt.show()
def multi_fun(files):
with Pool(cpu_count(), maxtasksperchild=1) as pool:
m = pool.imap_unordered(fun, files)
for res in m:
pass
And in the notebook multifun([file1, file2])
Done like this the plots don't pop up in the notebook. However if I change it to
def fun(file):
data = math(file)
plt.plot(data)
print("")
plt.show()
The plots to appear.
Adding the print statement to multi_fun or the notebook does not work. Adding print('', end='') also does not show the plots, however print(u'\u200c', end='') does.
I don't really need this to work (was more for debugging), but I'm super curious how this happens. Is there some magic going on with print forcing some joining of the threads?
Related
I am using Bokeh to print out a multiple sequence alignment. So far, it has worked perfectly. For example:
[*Previous code that was not included*]
aln = AlignIO.read(alignment_file,'fasta')
p = view_alignment(aln, plot_width=900)
pn.pane.Bokeh(p)
However, I am trying to introduce an if-statement as it follows, it simply does not print anything:
if result == 1:
print('The gen you selected does not have any ortholog.')
else:
aln = AlignIO.read(alignment_file,'fasta')
p = view_alignment(aln, plot_width=900)
pn.pane.Bokeh(p)
I have introduced a print('Check if it works') inside the else part of the if-statement and it does print it. But the bokeh panel doesn't get printed.
I am using JupyterLab to try these things version 3.2.1. Not entirely sure what's happening. The reason why I am including this if-statement is to filter the gen that do not have any orthologs. Otherwise, it would print simply one sequence and it does not make sense for what I am trying to achieve here.
Any suggestion?
I found the solution to my problem.
I imported show from bokeh.io.
from bokeh.io import show
Finally simply call the graph that it was already form as:
show(p)
Everything is working as it should now!
I need to export multiple plots with Python and Matplotlib but I have issues with the memory management.
I'm using Jupyter Notebook in Windows and the code is structured as shown in the working example here below. Basically, I'm generating a daily plot over a period of several years.
Following the memory usage I can see that usually for the first 4 "months" the memory usage is fairly stable but then it increases more than 15MB for each "day". With the complete code this leads to a memory error.
I tried to use plt.close() as suggested here. I also tried plt.close(fig), plt.close('all') and to manually delete some variables with del. Finally I've put gc.collect in the loop. Nothing seems to have any effect.
import numpy as np
import matplotlib.pyplot as plt
import gc
sz = 1000
i=0
for year in [2014,2015,2016,2017]:
for month in [1,2,3,4,5,9,10,11,12]:
for day in range(1,30):
# Plot
fig, ((ax1,ax2),(ax3,ax4)) = plt.subplots(2,2,figsize=(14,10))
x1 = np.arange(1,sz+1,1)
y1 = np.random.random(sz)
ax1.scatter(x1,y1)
ax2.scatter(x1,y1)
ax3.scatter(x1,y1)
ax4.scatter(x1,y1)
filename = 'test/test_output{}{}{}.png'.format(year,month,day)
plt.tight_layout()
plt.savefig(filename, bbox_inches='tight')
#plt.close(fig)
#plt.close('all')
plt.close()
del x1, y1, ax1, ax2, ax3, ax4, fig
# counter
i=i+1
# Monthly operations
print(filename,i)
gc.collect()
# Triggered outside the loops
gc.collect()
On the other hand, if I interrupt the code (or I wait for the "Memory error" to occur) and then I run manually gc.collect, the memory is cleaned right away.
Why gc.collect does not have any effect in the nested loops?
Is there a way to improve the memory management for this code?
Introduction
As I am coming from matlab, I am used to an interactive interface where a script can update figures while it is running. During the processing each figure can be re-sized or even closed. This probably means that each figure is running in its own thread which is obviously not the case with matplotlib.
IPython can imitate the Matlab behavior using the magic command %pylab or %matplotlib which does something that I don't understand yet and which is the very point of my question.
My goal is then to allow standalone Python scripts to work as Matlab does (or as IPython with %matplotlib does). In other words, I would like this script to be executed from the command line. I am expecting a new figure that pop-up every 3 seconds. During the execution I would be able to zoom, resize or even close the figure.
#!/usr/bin/python
import matplotlib.pyplot as plt
import time
def do_some_work():
time.sleep(3)
for i in range(10):
plt.plot([1,2,3,4])
plt.show() # this is way too boilerplate, I'd like to avoid it too.
do_some_work()
What alternative to %matplotlib I can use to manipulate figures while a script is running in Python (not IPython)?
What solutions I've already investigated?
I currently found 3 way to get a plot show.
1. %pylab / %matplotlib
As tom said, the use of %pylab should be avoided to prevent the namespace to be polluted.
>>> %pylab
>>> plot([1,2,3,4])
This solution is sweet, the plot is non-blocking, there is no need for an additionnal show(), I can still add a grid with grid() afterwards and I can close, resize or zoom on my figure with no additional issues.
Unfortunately the %matplotlib command is only available on IPython.
2. from pylab import * or from matplotlib.pyplot import plt
>>> from pylab import *
>>> plot([1,2,3,4])
Things are quite different here. I need to add the command show() to display my figure which is blocking. I cannot do anything but closing the figure to execute the next command such as grid() which will have no effect since the figure is now closed...
** 3. from pylab import * or from matplotlib.pyplot import plt + ion()**
Some suggestions recommend to use the ion() command as follow:
>>> from pylab import *
>>> ion()
>>> plot([1,2,3,4])
>>> draw()
>>> pause(0.0001)
Unfortunately, even if the plot shows, I cannot close the figure manually. I will need to execute close() on the terminal which is not very convenient. Moreover the need for two additional commands such as draw(); pause(0.0001) is not what I am expecting.
Summary
With %pylab, everything is wonderful, but I cannot use it outside of IPython
With from pylab import * followed by a plot, I get a blocking behavior and all the power of IPython is wasted.
from pylab import * followed by ion offers a nice alternative to the previous one, but I have to use the weird pause(0.0001) command that leads to a window that I cannot close manually (I know that the pause is not needed with some backends. I am using WxAgg which is the only one that works well on Cygwin x64.
This question advices to use matplotlib.interactive(True). Unfortunately it does not work and gives the same behavior as ion() does.
Change your do_some_work function to the following and it should work.
def do_some_work():
plt.pause(3)
For interactive backends plt.pause(3) starts the event loop for 3 seconds so that it can process your resize events. Note that the documentation says that it is an experimental function and that for complex animations you should use the animation module.
The, %pylab and %matplotlib magic commands also start an event loop, which is why user interaction with the plots is possible. Alternatively, you can start the event loop with %gui wx, and turn it off with %gui. You can use the IPython.lib.guisupport.is_event_loop_running_wx() function to test if it is running.
The reason for using ion() or ioff() is very well explained in the 'What is interactive mode' page. In principle, user interaction is possible without IPython. However, I could not get the interactive-example from that page to work with the Qt4Agg backend, only with the MacOSX backend (on my Mac). I didn't try with the WX backend.
Edit
I did manage to get the interactive-example to work with the Qt4Agg backend by using PyQt4 instead of PySide (so by setting backend.qt4 : PyQt4 in my ~/.config/matplotlibrc file). I think the example doesn't work with all backends. I submitted an issue here.
Edit 2
I'm afraid I can't think of a way of manipulating the figure while a long calculation is running, without using threads. As you mentioned: Matplotlib doesn't start a thread, and neither does IPython. The %pylab and %matplotlib commands alternate between processing commands from the read-eval-print loop and letting the GUI processing events for a short time. They do this sequentially.
In fact, I'm unable to reproduce your behavior, even with the %matplotlib or %pylab magic. (Just to be clear: in ipython I first call %matplotlib and then %run yourscript.py). The %matplotlib magic puts Matplotlib in interactive-mode, which makes the plt.show() call non-blocking so that the do_some_work function is executed immediately. However, during the time.sleep(3) call, the figure is unresponsive (this becomes even more apparent if I increase the sleeping period). I don't understand how this can work at your end.
Unless I'm wrong you'll have to break up your calculation in smaller parts and use plt.pause (or even better, the animation module) to update the figures.
My advice would be to keep using IPython, since it manages the GUI event loop for you (that's what pylab/pylot does).
I tried interactive plotting in a normal interpreter and it worked the way it is expected, even without calling ion() (Debian unstable, Python 3.4.3+, Matplotlib 1.4.2-3.1). If I recall it right, it's a fairly new feature in Matplotlib.
Alternatively, you can also use Matplotlib's animation capabilities to update a plot periodically:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import time
plt.ion()
tt = np.linspace(0, 1, 200)
freq = 1 # parameter for sine
t0 = time.time() # for measuring ellapsed time
fig, ax = plt.subplots()
def draw_func(i):
""" This function gets called repeated times """
global freq # needed because freq is changed in this function
xx = np.sin(2*np.pi*freq*tt)/freq
ax.set_title("Passed Time: %.1f s, " % (time.time()-t0) +
"Parameter i=%d" % i)
ax.plot(tt, xx, label="$f=%d$ Hz" % freq)
ax.legend()
freq += 1
# call draw_func every 3 seconds 1 + 4 times (first time is initialization):
ani = animation.FuncAnimation(fig, draw_func, np.arange(4), interval=3000,
repeat=False)
# plt.show()
Checkout matplotlib.animation.FuncAnimation for details. You'll find further examples in the examples section.
I am running a python code that simulates and predicts conditions inside a rapid compression machine (such as temperature, pressure, chemical species, etc.). The code is set up so that it calculates the time evolution of all these parameters (which are saved in numpy arrays) and then outputs them to a function, which creates and saves plots of these parameters against time.
These arrays are very large, and the function creates about 14 plots using matplotlib. The function will get to about the 7th plot (sometimes more, sometimes less), when I get an error reading "python.exe has stopped working". Not sure if it's a memory issue because the plots are so big or what's going on.
I have tried both (as you'll see in my sample code) plt.figure.clf() and plt.close(). I've even tried doing a time.sleep inbetween plots. The following is an example snippet of my function that does the plotting (not the real code, just an example necessary to illustrate my problem).
I am using Windows, Python 2.7, and Matplotlib 1.4.2
def graph_parameters(time, a, b, c, d, e): #a,b,c,d,e are my parameters
delay = 10
plt.figure(1)
plt.figure(facecolor="white")
plt.plot(time, a)
plt.ylabel('a')
plt.xlabel('Time(s)')
plt.savefig('fig1.png')
plt.figure(1).clf()
plt.close('all')
time.sleep(delay)
plt.figure(2)
plt.figure(facecolor="white")
plt.plot(time, b)
plt.ylabel('b')
plt.xlabel('Time(s)')
plt.savefig('fig2.png')
plt.figure(2).clf()
plt.close('all')
time.sleep(delay)
etc.
Thanks in advance.
I'm not completely sure, but I think it was a memory issue. I found a way (from another stackoverflow entry) to delete variables I don't need. Per my example above, if I only want to keep 'time_' and parameters 'a_','b_','c_','d_' and 'e_', I run the following at the end of my main code:
graph_array = np.array([time_, a_, b_, c_, d_, e_])
#Delete variables that aren't going to be plotted in order to free up memory
attributes = sys.modules[__name__]
for name in dir():
if name[0]!='_' and np.all( name!= np.array(['graph_array', 'attributes','np', 'gc', 'graph_parameters']) ):
delattr(attributes, name)
gc.collect()
graph_parameters(*graph_array)
Basically what the for loop does is keep private and magical function names and names of specified variables, and deletes all other names. I got this idea from the answer to the following link. How do I clear all variables in the middle of a Python script?
I watched my task manager as the code ran, and there was a significant drop in memory usage (~1,600,000 K to ~100,000 K) by Python right before plots began being saved, indicating that this method did free up memory. Also, no crash.
This question already has answers here:
pylab.ion() in python 2, matplotlib 1.1.1 and updating of the plot while the program runs
(2 answers)
Closed 8 years ago.
There are a lot of questions about matplotlib, pylab, pyplot, ipython, so I'm sorry if you're sick of seeing this asked. I'll try to be as specific as I can, because I've been looking through people's questions and looking at documentation for pyplot and pylab, and I still am not sure what I'm doing wrong. On with the code:
Goal: plot a figure every .5 seconds, and update the figure as soon as the plot command is called.
My attempt at coding this follows (running on ipython -pylab):
import time
ion()
x=linspace(-1,1,51)
plot(sin(x))
for i in range(10):
plot([sin(i+j) for j in x])
#see **
print i
time.sleep(1)
print 'Done'
It correctly plots each line, but not until it has exited the for loop. I have tried forcing a redraw by putting draw() where ** is, but that doesn't seem to work either. Ideally, I'd like to have it simply add each line, instead of doing a full redraw. If redrawing is required however, that's fine.
Additional attempts at solving:
just after ion(), tried adding hold(True) to no avail.
for kicks tried show() for **
The closest answer I've found to what I'm trying to do was at plotting lines without blocking execution, but show() isn't doing anything.
I apologize if this is a straightforward request, and I'm looking past something so obvious. For what it's worth, this came up while I was trying to convert matlab code from class to some python for my own use. The original matlab (initializations removed) which I have been trying to convert follows:
for i=1:time
plot(u)
hold on
pause(.01)
for j=2:n-1
v(j)=u(j)-2*u(j-1)
end
v(1)= pi
u=v
end
Any help, even if it's just "look up this_method" would be excellent, so I can at least narrow my efforts to figuring out how to use that method. If there's any more information that would be useful, let me know.
from pylab import *
import time
ion()
tstart = time.time() # for profiling
x = arange(0,2*pi,0.01) # x-array
line, = plot(x,sin(x))
for i in arange(1,200):
line.set_ydata(sin(x+i/10.0)) # update the data
draw() # redraw the canvas
pause(0.01)
print 'FPS:' , 200/(time.time()-tstart)
ioff()
show()
########################
The above worked for me nicely. I ran it in spyder editor in pythonxy2.7.3 under win7 OS.
Note the pause() statement following draw() followed by ioff() and show().
The second answer to the question you linked provides the answer: call draw() after every plot() to make it appear immediately; for example:
import time
ion()
x = linspace(-1,1,51)
plot(sin(x))
for i in range(10):
plot([sin(i+j) for j in x])
# make it appear immediately
draw()
time.sleep(1)
If that doesn't work... try what they do on this page: http://www.scipy.org/Cookbook/Matplotlib/Animations
import time
ion()
tstart = time.time() # for profiling
x = arange(0,2*pi,0.01) # x-array
line, = plot(x,sin(x))
for i in arange(1,200):
line.set_ydata(sin(x+i/10.0)) # update the data
draw() # redraw the canvas
print 'FPS:' , 200/(time.time()-tstart)
The page mentions that the line.set_ydata() function is the key part.
had the exact same problem with ipython running on my mac. (Enthought Distribution of python 2.7 32bit on Macbook pro running snow leopard).
Got a tip from a friend at work. Run ipython from the terminal with the following arguments:
ipython -wthread -pylab
This works for me. The above python code from "Daniel G" runs without incident, whereas previously it didn't update the plot.
According to the ipython documentation:
[-gthread, -qthread, -q4thread, -wthread, -pylab:...] They provide
threading support for the GTK, Qt (versions 3 and 4) and WXPython
toolkits, and for the matplotlib library.
I don't know why that is important, but it works.
hope that is helpful,
labjunky
Please see the answer I posted here to a similar question. I could generate your animation without problems using only the GTKAgg backend. To do this, you need to add this line to your script (I think before importing pylab):
matplotlib.use('GTkAgg')
and also install PyGTK.