I am running a python code that simulates and predicts conditions inside a rapid compression machine (such as temperature, pressure, chemical species, etc.). The code is set up so that it calculates the time evolution of all these parameters (which are saved in numpy arrays) and then outputs them to a function, which creates and saves plots of these parameters against time.
These arrays are very large, and the function creates about 14 plots using matplotlib. The function will get to about the 7th plot (sometimes more, sometimes less), when I get an error reading "python.exe has stopped working". Not sure if it's a memory issue because the plots are so big or what's going on.
I have tried both (as you'll see in my sample code) plt.figure.clf() and plt.close(). I've even tried doing a time.sleep inbetween plots. The following is an example snippet of my function that does the plotting (not the real code, just an example necessary to illustrate my problem).
I am using Windows, Python 2.7, and Matplotlib 1.4.2
def graph_parameters(time, a, b, c, d, e): #a,b,c,d,e are my parameters
delay = 10
plt.figure(1)
plt.figure(facecolor="white")
plt.plot(time, a)
plt.ylabel('a')
plt.xlabel('Time(s)')
plt.savefig('fig1.png')
plt.figure(1).clf()
plt.close('all')
time.sleep(delay)
plt.figure(2)
plt.figure(facecolor="white")
plt.plot(time, b)
plt.ylabel('b')
plt.xlabel('Time(s)')
plt.savefig('fig2.png')
plt.figure(2).clf()
plt.close('all')
time.sleep(delay)
etc.
Thanks in advance.
I'm not completely sure, but I think it was a memory issue. I found a way (from another stackoverflow entry) to delete variables I don't need. Per my example above, if I only want to keep 'time_' and parameters 'a_','b_','c_','d_' and 'e_', I run the following at the end of my main code:
graph_array = np.array([time_, a_, b_, c_, d_, e_])
#Delete variables that aren't going to be plotted in order to free up memory
attributes = sys.modules[__name__]
for name in dir():
if name[0]!='_' and np.all( name!= np.array(['graph_array', 'attributes','np', 'gc', 'graph_parameters']) ):
delattr(attributes, name)
gc.collect()
graph_parameters(*graph_array)
Basically what the for loop does is keep private and magical function names and names of specified variables, and deletes all other names. I got this idea from the answer to the following link. How do I clear all variables in the middle of a Python script?
I watched my task manager as the code ran, and there was a significant drop in memory usage (~1,600,000 K to ~100,000 K) by Python right before plots began being saved, indicating that this method did free up memory. Also, no crash.
Related
I try to solve 2 coupled equations systems, called here system A and system B. One of these 2 systems are an ODE system.
To avoid to copy the shared data between the 2 systems, I would like have a structure with pointers. To that, I use the mechanism of numpy.view.
A bit of code :
import numpy as np
import scipy
t0,t1,dt = 0.0,5.0, 1.0
data = np.ones((5,2))
data[:,1]*=2
y=np.array([0.0,0.0]) ### no matter default value
r = scipy.integrate.ode(f)
r.set_integrator('dopri5', rtol=1e-3, atol=1e-6 )
r.set_f_params(0.05)
#r.set_initial_value(y, t0); r._y = data[2] ### Apparently equivalent
r.set_initial_value(data[2], t0) ### Apparently equivalent
print(np.shares_memory(r.y,y))
print(np.shares_memory(r.y,data))
Here, at the initial state, I have a synchronization between r.y (system A) and data[2] (the variable named data is the data of system B). If I modify one, the other is also modified and vice versa. Tape the command r.y.base confirm that r.y is just a view of the array named data. That the behavior that I desired.
Now, the problem start here. If I make progress my EDO system :
while r.successful() and r.t < t1:
r.integrate(r.t+dt, step=True)
print(r.t+dt,r.y)
print(np.shares_memory(r.y,data))
print(data)
data and r.y are no more synchronized. r.y are no more a view of data.
It looks that the integrate function creates a new instance of its attribute r.y rather than just update it. I have read the source code of this function
https://github.com/scipy/scipy/blob/v0.19.1/scipy/integrate/_ode.py#L396
but it rapidly refers to fortran code, and my understanding abilities stop here.
How can I solve (or got round) this problem by a different way of the data copy between r.y and data (that also implies a manual management of the synchronisation) ?
Is it possible that is a bug in scipy ?
Thanks for your help
what I am trying to do is having a script compute something, prepare a plot and show the already obtained results as a pylab.figure - in python 2 (specifically python 2.7) with a stable matplotlib (which is 1.1.1).
In python 3 (python 3.2.3 with a matplotlib git build ... version 1.2.x), this works fine. As a simple example (simulating a lengthy computation by time.sleep()) consider
import pylab
import time
import random
dat=[0,1]
pylab.plot(dat)
pylab.ion()
pylab.draw()
for i in range (18):
dat.append(random.uniform(0,1))
pylab.plot(dat)
pylab.draw()
time.sleep(1)
In python 2 (version 2.7.3 vith matplotlib 1.1.1), the code runs cleanly without errors but does not show the figure. A little trial and error with the python2 interpreter seemed to suggest to replace pylab.draw() with pylab.show(); doing this once is apparently sufficient (not, as with draw calling it after every change/addition to the plot). Hence:
import pylab
import time
import random
dat=[0,1]
pylab.plot(dat)
pylab.ion()
pylab.show()
for i in range (18):
dat.append(random.uniform(0,1))
pylab.plot(dat)
#pylab.draw()
time.sleep(1)
However, this doesn't work either. Again, it runs cleanly but does not show the figure. It seems to do so only when waiting for user input. It is not clear to me why this is, but the plot is finally shown when a raw_input() is added to the loop
import pylab
import time
import random
dat=[0,1]
pylab.plot(dat)
pylab.ion()
pylab.show()
for i in range (18):
dat.append(random.uniform(0,1))
pylab.plot(dat)
#pylab.draw()
time.sleep(1)
raw_input()
With this, the script will of course wait for user input while showing the plot and will not continue computing the data before the user hits enter. This was, of course, not the intention.
This may be caused by different versions of matplotlib (1.1.1 and 1.2.x) or by different python versions (2.7.3 and 3.2.3).
Is there any way to accomplish with python 2 with a stable (1.1.1) matplotlib what the above script (the first one) does in python 3, matplotlib 1.2.x:
- computing data (which takes a while, in the above example simulated by time.sleep()) in a loop or iterated function and
- (while still computing) showing what has already been computed in previous iterations
- and not bothering the user to continually hit enter for the computation to continue
Thanks; I'd appreciate any help...
You want the pause function to give the gui framework a chance to re-draw the screen:
import pylab
import time
import random
import matplotlib.pyplot as plt
dat=[0,1]
fig = plt.figure()
ax = fig.add_subplot(111)
Ln, = ax.plot(dat)
ax.set_xlim([0,20])
plt.ion()
plt.show()
for i in range (18):
dat.append(random.uniform(0,1))
Ln.set_ydata(dat)
Ln.set_xdata(range(len(dat)))
plt.pause(1)
print 'done with loop'
You don't need to create a new Line2D object every pass through, you can just update the data in the existing one.
Documentation:
pause(interval)
Pause for *interval* seconds.
If there is an active figure it will be updated and displayed,
and the gui event loop will run during the pause.
If there is no active figure, or if a non-interactive backend
is in use, this executes time.sleep(interval).
This can be used for crude animation. For more complex
animation, see :mod:`matplotlib.animation`.
This function is experimental; its behavior may be changed
or extended in a future release.
A really over-kill method to is to use the matplotlib.animate module. On the flip side, it gives you a nice way to save the data if you want (ripped from my answer to Python- 1 second plots continous presentation).
example, api, tutorial
Some backends (in my experience "Qt4Agg") require the pause function, as #tcaswell suggested.
Other backends (in my experience "TkAgg") seem to just update on draw() without requiring a pause. So another solution is to switch your backend, for example with matplotlib.use('TkAgg').
In Matlab, you can use drawnow to view a calculation's result while it's in progress. I have tried a similar syntax in Python, both with matplotlib and mayavi.
I know it's possible to animate in one dimension with ion and set_data. However, animating in two dimensions (via imshow) is useful and I couldn't find an easy way to do that.
I know it's possible to animate using a function call but that's not as useful for algorithm development (since you can't use IPython's %run and query your program).
In matplotlib, I can use
N = 16
first_image = arange(N*N).reshape(N,N)
myobj = imshow(first_image)
for i in arange(N*N):
first_image.flat[i] = 0
myobj.set_data(first_image)
draw()
to animate an image but this script doesn't respond to <Cntrl-C> -- it hangs and disables future animations (on this machine). Despite this SO answer, different ways of invoking this animation process does not work. How do I view 2D data as it's being calculated?
EDIT: I have since made a package called python-drawnow to implement the following answer.
You just want to visualize the data from some complicated computation and not smoothly animate an image, correct? Then you can just define some simple functions:
def drawnow(draw_fig, wait_secs=1):
"""
draw_fig: (callable, no args by use of python's global scope) your
function to draw the figure. it should include the figure() call --
just like you'd normally do it. However, you must leave out the
show().
wait_secs : optional, how many seconds to wait. note that if this is 0
and your computation is fast, you don't really see the plot update.
does not work in ipy-qt. only works in the ipython shell.
"""
close()
draw_fig()
draw()
time.sleep(wait_secs)
def drawnow_init():
ion()
An example of this:
def draw_fig():
figure()
imshow(z, interpolation='nearest')
#show()
N = 16
x = linspace(-1, 1, num=N)
x, y = meshgrid(x, x)
z = x**2 + y**2
drawnow_init()
for i in arange(2*N):
z.flat[i] = 0
drawnow(draw_fig)
Notice this requires that the variables you're plotting be global. This shouldn't be a problem since it seems that the variables you want to visualize are global.
This method responds fine to cntrl-c and is visible even during fast computations (through wait_secs.
EDIT: SOLVED
But i'm not sure how... i moved as much as I could to pylab instead of pyplot, implemented a multiprocessing approach described [here][1], and at that point it was working fine. Then in an attempt to pinpoint the issue, i reversed my code step-by-step. But it kept working, even without the multiprocessing, even with pyplot, etc. Only when i take off fig.clf() now it does not work, which is what most people seem to experience but was not the issue with me initially. Well maybe it was, maybe the clf() statement wasnt at the right place or something. Thanks !
EDIT: ISSUE STILL NOT SOLVED
That's very surprising. I now moved my savefig() function into an external module, which I import when executing my script. Here is the function:
def savethefig(fig,outpath,luname):
plt.savefig(outpath+luname+'testredtarget.png',dpi=500)
So now i do something like:
from myfile import savethefig
fig = plt.figure()
ax1 = fig.add_subplot(311)
pftmap=zeros(shape=(1800,3600))*nan
for i in range(len(pftspatialnames)):
pftmap[indlat,indlon]=data[i,:]
fig = pylab.gcf()
a1=ax1.imshow(pftmap, interpolation='nearest',origin='upper', aspect='auto')
savethefig(fig,outpath,luname)
I did everything step by step, line by line and the RAM definitely goes up when hiting the savefig() function within the external function. Goes up by about 500MB. Then when going back to the main script, that memory is not released.
Aren't external function supposed to clear everything ? I'm missing something...
ORIGINAL POST:
I'm using python EDP 7.3-2 (32 bit) (python 2.7.3).
I have a program that does some computations, then maps some of the results out and saves the images, with matplotlib.
That's a pretty large amount of data, and if i try to map and save too many images, I hit a memory limit. Which I shouldnt since I re-use the same figure, and dont create new variables. I've struggled for a while with that, tried all the figure clearing/deleting etc solutions, changing the backend used by matplotlib, but nothing does it, everytime the code hits the savefig function, it adds a lot of memory and does not take it off later.
I'm far from being an expert on memory stuff (or python by the way), but here is one diagnose attempt i ran, using the module objgraph:
from numpy import *
import matplotlib
matplotlib.use('Agg')
from matplotlib.figure import Figure
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
from matplotlib import pyplot as plt
outpath = '/Users/yannick/Documents/outputs/'
pftspatialnames = ['forest','shrubs','grass']
Computations (not shown)
fig = plt.figure()
for i in range(len(pftspatialnames)):
pftmap=zeros(shape=(1800,3600))*nan
pftmap[indlat,indlon]=data[i,:]
fig = pylab.gcf()
ax1 = plt.subplot2grid((3,1),(0,0))
a1=ax1.imshow(pftmap, interpolation='nearest',origin='upper', aspect='auto')
del(pftmap)
gc.collect()
print 'MEMORY LEAK DETECTOR before figsave'
objgraph.show_growth()
plt.savefig(outpath+pftspatialnames[i]+'testredtarget.png', dpi=500)
print 'MEMORY LEAK DETECTOR after figsave'
objgraph.show_growth()
Pretty big maps (and there's 3 of them, 3 subplots, just showed one here), but it handles it pretty well. It takes about 4 figures to saturate memory. Some stuff may be superfluous (e.g. deleting pftmap), but i was trying everything to clear some memory.
And here is the printed output:
MEMORY LEAK DETECTOR before figsave
dict 6640 +2931
weakref 3218 +1678
instance 1440 +1118
tuple 3501 +939
function 12486 +915
Bbox 229 +229
instancemethod 684 +171
Line2D 147 +147
TransformedBbox 115 +115
Path 127 +114
MEMORY LEAK DETECTOR after figsave
dict 7188 +548
Path 422 +295
weakref 3494 +276
instance 1679 +239
tuple 3703 +202
function 12670 +184
TransformedPath 87 +87
instancemethod 741 +57
Line2D 201 +54
FontProperties 147 +36
So before calling savefig there's a lot of new objects (that's normal i do a bunch of stuff in the code before). But then by just calling savefig we add 550 dict, etc. Could that be the source of my problem ? Note that this happens the first time i call savefig only. Any subsequent call does the following:
MEMORY LEAK DETECTOR before figsave
MEMORY LEAK DETECTOR after figsave
tuple 3721 +6
dict 7206 +6
function 12679 +3
list 2001 +3
weakref 3503 +3
instance 1688 +3
Bbox 260 +3
but memory still keeps growing and soon i hit memory limit.
Thanks a lot !
Googled anecdotal evidence seems to suggest that executing 'matplotlib.pyplot.close' after a savefig will reclaim/free the memory associated with the figure. See pyplot docs for all the calling options.
I forgot which stack overflow thread/website I found this solution at, but using the figure directly (and not touching the pyplot state machine) usually does the trick. So:
fig = figure.Figure()
canvas = FigureCanvas(fig)
...
canvas.print_figure(outpath+pftspatialnames[i]+'testredtarget.png', dpi=500)
EDIT: SOLVED
But i'm not sure how... i moved as much as I could to pylab instead of pyplot, implemented a multiprocessing approach described [here][1], and at that point it was working fine. Then in an attempt to pinpoint the issue, i reversed my code step-by-step. But it kept working, even without the multiprocessing, even with pyplot, etc. Only when i take off fig.clf() now it does not work, which is what most people seem to experience but was not the issue with me initially. Well maybe it was, maybe the clf() statement wasnt at the right place or something. Thanks
I am basically building a 3D scatter plot using primitive UV spheres and am running into memory issues when attempting to create more than a couple hundred points at one time. I am limited on my laptop with a 2.1Ghz processor but wanted to know if there is a better way to write this:
import bpy
import random
while count < 5:
bpy.ops.mesh.primitive_uv_sphere_add(size=.3,\
location=(random.randint(-9,9), random.randint(-9,9),\
random.randint(-9,9)), rotation=(0,0,0))
count += 1
I realize that with such a simple script any performance increase is likely negligible but wanted to give it a shot anyway.
Some possible suggestions
I would pre-calculate the x,y,z values, store them in a mathutil vector and add it to a dict to be iterated over.
Duplication should provide a smaller memory footprint than
instantiating new objects. bpy.ops.object.duplicate_move(OBJECT_OT_duplicate=(linked:false, TRANSFORM_OT_translate=(transform)
Edit:
Doing further research it appears each time a bpy.ops.* is called the redraw function . One user documentented exponential increase in time taken to genenerate UV sphere.
CoDEmanX provided the following code snippet to another user.
import bpy
bpy.ops.object.select_all(action='DESELECT')
bpy.ops.mesh.primitive_uv_sphere_add()
sphere = bpy.context.object
for i in range(-1000, 1000, 2):
ob = sphere.copy()
ob.location.y = i
#ob.data = sphere.data.copy() # uncomment this, if you want full copies and no linked duplicates
bpy.context.scene.objects.link(ob)
bpy.context.scene.update()
Then it is just a case of adapting the code to set the object locations
obj.location = location_dict[i]