How to generate a histogram animation with many values - python

The iteration update very slow, n+=3 for each time only but my data has 10000 elements. Like, It tries to update every single frame n=1,n=2,n=3.. but the hist function is really power consuming. I don't know if there are any way I could skip frames like from n=1 go straight to n=500 and to n=1000.
import matplotlib.animation as animation
import numpy as np
import matplotlib.pyplot as plt
n=10000
def update(curr):
if curr==n:
a.event_source.stop()
first_histogram.cla()
sec_histogram.cla()
thi_histogram.cla()
for_histogram.cla()
first_histogram.hist(x1[:curr], bins=np.arange(-6,2,0.5))
sec_histogram.hist(x2[:curr], bins=np.arange(-1,15,1))
thi_histogram.hist(x3[:curr], bins=np.arange(2,22,1))
for_histogram.hist(x4[:curr], bins=np.arange(13,21,1))
first_histogram.set_title('n={}'.format(curr))
fig=plt.figure()
gspec=gridspec.GridSpec(2,2)
first_histogram=plt.subplot(gspec[0,0])
sec_histogram=plt.subplot(gspec[0,1])
thi_histogram=plt.subplot(gspec[1,0])
for_histogram=plt.subplot(gspec[1,1])
a = animation.FuncAnimation(fig,update,blit=True,interval=1,repeat=False)
How can I make it faster ? Thank you!

There are several things to note here.
blit=True is not useful when clearing the axes in between. It would either not take effect, or you would get wrong tick labels on the axes.
It would only be useful if the axes limits do not change from frame to frame. However in a normal histogram, where more and more data is animated, this would necessarily need to be the case, else your bars either grow out of the axes, or you do not see the low numbers at the start. As an alternative, you could plot a normalized histogram (i.e. a density plot).
Also, interval=1 is not useful. You will not be able to animate 4 subplots with a 1 millisecond frame rate on any normal system. Matplotlib is too slow for that. However, consider that the human brain can usually not resolve framerates above some 25 fps, i.e. 40 ms, anyways. That's probably the frame rate to aim at (although matplotlib may not achieve that)
So a way to set this up is simply via
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
x1 = np.random.normal(-2.5, 1, 10000)
def update(curr):
ax.clear()
ax.hist(x1[:curr], bins=np.arange(-6,2,0.5))
ax.set_title('n={}'.format(curr))
fig, ax = plt.subplots()
a = animation.FuncAnimation(fig, update, frames=len(x1), interval=40, repeat=False, blit=False)
plt.show()
If you feel like you want to arrive more quickly at the final number of items in the list, use less frames. E.g. for a 25 times faster animation, show only every 25th state,
a = animation.FuncAnimation(fig, update, frames=np.arange(0, len(x1)+1, 25),
interval=40, repeat=False, blit=False)
This code runs with a framerate of 11 fps (interval of ~85 ms), so it's slower than specified, which in turn means, we could directly set interval=85.
In order to increase the frame rate one may use blitting.
For that, you will need to not update the axes limits at all. To optimize further you may precompute all the histograms to show. Note however that the axes limits should then not change, so we set them at the beginning, which leads to a different plot.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
x1 = np.random.normal(-2.5, 1, 10000)
bins = np.arange(-6,2,0.5)
hist = np.empty((len(x1), len(bins)-1))
for i in range(len(x1)):
hist[i, :], _ = np.histogram(x1[:i], bins=bins)
def update(i):
for bar, y in zip(bars, hist[i,:]):
bar.set_height(y)
text.set_text('n={}'.format(i))
return list(bars) + [text]
fig, ax = plt.subplots()
ax.set_ylim(0,hist.max()*1.05)
bars = ax.bar(bins[:-1], hist[0,:], width=np.diff(bins), align="edge")
text = ax.text(.99,.99, "", ha="right", va="top", transform=ax.transAxes)
ani = animation.FuncAnimation(fig, update, frames=len(x1), interval=1, repeat=False, blit=True)
plt.show()
Running this code give me a framerate of 215 fps, (4.6 ms per frame), so we could set the interval to 4.6 ms.
Tested in python 3.10 and matplotlib 3.5.1
10000 samples creates a 40MB animation, which exceeds the 2MB limit for posting a gif.
The following animation example uses 500 samples, x1 = np.random.normal(-2.5, 1, 500)

Related

Matplotlib pyplot in real time

I have a while function that generates two lists of numbers and at the end I plot them using matplotlib.pyplot.
I'm doing
while True:
#....
plt.plot(list1)
plt.plot(list2)
plt.show()
But in order to see the progression I have to close the plot window.
Is there a way to refresh it with the new data every x seconds?
The most robust way to do what you want is to use matplotlib.animation. Here's an example of animating two lines, one representing sine and one representing cosine.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
fig, ax = plt.subplots()
sin_l, = ax.plot(np.sin(0))
cos_l, = ax.plot(np.cos(0))
ax.set_ylim(-1, 1)
ax.set_xlim(0, 5)
dx = 0.1
def update(i):
# i is a counter for each frame.
# We'll increment x by dx each frame.
x = np.arange(0, i) * dx
sin_l.set_data(x, np.sin(x))
cos_l.set_data(x, np.cos(x))
return sin_l, cos_l
ani = animation.FuncAnimation(fig, update, frames=51, interval=50)
plt.show()
For your particular example, you would get rid of the while True and put the logic inside that while loop in the update function. Then, you just have to make sure to do set_data instead of making a whole new plt.plot call.
More details can be found in this nice blog post, the animation API, or the animation examples.
I think what you're looking for is the "animation" feature.
Here is an example
This example is a second one.

matplotlib animation duration

the code here below shows and saves an animation of random matrices in succession. My question is how can I adjust the duration of the animation that I save. The only parameters that I have here fps, and dpi control first how many seconds a frame remains and the second controls the quality of the image. What I want is to actually control the number of frames that are going to be saved in terms of the matrices the number of them that are actually stored.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
fig = plt.figure()
N = 5
A = np.random.rand(N,N)
im = plt.imshow(A)
def updatefig(*args):
im.set_array(np.random.rand(N,N))
return im,
ani = animation.FuncAnimation(fig, updatefig, interval=200, blit=True)
ani.save('try_animation.mp4', fps=10, dpi=80) #Frame per second controls speed, dpi controls the quality
plt.show()
I am wonderinf if I should add more parameters. I tried to look for the appropriate one in the class documentation in matplotlib but I was unsuccessful:
http://matplotlib.org/api/animation_api.html#module-matplotlib.animation
Years later I have built this is example that I come back to every time that I need to see how the parameters of the animation relate between themselves. I decided to share it here for whoever may find it useful.
tl/dr:
For the saved animation the duration is going to be frames * (1 / fps) (in seconds)
For the display animation the duration is going to be frames * interval / 1000 (in seconds)
The code bellow allows you to play with this setting in an environment that gives immediate visual feedback.
This code builds a clock that ticks according to the parameters:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
fig = plt.figure(figsize=(16, 12))
ax = fig.add_subplot(111)
# You can initialize this with whatever
im = ax.imshow(np.random.rand(6, 10), cmap='bone_r', interpolation='nearest')
def animate(i):
aux = np.zeros(60)
aux[i] = 1
image_clock = np.reshape(aux, (6, 10))
im.set_array(image_clock)
ani = animation.FuncAnimation(fig, animate, frames=60, interval=1000)
ani.save('clock.mp4', fps=1.0, dpi=200)
plt.show()
This will generate and save an animation that will look like this:
So the point is that the black square will move along the big white square as the time is passing. There are 60 white boxes so you can build a clock that goes over it in a minute.
Now, the important thing to note is that there are two parameters that determine how fast the black box would move: interval in the animation.FuncAnimation function and 'fps' in the ani.save function. The first controls the speed in the animation that you will display and the second in the animation that you will save.
As the code above stands you will generate 60 frames and they are displayed at 1 frame per second. That means that the clock ticks every second. If you want the saved animation clock to tick every two seconds then you should set fps=0.5. If you want the displayed animation clock to click every two seconds you should set interval=2000.
[I will edit the longer explanation as soon as I have time]
The documentation reveals that FuncAnimation accepts an argument frames, which controls the total number of frames played. Your code could thus read
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
fig = plt.figure()
N = 5
A = np.random.rand(N,N)
im = plt.imshow(A)
def updatefig(*args):
im.set_array(np.random.rand(N,N))
return im,
ani = animation.FuncAnimation(fig, updatefig, frames=10, interval=200, blit=True)
ani.save('try_animation.mp4', fps=10, dpi=80) #Frame per second controls speed, dpi controls the quality
plt.show()
to play 10 frames.

Updating the x-axis values using matplotlib animation

I am trying to use matplotlib.ArtistAnimation to animate two subplots. I want the x-axis to increase in value as the animation progresses, such that the total length of the animation is 100 but at any time the subplot is only presenting me with the time values from 0-24 and then iterates up to 100.
A great example is given here. The link uses FuncAnimation and updates the x-axis labels in a rolling fashion using plot().axes.set_xlim() and incrementing the x-values. The code is available via the link below the YouTube video in the link provided.
I have appended code below that shows my attempts to replicate these results but the x-limits seem to take on their final values instead of incrementing with time. I have also tried incrementing the solution (as opposed to the axis) by only plotting the values in the window that will be seen in the subplot, but that does not increment the x-axis values. I also tried to implement autoscaling but the x-axis still does not update.
I also found this question which is virtually the same problem, but the question was never answered.
Here is my code:
import matplotlib.pylab as plt
import matplotlib.animation as anim
import numpy as np
#create image with format (time,x,y)
image = np.random.rand(100,10,10)
#setup figure
fig = plt.figure()
ax1=fig.add_subplot(1,2,1)
ax2=fig.add_subplot(1,2,2)
#set up viewing window (in this case the 25 most recent values)
repeat_length = (np.shape(image)[0]+1)/4
ax2.set_xlim([0,repeat_length])
#ax2.autoscale_view()
ax2.set_ylim([np.amin(image[:,5,5]),np.amax(image[:,5,5])])
#set up list of images for animation
ims=[]
for time in xrange(np.shape(image)[0]):
im = ax1.imshow(image[time,:,:])
im2, = ax2.plot(image[0:time,5,5],color=(0,0,1))
if time>repeat_length:
lim = ax2.set_xlim(time-repeat_length,time)
ims.append([im, im2])
#run animation
ani = anim.ArtistAnimation(fig,ims, interval=50,blit=False)
plt.show()
I only want the second subplot (ax2) to update the x-axis values.
Any help would be much appreciated.
If you don't need blitting
import matplotlib.pylab as plt
import matplotlib.animation as animation
import numpy as np
#create image with format (time,x,y)
image = np.random.rand(100,10,10)
#setup figure
fig = plt.figure()
ax1 = fig.add_subplot(1,2,1)
ax2 = fig.add_subplot(1,2,2)
#set up viewing window (in this case the 25 most recent values)
repeat_length = (np.shape(image)[0]+1)/4
ax2.set_xlim([0,repeat_length])
#ax2.autoscale_view()
ax2.set_ylim([np.amin(image[:,5,5]),np.amax(image[:,5,5])])
#set up list of images for animation
im = ax1.imshow(image[0,:,:])
im2, = ax2.plot([], [], color=(0,0,1))
def func(n):
im.set_data(image[n,:,:])
im2.set_xdata(np.arange(n))
im2.set_ydata(image[0:n, 5, 5])
if n>repeat_length:
lim = ax2.set_xlim(n-repeat_length, n)
else:
# makes it look ok when the animation loops
lim = ax2.set_xlim(0, repeat_length)
return im, im2
ani = animation.FuncAnimation(fig, func, frames=image.shape[0], interval=30, blit=False)
plt.show()
will work.
If you need to run faster, you will need to play games with the bounding box used for blitting so that the axes labels are updated.
If you are using blitting, you can call pyplot.draw() to redraw the entire figure, each time you change y/x axis.
This updates whole figure, so is relatively slow, but it's acceptable if you don't call it many items.
This moves your axis, but is very slow.
import matplotlib.pylab as plt
import matplotlib.animation as anim
import numpy as np
image = np.random.rand(100,10,10)
repeat_length = (np.shape(image)[0]+1)/4
fig = plt.figure()
ax1 = ax1=fig.add_subplot(1,2,1)
im = ax1.imshow(image[0,:,:])
ax2 = plt.subplot(122)
ax2.set_xlim([0,repeat_length])
ax2.set_ylim([np.amin(image[:,5,5]),np.amax(image[:,5,5])])
im2, = ax2.plot(image[0:0,5,5],color=(0,0,1))
canvas = ax2.figure.canvas
def init():
im = ax1.imshow(image[0,:,:])
im2.set_data([], [])
return im,im2,
def animate(time):
time = time%len(image)
im = ax1.imshow(image[time,:,:])
im2, = ax2.plot(image[0:time,5,5],color=(0,0,1))
if time>repeat_length:
print time
im2.axes.set_xlim(time-repeat_length,time)
plt.draw()
return im,im2,
ax2.get_yaxis().set_animated(True)
# call the animator. blit=True means only re-draw the parts that have changed.
animate = anim.FuncAnimation(fig, animate, init_func=init,
interval=0, blit=True, repeat=True)
plt.show()

matplotlib major display issue with dense data sets

I've run into a fairly serious issue with matplotlib and Python. I have a dense periodogram data set and want to plot it. The issue is that when there are more data points than can be plotted on a pixel, the package does not pick the min and max to display. This means a casual look at the plot can lead you to incorrect conclusions.
Here's an example of such a problem:
The dataset was plotted with plot() and scatter() overlayed. You can see that in the dense data fields, the blue line that connects the data does not reach the actual peaks, leading a human viewer to conclude the peak at ~2.4 is the maximum, when it's really not.
If you zoom-in or force a wide viewing window, it is displayed correctly. rasterize and aa keywords have no effect on the issue.
Is there a way to ensure that the min/max points of a plot() call are always rendered? Otherwise, this needs to be addressed in an update to matplotlib. I've never had a plotting package behave like this, and this is a pretty major issue.
Edit:
x = numpy.linspace(0,1,2000000)
y = numpy.random.random(x.shape)
y[1000000]=2
plot(x,y)
show()
Should replicate the problem. Though it may depend on your monitor resolution. By dragging and resizing the window, you should see the problem. One data point should stick out a y=2, but that doesn't always display.
This is due to the path-simplification algorithm in matplotlib. While it's certainly not desirable in some cases, it's deliberate behavior to speed up rendering.
The simplification algorithm was changed at some point to avoid skipping "outlier" points, so newer versions of mpl don't exhibit this exact behavior (the path is still simplified, though).
If you don't want to simplify paths, then you can disable it in the rc parameters (either in your .matplotlibrc file or at runtime).
E.g.
import matplotlib as mpl
mpl.rcParams['path.simplify'] = False
import matplotlib.pyplot as plt
However, it may make more sense to use an "envelope" style plot. As a quick example:
import matplotlib.pyplot as plt
import numpy as np
def main():
num = 10000
x = np.linspace(0, 10, num)
y = np.cos(x) + 5 * np.random.random(num)
fig, (ax1, ax2) = plt.subplots(nrows=2)
ax1.plot(x, y)
envelope_plot(x, y, winsize=40, ax=ax2)
plt.show()
def envelope_plot(x, y, winsize, ax=None, fill='gray', color='blue'):
if ax is None:
ax = plt.gca()
# Coarsely chunk the data, discarding the last window if it's not evenly
# divisible. (Fast and memory-efficient)
numwin = x.size // winsize
ywin = y[:winsize * numwin].reshape(-1, winsize)
xwin = x[:winsize * numwin].reshape(-1, winsize)
# Find the min, max, and mean within each window
ymin = ywin.min(axis=1)
ymax = ywin.max(axis=1)
ymean = ywin.mean(axis=1)
xmean = xwin.mean(axis=1)
fill_artist = ax.fill_between(xmean, ymin, ymax, color=fill,
edgecolor='none', alpha=0.5)
line, = ax.plot(xmean, ymean, color=color, linestyle='-')
return fill_artist, line
if __name__ == '__main__':
main()

Matplotlib animation too slow ( ~3 fps )

I need to animate data as they come with a 2D histogram2d ( maybe later 3D but as I hear mayavi is better for that ).
Here's the code:
import numpy as np
import numpy.random
import matplotlib.pyplot as plt
import time, matplotlib
plt.ion()
# Generate some test data
x = np.random.randn(50)
y = np.random.randn(50)
heatmap, xedges, yedges = np.histogram2d(x, y, bins=5)
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
# start counting for FPS
tstart = time.time()
for i in range(10):
x = np.random.randn(50)
y = np.random.randn(50)
heatmap, xedges, yedges = np.histogram2d(x, y, bins=5)
plt.clf()
plt.imshow(heatmap, extent=extent)
plt.draw()
# calculate and print FPS
print 'FPS:' , 20/(time.time()-tstart)
It returns 3 fps, too slow apparently. Is it the use of the numpy.random in each iteration? Should I use blit? If so how?
The docs have some nice examples but for me I need to understand what everything does.
Thanks to #Chris I took a look at the examples again and also found this incredibly helpful post in here.
As #bmu states in he's answer (see post) using animation.FuncAnimation was the way for me.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
def generate_data():
# do calculations and stuff here
return # an array reshaped(cols,rows) you want the color map to be
def update(data):
mat.set_data(data)
return mat
def data_gen():
while True:
yield generate_data()
fig, ax = plt.subplots()
mat = ax.matshow(generate_data())
plt.colorbar(mat)
ani = animation.FuncAnimation(fig, update, data_gen, interval=500,
save_count=50)
plt.show()
I suspect it is the use of np.histogram2d in each loop iteration. or that in each loop iteration of the for loop you are clearing and drawing a new figure. To speed things up you should create a figure once and just update the properties and data of the figure in a loop. Have a look through the matplotlib animation examples for some pointers on how to do this. Typically it involves calling matplotlib.pyploy.plot then, in a loop, calling axes.set_xdata and axes.set_ydata.
In your case however, take a look at the matplotlib animation example dynamic image 2. In this example the generation of data is separated from the animation of the data (may not be a great approach if you have lots of data). By splitting these two parts up you can see which is causing a bottleneck, numpy.histrogram2d or imshow (use time.time() around each part).
P.s. np.random.randn is a psuedo-random number generator. These tend to be simple linear generators which can generate many millions of (psuedo-)random numbers per second, so this is almost certainly not your bottleneck - drawing to screen is almost always a slower process than any number crunching.

Categories