Improve performance of drawing with Matplotlib - python

I am using matplotlib to plot more than one hundred of graphs. This is currently too slow and I'd like to optimize the code a bit.
Each figure contains up to 20 lines that am drawing this way (simplified):
f, ax = plt.subplots(1)
for i, y in enumerate(data):
ax.plot(tasks, res, marker=markers[i], label=labels[i])
I suppose that the method plot is actually drawing too much stuff (such as the axis). I tried using line.set_ydata but this replaced the previous line.
Is there a way to do something similar but faster?

Related

Plotting multiple scatter plots with acceptable speed

I have a lot of data to create scatter plots for, with each individual plot needing to be saved. Each plot shares the same axis. Currently I have this which works:
for i in dct:
plt.figure()
plt.scatter(time_values, dct[i])
plt.title(i)
plt.xlabel("Time")
plt.ylabel("values")
plt.xticks(x_labels,rotation=90)
plt.savefig(os.path.join('some_file_path','image{}.png'.format(str(self.image_counter))))
plt.close('all')
However, it is very slow at actually creating the graphs. The answer here How could I save multiple plots in a folder using Python? does what I want, however only for a normal plot. Is there anyway I can implement something like this with a scatter plot? I have tried converting by data into a 2D array, however my x_axis values are a string and so it does not accept the array
Actually, you can plot scatter plots with plt.plot(x, y, 'o') and re-use that code by example.

Superimposing some plots with a txt file

`I am trying to reproduce the attached figure step by step. My problem was that how can i plot colorbar in above figure by my data. My data is a cosmological data and it has 7 columns totally with many raw. My main goal is reproducing the present figure step by step. You can see that there are three different plots which are interpolated each other. Firstly, i tried to plot small colorful lines in the body of figure by using two columns of data. I did it by scatter plots and then i needed to reproduce the colorbar part of figure. But, it was not possible at the first attempt. Because, the colorbar points was not a part of data. Then, i obtained the values of colorbar by some calculations and added them as additional columns to data. Now, i could you the simple colorbar function to do colorbar part. And i got it. For the next step, i need to turn small curved lines to dark solid lines.
How can I do plots in matplotlib?
import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
data1 = np.loadtxt("bei_predic.txt", unpack=True)
B = np.log10(data1[3]/(4.*(data1[2])))
R = np.vstack((data1,B))
R = np.transpose(R)
D = R[~np.isnan(R).any(axis=1)]
A = plt.scatter(D[:,3],D[:,2], c=D[:,8])
cbar= plt.colorbar()
cbar.set_label("file", labelpad=+1)
plt.show()
If you could start off by telling us a little bit about the data that you are using that would be great. In order to plot the figure that you want, we must first load the data into some variables. Have you managed to do this?
Check out this example in which the author plots multicolored lines for some guidance.

matplotlib legend performance issue

I am using Jupyter-notebook with python 3.6.2 and matplotlib to plot some data.
When I plot my data, I want to add a legend to the plot (basically to know which line is which)
However calling plt.legend takes a lot of time (almost as much as the plot itself, which to my understanding should just be instant).
Minimal toy problem that reproduces the issue:
import numpy as np
import matplotlib.pyplot as plt
# Toy useless data (one milion x 4)
my_data = np.random.rand(1000000,4)
plt.plot(my_data)
#plt.legend(['A','C','G','T'])
plt.show()
The data here is just random and useless, but it reproduces my problem:
If I uncomment the plt.legend line, the run takes almost double the time
Why? Shouldn't the legend just look at the plot, see that 4 plots have been made, and draw a box assigning each color to the corresponding string?
Why is a simple legend taking so much time?
Am I missing something?
Replicating the answer by #bnaecker, such that this question is answered:
By default, the legend will be placed in the "best" location, which requires computing how many points from each line are inside a potential legend box. If there are many points, this can take a while. Drawing is much faster when specifying a location other than "best", e.g. plt.legend(loc=3).

Dynamially modify & update plot in matplotlib

Claim 1: no precomputed array
Claim 2: no set_ydata
I have read answer from Dynamically updating plot in matplotlib and How to update a plot in matplotlib? and their scenarios are different.
Case 0, you are iteratively solving large linear algebra problem, how can you monitor convergence of the solution at real time -- so you can stop the calculation once the solution shows signs of explosion and save waiting time?
Case 1, you have a two dimensional array and you are changing value at random (i, j), you can think of this as ising model or percolation model or whatever model you are familar with. How to show the evolution of the array?
Case 2, you have a triangular mesh and you are doing depth-first search of the nodes and you want to mark the path, or you are doing something simpler: tracing the mesh boundaries using half-edge data structure. A visualization at real time is of massive help for checking correctness of the algorithm implementation.
Enough background, the 0th one can be solved by the set_ydata method if your number of iterations is not large (or you don't care about memory), but case 1 and 2? I have no hint. Currently my 'solution' is to save hundreds of figures on disk. I've tried to just plot one figure and update it, but matplotlib seems prefer waiting for everything is done and plot everything at once. Can anybody tell me how to dynamically update a figure? Thank you very much!
Test code is below (currently it only plots once in the end):
from numpy import *
import matplotlib.pyplot as plt
from time import sleep
M = zeros((5,5))
fig = plt.figure(1)
for n in range(10):
i = random.randint(5)
j = random.randint(5)
print 'open site at ', i, j
M[i, j] = 1
plt.imshow(M, 'gray', interpolation = 'nearest')
plt.plot(j, i, 'ro') # also mark open site (I have reason to do this)
fig.canvas.draw()
sleep(0.3)
plt.show()
I just found the answer after finding this question: Matplotlib ion() function fails to be interactive
The code above can be made 'real-time' just by replacing the sleep(0.3) with plt.pause(0.3), no other change is necessary. This is quite unexpected and I never used plt.pause() function before, but it simply works.

Overlay transparent paths in matplotlib?

from pylab import *
plot(randn(1000), randn(1000), alpha=0.1)
[<matplotlib.lines.Line2D at 0x7f756e65a450>]
savefig('test.png')
gives this:
Where the paths are combined, and then the transparency is applied after. I want something like this:
This was post-edited in inkscape to break up the paths and then overlay them. This isn't practical with the data set I'm using, because it's too large, and basically crashes my computer when I try to open it in inkscape. Is there any way to do this in matplotlib itself?
Edit: the actual data I'm using a single long vector of geophysical data, and I'm trying to plot a phase portrait with plot(vec[:-1], vec[1:]).
You could just use a loop to create the plot:
for i in range(100):
plot(randn(10), randn(10), alpha=0.5, c='b')
will give you something similar (the transparency is "added" for every iteration of the loop):
Depending on your data set, however, I don't know how practical this approach would be.

Categories