Faster way to provide rotation to scatter point plots in matplotlib? - python

Currently I use the following to plot a set of rotated lines (geologic strike indicators). However, this section of code takes a long time even with only a modest amount of strikes (5000). Each point has a unique rotation. Is there a way to give matplotlib a list with the rotations and perform the plotting faster than rotating one-by-one like this?
sample=#3d-array of points(x,y,theta) where theta is an amount I want to rotate the points by.
for i in range(len(sample.T)):
t = matplotlib.markers.MarkerStyle(marker='|')
t._transform = t.get_transform().rotate_deg(sample[2,i])
plt.scatter(sample[0,i],sample[1,i],marker=t,s=50,c='0',linewidth=1)

Here you create 5000 individual scatter plots. That is for sure inefficient. You may use a solution I proposed in this answer, namely to set the individual markers as paths to a PathCollection. This would work similar to a scatter, with an additional argument m for the markers.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.markers as mmarkers
def mscatter(x,y,ax=None, m=None, **kw):
import matplotlib.markers as mmarkers
if not ax: ax=plt.gca()
sc = ax.scatter(x,y,**kw)
if (m is not None) and (len(m)==len(x)):
paths = []
for marker in m:
if isinstance(marker, mmarkers.MarkerStyle):
marker_obj = marker
else:
marker_obj = mmarkers.MarkerStyle(marker)
path = marker_obj.get_path().transformed(
marker_obj.get_transform())
paths.append(path)
sc.set_paths(paths)
return sc
np.random.seed(42)
data = np.random.rand(5000,3)
data[:,2] *= 360
markers = []
fig, ax = plt.subplots()
for i in range(len(data)):
t = mmarkers.MarkerStyle(marker='|')
t._transform = t.get_transform().rotate_deg(data[i,2])
markers.append(t)
mscatter(data[:,0], data[:,1], m=markers, s=50, c='0', linewidth=1)
plt.show()
If we time this we find that this takes ~250 ms to create the plot with 5000 points and 5000 different angles. The loop solution would in contrast take more than 12 seconds.
So far for the general question on how to rotate many markers. For the special case here, it seems you want to use simple line markers. This could easily be done using a quiver plot. One may then turn the arrow heads off to have the arrows look like lines.
fig, ax = plt.subplots()
ax.quiver(data[:,0], data[:,1], 1,1, angles=data[:,2]+90, scale=1/10, scale_units="dots",
units="dots", color="k", pivot="mid",width=1, headwidth=1, headlength=0)
The result is pretty much the same, with the benefit of this plot only taking ~80 ms, which is again three times faster than the PathCollection.

Related

how to get different line colors depending on one variable for different plots in one single figure in python? [duplicate]

This question already has an answer here:
Drawing a colorbar aside a line plot, using Matplotlib
(1 answer)
Closed 1 year ago.
Let's say I have one figure with a certain number of plots, which resembles like this one:
where the colors of the single plots are decided automatically by matplotlib. The code to obtain this is very simple:
for i in range(len(some_list)):
x, y = some_function(dataset, some_list[i])
plt.plot(x, y)
Now suppose that all these lines depend on a third variable z. I would like to include this information plotting the given lines with a color that gives information about the magnitude of z, possibly using a colormap and a colorbar on the right side of the figure. What would you suggest me to do? I exclude to use a legend since in my figures I have many more lines that the ones I am showing. All information I can find is about how to draw one single line with different colors, but this is not what I am looking for. I thank you in advance!
Here it is some code that, in my opinion, you can easily adapt to your problem
import numpy as np
import matplotlib.pyplot as plt
from random import randint
# generate some data
N, vmin, vmax = 12, 0, 20
rd = lambda: randint(vmin, vmax)
segments_z = [((rd(),rd()),(rd(),rd()),rd()) for _ in range(N)]
# prepare for the colorization of the lines,
# first the normalization function and the colomap we want to use
norm = plt.Normalize(vmin, vmax)
cm = plt.cm.rainbow
# most important, plt.plot doesn't prepare the ScalarMappable
# that's required to draw the colorbar, so we'll do it instead
sm = plt.cm.ScalarMappable(cmap=cm, norm=norm)
# plot the segments, the segment color depends on z
for p1, p2, z in segments_z:
x, y = zip(p1,p2)
plt.plot(x, y, color=cm(norm(z)))
# draw the colorbar, note that we pass explicitly the ScalarMappable
plt.colorbar(sm)
# I'm done, I'll show the results,
# you probably want to add labels to the axes and the colorbar.
plt.show()

Matplotlib: multiple 3D lines all get drawn using the final y-value in my loop

I am trying to plot multiple lines in a 3D figure. Each line represents a month: I want them displayed parallel in the y-direction.
My plan was to loop over a set of Y values, but I cannot make this work properly, as using the ax.plot command (see working code below) produces a dozen lines all at the position of the final Y value. Confusingly, swapping ax.plot for ax.scatter does produce a set of parallel lines of data (albeit in the form of a set of dots; ax.view_init set to best display the parallel aspect of the result).
How can I use a produce a plot with multiple parallel lines?
My current workaround is to replace the loop with a dozen different arrays of Y values, and that can't be the right answer.
from mpl_toolkits.mplot3d.axes3d import Axes3D
import matplotlib.pyplot as plt
import numpy as np
# preamble
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
cs = ['r','g','b','y','r','g','b','y','r','g','b','y']
# x axis
X = np.arange(24)
# y axis
y = np.array([15,45,75,105,135,165,195,225,255,285,315,345])
Y = np.zeros(24)
# data - plotted against z axis
Z = np.random.rand(24)
# populate figure
for step in range(0,12):
Y[:] = y[step]
# ax.plot(X,Y,Z, color=cs[step])
ax.scatter(X,Y,Z, color=cs[step])
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')
# set initial view of plot
ax.view_init(elev=80., azim=345.)
plt.show()
I'm still learning python, so simple solutions (or, preferably, those with copious explanatory comments) are greatly appreciated.
Use
ax.plot(X, np.array(Y), Z, color=cs[step])
or
Y = [y[step]] * 24
This looks like a bug in mpl where we are not copying data when you hand it in so each line is sharing the same np.array object so when you update it all of your lines.

Wireframe joins the wrong way in numpy matplotlib mplot3d

I'm trying to create a 3D wireframe in Python using matplotlib.
When I get to the actual graph plotting, however, the wireframe joins the wrong way, as shown in the images below.
How can I force matplotlib to join the wireframe along a certain axis?
My code is below:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import axes3d
def rossler(x_n, y_n, z_n, h, a, b, c):
#defining the rossler function
x_n1=x_n+h*(-y_n-z_n)
y_n1=y_n+h*(x_n+a*y_n)
z_n1=z_n+h*(b+z_n*(x_n-c))
return x_n1,y_n1,z_n1
#defining a, b, and c
a = 1.0/5.0
b = 1.0/5.0
c = 5
#defining time limits and steps
t_0 = 0
t_f = 32*np.pi
h = 0.01
steps = int((t_f-t_0)/h)
#3dify
c_list = np.linspace(5,10,6)
c_size = len(c_list)
c_array = np.zeros((c_size,steps))
for i in range (0, c_size):
for j in range (0, steps):
c_array[i][j] = c_list[i]
#create plotting values
t = np.zeros((c_size,steps))
for i in range (0, c_size):
t[i] = np.linspace(t_0,t_f,steps)
x = np.zeros((c_size,steps))
y = np.zeros((c_size,steps))
z = np.zeros((c_size,steps))
binvar, array_size = x.shape
#initial conditions
x[0] = 0
y[0] = 0
z[0] = 0
for j in range(0, c_size-1):
for i in range(array_size-1):
c = c_list[j]
#re-evaluate the values of the x-arrays depending on the initial conditions
[x[j][i+1],y[j][i+1],z[j][i+1]]=rossler(x[j][i],y[j][i],z[j][i],t[j][i+1]-t[j][i],a,b,c)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_wireframe(t,x,c_array, rstride=10, cstride=10)
plt.show()
I am getting this as an output:
The same output from another angle:
Whereas I'd like the wireframe to join along the wave-peaks. Sorry, I can't give you an image I'd like to see, that's my problem, but I guess it'd be more like the tutorial image.
If I understood, you want to link the 6 traces with polygons. You can do that by triangulating the traces 2 by 2, then plotting the surface with no edges or antialising. Maybe choosing a good colormap will also help.
Just keep in mind that this will be a very heavy plot. The exported SVG weight 10mb :)
import matplotlib.tri as mtri
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
for LineIndex in range(c_size-1):
# If plotting all at once, you get a MemoryError. I'll plot each 6 points
for Sample in range(0, array_size-1, 3):
# I switched x and c_array, because the surface and the triangles
# will look better by default
X = np.concatenate([t[LineIndex,Sample:Sample+3], t[LineIndex+1,Sample:Sample+3]])
Y = np.concatenate([c_array[LineIndex,Sample:Sample+3], c_array[LineIndex+1,Sample:Sample+3]])
Z = np.concatenate([x[LineIndex,Sample:Sample+3], x[LineIndex+1,Sample:Sample+3]])
T = mtri.Triangulation(X, Y)
ax.plot_trisurf(X, Y, Z, triangles=T.triangles, edgecolor='none', antialiased=False)
ax.set_xlabel('t')
ax.set_zlabel('x')
plt.savefig('Test.png', format='png', dpi=600)
plt.show()
Here is the resulting image:
I'm quite unsure about what you're exactly trying to achieve, but I don't think it will work.
Here's what your data looks like when plotted layer by layer (without and with filling):
You're trying to plot this as a wireframe plot. Here's how a wireframe plot looks like as per the manual:
Note the huge differene: a wireframe plot is essentially a proper surface plot, the only difference is that the faces of the surface are fully transparent. This also implies that you can only plot
single-valued functions of the form z(x,y), which are furthermore
specified on a rectangular mesh (at least topologically)
Your data is neither: your points are given along lines, and they are stacked on top of each other, so there's no chance that this is a single surface that can be plotted.
If you just want to visualize your functions above each other, here's how I plotted the above figures:
from mpl_toolkits.mplot3d.art3d import Poly3DCollection
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
for zind in range(t.shape[0]):
tnow,xnow,cnow = t[zind,:],x[zind,:],c_array[zind,:]
hplot = ax.plot(tnow,xnow,cnow)
# alternatively fill:
stride = 10
tnow,xnow,cnow = tnow[::stride],xnow[::stride],cnow[::stride]
slice_from = slice(None,-1)
slice_to = slice(1,None)
xpoly = np.array([tnow[slice_from],
tnow[slice_to],
tnow[slice_to],
tnow[slice_from]]
).T
ypoly = np.array([xnow[slice_from],
xnow[slice_to],
np.zeros_like(xnow[slice_to]),
np.zeros_like(xnow[slice_from])]
).T
zpoly = np.array([cnow[slice_from],
cnow[slice_to],
cnow[slice_to],
cnow[slice_from]]
).T
tmppoly = [tuple(zip(xrow,yrow,zrow)) for xrow,yrow,zrow in zip(xpoly,ypoly,zpoly)]
poly3dcoll = Poly3DCollection(tmppoly,linewidth=0.0)
poly3dcoll.set_edgecolor(hplot[0].get_color())
poly3dcoll.set_facecolor(hplot[0].get_color())
ax.add_collection3d(poly3dcoll)
plt.xlabel('t')
plt.ylabel('x')
plt.show()
There is one other option: switching your coordinate axes, such that the (x,t) pair corresponds to a vertical plane rather than a horizontal one. In this case your functions for various c values are drawn on parallel planes. This allows a wireframe plot to be used properly, but since your functions have extrema in different time steps, the result is as confusing as your original plot. You can try using very few plots along the t axis, and hoping that the extrema are close. This approach needs so much guesswork that I didn't try to do this myself. You can plot each function as a filled surface instead, though:
from matplotlib.collections import PolyCollection
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
for zind in range(t.shape[0]):
tnow,xnow,cnow = t[zind,:],x[zind,:],c_array[zind,:]
hplot = ax.plot(tnow,cnow,xnow)
# alternative to fill:
stride = 10
tnow,xnow,cnow = tnow[::stride],xnow[::stride],cnow[::stride]
slice_from = slice(None,-1)
slice_to = slice(1,None)
xpoly = np.array([tnow[slice_from],
tnow[slice_to],
tnow[slice_to],
tnow[slice_from]]
).T
ypoly = np.array([xnow[slice_from],
xnow[slice_to],
np.zeros_like(xnow[slice_to]),
np.zeros_like(xnow[slice_from])]
).T
tmppoly = [tuple(zip(xrow,yrow)) for xrow,yrow in zip(xpoly,ypoly)]
polycoll = PolyCollection(tmppoly,linewidth=0.5)
polycoll.set_edgecolor(hplot[0].get_color())
polycoll.set_facecolor(hplot[0].get_color())
ax.add_collection3d(polycoll,zdir='y',zs=cnow[0])
hplot[0].set_color('none')
ax.set_xlabel('t')
ax.set_zlabel('x')
plt.show()
This results in something like this:
There are a few things to note, however.
3d scatter and wire plots are very hard to comprehend, due to the lacking depth information. You might be approaching your visualization problem in a fundamentally wrong way: maybe there are other options with which you can visualize your data.
Even if you do something like the plots I showed, you should be aware that matplotlib has historically been failing to plot complicated 3d objects properly. Now by "properly" I mean "with physically reasonable apparent depth", see also the mplot3d FAQ note describing exactly this. The core of the problem is that matplotlib projects every 3d object to 2d, and draws these pancakes on the sreen one after the other. Sometimes the asserted drawing order of the pancakes doesn't correspond to their actual relative depth, which leads to artifacts that are both very obvious to humans and uncanny to look at. If you take a closer look at the first filled plot in this post, you'll see that the gold flat plot is behind the magenta one, even though it should be on top of it. Similar things often happen with 3d bar plots and convoluted surfaces.
When you're saying "Sorry, I can't give you an image I'd like to see, that's my problem", you're very wrong. It's not just your problem. It might be crystal clear in your head what you're trying to achieve, but unless you very clearly describe what you see in your head, the outside world will have to resort to guesswork. You can make the work of others and yourself alike easier by trying to be as informative as possible.

Can I pass a list of colors for points to matplotlib's 'Axes.plot()'?

I've got a lot of points to plot and am noticing that plotting them individually in matplotlib takes much longer (more than 100 times longer, according to cProfile) than plotting them all at once.
However, I need to color code the points (based on data associated with each one) and can't figure out how to plot more than one color for a given call to Axes.plot(). For example, I can get a result similar to the one I want with something like
fig, ax = matplotlib.pyplot.subplots()
rands = numpy.random.random_sample((10000,))
for x in range(10000):
ax.plot(x, rands[x], 'o', color=str(rands[x]))
matplotlib.pyplot.show()
but would rather do something much faster like
fig, ax = matplotlib.pyplot.subplots()
rands = numpy.random.random_sample((10000,))
# List of colors doesn't work
ax.plot(range(10000), rands, 'o', color=[str(y) for y in rands])
matplotlib.pyplot.show()
but providing a list as the value for color doesn't work in this way.
Is there a way to provide a list of colors (and for that matter, edge colors, face colors , shapes, z-order, etc.) to Axes.plot() so that each point can potentially be customized, but all points can be plotted at once?
Using Axes.scatter() seems to get part way there, since it allows for individual setting of point color; but color is as far as that seems to go. (Axes.scatter() also lays out the figure completely differently.)
It is about 5 times faster for me to create the objects (patches) directly. To illustrate the example, I have changed the limits (which have to be set manually with this method). The circle themselves are draw with matplotlib.path.Path.circle. Minimal working example:
import numpy as np
import pylab as plt
from matplotlib.patches import Circle
from matplotlib.collections import PatchCollection
fig, ax = plt.subplots(figsize=(10,10))
rands = np.random.random_sample((N,))
patches = []
colors = []
for x in range(N):
C = Circle((x/float(N), rands[x]), .01)
colors.append([rands[x],rands[x],rands[x]])
patches.append(C)
plt.axis('equal')
ax.set_xlim(0,1)
ax.set_ylim(0,1)
collection = PatchCollection(patches)
collection.set_facecolor(colors)
ax.add_collection(collection)
plt.show()

Speeding up matplotlib scatter plots

I'm trying to make an interactive program which primarily uses matplotlib to make scatter plots of rather a lot of points (10k-100k or so). Right now it works, but changes take too long to render. Small numbers of points are ok, but once the number rises things get frustrating in a hurry. So, I'm working on ways to speed up scatter, but I'm not having much luck
There's the obvious way to do thing (the way it's implemented now)
(I realize the plot redraws without updating. I didn't want to alter the fps result with large calls to random).
import matplotlib.pyplot as plt
import numpy as np
import matplotlib as mpl
import time
X = np.random.randn(10000) #x pos
Y = np.random.randn(10000) #y pos
C = np.random.random(10000) #will be color
S = (1+np.random.randn(10000)**2)*3 #size
#build the colors from a color map
colors = mpl.cm.jet(C)
#there are easier ways to do static alpha, but this allows
#per point alpha later on.
colors[:,3] = 0.1
fig, ax = plt.subplots()
fig.show()
background = fig.canvas.copy_from_bbox(ax.bbox)
#this makes the base collection
coll = ax.scatter(X,Y,facecolor=colors, s=S, edgecolor='None',marker='D')
fig.canvas.draw()
sTime = time.time()
for i in range(10):
print i
#don't change anything, but redraw the plot
ax.cla()
coll = ax.scatter(X,Y,facecolor=colors, s=S, edgecolor='None',marker='D')
fig.canvas.draw()
print '%2.1f FPS'%( (time.time()-sTime)/10 )
Which gives a speedy 0.7 fps
Alternatively, I can edit the collection returned by scatter. For that, I can change color and position, but don't know how to change the size of each point. That would I think look something like this
import matplotlib.pyplot as plt
import numpy as np
import matplotlib as mpl
import time
X = np.random.randn(10000) #x pos
Y = np.random.randn(10000) #y pos
C = np.random.random(10000) #will be color
S = (1+np.random.randn(10000)**2)*3 #size
#build the colors from a color map
colors = mpl.cm.jet(C)
#there are easier ways to do static alpha, but this allows
#per point alpha later on.
colors[:,3] = 0.1
fig, ax = plt.subplots()
fig.show()
background = fig.canvas.copy_from_bbox(ax.bbox)
#this makes the base collection
coll = ax.scatter(X,Y,facecolor=colors, s=S, edgecolor='None', marker='D')
fig.canvas.draw()
sTime = time.time()
for i in range(10):
print i
#don't change anything, but redraw the plot
coll.set_facecolors(colors)
coll.set_offsets( np.array([X,Y]).T )
#for starters lets not change anything!
fig.canvas.restore_region(background)
ax.draw_artist(coll)
fig.canvas.blit(ax.bbox)
print '%2.1f FPS'%( (time.time()-sTime)/10 )
This results in a slower 0.7 fps. I wanted to try using CircleCollection or RegularPolygonCollection, as this would allow me to change the sizes easily, and I don't care about changing the marker. But, I can't get either to draw so I have no idea if they'd be faster. So, at this point I'm looking for ideas.
I've been through this a few times trying to speed up scatter plots with large numbers of points, variously trying:
Different marker types
Limiting colours
Cutting down the dataset
Using a heatmap / grid instead of a scatter plot
And none of these things worked. Matplotlib is just not very performant when it comes to scatter plots. My only recommendation is to use a different plotting library, though I haven't personally found one that was suitable. I know this doesn't help much, but it may save you some hours of fruitless tinkering.
We are actively working on performance for large matplotlib scatter plots.
I'd encourage you to get involved in the conversation (http://matplotlib.1069221.n5.nabble.com/mpl-1-2-1-Speedup-code-by-removing-startswith-calls-and-some-for-loops-td41767.html) and, even better, test out the pull request that has been submitted to make life much better for a similar case (https://github.com/matplotlib/matplotlib/pull/2156).
HTH

Categories