Plotting large numbers of points and edges in matplotlib - python

I have some points (about 3000) and edges (about 6000) in this type of format:
points = numpy.array([1,2],[4,5],[2,7],[3,9],[9,2])
edges = numpy.array([0,1],[3,4],[3,2],[2,4])
where the edges are indices into points, so that the start and end coordinates of each edges is given by:
points[edges]
I am looking for a faster / better way to plot them. Currently I have:
from matplotlib import pyplot as plt
x = points[:,0].flatten()
y = points[:,1].flatten()
plt.plot(x[edges.T], y[edges.T], 'y-') # Edges
plt.plot(x, y, 'ro') # Points
plt.savefig('figure.png')
I read about lineCollections, but am unsure how to use them. Is there a way to use my data more directly? What is the bottleneck here?
Some more realistic test data, time to plot is about 132 seconds:
points = numpy.random.randint(0, 100, (3000, 2))
edges = numpy.random.randint(0, 3000, (6000, 2))

Well, I found the following which is much faster:
from matplotlib import pyplot as plt
from matplotlib.collections import LineCollection
lc = LineCollection(points[edges])
fig = plt.figure()
plt.gca().add_collection(lc)
plt.xlim(points[:,0].min(), points[:,0].max())
plt.ylim(points[:,1].min(), points[:,1].max())
plt.plot(points[:,0], points[:,1], 'ro')
fig.savefig('full_figure.png')
Is it still possible to do it faster?

You could also just do it in a single plot call, which is considerably faster than two (though probably essentially the same as adding a LineCollection).
import numpy
import matplotlib.pyplot as plt
points = numpy.array([[1,2],[4,5],[2,7],[3,9],[9,2]])
edges = numpy.array([[0,1],[3,4],[3,2],[2,4]])
x = points[:,0].flatten()
y = points[:,1].flatten()
plt.plot(x[edges.T], y[edges.T], linestyle='-', color='y',
markerfacecolor='red', marker='o')
plt.show()

Related

Custom "multicolored lines" in matplotlib [duplicate]

I have two list as below:
latt=[42.0,41.978567980875397,41.96622693388357,41.963791391892457,...,41.972407378075879]
lont=[-66.706920989908909,-66.703116557977069,-66.707351643324543,...-66.718218142021925]
now I want to plot this as a line, separate each 10 of those 'latt' and 'lont' records as a period and give it a unique color.
what should I do?
There are several different ways to do this. The "best" approach will depend mostly on how many line segments you want to plot.
If you're just going to be plotting a handful (e.g. 10) line segments, then just do something like:
import numpy as np
import matplotlib.pyplot as plt
def uniqueish_color():
"""There're better ways to generate unique colors, but this isn't awful."""
return plt.cm.gist_ncar(np.random.random())
xy = (np.random.random((10, 2)) - 0.5).cumsum(axis=0)
fig, ax = plt.subplots()
for start, stop in zip(xy[:-1], xy[1:]):
x, y = zip(start, stop)
ax.plot(x, y, color=uniqueish_color())
plt.show()
If you're plotting something with a million line segments, though, this will be terribly slow to draw. In that case, use a LineCollection. E.g.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
xy = (np.random.random((1000, 2)) - 0.5).cumsum(axis=0)
# Reshape things so that we have a sequence of:
# [[(x0,y0),(x1,y1)],[(x0,y0),(x1,y1)],...]
xy = xy.reshape(-1, 1, 2)
segments = np.hstack([xy[:-1], xy[1:]])
fig, ax = plt.subplots()
coll = LineCollection(segments, cmap=plt.cm.gist_ncar)
coll.set_array(np.random.random(xy.shape[0]))
ax.add_collection(coll)
ax.autoscale_view()
plt.show()
For both of these cases, we're just drawing random colors from the "gist_ncar" coloramp. Have a look at the colormaps here (gist_ncar is about 2/3 of the way down): http://matplotlib.org/examples/color/colormaps_reference.html
Copied from this example:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
from matplotlib.colors import ListedColormap, BoundaryNorm
x = np.linspace(0, 3 * np.pi, 500)
y = np.sin(x)
z = np.cos(0.5 * (x[:-1] + x[1:])) # first derivative
# Create a colormap for red, green and blue and a norm to color
# f' < -0.5 red, f' > 0.5 blue, and the rest green
cmap = ListedColormap(['r', 'g', 'b'])
norm = BoundaryNorm([-1, -0.5, 0.5, 1], cmap.N)
# Create a set of line segments so that we can color them individually
# This creates the points as a N x 1 x 2 array so that we can stack points
# together easily to get the segments. The segments array for line collection
# needs to be numlines x points per line x 2 (x and y)
points = np.array([x, y]).T.reshape(-1, 1, 2)
segments = np.concatenate([points[:-1], points[1:]], axis=1)
# Create the line collection object, setting the colormapping parameters.
# Have to set the actual values used for colormapping separately.
lc = LineCollection(segments, cmap=cmap, norm=norm)
lc.set_array(z)
lc.set_linewidth(3)
fig1 = plt.figure()
plt.gca().add_collection(lc)
plt.xlim(x.min(), x.max())
plt.ylim(-1.1, 1.1)
plt.show()
See the answer here to generate the "periods" and then use the matplotlib scatter function as #tcaswell mentioned. Using the plot.hold function you can plot each period, colors will increment automatically.
Cribbing the color choice off of #JoeKington,
import numpy as np
import matplotlib.pyplot as plt
def uniqueish_color(n):
"""There're better ways to generate unique colors, but this isn't awful."""
return plt.cm.gist_ncar(np.random.random(n))
plt.scatter(latt, lont, c=uniqueish_color(len(latt)))
You can do this with scatter.
I have been searching for a short solution how to use pyplots line plot to show a time series coloured by a label feature without using scatter due to the amount of data points.
I came up with the following workaround:
plt.plot(np.where(df["label"]==1, df["myvalue"], None), color="red", label="1")
plt.plot(np.where(df["label"]==0, df["myvalue"], None), color="blue", label="0")
plt.legend()
The drawback is you are creating two different line plots so the connection between the different classes is not shown. For my purposes it is not a big deal. It may help someone.

Adding a colorbar to pyplot [duplicate]

I have a sequence of line plots for two variables (x,y) for a number of different values of a variable z. I would normally add the line plots with legends like this:
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
# suppose mydata is a list of tuples containing (xs, ys, z)
# where xs and ys are lists of x's and y's and z is a number.
legns = []
for(xs,ys,z) in mydata:
pl = ax.plot(xs,ys,color = (z,0,0))
legns.append("z = %f"%(z))
ax.legends(legns)
plt.show()
But I have too many graphs and the legends will cover the graph. I'd rather have a colorbar indicating the value of z corresponding to the color. I can't find anything like that in the galery and all my attempts do deal with the colorbar failed. Apparently I must create a collection of plots before trying to add a colorbar.
Is there an easy way to do this? Thanks.
EDIT (clarification):
I wanted to do something like this:
import matplotlib.pyplot as plt
import matplotlib.cm as cm
fig = plt.figure()
ax = fig.add_subplot(111)
mycmap = cm.hot
# suppose mydata is a list of tuples containing (xs, ys, z)
# where xs and ys are lists of x's and y's and z is a number between 0 and 1
plots = []
for(xs,ys,z) in mydata:
pl = ax.plot(xs,ys,color = mycmap(z))
plots.append(pl)
fig.colorbar(plots)
plt.show()
But this won't work according to the Matplotlib reference because a list of plots is not a "mappable", whatever this means.
I've created an alternative plot function using LineCollection:
def myplot(ax,xs,ys,zs, cmap):
plot = lc([zip(x,y) for (x,y) in zip(xs,ys)], cmap = cmap)
plot.set_array(array(zs))
x0,x1 = amin(xs),amax(xs)
y0,y1 = amin(ys),amax(ys)
ax.add_collection(plot)
ax.set_xlim(x0,x1)
ax.set_ylim(y0,y1)
return plot
xs and ys are lists of lists of x and y coordinates and zs is a list of the different conditions to colorize each line. It feels a bit like a cludge though... I thought that there would be a more neat way to do this. I like the flexibility of the plt.plot() function.
(I know this is an old question but...) Colorbars require a matplotlib.cm.ScalarMappable, plt.plot produces lines which are not scalar mappable, therefore, in order to make a colorbar, we are going to need to make a scalar mappable.
Ok. So the constructor of a ScalarMappable takes a cmap and a norm instance. (norms scale data to the range 0-1, cmaps you have already worked with and take a number between 0-1 and returns a color). So in your case:
import matplotlib.pyplot as plt
sm = plt.cm.ScalarMappable(cmap=my_cmap, norm=plt.normalize(min=0, max=1))
plt.colorbar(sm)
Because your data is in the range 0-1 already, you can simplify the sm creation to:
sm = plt.cm.ScalarMappable(cmap=my_cmap)
EDIT: For matplotlib v1.2 or greater the code becomes:
import matplotlib.pyplot as plt
sm = plt.cm.ScalarMappable(cmap=my_cmap, norm=plt.normalize(vmin=0, vmax=1))
# fake up the array of the scalar mappable. Urgh...
sm._A = []
plt.colorbar(sm)
EDIT: For matplotlib v1.3 or greater the code becomes:
import matplotlib.pyplot as plt
sm = plt.cm.ScalarMappable(cmap=my_cmap, norm=plt.Normalize(vmin=0, vmax=1))
# fake up the array of the scalar mappable. Urgh...
sm._A = []
plt.colorbar(sm)
EDIT: For matplotlib v3.1 or greater simplifies to:
import matplotlib.pyplot as plt
sm = plt.cm.ScalarMappable(cmap=my_cmap, norm=plt.Normalize(vmin=0, vmax=1))
plt.colorbar(sm)
Here's one way to do it while still using plt.plot(). Basically, you make a throw-away plot and get the colorbar from there.
import matplotlib as mpl
import matplotlib.pyplot as plt
min, max = (-40, 30)
step = 10
# Setting up a colormap that's a simple transtion
mymap = mpl.colors.LinearSegmentedColormap.from_list('mycolors',['blue','red'])
# Using contourf to provide my colorbar info, then clearing the figure
Z = [[0,0],[0,0]]
levels = range(min,max+step,step)
CS3 = plt.contourf(Z, levels, cmap=mymap)
plt.clf()
# Plotting what I actually want
X=[[1,2],[1,2],[1,2],[1,2]]
Y=[[1,2],[1,3],[1,4],[1,5]]
Z=[-40,-20,0,30]
for x,y,z in zip(X,Y,Z):
# setting rgb color based on z normalized to my range
r = (float(z)-min)/(max-min)
g = 0
b = 1-r
plt.plot(x,y,color=(r,g,b))
plt.colorbar(CS3) # using the colorbar info I got from contourf
plt.show()
It's a little wasteful, but convenient. It's also not very wasteful if you make multiple plots as you can call plt.colorbar() without regenerating the info for it.
Here is a slightly simplied example inspired by the top answer given by Boris and Hooked (Thanks for the great idea!):
1. Discrete colorbar
Discrete colorbar is more involved, because colormap generated by mpl.cm.get_cmap() is not a mappable image needed as a colorbar() argument. A dummie mappable needs to generated as shown below:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
n_lines = 5
x = np.linspace(0, 10, 100)
y = np.sin(x[:, None] + np.pi * np.linspace(0, 1, n_lines))
c = np.arange(1, n_lines + 1)
cmap = mpl.cm.get_cmap('jet', n_lines)
fig, ax = plt.subplots(dpi=100)
# Make dummie mappable
dummie_cax = ax.scatter(c, c, c=c, cmap=cmap)
# Clear axis
ax.cla()
for i, yi in enumerate(y.T):
ax.plot(x, yi, c=cmap(i))
fig.colorbar(dummie_cax, ticks=c)
plt.show();
This will produce a plot with a discrete colorbar:
2. Continuous colorbar
Continuous colorbar is less involved, as mpl.cm.ScalarMappable() allows us to obtain an "image" for colorbar().
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
n_lines = 5
x = np.linspace(0, 10, 100)
y = np.sin(x[:, None] + np.pi * np.linspace(0, 1, n_lines))
c = np.arange(1, n_lines + 1)
norm = mpl.colors.Normalize(vmin=c.min(), vmax=c.max())
cmap = mpl.cm.ScalarMappable(norm=norm, cmap=mpl.cm.jet)
cmap.set_array([])
fig, ax = plt.subplots(dpi=100)
for i, yi in enumerate(y.T):
ax.plot(x, yi, c=cmap.to_rgba(i + 1))
fig.colorbar(cmap, ticks=c)
plt.show();
This will produce a plot with a continuous colorbar:
[Side note] In this example, I personally don't know why cmap.set_array([]) is necessary (otherwise we'd get error messages). If someone understand the principles under the hood, please comment :)
As other answers here do try to use dummy plots, which is not really good style, here is a generic code for a
Discrete colorbar
A discrete colorbar is produced in the same way a continuous colorbar is created, just with a different Normalization. In this case a BoundaryNorm should be used.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors
n_lines = 5
x = np.linspace(0, 10, 100)
y = np.sin(x[:, None] + np.pi * np.linspace(0, 1, n_lines))
c = np.arange(1., n_lines + 1)
cmap = plt.get_cmap("jet", len(c))
norm = matplotlib.colors.BoundaryNorm(np.arange(len(c)+1)+0.5,len(c))
sm = plt.cm.ScalarMappable(norm=norm, cmap=cmap)
sm.set_array([]) # this line may be ommitted for matplotlib >= 3.1
fig, ax = plt.subplots(dpi=100)
for i, yi in enumerate(y.T):
ax.plot(x, yi, c=cmap(i))
fig.colorbar(sm, ticks=c)
plt.show()

Matplotlib 2D line plot with color as a function of a third variable, plus colorbar [duplicate]

I have two list as below:
latt=[42.0,41.978567980875397,41.96622693388357,41.963791391892457,...,41.972407378075879]
lont=[-66.706920989908909,-66.703116557977069,-66.707351643324543,...-66.718218142021925]
now I want to plot this as a line, separate each 10 of those 'latt' and 'lont' records as a period and give it a unique color.
what should I do?
There are several different ways to do this. The "best" approach will depend mostly on how many line segments you want to plot.
If you're just going to be plotting a handful (e.g. 10) line segments, then just do something like:
import numpy as np
import matplotlib.pyplot as plt
def uniqueish_color():
"""There're better ways to generate unique colors, but this isn't awful."""
return plt.cm.gist_ncar(np.random.random())
xy = (np.random.random((10, 2)) - 0.5).cumsum(axis=0)
fig, ax = plt.subplots()
for start, stop in zip(xy[:-1], xy[1:]):
x, y = zip(start, stop)
ax.plot(x, y, color=uniqueish_color())
plt.show()
If you're plotting something with a million line segments, though, this will be terribly slow to draw. In that case, use a LineCollection. E.g.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
xy = (np.random.random((1000, 2)) - 0.5).cumsum(axis=0)
# Reshape things so that we have a sequence of:
# [[(x0,y0),(x1,y1)],[(x0,y0),(x1,y1)],...]
xy = xy.reshape(-1, 1, 2)
segments = np.hstack([xy[:-1], xy[1:]])
fig, ax = plt.subplots()
coll = LineCollection(segments, cmap=plt.cm.gist_ncar)
coll.set_array(np.random.random(xy.shape[0]))
ax.add_collection(coll)
ax.autoscale_view()
plt.show()
For both of these cases, we're just drawing random colors from the "gist_ncar" coloramp. Have a look at the colormaps here (gist_ncar is about 2/3 of the way down): http://matplotlib.org/examples/color/colormaps_reference.html
Copied from this example:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
from matplotlib.colors import ListedColormap, BoundaryNorm
x = np.linspace(0, 3 * np.pi, 500)
y = np.sin(x)
z = np.cos(0.5 * (x[:-1] + x[1:])) # first derivative
# Create a colormap for red, green and blue and a norm to color
# f' < -0.5 red, f' > 0.5 blue, and the rest green
cmap = ListedColormap(['r', 'g', 'b'])
norm = BoundaryNorm([-1, -0.5, 0.5, 1], cmap.N)
# Create a set of line segments so that we can color them individually
# This creates the points as a N x 1 x 2 array so that we can stack points
# together easily to get the segments. The segments array for line collection
# needs to be numlines x points per line x 2 (x and y)
points = np.array([x, y]).T.reshape(-1, 1, 2)
segments = np.concatenate([points[:-1], points[1:]], axis=1)
# Create the line collection object, setting the colormapping parameters.
# Have to set the actual values used for colormapping separately.
lc = LineCollection(segments, cmap=cmap, norm=norm)
lc.set_array(z)
lc.set_linewidth(3)
fig1 = plt.figure()
plt.gca().add_collection(lc)
plt.xlim(x.min(), x.max())
plt.ylim(-1.1, 1.1)
plt.show()
See the answer here to generate the "periods" and then use the matplotlib scatter function as #tcaswell mentioned. Using the plot.hold function you can plot each period, colors will increment automatically.
Cribbing the color choice off of #JoeKington,
import numpy as np
import matplotlib.pyplot as plt
def uniqueish_color(n):
"""There're better ways to generate unique colors, but this isn't awful."""
return plt.cm.gist_ncar(np.random.random(n))
plt.scatter(latt, lont, c=uniqueish_color(len(latt)))
You can do this with scatter.
I have been searching for a short solution how to use pyplots line plot to show a time series coloured by a label feature without using scatter due to the amount of data points.
I came up with the following workaround:
plt.plot(np.where(df["label"]==1, df["myvalue"], None), color="red", label="1")
plt.plot(np.where(df["label"]==0, df["myvalue"], None), color="blue", label="0")
plt.legend()
The drawback is you are creating two different line plots so the connection between the different classes is not shown. For my purposes it is not a big deal. It may help someone.

How to fill rainbow color under a curve in Python matplotlib

I want to fill rainbow color under a curve. Actually the function matplotlib.pyplot.fill_between can fill area under a curve with a single color.
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 100, 50)
y = -(x-50)**2 + 2500
plt.plot(x,y)
plt.fill_between(x,y, color='green')
plt.show()
Is there a knob I can tweak the color to be rainbow? Thanks.
This is pretty easy to hack if you want "fill" with a series of rectangles:
import numpy as np
import pylab as plt
def rect(x,y,w,h,c):
ax = plt.gca()
polygon = plt.Rectangle((x,y),w,h,color=c)
ax.add_patch(polygon)
def rainbow_fill(X,Y, cmap=plt.get_cmap("jet")):
plt.plot(X,Y,lw=0) # Plot so the axes scale correctly
dx = X[1]-X[0]
N = float(X.size)
for n, (x,y) in enumerate(zip(X,Y)):
color = cmap(n/N)
rect(x,0,dx,y,color)
# Test data
X = np.linspace(0,10,100)
Y = .25*X**2 - X
rainbow_fill(X,Y)
plt.show()
You can smooth out the jagged edges by making the rectangles smaller (i.e. use more points). Additionally you could use a trapezoid (or even an interpolated polynomial) to refine the "rectangles".
If you mean giving some clever argument to "color=" I'm afraid this doesn't exist to the best of my knowledge. You could do this manually by setting a quadratic line for each color and varying the offset. Filling between them with the correct colors will give a rainbowish This makes a fun project to learn some python but if you don't feel like trying here is an example:
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 100, 50)
y_old = -(x-50)**2 + 2500
for delta, color in zip([2250, 2000, 1750, 1500, 1250, 1000], ["r", "orange", "g", "b", "indigo", "violet"] ):
y_new = -(x-50)**2 + delta
plt.plot(x, y, "-k")
plt.fill_between(x, y_old, y_new, color=color)
y_old = y_new
plt.ylim(0, 2500)
plt.show()
As you will notice this does not look like a rainbow. This is because the function we are using is a quadratic, in actual fact a rainbow is made of circles with different radii (there is also a fun maths project here!). This is also plotable by matplotlib, I would try this and make it so you can plot more than the 7 colors in the rainbow e.g plot 1000 colors spanning the entire spectrum to make it really look like a rainbow!
Here is a modified solution of the accepted answer that uses trapezoids instead of rectangles.
import numpy as np
import pylab as plt
# a solution that uses rectangles
def rect(x,y,w,h,c):
ax = plt.gca()
polygon = plt.Rectangle((x,y),w,h,color=c)
ax.add_patch(polygon)
# a solution that uses trapezoids
def polygon(x1,y1,x2,y2,c):
ax = plt.gca()
polygon = plt.Polygon( [ (x1,y1), (x2,y2), (x2,0), (x1,0) ], color=c )
ax.add_patch(polygon)
def rainbow_fill(X,Y, cmap=plt.get_cmap("jet")):
plt.plot(X,Y,lw=0) # Plot so the axes scale correctly
dx = X[1]-X[0]
N = float(X.size)
for n, (x,y) in enumerate(zip(X,Y)):
color = cmap(n/N)
# uncomment to use rectangles
# rect(x,0,dx,y,color)
# uncomment to use trapezoids
if n+1 == N: continue
polygon(x,y,X[n+1],Y[n+1],color)
# Test data
X = np.linspace(0,10,100)
Y = .25*X**2 - X
rainbow_fill(X,Y)
plt.show()

setting color range in matplotlib patchcollection

I am plotting a PatchCollection in matplotlib with coords and patch color values read in from a file.
The problem is that matplotlib seems to automatically scale the color range to the min/max of the data values. How can I manually set the color range? E.g. if my data range is 10-30, but I want to scale this to a color range of 5-50 (e.g. to compare to another plot), how can I do this?
My plotting commands look much the same as in the api example code: patch_collection.py
colors = 100 * pylab.rand(len(patches))
p = PatchCollection(patches, cmap=matplotlib.cm.jet, alpha=0.4)
p.set_array(pylab.array(colors))
ax.add_collection(p)
pylab.colorbar(p)
pylab.show()
Use p.set_clim([5, 50]) to set the color scaling minimums and maximums in the case of your example. Anything in matplotlib that has a colormap has the get_clim and set_clim methods.
As a full example:
import matplotlib
import matplotlib.pyplot as plt
from matplotlib.collections import PatchCollection
from matplotlib.patches import Circle
import numpy as np
# (modified from one of the matplotlib gallery examples)
resolution = 50 # the number of vertices
N = 100
x = np.random.random(N)
y = np.random.random(N)
radii = 0.1*np.random.random(N)
patches = []
for x1, y1, r in zip(x, y, radii):
circle = Circle((x1, y1), r)
patches.append(circle)
fig = plt.figure()
ax = fig.add_subplot(111)
colors = 100*np.random.random(N)
p = PatchCollection(patches, cmap=matplotlib.cm.jet, alpha=0.4)
p.set_array(colors)
ax.add_collection(p)
fig.colorbar(p)
fig.show()
Now, if we just add p.set_clim([5, 50]) (where p is the patch collection) somewhere before we call fig.show(...), we get this:

Categories