Related
With matplotlib I am trying to plot 3D data as a 2D colormap. Each point has a x and a y coordinate, and a 'height' z. This height should determine the color a certain x/y region is colored in.
Here is the code I have been trying:
import random
import numpy as np
import matplotlib.pyplot as plt
x = []
y = []
z = []
for index in range(100):
a = random.random()
b = random.random()
c = np.exp(-a*a - b*b)
x.append(a)
y.append(b)
z.append(c)
cmap = plt.get_cmap('PiYG')
fig, ax = plt.subplots()
ax.pcolormesh(x, y, z, cmap=cmap)
But it gives an error
ValueError: not enough values to unpack (expected 2, got 1)
Maybe I am trying the wrong thing?
Remark: The three lists x,y,z and calculated for the example above, but in reality I have just three lists with "random" numbers in it I want to vizualize. I cannot calculate z given x and y.
I could also use imshow to create the plot I want, but I have to convert my original data into a matrix first. Maybe there is a function I can use?
pcolormesh might not be the choice for this kind of problem. pcolormesh expects ordered cell edges as data rather than random data points. You could do this if you know your grid before hand e.g.
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 1, 51)
# meshgrid makes a 2D grid of points
xx, yy = np.meshgrid(x, x)
z = np.exp(-xx**2 - yy*2)
fig, ax = plt.subplots()
ax.pcolormesh(xx, yy, z, cmap="PiYG")
which will give you
Alternatively, you could use one of the tri functions such as tripcolor with your existing setup
import random
import numpy as np
import matplotlib.pyplot as plt
x = []
y = []
z = []
for index in range(100):
a = random.random()
b = random.random()
c = np.exp(-a*a - b*b)
x.append(a)
y.append(b)
z.append(c)
fig, ax = plt.subplots()
ax.tripcolor(x, y, z, cmap="PiYG")
which will give
Note it would be simpler to use np.random to generate your data
x, y = np.random.random(size=(2, 100))
z = np.exp(-x**2 - y**2)
fig, ax = plt.subplots()
ax.tripcolor(x, y, z, cmap="PiYG")
There is an issue with x, y and z shapes: they have to be 2D arrays (matrices) but they are 1-dimensional.
In order to generate x and y axis, you could use:
x = []
y = []
for index in range(100):
x.append(random.random())
y.append(random.random())
Then you have to create a meshgrid:
X, Y = np.meshgrid(x, y)
Finally you can compute Z over the meshgrid:
Z = np.exp(-X**2 - Y**2)
In this way, your code:
cmap = plt.get_cmap('PiYG')
fig, ax = plt.subplots()
ax.pcolormesh(X, Y, Z, cmap=cmap)
gives:
If you you cannot compute Z on the meshgrid, then you should not use pcolormesh.
Some alternative could be:
3D scatterplot:
import random
import numpy as np
import matplotlib.pyplot as plt
x = []
y = []
z = []
for index in range(100):
a = random.random()
b = random.random()
c = np.exp(-a*a - b*b)
x.append(a)
y.append(b)
z.append(c)
cmap = plt.get_cmap('PiYG')
fig = plt.figure()
ax = fig.add_subplot(projection = '3d')
ax.scatter(x, y, z, cmap=cmap)
plt.show()
2D colored scatterplot:
import random
import numpy as np
import matplotlib.pyplot as plt
x = []
y = []
z = []
for index in range(100):
a = random.random()
b = random.random()
c = np.exp(-a*a - b*b)
x.append(a)
y.append(b)
z.append(c)
cmap = plt.get_cmap('PiYG')
plt.style.use('seaborn-darkgrid')
fig, ax = plt.subplots()
ax.scatter(x, y, c = z, cmap=cmap)
plt.show()
I have data of a plot on two arrays that are stored in unsorted way, so the plot jumps from one place to another discontinuously:
I have tried one example of finding the closest point in a 2D array:
import numpy as np
def distance(pt_1, pt_2):
pt_1 = np.array((pt_1[0], pt_1[1]))
pt_2 = np.array((pt_2[0], pt_2[1]))
return np.linalg.norm(pt_1-pt_2)
def closest_node(node, nodes):
nodes = np.asarray(nodes)
dist_2 = np.sum((nodes - node)**2, axis=1)
return np.argmin(dist_2)
a = []
for x in range(50000):
a.append((np.random.randint(0,1000),np.random.randint(0,1000)))
some_pt = (1, 2)
closest_node(some_pt, a)
Can I use it somehow to "clean" my data? (in the above code, a can be my data)
Exemplary data from my calculations is:
array([[ 2.08937872e+001, 1.99020033e+001, 2.28260611e+001,
6.27711094e+000, 3.30392288e+000, 1.30312878e+001,
8.80768833e+000, 1.31238275e+001, 1.57400130e+001,
5.00278061e+000, 1.70752624e+001, 1.79131456e+001,
1.50746185e+001, 2.50095731e+001, 2.15895974e+001,
1.23237801e+001, 1.14860312e+001, 1.44268222e+001,
6.37680265e+000, 7.81485403e+000],
[ -1.19702178e-001, -1.14050879e-001, -1.29711421e-001,
8.32977493e-001, 7.27437322e-001, 8.94389885e-001,
8.65931116e-001, -6.08199292e-002, -8.51922900e-002,
1.12333841e-001, -9.88131292e-324, 4.94065646e-324,
-9.88131292e-324, 4.94065646e-324, 4.94065646e-324,
0.00000000e+000, 0.00000000e+000, 0.00000000e+000,
-4.94065646e-324, 0.00000000e+000]])
After using radial_sort_line (of Joe Kington) I have received the following plot:
This is actually a problem that's tougher than you might think in general.
In your exact case, you might be able to get away with sorting by the y-values. It's hard to tell for sure from the plot.
Therefore, a better approach for somewhat circular shapes like this is to do a radial sort.
For example, let's generate some data somewhat similar to yours:
import numpy as np
import matplotlib.pyplot as plt
t = np.linspace(.2, 1.6 * np.pi)
x, y = np.cos(t), np.sin(t)
# Shuffle the points...
i = np.arange(t.size)
np.random.shuffle(i)
x, y = x[i], y[i]
fig, ax = plt.subplots()
ax.plot(x, y, color='lightblue')
ax.margins(0.05)
plt.show()
Okay, now let's try to undo that shuffle by using a radial sort. We'll use the centroid of the points as the center and calculate the angle to each point, then sort by that angle:
x0, y0 = x.mean(), y.mean()
angle = np.arctan2(y - y0, x - x0)
idx = angle.argsort()
x, y = x[idx], y[idx]
fig, ax = plt.subplots()
ax.plot(x, y, color='lightblue')
ax.margins(0.05)
plt.show()
Okay, pretty close! If we were working with a closed polygon, we'd be done.
However, we have one problem -- This closes the wrong gap. We'd rather have the angle start at the position of the largest gap in the line.
Therefore, we'll need to calculate the gap to each adjacent point on our new line and re-do the sort based on a new starting angle:
dx = np.diff(np.append(x, x[-1]))
dy = np.diff(np.append(y, y[-1]))
max_gap = np.abs(np.hypot(dx, dy)).argmax() + 1
x = np.append(x[max_gap:], x[:max_gap])
y = np.append(y[max_gap:], y[:max_gap])
Which results in:
As a complete, stand-alone example:
import numpy as np
import matplotlib.pyplot as plt
def main():
x, y = generate_data()
plot(x, y).set(title='Original data')
x, y = radial_sort_line(x, y)
plot(x, y).set(title='Sorted data')
plt.show()
def generate_data(num=50):
t = np.linspace(.2, 1.6 * np.pi, num)
x, y = np.cos(t), np.sin(t)
# Shuffle the points...
i = np.arange(t.size)
np.random.shuffle(i)
x, y = x[i], y[i]
return x, y
def radial_sort_line(x, y):
"""Sort unordered verts of an unclosed line by angle from their center."""
# Radial sort
x0, y0 = x.mean(), y.mean()
angle = np.arctan2(y - y0, x - x0)
idx = angle.argsort()
x, y = x[idx], y[idx]
# Split at opening in line
dx = np.diff(np.append(x, x[-1]))
dy = np.diff(np.append(y, y[-1]))
max_gap = np.abs(np.hypot(dx, dy)).argmax() + 1
x = np.append(x[max_gap:], x[:max_gap])
y = np.append(y[max_gap:], y[:max_gap])
return x, y
def plot(x, y):
fig, ax = plt.subplots()
ax.plot(x, y, color='lightblue')
ax.margins(0.05)
return ax
main()
Sorting the data base on their angle relative to the center as in #JoeKington 's solution might have problems with some parts of the data:
In [1]:
import scipy.spatial as ss
import matplotlib.pyplot as plt
import numpy as np
import re
%matplotlib inline
In [2]:
data=np.array([[ 2.08937872e+001, 1.99020033e+001, 2.28260611e+001,
6.27711094e+000, 3.30392288e+000, 1.30312878e+001,
8.80768833e+000, 1.31238275e+001, 1.57400130e+001,
5.00278061e+000, 1.70752624e+001, 1.79131456e+001,
1.50746185e+001, 2.50095731e+001, 2.15895974e+001,
1.23237801e+001, 1.14860312e+001, 1.44268222e+001,
6.37680265e+000, 7.81485403e+000],
[ -1.19702178e-001, -1.14050879e-001, -1.29711421e-001,
8.32977493e-001, 7.27437322e-001, 8.94389885e-001,
8.65931116e-001, -6.08199292e-002, -8.51922900e-002,
1.12333841e-001, -9.88131292e-324, 4.94065646e-324,
-9.88131292e-324, 4.94065646e-324, 4.94065646e-324,
0.00000000e+000, 0.00000000e+000, 0.00000000e+000,
-4.94065646e-324, 0.00000000e+000]])
In [3]:
plt.plot(data[0], data[1])
plt.title('Unsorted Data')
Out[3]:
<matplotlib.text.Text at 0x10a5c0550>
See x values between 15 and 20 are not sorted correctly.
In [10]:
#Calculate the angle in degrees of [0, 360]
sort_index = np.angle(np.dot((data.T-data.mean(1)), np.array([1.0, 1.0j])))
sort_index = np.where(sort_index>0, sort_index, sort_index+360)
#sorted the data by angle and plot them
sort_index = sort_index.argsort()
plt.plot(data[0][sort_index], data[1][sort_index])
plt.title('Data Sorted by angle relatively to the centroid')
plt.plot(data[0], data[1], 'r+')
Out[10]:
[<matplotlib.lines.Line2D at 0x10b009e10>]
We can sort the data based on a nearest neighbor approach, but since the x and y are of very different scale, the choice of distance metrics becomes an important issue. We will just try all the distance metrics available in scipy to get an idea:
In [7]:
def sort_dots(metrics, ax, start):
dist_m = ss.distance.squareform(ss.distance.pdist(data.T, metrics))
total_points = data.shape[1]
points_index = set(range(total_points))
sorted_index = []
target = start
ax.plot(data[0, target], data[1, target], 'o', markersize=16)
points_index.discard(target)
while len(points_index)>0:
candidate = list(points_index)
nneigbour = candidate[dist_m[target, candidate].argmin()]
points_index.discard(nneigbour)
points_index.discard(target)
#print points_index, target, nneigbour
sorted_index.append(target)
target = nneigbour
sorted_index.append(target)
ax.plot(data[0][sorted_index], data[1][sorted_index])
ax.set_title(metrics)
In [6]:
dmetrics = re.findall('pdist\(X\,\s+\'(.*)\'', ss.distance.pdist.__doc__)
In [8]:
f, axes = plt.subplots(4, 6, figsize=(16,10), sharex=True, sharey=True)
axes = axes.ravel()
for metrics, ax in zip(dmetrics, axes):
try:
sort_dots(metrics, ax, 5)
except:
ax.set_title(metrics + '(unsuitable)')
It looks like standardized euclidean and mahanalobis metrics give the best result. Note that we choose a starting point of the 6th data (index 5), it is the data point this the largest y value (use argmax to get the index, of course).
In [9]:
f, axes = plt.subplots(4, 6, figsize=(16,10), sharex=True, sharey=True)
axes = axes.ravel()
for metrics, ax in zip(dmetrics, axes):
try:
sort_dots(metrics, ax, 13)
except:
ax.set_title(metrics + '(unsuitable)')
This is what happens if you choose the starting point of max. x value (index 13). It appears that mahanalobis metrics is better than standardized euclidean as it is not affected by the starting point we choose.
If we do the assumption that the data are 2D and the x axis should be in an increasing fashion, then you could:
sort the x axis data, e.g. x_old and store the result in a different variable, e.g. x_new
for each element in the x_new find its index in the x_old array
re-order the elements in the y_axis array according to the indices that you got from previous step
I would do it with python list instead of numpy array due to list.index method been more easily manipulated than the numpy.where method.
E.g. (and assume that x_old and y_old are your previous numpy variables for x and y axis respectively)
import numpy as np
x_new_tmp = x_old.tolist()
y_new_tmp = y_old.tolist()
x_new = sorted(x_new_tmp)
y_new = [y_new_tmp[x_new_tmp.index(i)] for i in x_new]
Then you can plot x_new and y_new
I've a hat wave given some initial conditions. For x and y between 0.5 and 1, u = 2, otherwise u = 1. x and y vary from 0 to 2.
The wave is also time dependent so i am plotting then pausing then clearing the 3d graph.
Here is my code:
# import the modules required
import numpy as np
import matplotlib.pyplot as plt
import pylab as py
from mpl_toolkits.mplot3d import Axes3D
#define the grid
X = np.linspace(0,2,21)
Y = np.linspace(0,2,21)
U = np.ones((21,21))
c = 1
dt = 0.01
dx = 0.1
dy = 0.1
#initial conditions
for i in range (21):
if (0.5<=X[i]<=1):
for j in range (21):
if (0.5<=Y[j]<=1):
U[i,j] = 2
#prop and plot
UP = np.ones((21,21))
for f in range (100):
for i in range (21):
for j in range (21):
UP[i,j] = U[i,j] - ((c*dt)/(dx))*(U[i,j] - U[i-1,j]) - ((c*dt)/(dy))*(U[i,j] - U[i,j-1])
U = UP
fig = plt.figure(figsize=(11,7), dpi=100)
ax = fig.gca(projection='3d')
X,Y = np.meshgrid(X,Y)
surf = ax.plot_wireframe(X,Y,U[:])
plt.show()
You are getting that error because ax.plot_wireframe uses np.broadcast_arrays which expects all input arrays to have the same shape. What's happening is that for each iteration of for f in range (100): this line X,Y = np.meshgrid(X,Y) is changing the shape of X and Y. Paste these lines into your code and see for yourself.
print X.shape, Y.shape
X,Y = np.meshgrid(X,Y)
print X.shape, Y.shape
Try moving X,Y = np.meshgrid(X,Y) out of the loop.
#prop and plot
UP = np.ones((21,21))
X,Y = np.meshgrid(X,Y)
for f in range (100):
for i in range (21):
for j in range (21):
UP[i,j] = U[i,j] - ((c*dt)/(dx))*(U[i,j] - U[i-1,j]) - ((c*dt)/(dy))*(U[i,j] - U[i,j-1])
U = UP
fig = plt.figure(figsize=(11,7), dpi=100)
ax = fig.gca(projection='3d')
surf = ax.plot_wireframe(X,Y,U[:])
plt.show()
So I have a dataset that I am trying to bin into a matrix and then make a wireframe plot out of. When I show the plot, all that shows is a flat surface along the x=y line of the 3d image. I would like the full matrix to show. I have included my code as well as a sample of the stats.txt:
from numpy import *
from pylab import *
f = open('stats.txt')
bins = 10
xs = []
ys = []
for line in f:
line = line.strip().split(' ')
xs.append(float(line[0]))
ys.append(float(line[1]))
xlin = linspace(min(xs),max(xs),bins+1)
ylin = linspace(min(ys),max(ys),bins+1)
matrix = zeros((bins,bins))
for i in range(bins):
for j in range(bins):
count = 0
for s in range(len(xs)):
if xs[s] >= xlin[i] and xs[s] <= xlin[i+1] and ys[s] >= ylin[j] and ys[s] <= ylin[j+1]:
count +=1
matrix[i,j] = count
print matrix
x = []
y = []
for i in range(bins):
x.append([0.,1.,2.,3.,4.,5.,6.,7.,8.,9.])
for i in range(bins):
y.append([0.,1.,2.,3.,4.,5.,6.,7.,8.,9.])
#for i in range(bins):
# y.append(linspace(0,bins-1,bins))
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d.axes3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
print shape(x)
print shape(y)
print shape(matrix)
ax.plot_wireframe(x, y, matrix)
#plt.imshow(matrix,cmap=plt.cm.ocean)
plt.show()
Sample of stats.txt:
10385.8694574 114.758131279
11379.8955938 -166.830995639
10347.5572407 165.168099188
11698.0834105 110.188708959
12100.3323331 185.316597413
11530.3943217 287.99795812
11452.2864796 474.890116234
12181.4426414 149.266756079
10962.8512477 -544.794117131
10601.2128384 49.782478266
The problem with your code is that your x-coordinates are in the same as the y-coordinates for every data point. Thus, you're effectively telling matplotlib that you only have values on the diagonal in the x-y-plane.
One possible solution would be to simply transpose your y-coordinates. However, using numpy's meshgrid (link) function is probably a lot more comfortable.
x,y = np.meshgrid(np.arange(bins),np.arange(bins))
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_wireframe(x, y, matrix)
It's possible to fill between lines with a color:
http://matplotlib.sourceforge.net/examples/pylab_examples/fill_between_demo.html
It's also possible to use a continuous colormap for a line:
http://matplotlib.sourceforge.net/examples/pylab_examples/multicolored_line.html
Is it possible (and reasonably easy) to use a continuous colormap for the colored fill between two lines? For example, the color fill may change along x based on the difference between the two lines at x (or based on another set of data).
I found a solution to this problem. It builds on the brilliant but hacky solution of #Hooked. You create a 2D grid filed from lots of small boxes. It's not the fastest solution but it should be pretty flexible (more so than solutions which apply imshow to the patches).
import numpy as np
import pylab as plt
#Plot a rectangle
def rect(ax, x, y, w, h, c,**kwargs):
#Varying only in x
if len(c.shape) is 1:
rect = plt.Rectangle((x, y), w, h, color=c, ec=c,**kwargs)
ax.add_patch(rect)
#Varying in x and y
else:
#Split into a number of bins
N = c.shape[0]
hb = h/float(N); yl = y
for i in range(N):
yl += hb
rect = plt.Rectangle((x, yl), w, hb,
color=c[i,:], ec=c[i,:],**kwargs)
ax.add_patch(rect)
#Fill a contour between two lines
def rainbow_fill_between(ax, X, Y1, Y2, colors=None,
cmap=plt.get_cmap("Reds"),**kwargs):
plt.plot(X,Y1,lw=0) # Plot so the axes scale correctly
dx = X[1]-X[0]
N = X.size
#Pad a float or int to same size as x
if (type(Y2) is float or type(Y2) is int):
Y2 = np.array([Y2]*N)
#No colors -- specify linear
if colors is None:
colors = []
for n in range(N):
colors.append(cmap(n/float(N)))
#Varying only in x
elif len(colors.shape) is 1:
colors = cmap((colors-colors.min())
/(colors.max()-colors.min()))
#Varying only in x and y
else:
cnp = np.array(colors)
colors = np.empty([colors.shape[0],colors.shape[1],4])
for i in range(colors.shape[0]):
for j in range(colors.shape[1]):
colors[i,j,:] = cmap((cnp[i,j]-cnp[:,:].min())
/(cnp[:,:].max()-cnp[:,:].min()))
colors = np.array(colors)
#Create the patch objects
for (color,x,y1,y2) in zip(colors,X,Y1,Y2):
rect(ax,x,y2,dx,y1-y2,color,**kwargs)
# Some Test data
X = np.linspace(0,10,100)
Y1 = .25*X**2 - X
Y2 = X
g = np.exp(-.3*(X-5)**2)
#Plot fill and curves changing in x only
fig, axs =plt.subplots(1,2)
colors = g
rainbow_fill_between(axs[0],X,Y1,Y2,colors=colors)
axs[0].plot(X,Y1,'k-',lw=4)
axs[0].plot(X,Y2,'k-',lw=4)
#Plot fill and curves changing in x and y
colors = np.outer(g,g)
rainbow_fill_between(axs[1],X,Y1,Y2,colors=colors)
axs[1].plot(X,Y1,'k-',lw=4)
axs[1].plot(X,Y2,'k-',lw=4)
plt.show()
The result is,
Your solution is great and flexible ! In particular the 2D case is really nice. Such a feature could be added to fill_between maybe if the colors kwargs of the function would accept an array of the same length of x and y ?
Here is a simpler case for the 1D case using the fill_between function. It does the same but as it use trapezes instead of rectangle the result is smoother.
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import norm
# Select a color map
cmap = mpl.cm.bwr
# Some Test data
npts = 100
x = np.linspace(-4, 4, npts)
y = norm.pdf(x)
z = np.sin(2 * x)
normalize = mpl.colors.Normalize(vmin=z.min(), vmax=z.max())
# The plot
fig = plt.figure()
ax = fig.add_axes([0.12, 0.12, 0.68, 0.78])
plt.plot(x, y, color="gray")
for i in range(npts - 1):
plt.fill_between([x[i], x[i+1]], [y[i], y[i+1]], color=cmap(normalize(z[i])))
cbax = fig.add_axes([0.85, 0.12, 0.05, 0.78])
cb = mpl.colorbar.ColorbarBase(cbax, cmap=cmap, norm=normalize, orientation='vertical')
cb.set_label("Sin function", rotation=270, labelpad=15)
plt.show()