Matplotlib 3D Waterfall Plot with Colored Heights - python

I'm trying to visualise a dataset in 3D which consists of a time series (along y) of x-z data, using Python and Matplotlib.
I'd like to create a plot like the one below (which was made in Python: http://austringer.net/wp/index.php/2011/05/20/plotting-a-dolphin-biosonar-click-train/), but where the colour varies with Z - i.e. so the intensity is shown by a colormap as well as the peak height, for clarity.
An example showing the colormap in Z is (apparently made using MATLAB):
This effect can be created using the waterfall plot option in MATLAB, but I understand there is no direct equivalent of this in Python.
I have also tried using the plot_surface option in Python (below), which works ok, but I'd like to 'force' the lines running over the surface to only be in the x direction (i.e. making it look more like a stacked time series than a surface). Is this possible?
Any help or advice greatly welcomed. Thanks.

I have generated a function that replicates the matlab waterfall behaviour in matplotlib, but I don't think it is the best solution when it comes to performance.
I started from two examples in matplotlib documentation: multicolor lines and multiple lines in 3d plot. From these examples, I only saw possible to draw lines whose color varies following a given colormap according to its z value following the example, which is reshaping the input array to draw the line by segments of 2 points and setting the color of the segment to the z mean value between the 2 points.
Thus, given the input matrixes n,m matrixes X,Y and Z, the function loops over the smallest dimension between n,m to plot each line like in the example, by 2 points segments, where the reshaping to plot by segments is done reshaping the array with the same code as the example.
def waterfall_plot(fig,ax,X,Y,Z):
'''
Make a waterfall plot
Input:
fig,ax : matplotlib figure and axes to populate
Z : n,m numpy array. Must be a 2d array even if only one line should be plotted
X,Y : n,m array
'''
# Set normalization to the same values for all plots
norm = plt.Normalize(Z.min().min(), Z.max().max())
# Check sizes to loop always over the smallest dimension
n,m = Z.shape
if n>m:
X=X.T; Y=Y.T; Z=Z.T
m,n = n,m
for j in range(n):
# reshape the X,Z into pairs
points = np.array([X[j,:], Z[j,:]]).T.reshape(-1, 1, 2)
segments = np.concatenate([points[:-1], points[1:]], axis=1)
lc = LineCollection(segments, cmap='plasma', norm=norm)
# Set the values used for colormapping
lc.set_array((Z[j,1:]+Z[j,:-1])/2)
lc.set_linewidth(2) # set linewidth a little larger to see properly the colormap variation
line = ax.add_collection3d(lc,zs=(Y[j,1:]+Y[j,:-1])/2, zdir='y') # add line to axes
fig.colorbar(lc) # add colorbar, as the normalization is the same for all, it doesent matter which of the lc objects we use
Therefore, plots looking like matlab waterfall can be easily generated with the same input matrixes as a matplotlib surface plot:
import numpy as np; import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
from mpl_toolkits.mplot3d import Axes3D
# Generate data
x = np.linspace(-2,2, 500)
y = np.linspace(-2,2, 40)
X,Y = np.meshgrid(x,y)
Z = np.sin(X**2+Y**2)
# Generate waterfall plot
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
waterfall_plot(fig,ax,X,Y,Z)
ax.set_xlabel('X') ; ax.set_xlim3d(-2,2)
ax.set_ylabel('Y') ; ax.set_ylim3d(-2,2)
ax.set_zlabel('Z') ; ax.set_zlim3d(-1,1)
The function assumes that when generating the meshgrid, the x array is the longest, and by default the lines have fixed y, and its the x coordinate what varies. However, if the size of the y dimension is larger, the matrixes are transposed, generating the lines with fixed x. Thus, generating the meshgrid with the sizes inverted (len(x)=40 and len(y)=500) yields:

with a pandas dataframe with the x axis as the first column and each spectra as another column
offset=0
for c in s.columns[1:]:
plt.plot(s.wavelength,s[c]+offset)
offset+=.25
plt.xlim([1325,1375])

Related

Matplotlib duplicated y axis

I am trying to plot two different lines from a same vector in python using Matplotlib. For this, I use an additional vector whose values on certain indices filter the general array to plot in a certain line. The code is:
import matplotlib.pyplot as plt
def visualize_method(general, changes):
'''Correctly plots the data to visualize reversals and trials'''
x = np.array([i for i in range(len(general))])
plt.plot(x[changes==0], general[changes==0],
x[changes==1], general[changes==1],
linestyle='--', marker='o')
plt.show()
When plotting the data, the result is:
As it can be observed, the y axis is "duplicated", how could I use the same y and x axis for this filtered plot?

Using numpy arrays to count the number of points within the cells of a regular grid

I am working with a large number of 3D points, each with x,y,z values stored in numpy arrays.
For background, the points will always fall within a cylinder of fixed radius, and height = max z value of the points.
My objective is to split the bounding cylinder (or column if it is easier) into e.g. 1 m height strata, and then count the number of points within each cell
of a regular grid (e.g. 1 m x 1 m) overlaid on each strata.
Conceptually, the operation would be the same as overlaying a raster and counting the points intersecting each pixel.
The grid of cells can form a square or a disk, it doesn't matter.
After a lot of searching and reading, my current thinking is to use some combination of numpy.linspace and numpy.meshgrid to generate the vertices of each cell stored within an array and test each cell against each point to see if it is 'in'. This seems inefficient, especially when working with thousands of points.
The numpy / scipy suite seems well suited to the problem, but I have not found a solution yet. Any suggestions would be much appreciated.
I have included a few example points and some code to visualize the data.
# Setup
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
# Load in X,Y,Z values from a sub-sample of 10 points for testing
# XY Values are scaled to a reasonable point of origin
z_vals = np.array([3.08,4.46,0.27,2.40,0.48,0.21,0.31,3.28,4.09,1.75])
x_vals = np.array([22.88,20.00,20.36,24.11,40.48,29.08,36.02,29.14,32.20,18.96])
y_vals = np.array([31.31,25.04,31.86,41.81,38.23,31.57,42.65,18.09,35.78,31.78])
# This plot is instructive to visualize the problem
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(x_vals, y_vals, z_vals, c='b', marker='o')
plt.show()
I am not sure I understand perfectly what you are looking for, but since every "cell" seems to have a 1m side for all directions, couldn't you:
round all your values to integers (rasterize your data) probably with some floor function;
create a bijection from these integer coordinates to something more convenient with something like:
(64**2)*x + (64)*y + z # assuming all values are in [0,63]
You can put z rather at the beginning if you want to more easely focus on height later
compute the histogram of each "cell" (several functions from numpy/scipy or numpy can do it);
revert the bijection if needed (ie. know the "true" coordinates of each cell once the count is known)
Maybe I didn't understand well, but in case it helps...
Thanks #Baruchel. It turns out the n-dimensional histograms suggested by #DilithiumMatrix provides a fairly simple solution to the problem I posted. After some reading, here is my current solution for anyone else that faces a similar problem.
As this is my first Python/Numpy effort any improvements/suggestions, especially regarding performance, would be welcome. Thanks.
# Setup
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
# Load in X,Y,Z values from a sub-sample of 10 points for testing
# XY Values are scaled to a reasonable point of origin
z_vals = np.array([3.08,4.46,0.27,2.40,0.48,0.21,0.31,3.28,4.09,1.75])
x_vals = np.array([22.88,20.00,20.36,24.11,40.48,29.08,36.02,29.14,32.20,18.96])
y_vals = np.array([31.31,25.04,31.86,41.81,38.23,31.57,42.65,18.09,35.78,31.78])
# Updated code below
# Variables needed for 2D,3D histograms
xmax, ymax, zmax = int(x_vals.max())+1, int(y_vals.max())+1, int(z_vals.max())+1
xmin, ymin, zmin = int(x_vals.min()), int(y_vals.min()), int(z_vals.min())
xrange, yrange, zrange = xmax-xmin, ymax-ymin, zmax-zmin
xedges = np.linspace(xmin, xmax, (xrange + 1), dtype=int)
yedges = np.linspace(ymin, ymax, (yrange + 1), dtype=int)
zedges = np.linspace(zmin, zmax, (zrange + 1), dtype=int)
# Make the 2D histogram
h2d, xedges, yedges = np.histogram2d(x_vals, y_vals, bins=(xedges, yedges))
assert np.count_nonzero(h2d) == len(x_vals), "Unclassified points in the array"
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
plt.imshow(h2d.transpose(), extent=extent, interpolation='none', origin='low')
# Transpose and origin must be used to make the array line up when using imshow, unsure why
# Plot settings, not sure yet on matplotlib update/override objects
plt.grid(b=True, which='both')
plt.xticks(xedges)
plt.yticks(yedges)
plt.xlabel('X-Axis')
plt.ylabel('Y-Axis')
plt.plot(x_vals, y_vals, 'ro')
plt.show()
# 3-dimensional histogram with 1 x 1 x 1 m bins. Produces point counts in each 1m3 cell.
xyzstack = np.stack([x_vals,y_vals,z_vals], axis=1)
h3d, Hedges = np.histogramdd(xyzstack, bins=(xedges, yedges, zedges))
assert np.count_nonzero(h3d) == len(x_vals), "Unclassified points in the array"
h3d.shape # Shape of the array should be same as the edge dimensions
testzbin = np.sum(np.logical_and(z_vals >= 1, z_vals < 2)) # Slice to test with
np.sum(h3d[:,:,1]) == testzbin # Test num points in second bins
np.sum(h3d, axis=2) # Sum of all vertical points above each x,y 'pixel'
# only in this example the h2d and np.sum(h3d,axis=2) arrays will match as no z bins have >1 points
# Remaining issue - how to get a r x c count of empty z bins.
# i.e. for each 'pixel' how many z bins contained no points?
# Possible solution is to reshape to use logical operators
count2d = h3d.reshape(xrange * yrange, zrange) # Maintain dimensions per num 3D cells defined
zerobins = (count2d == 0).sum(1)
zerobins.shape
# Get back to x,y grid with counts - ready for output as image with counts=pixel digital number
bincount_pixels = zerobins.reshape(xrange,yrange)
# Appears to work, perhaps there is a way without reshapeing?
PS if you are facing a similar problem scikit patch extraction looks like another possible solution.

bin 3d points into 3d bins in python

How can I bin 3d points into 3d bins? Is there a multi dimensional version for np.digitize?
I can use np.digitize separately for each dimension, like here. Is there a better solution?
Thanks!
You can do this with numpy.histogramdd(sample), where the number of bins in each direction and the physical range can be adjusted as with a 1D histogram. More info on the reference page. For more general statistics, like the mean of another variable per point in a bin, you can use the scipy scipy.stats.binned_statistic_dd function, see docs.
For your case with an array of three dimensional points, you would use this in the following way,
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
from scipy import stats
#Setup some dummy data
points = np.random.randn(1000,3)
hist, binedges = np.histogramdd(points, normed=False)
#Setup a 3D figure and plot points as well as a series of slices
fig = plt.figure()
ax1 = fig.add_subplot(111, projection='3d')
ax1.plot(points[:,0],points[:,1],points[:,2],'k.',alpha=0.3)
#Use one less than bin edges to give rough bin location
X, Y = np.meshgrid(binedges[0][:-1],binedges[1][:-1])
#Loop over range of slice locations (default histogram uses 10 bins)
for ct in [0,2,5,7,9]:
cs = ax1.contourf(X,Y,hist[:,:,ct],
zdir='z',
offset=binedges[2][ct],
level=100,
cmap=plt.cm.RdYlBu_r,
alpha=0.5)
ax1.set_xlim(-3, 3)
ax1.set_ylim(-3, 3)
ax1.set_zlim(-3, 3)
plt.colorbar(cs)
plt.show()
which gives a series of histogram slices of occupancy at each location,

Best way to plot a 3D matrix in python

I am trying to visualize 3D data. This is a full 3D matrix: each (x,y,z) coordinate has a value, unlike a surface or a collection of individual data vectors. The way I am trying to do this is to plot an opaque cube, where each edge of the cube shows the sum of the data over the orthogonal dimension.
Some example data -- basically, a blob centered at (3,5,7):
import numpy as np
(x,y,z) = np.mgrid[0:10,0:10, 0:10]
data = np.exp(-((x-3)**2 + (y-5)**2 + (z-7)**2)**(0.5))
edge_yz = np.sum(data,axis=0)
edge_xz = np.sum(data,axis=1)
edge_xy = np.sum(data,axis=2)
So the idea would be here to generate a 3D plot that showed a cube; each surface of the cube would show the appropriate 2D matrix edge_*. This would be like plotting 3 4-sided polygons at the appropriate 3D positions (or 6 if you did the back sides of the cube as well) except that each polygon is actually a matrix of values to be plotted in color.
My best approximation at the moment is to compute larger matrices that contained skewed versions of edge, and concatenate these into a single, larger 2D matrix, and imshow() that larger matrix. Seems pretty clumsy, and does a lot of work that some engine in matplotlib or m3plot or something I'm sure already does. It also only works to view a static image at a single view angle, but that's not something I need to overcome at the moment.
Is there a good way to plot these cube edges in a true 3D plot using an existing python tool? Is there a better way to plot a 3D matrix?
Falko's suggestion to use contourf works with a bit of finagling. It's a bit limited since at least my version of contourf has a few bugs where it sometimes renders one of the planes in front of other planes it should be behind, but for now only plotting either the three front or three back sides of the cube will do:
import numpy as np
import math
import matplotlib.pyplot as plot
import mpl_toolkits.mplot3d.axes3d as axes3d
def cube_marginals(cube, normalize=False):
c_fcn = np.mean if normalize else np.sum
xy = c_fcn(cube, axis=0)
xz = c_fcn(cube, axis=1)
yz = c_fcn(cube, axis=2)
return(xy,xz,yz)
def plotcube(cube,x=None,y=None,z=None,normalize=False,plot_front=False):
"""Use contourf to plot cube marginals"""
(Z,Y,X) = cube.shape
(xy,xz,yz) = cube_marginals(cube,normalize=normalize)
if x == None: x = np.arange(X)
if y == None: y = np.arange(Y)
if z == None: z = np.arange(Z)
fig = plot.figure()
ax = fig.gca(projection='3d')
# draw edge marginal surfaces
offsets = (Z-1,0,X-1) if plot_front else (0, Y-1, 0)
cset = ax.contourf(x[None,:].repeat(Y,axis=0), y[:,None].repeat(X,axis=1), xy, zdir='z', offset=offsets[0], cmap=plot.cm.coolwarm, alpha=0.75)
cset = ax.contourf(x[None,:].repeat(Z,axis=0), xz, z[:,None].repeat(X,axis=1), zdir='y', offset=offsets[1], cmap=plot.cm.coolwarm, alpha=0.75)
cset = ax.contourf(yz, y[None,:].repeat(Z,axis=0), z[:,None].repeat(Y,axis=1), zdir='x', offset=offsets[2], cmap=plot.cm.coolwarm, alpha=0.75)
# draw wire cube to aid visualization
ax.plot([0,X-1,X-1,0,0],[0,0,Y-1,Y-1,0],[0,0,0,0,0],'k-')
ax.plot([0,X-1,X-1,0,0],[0,0,Y-1,Y-1,0],[Z-1,Z-1,Z-1,Z-1,Z-1],'k-')
ax.plot([0,0],[0,0],[0,Z-1],'k-')
ax.plot([X-1,X-1],[0,0],[0,Z-1],'k-')
ax.plot([X-1,X-1],[Y-1,Y-1],[0,Z-1],'k-')
ax.plot([0,0],[Y-1,Y-1],[0,Z-1],'k-')
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')
plot.show()
plot_front=True
plot_front=False
Other data (not shown)
Take a look at MayaVI. The contour3d() function may be what you want.
Here's an answer I gave to a similar question with an example of the code and resulting plot https://stackoverflow.com/a/24784471/3419537

Waterfall plot python?

Is there a python module that will do a waterfall plot like MATLAB does? I googled 'numpy waterfall', 'scipy waterfall', and 'matplotlib waterfall', but did not find anything.
You can do a waterfall in matplotlib using the PolyCollection class. See this specific example to have more details on how to do a waterfall using this class.
Also, you might find this blog post useful, since the author shows that you might obtain some 'visual bug' in some specific situation (depending on the view angle chosen).
Below is an example of a waterfall made with matplotlib (image from the blog post):
(source: austringer.net)
Have a look at mplot3d:
# copied from
# http://matplotlib.sourceforge.net/mpl_examples/mplot3d/wire3d_demo.py
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
X, Y, Z = axes3d.get_test_data(0.05)
ax.plot_wireframe(X, Y, Z, rstride=10, cstride=10)
plt.show()
I don't know how to get results as nice as Matlab does.
If you want more, you may also have a look at MayaVi: http://mayavi.sourceforge.net/
The Wikipedia type of Waterfall chart one can obtain also like this:
import numpy as np
import pandas as pd
def waterfall(series):
df = pd.DataFrame({'pos':np.maximum(series,0),'neg':np.minimum(series,0)})
blank = series.cumsum().shift(1).fillna(0)
df.plot(kind='bar', stacked=True, bottom=blank, color=['r','b'])
step = blank.reset_index(drop=True).repeat(3).shift(-1)
step[1::3] = np.nan
plt.plot(step.index, step.values,'k')
test = pd.Series(-1 + 2 * np.random.rand(10), index=list('abcdefghij'))
waterfall(test)
I have generated a function that replicates the matlab waterfall behaviour in matplotlib. That is:
It generates the 3D shape as many independent and parallel 2D curves
Its color comes from a colormap in the z values
I started from two examples in matplotlib documentation: multicolor lines and multiple lines in 3d plot. From these examples, I only saw possible to draw lines whose color varies following a given colormap according to its z value following the example, which is reshaping the input array to draw the line by segments of 2 points and setting the color of the segment to the z mean value between these 2 points.
Thus, given the input matrixes n,m matrixes X,Y and Z, the function loops over the smallest dimension between n,m to plot each of the waterfall plot independent lines as a line collection of the 2 points segments as explained above.
def waterfall_plot(fig,ax,X,Y,Z,**kwargs):
'''
Make a waterfall plot
Input:
fig,ax : matplotlib figure and axes to populate
Z : n,m numpy array. Must be a 2d array even if only one line should be plotted
X,Y : n,m array
kwargs : kwargs are directly passed to the LineCollection object
'''
# Set normalization to the same values for all plots
norm = plt.Normalize(Z.min().min(), Z.max().max())
# Check sizes to loop always over the smallest dimension
n,m = Z.shape
if n>m:
X=X.T; Y=Y.T; Z=Z.T
m,n = n,m
for j in range(n):
# reshape the X,Z into pairs
points = np.array([X[j,:], Z[j,:]]).T.reshape(-1, 1, 2)
segments = np.concatenate([points[:-1], points[1:]], axis=1)
# The values used by the colormap are the input to the array parameter
lc = LineCollection(segments, cmap='plasma', norm=norm, array=(Z[j,1:]+Z[j,:-1])/2, **kwargs)
line = ax.add_collection3d(lc,zs=(Y[j,1:]+Y[j,:-1])/2, zdir='y') # add line to axes
fig.colorbar(lc) # add colorbar, as the normalization is the same for all
# it doesent matter which of the lc objects we use
ax.auto_scale_xyz(X,Y,Z) # set axis limits
Therefore, plots looking like matlab waterfall can be easily generated with the same input matrixes as a matplotlib surface plot:
import numpy as np; import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
from mpl_toolkits.mplot3d import Axes3D
# Generate data
x = np.linspace(-2,2, 500)
y = np.linspace(-2,2, 60)
X,Y = np.meshgrid(x,y)
Z = np.sin(X**2+Y**2)-.2*X
# Generate waterfall plot
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
waterfall_plot(fig,ax,X,Y,Z,linewidth=1.5,alpha=0.5)
ax.set_xlabel('X'); ax.set_ylabel('Y'); ax.set_zlabel('Z')
fig.tight_layout()
The function assumes that when generating the meshgrid, the x array is the longest, and by default the lines have fixed y, and its the x coordinate what varies. However, if the size of the y array is longer, the matrixes are transposed, generating the lines with fixed x. Thus, generating the meshgrid with the sizes inverted (len(x)=60 and len(y)=500) yields:
To see what are the possibilities of the **kwargs argument, refer to the LineCollection class documantation and to its set_ methods.

Categories