Is there a python module that will do a waterfall plot like MATLAB does? I googled 'numpy waterfall', 'scipy waterfall', and 'matplotlib waterfall', but did not find anything.
You can do a waterfall in matplotlib using the PolyCollection class. See this specific example to have more details on how to do a waterfall using this class.
Also, you might find this blog post useful, since the author shows that you might obtain some 'visual bug' in some specific situation (depending on the view angle chosen).
Below is an example of a waterfall made with matplotlib (image from the blog post):
(source: austringer.net)
Have a look at mplot3d:
# copied from
# http://matplotlib.sourceforge.net/mpl_examples/mplot3d/wire3d_demo.py
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
X, Y, Z = axes3d.get_test_data(0.05)
ax.plot_wireframe(X, Y, Z, rstride=10, cstride=10)
plt.show()
I don't know how to get results as nice as Matlab does.
If you want more, you may also have a look at MayaVi: http://mayavi.sourceforge.net/
The Wikipedia type of Waterfall chart one can obtain also like this:
import numpy as np
import pandas as pd
def waterfall(series):
df = pd.DataFrame({'pos':np.maximum(series,0),'neg':np.minimum(series,0)})
blank = series.cumsum().shift(1).fillna(0)
df.plot(kind='bar', stacked=True, bottom=blank, color=['r','b'])
step = blank.reset_index(drop=True).repeat(3).shift(-1)
step[1::3] = np.nan
plt.plot(step.index, step.values,'k')
test = pd.Series(-1 + 2 * np.random.rand(10), index=list('abcdefghij'))
waterfall(test)
I have generated a function that replicates the matlab waterfall behaviour in matplotlib. That is:
It generates the 3D shape as many independent and parallel 2D curves
Its color comes from a colormap in the z values
I started from two examples in matplotlib documentation: multicolor lines and multiple lines in 3d plot. From these examples, I only saw possible to draw lines whose color varies following a given colormap according to its z value following the example, which is reshaping the input array to draw the line by segments of 2 points and setting the color of the segment to the z mean value between these 2 points.
Thus, given the input matrixes n,m matrixes X,Y and Z, the function loops over the smallest dimension between n,m to plot each of the waterfall plot independent lines as a line collection of the 2 points segments as explained above.
def waterfall_plot(fig,ax,X,Y,Z,**kwargs):
'''
Make a waterfall plot
Input:
fig,ax : matplotlib figure and axes to populate
Z : n,m numpy array. Must be a 2d array even if only one line should be plotted
X,Y : n,m array
kwargs : kwargs are directly passed to the LineCollection object
'''
# Set normalization to the same values for all plots
norm = plt.Normalize(Z.min().min(), Z.max().max())
# Check sizes to loop always over the smallest dimension
n,m = Z.shape
if n>m:
X=X.T; Y=Y.T; Z=Z.T
m,n = n,m
for j in range(n):
# reshape the X,Z into pairs
points = np.array([X[j,:], Z[j,:]]).T.reshape(-1, 1, 2)
segments = np.concatenate([points[:-1], points[1:]], axis=1)
# The values used by the colormap are the input to the array parameter
lc = LineCollection(segments, cmap='plasma', norm=norm, array=(Z[j,1:]+Z[j,:-1])/2, **kwargs)
line = ax.add_collection3d(lc,zs=(Y[j,1:]+Y[j,:-1])/2, zdir='y') # add line to axes
fig.colorbar(lc) # add colorbar, as the normalization is the same for all
# it doesent matter which of the lc objects we use
ax.auto_scale_xyz(X,Y,Z) # set axis limits
Therefore, plots looking like matlab waterfall can be easily generated with the same input matrixes as a matplotlib surface plot:
import numpy as np; import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
from mpl_toolkits.mplot3d import Axes3D
# Generate data
x = np.linspace(-2,2, 500)
y = np.linspace(-2,2, 60)
X,Y = np.meshgrid(x,y)
Z = np.sin(X**2+Y**2)-.2*X
# Generate waterfall plot
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
waterfall_plot(fig,ax,X,Y,Z,linewidth=1.5,alpha=0.5)
ax.set_xlabel('X'); ax.set_ylabel('Y'); ax.set_zlabel('Z')
fig.tight_layout()
The function assumes that when generating the meshgrid, the x array is the longest, and by default the lines have fixed y, and its the x coordinate what varies. However, if the size of the y array is longer, the matrixes are transposed, generating the lines with fixed x. Thus, generating the meshgrid with the sizes inverted (len(x)=60 and len(y)=500) yields:
To see what are the possibilities of the **kwargs argument, refer to the LineCollection class documantation and to its set_ methods.
Related
I'm using python to create a 3D surface map, I have an array of data I'm trying to plot as a 3D surface, the issue is that I have logged the Z axis (necessary to show peaks in data) which means the default colormap doesn't work (displays one continous color). I've tried using the LogNorm to normalise the colormap but again this produces one continous color. I'm not sure whether I should be using the logged values to normalise the map, but if i do this the max is negative and produces an error?
fig=plt.figure(figsize=(10,10))
ax=plt.axes(projection='3d')
def log_tick_formatter(val, pos=None):
return "{:.2e}".format(10**val)
ax.zaxis.set_major_formatter(mticker.FuncFormatter(log_tick_formatter))
X=np.arange(0,2,1)
Y=np.arange(0,3,1)
X,Y=np.meshgrid(X,Y)
Z=[[1.2e-11,1.3e-11,-1.8e-11],[6e-13,1.3e-13,2e-15]]
Z_min=np.amin(Z)
Z_max=np.amax(Z)
norm = colors.LogNorm(vmin=1e-15,vmax=(Z_max),clip=False)
ax.plot_surface(X,Y,np.transpose(np.log10(Z)),norm=norm,cmap='rainbow')
Just an example of the logarithmic colors and logarithmic data:
#!/usr/bin/env ipython
import numpy as np
import matplotlib as mpl
import matplotlib.pylab as plt
import matplotlib.colors as colors
# ------------------
X=np.arange(0,401,1);nx= np.size(X)
Y=np.arange(40,200,1);ny = np.size(Y)
X,Y=np.meshgrid(X,Y)
Z = 10000*np.random.random((ny,nx))
Z=np.array(Z)
# ------------------------------------------------------------
Z_min=np.amin(Z)
Z_max=np.amax(Z)
# ------------------------------------------------------------
norm = colors.LogNorm(vmin=np.nanmin(Z),vmax=np.nanmax(Z),clip=False)
# ------------------------------------------------------------
fig = plt.figure(figsize=(15,5));axs = [fig.add_subplot(131),fig.add_subplot(132),fig.add_subplot(133)]
p0 = axs[0].pcolormesh(X,Y,np.log10(Z),cmap='rainbow',norm=norm);plt.colorbar(p0,ax=axs[0]);
axs[0].set_title('Original method: NOT TO DO!')
p1 = axs[1].pcolormesh(X,Y,Z,cmap='rainbow',norm=norm);plt.colorbar(p1,ax=axs[1])
axs[1].set_title('Normalized colorbar, original data')
p2 = axs[2].pcolormesh(X,Y,np.log10(Z),cmap='rainbow');plt.colorbar(p2,ax=axs[2])
axs[2].set_title('Logarithmic data, original colormap')
plt.savefig('test.png',bbox_inches='tight')
# --------------------------------------------------------------
So the result is like this:
In the first case, we have used logarithmic colormap and also taken the logarithm of the data, so the colorbar does not work anymore as the values on the map are small and we have used large limits for the colorbar.
In the middle image, we use the normalized colorbar or logarithmic colorbar so that it is quite natively understood what is on the image and what are the values. The third case is when we take the logarithm from the data and the colorbar is just showing the power of the 10th we have to use in order to interpret the coloured value on the plot.
So, in the end, I would suggest the middle method: use the logarithmic colorbar and original data.
Edit: to solve your problem you are taking the log of the data then you are taking it again when calculating the norm, simply remove the norm and apply vmin and vmax directly to the drawing function
ax.plot_surface(X, Y, np.transpose(np.log10(Z)), cmap='rainbow',vmin=np.log10(1e-15),vmax=np.log10(Z_max))
you can use the facecolor argument of plot_surface to define color for each face independent of z, here's a simplified example
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
x = np.linspace(0,10,100)
y = np.linspace(0,10,100)
x,y = np.meshgrid(x,y)
z = np.sin(x+y)
fig, ax = plt.subplots(subplot_kw={"projection": "3d"})
cmap = matplotlib.cm.get_cmap('rainbow')
def rescale_0_to_1(item):
max_z = np.amax(item)
min_z = np.amin(item)
return (item - min_z)/(max_z-min_z)
rgba = cmap(rescale_0_to_1(z)) # some values of z to calculate color with
real_z = np.log(z+1) # real values of z to draw
surf = ax.plot_surface(x, y, real_z, cmap='rainbow', facecolors=rgba)
plt.show()
you can modify it to calculate colors based on x or y or something completely unrelated.
I'm trying to visualise a dataset in 3D which consists of a time series (along y) of x-z data, using Python and Matplotlib.
I'd like to create a plot like the one below (which was made in Python: http://austringer.net/wp/index.php/2011/05/20/plotting-a-dolphin-biosonar-click-train/), but where the colour varies with Z - i.e. so the intensity is shown by a colormap as well as the peak height, for clarity.
An example showing the colormap in Z is (apparently made using MATLAB):
This effect can be created using the waterfall plot option in MATLAB, but I understand there is no direct equivalent of this in Python.
I have also tried using the plot_surface option in Python (below), which works ok, but I'd like to 'force' the lines running over the surface to only be in the x direction (i.e. making it look more like a stacked time series than a surface). Is this possible?
Any help or advice greatly welcomed. Thanks.
I have generated a function that replicates the matlab waterfall behaviour in matplotlib, but I don't think it is the best solution when it comes to performance.
I started from two examples in matplotlib documentation: multicolor lines and multiple lines in 3d plot. From these examples, I only saw possible to draw lines whose color varies following a given colormap according to its z value following the example, which is reshaping the input array to draw the line by segments of 2 points and setting the color of the segment to the z mean value between the 2 points.
Thus, given the input matrixes n,m matrixes X,Y and Z, the function loops over the smallest dimension between n,m to plot each line like in the example, by 2 points segments, where the reshaping to plot by segments is done reshaping the array with the same code as the example.
def waterfall_plot(fig,ax,X,Y,Z):
'''
Make a waterfall plot
Input:
fig,ax : matplotlib figure and axes to populate
Z : n,m numpy array. Must be a 2d array even if only one line should be plotted
X,Y : n,m array
'''
# Set normalization to the same values for all plots
norm = plt.Normalize(Z.min().min(), Z.max().max())
# Check sizes to loop always over the smallest dimension
n,m = Z.shape
if n>m:
X=X.T; Y=Y.T; Z=Z.T
m,n = n,m
for j in range(n):
# reshape the X,Z into pairs
points = np.array([X[j,:], Z[j,:]]).T.reshape(-1, 1, 2)
segments = np.concatenate([points[:-1], points[1:]], axis=1)
lc = LineCollection(segments, cmap='plasma', norm=norm)
# Set the values used for colormapping
lc.set_array((Z[j,1:]+Z[j,:-1])/2)
lc.set_linewidth(2) # set linewidth a little larger to see properly the colormap variation
line = ax.add_collection3d(lc,zs=(Y[j,1:]+Y[j,:-1])/2, zdir='y') # add line to axes
fig.colorbar(lc) # add colorbar, as the normalization is the same for all, it doesent matter which of the lc objects we use
Therefore, plots looking like matlab waterfall can be easily generated with the same input matrixes as a matplotlib surface plot:
import numpy as np; import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
from mpl_toolkits.mplot3d import Axes3D
# Generate data
x = np.linspace(-2,2, 500)
y = np.linspace(-2,2, 40)
X,Y = np.meshgrid(x,y)
Z = np.sin(X**2+Y**2)
# Generate waterfall plot
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
waterfall_plot(fig,ax,X,Y,Z)
ax.set_xlabel('X') ; ax.set_xlim3d(-2,2)
ax.set_ylabel('Y') ; ax.set_ylim3d(-2,2)
ax.set_zlabel('Z') ; ax.set_zlim3d(-1,1)
The function assumes that when generating the meshgrid, the x array is the longest, and by default the lines have fixed y, and its the x coordinate what varies. However, if the size of the y dimension is larger, the matrixes are transposed, generating the lines with fixed x. Thus, generating the meshgrid with the sizes inverted (len(x)=40 and len(y)=500) yields:
with a pandas dataframe with the x axis as the first column and each spectra as another column
offset=0
for c in s.columns[1:]:
plt.plot(s.wavelength,s[c]+offset)
offset+=.25
plt.xlim([1325,1375])
As explained by Joe Kington answering in this question : How can I make a scatter plot colored by density in matplotlib, I made a scatter plot colored by density. However, due to the complex distribution of my data, I would like to change the parameters used to calculate the density.
Here is the results with some fake data similar to mine :
I would want to calibrate the density calculations of gaussian_kde so that the left part of the plot looks like this :
I don't like the first plot because the groups of points influence the density of adjacent groups of points and that prevents me from analyzing the distribution within a group. In other words, even if each of the 8 groups have exactly the same distribution, that won't be visible on the graph.
I tried to modify the covariance_factor (like I once did for a 2d plot of density over x), but when gaussian_kde is used with multiple dimension arrays it returns a numpy.ndarray, not a "scipy.stats.kde.gaussian_kde" object. Plus, I don't even know if changing the covariance_factor will do it.
Here's my dummy code :
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import gaussian_kde
# Generate fake data
a = np.random.normal(size=1000)
b = np.random.normal(size=1000)
# Data for the first image
x = np.concatenate((a+10,a+10,a+20,a+20,a+30,a+30,a+40,a+40,a+80))
y = np.concatenate((b+10,b-10,b+10,b-10,b+10,b-10,b+10,b-10,b*4))
# Data for the second image
#x = np.concatenate((a+10,a+10,a+20,a+20,a+30,a+30,a+40,a+40))
#y = np.concatenate((b+10,b-10,b+10,b-10,b+10,b-10,b+10,b-10))
# Calculate the point density
xy = np.vstack([x,y])
z = gaussian_kde(xy)(xy)
# My unsuccesfull try to modify covariance which would work in 1D with "z = gaussian_kde(x)"
#z.covariance_factor = lambda : 0.01
#z._compute_covariance()
# Sort the points by density, so that the densest points are plotted last
idx = z.argsort()
x, y, z = x[idx], y[idx], z[idx]
fig, ax = plt.subplots()
ax.scatter(x, y, c=z, s=50, edgecolor='')
plt.show()
The solution could use an other density calculator, I don't mind.
The goal is to make a density plot like the ones showed above, where I can play with the density parameters.
I'm using python 3.4.3
Did have a look at Seaborn? It's not exactly what you're asking for, but it already has functions for generating density plots:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import kendalltau
import seaborn as sns
# Generate fake data
a = np.random.normal(size=1000)
b = np.random.normal(size=1000)
# Data for the first image
x = np.concatenate((a+10, a+10, a+20, a+20, a+30, a+30, a+40, a+40, a+80))
y = np.concatenate((b+10, b-10, b+10, b-10, b+10, b-10, b+10, b-10, b*4))
sns.jointplot(x, y, kind="hex", stat_func=kendalltau)
sns.jointplot(x, y, kind="kde", stat_func=kendalltau)
plt.show()
It gives:
and
How can I bin 3d points into 3d bins? Is there a multi dimensional version for np.digitize?
I can use np.digitize separately for each dimension, like here. Is there a better solution?
Thanks!
You can do this with numpy.histogramdd(sample), where the number of bins in each direction and the physical range can be adjusted as with a 1D histogram. More info on the reference page. For more general statistics, like the mean of another variable per point in a bin, you can use the scipy scipy.stats.binned_statistic_dd function, see docs.
For your case with an array of three dimensional points, you would use this in the following way,
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
from scipy import stats
#Setup some dummy data
points = np.random.randn(1000,3)
hist, binedges = np.histogramdd(points, normed=False)
#Setup a 3D figure and plot points as well as a series of slices
fig = plt.figure()
ax1 = fig.add_subplot(111, projection='3d')
ax1.plot(points[:,0],points[:,1],points[:,2],'k.',alpha=0.3)
#Use one less than bin edges to give rough bin location
X, Y = np.meshgrid(binedges[0][:-1],binedges[1][:-1])
#Loop over range of slice locations (default histogram uses 10 bins)
for ct in [0,2,5,7,9]:
cs = ax1.contourf(X,Y,hist[:,:,ct],
zdir='z',
offset=binedges[2][ct],
level=100,
cmap=plt.cm.RdYlBu_r,
alpha=0.5)
ax1.set_xlim(-3, 3)
ax1.set_ylim(-3, 3)
ax1.set_zlim(-3, 3)
plt.colorbar(cs)
plt.show()
which gives a series of histogram slices of occupancy at each location,
I need to generate a stack of 2D polar plots (a 3D cylindrical plot) so that I can view a distorted cylinder. I want to use matplotlib since I already have it installed and want to distribute my code to others who only have matplotlib. For example, say I have a bunch of 2-D arrays. Is there any way I can do this without having to download an external package? Here's my code.
#!usr/bin/env python
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(-180.0,190.0,10)
theta = (np.pi/180.0 )*x # in radians
A0 = 55.0
offset = 60.0
R = [116.225,115.105,114.697,115.008,115.908,117.184,118.61,119.998,121.224,122.216,\
122.93,123.323,123.343,122.948,122.134,120.963,119.575,118.165,116.941,116.074,115.66\
,115.706,116.154,116.913,117.894,119.029,120.261,121.518,122.684,123.594,124.059,\
123.917,123.096,121.661,119.821,117.894,116.225]
fig = plt.figure()
ax = fig.add_axes([0.1,0.1,0.8,0.8],polar=True) # Polar plot
ax.plot(theta,R,lw=2.5)
ax.set_rmax(1.5*(A0)+offset)
plt.show()
I have 10 more similar 2D polar plots and I want to stack them up nicely. If there's any better way to visualize a distorted cylinder in 3D, I'm totally open to suggestions. Any help would be appreciated. Thanks!
If you want to stack polar charts using matplotlib, one approach is to use the Axes3D module. You'll notice that I used polar coordinates first and then converted them back to Cartesian when I was ready to plot them.
from numpy import *
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
n = 1000
fig = plt.figure()
ax = fig.gca(projection='3d')
for k in linspace(0, 5, 5):
THETA = linspace(0, 2*pi, n)
R = ones(THETA.shape)*cos(THETA*k)
# Convert to Cartesian coordinates
X = R*cos(THETA)
Y = R*sin(THETA)
ax.plot(X, Y, k-2)
plt.show()
If you play with the last argument of ax.plot, it controls the height of each slice. For example, if you want to project all of your data down to a single axis you would use ax.plot(X, Y, 0). For a more exotic example, you can map the height of the data onto a function, say a saddle ax.plot(X, Y, -X**2+Y**2 ). By playing with the colors as well, you could in theory represent multiple 4 dimensional datasets (though I'm not sure how clear this would be). Examples below: