Python matplotlib 3D bar plot with error bars - python

I am trying to get a 3D barplot with error bars.
I am open to use matplotlib, seaborn or any other python library or tool
Searching in SO I found 3D bar graphs can be done by drawing several 2D plots (here for example). This is my code:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
dades01 = [54,43,24,104,32,63,57,14,32,12]
dades02 = [35,23,14,54,24,33,43,55,23,11]
dades03 = [12,65,24,32,13,54,23,32,12,43]
df_3d = pd.DataFrame([dades01, dades02, dades03]).transpose()
colors = ['r','b','g','y','b','p']
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
z= list(df_3d)
for n, i in enumerate(df_3d):
print 'n',n
xs = np.arange(len(df_3d[i]))
ys = [i for i in df_3d[i]]
zs = z[n]
cs = colors[n]
print ' xs:', xs,'ys:', ys, 'zs',zs, ' cs: ',cs
ax.bar(xs, ys, zs, zdir='y', color=cs, alpha=0.8)
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')
plt.show()
I get the 3D 'ish' plot.
My question is: How do I add error bars?
To make it easy, lets try to add the same error bars to all the plots:
yerr=[10,10,10,10,10,10,10,10,10,10]
If I add my error bars in each '2D' plot:
ax.bar(xs, ys, zs, zdir='y', color=cs,yerr=[10,10,10,10,10,10,10,10,10,10], alpha=0.8)
Doesn't work:
AttributeError: 'LineCollection' object has no attribute 'do_3d_projection'
I have also tried to add:
#ax.errorbar(xs, ys, zs, yerr=[10,10,10,10,10,10,10,10,10,10], ls = 'none')
But again an error:
TypeError: errorbar() got multiple values for keyword argument 'yerr'
Any idea how I could get 3D plot bars with error bars?

There is no direct way to the best of my knowledge to do it in 3d. However, you can create a workaround solution as shown below. The solution is inspired from here. The trick here is to pass two points lying vertically and then use _ as the marker to act as the error bar cap.
yerr=np.array([10,10,10,10,10,10,10,10,10,10])
for n, i in enumerate(df_3d):
xs = np.arange(len(df_3d[i]))
ys = [i for i in df_3d[i]]
zs = z[n]
cs = colors[n]
ax.bar(xs, ys, zs, zdir='y', color=cs, alpha=0.8)
for i, j in enumerate(ys):
ax.plot([xs[i], xs[i]], [zs, zs], [j+yerr[i], j-yerr[i]], marker="_", color=cs)
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')

First of all, don't use a 3D plot when a 2D plot would suffice, which in this case, it would. Using 3D plots for 2D data unnecessarily obfuscates things.
Second, you can use a combination of a MultiIndex pandas dataframe to get your desired result:
df = pd.DataFrame({
'a': list(range(5))*3,
'b': [1, 2, 3]*5,
'c': np.random.randint(low=0, high=10, size=15)
}).set_index(['a', 'b'])
fig, ax = plt.subplots(figsize=(10,6))
y_errs = np.random.random(size=(3, 5))
df.unstack().plot.bar(ax=ax, yerr=y_errs)
This produces a plot like the following:
I'm using the 'bmh' style here (i.e., I called plt.style.use('bmh') earlier in my notebook that I had opened), which is why it looks the way it does.

Related

How to project 2d plots (e.g. boxplot) to 3d in matplotlib?

I have found code to project 2d bar plots or scatter plots into 3d. For example, with the following code:
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(8,8))
ax = fig.add_subplot(111, projection='3d')
nbins = 50
for i, c, z in zip([0,1,2],['r', 'g', 'b', 'y'], [30, 20, 10, 0]):
ys = np.random.normal(loc=10, scale=10, size=2000)
hist, bins = np.histogram(ys, bins=nbins)
xs = (bins[:-1] + bins[1:])/2
ax.bar(xs, hist,zs=z, zdir='y', color=c, ec=c, alpha=0.8)
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')
plt.show()
However, I want to project other 2d plots into 3d, for example, boxplot. When I modified the code above to boxplot, I cannot use arguments including "zs=z" and "zdir='y'" to set apart 2d plots at different positions. What should I do to make boxplots into figure above? Thanks!

Matplotlib: categorical plot without strings and inversion of axes

Let's take this snippet of Python:
import matplotlib.pyplot as plt
x = [5,4,3,2,1,0]
x_strings = ['5','4','3','2','1','0']
y = [0,1,2,3,4,5]
plt.figure()
plt.subplot(311)
plt.plot(x, y, marker='o')
plt.subplot(312)
plt.plot(x_strings, y, marker='^', color='red')
plt.subplot(313)
plt.plot(x, y, marker='^', color='red')
plt.gca().invert_xaxis()
plt.show()
Which produces these three subplots:
In the top subplot the x values are automatically sorted increasingly despite their order in the given list. If I want to plot x vs. y exactly in the given order of x, then I have two possibilities:
1) Convert x values to strings and have a categorical plot -- that's the middle subplot.
2) Invert the x-axis -- that's the bottom subplot.
Question: is there any other way to do a sort of categorical plot, but without conversion of numbers into strings and without the inversion of the x-axis?
ADD-ON:
If I use set_xticklabels(list), then for some unclear reason the first element in the list is skipped (no matter if I refer to the x or to the x_strings list), and the resulting plot is also totally strange:
import matplotlib.pyplot as plt
x = [5,4,3,2,1,0]
x_strings = ['5','4','3','2','1','0']
y = [0,1,2,3,4,5]
fig, ax = plt.subplots()
ax.set_xticklabels(x)
ax.plot(x, y, marker='^', color='red')
plt.show()
Both attempted solutions seem possible. Alternatively, you can always mimic categorical plots by plotting integer numbers and setting the ticklabels to your liking.
import matplotlib.pyplot as plt
x = [5,4,3,2,1,0]
y = [0,1,2,3,4,5]
fig, ax = plt.subplots()
ax.plot(range(len(y)), y, marker='^', color='red')
ax.set_xticks(range(len(y)))
ax.set_xticklabels(x)
plt.show()
I have found another way to do it, without being anyhow categorical and without x-axis inversion!
ax = plt.subplot()
ax.set_xlim(x[0],x[-1], auto=True) # this line plays the trick
plt.plot(x, y, marker='^', color='red')

One legend entry when plotting several curves using one `plot` call

I am creating a grid by plotting several curves using one plot call as:
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
x = np.array([[0,1], [0,1], [0,1]])
y = np.array([[0,0], [1,1], [2,2]])
ax.plot([0,1],[0,2], label='foo', color='b')
ax.plot(x.T, y.T, label='bar', color='k')
ax.legend()
plt.show()
The resulting legend has as many 'bar' entries as there are curves (see below). I wish that have only one legend entry per plot call (in this case only one time 'bar').
I want this such that I can have other plot commands (e.g. the one plotting the 'foo' curve) whose curves are automatically included in the legend if they have a label. I specifically want to avoid hand-selecting the handles when constructing the legend, but rather use matplotlib's feature to deal with this by yes/no including a label when plotting. How can I achieve this?
Here is one possible solution: You may use the fact that underscores do not produce legend entries. So setting all but the first label to "_" suppresses those to appear in the legend.
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
x = np.array([[0,1], [0,1], [0,1]])
y = np.array([[0,0], [1,1], [2,2]])
ax.plot([0,1],[0,2], label='foo', color='b')
lines = ax.plot(x.T, y.T, label='bar', color='k')
plt.setp(lines[1:], label="_")
ax.legend()
plt.show()
Following is one way using the already existing legend handles and labels. You first get the three handles, labels and then just show the first one. This way additionally gives you a control not only on the order of putting handles but also what to show on the plot.
ax.plot(x.T, y.T, label='bar', color='k')
handles, labels = ax.get_legend_handles_labels()
ax.legend([handles[0]], [labels[0]], loc='best')
Alternative approach where the legends will only be taken from a particular plot (set of lines) -- ax1 in this case
ax1 = ax.plot(x.T, y.T, label='bar', color='k')
plt.legend(handles=[ax1[0]], loc='best')
Extending it to you problem with two figures
ax1 = ax.plot([0,1],[0,2], label='foo', color='b')
ax2 = ax.plot(x.T, y.T, label='bar', color='k')
plt.legend(handles=[ax1[0], ax2[1]], loc='best')
Another alternative using for loops as suggested by #SpghttCd
for i in range(len(x)):
ax.plot(x[i], y[i], label=('' if i==0 else '_') + 'bar', color='k')
ax.legend()
Maybe not quite elegant, but the easiest and most straightforward way is to make a second plot using a single pair of elements where you prescribe the 'label' you want!
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
x = np.array([[0,1], [0,1], [0,1]])
y = np.array([[0,0], [1,1], [2,2]])
ax.plot([0,1],[0,2], label='foo', color='b')
ax.plot(x.T, y.T, color='k')
ax.plot(x[0].T, y[0].T, label='bar', color='k')
ax.legend()
plt.show()

Turning 2D graphics into 3D in python

In 2D I have my x that gets the value of the x and y coordinates:
x = [[0.72,0.82]]
And at some point in the code I use this:
plt.plot(x[i][0], x[i][1], 'go', markersize=15, alpha=.5)
Now I have an x that gets the value of the x, y, and z coordinates:
x = [[0.72,0.82,-0.77]]
And I want to reproduce the same effect of 2D only now in 3D, I tried to do something like:
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.set_xlim(-1, 1)
ax.set_ylim(-1, 1)
ax.set_zlim(-1, 1)
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')
ax.scatter(x[i][0], x[i][1], x[i][2], 'go', markersize=15, alpha=.5)
But I get the following error:
AttributeError: Unknown property markersize
P.S.: I'm using:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
I'd like to know how can I plot them correctly.
Check matplotlib reference for ax.scatter arguments, markerzise and alpha are not there, to change points' size you should use sargument, something like:
ax.scatter(xs, ys, zs, s=10, c=c, marker=m)
Notice s can also be an array of the same length as xs if you want points' size to be proportional to it's xsvalue.

Adding y=x to a matplotlib scatter plot if I haven't kept track of all the data points that went in

Here's some code that does scatter plot of a number of different series using matplotlib and then adds the line y=x:
import numpy as np, matplotlib.pyplot as plt, matplotlib.cm as cm, pylab
nseries = 10
colors = cm.rainbow(np.linspace(0, 1, nseries))
all_x = []
all_y = []
for i in range(nseries):
x = np.random.random(12)+i/10.0
y = np.random.random(12)+i/5.0
plt.scatter(x, y, color=colors[i])
all_x.extend(x)
all_y.extend(y)
# Could I somehow do the next part (add identity_line) if I haven't been keeping track of all the x and y values I've seen?
identity_line = np.linspace(max(min(all_x), min(all_y)),
min(max(all_x), max(all_y)))
plt.plot(identity_line, identity_line, color="black", linestyle="dashed", linewidth=3.0)
plt.show()
In order to achieve this I've had to keep track of all the x and y values that went into the scatter plot so that I know where identity_line should start and end. Is there a way I can get y=x to show up even if I don't have a list of all the points that I plotted? I would think that something in matplotlib can give me a list of all the points after the fact, but I haven't been able to figure out how to get that list.
You don't need to know anything about your data per se. You can get away with what your matplotlib Axes object will tell you about the data.
See below:
import numpy as np
import matplotlib.pyplot as plt
# random data
N = 37
x = np.random.normal(loc=3.5, scale=1.25, size=N)
y = np.random.normal(loc=3.4, scale=1.5, size=N)
c = x**2 + y**2
# now sort it just to make it look like it's related
x.sort()
y.sort()
fig, ax = plt.subplots()
ax.scatter(x, y, s=25, c=c, cmap=plt.cm.coolwarm, zorder=10)
Here's the good part:
lims = [
np.min([ax.get_xlim(), ax.get_ylim()]), # min of both axes
np.max([ax.get_xlim(), ax.get_ylim()]), # max of both axes
]
# now plot both limits against eachother
ax.plot(lims, lims, 'k-', alpha=0.75, zorder=0)
ax.set_aspect('equal')
ax.set_xlim(lims)
ax.set_ylim(lims)
fig.savefig('/Users/paul/Desktop/so.png', dpi=300)
Et voilĂ 
In one line:
ax.plot([0,1],[0,1], transform=ax.transAxes)
No need to modify the xlim or ylim.
Starting with matplotlib 3.3 this has been made very simple with the axline method which only needs a point and a slope. To plot x=y:
ax.axline((0, 0), slope=1)
You don't need to look at your data to use this because the point you specify (i.e. here (0,0)) doesn't actually need to be in your data or plotting range.
If you set scalex and scaley to False, it saves a bit of bookkeeping. This is what I have been using lately to overlay y=x:
xpoints = ypoints = plt.xlim()
plt.plot(xpoints, ypoints, linestyle='--', color='k', lw=3, scalex=False, scaley=False)
or if you've got an axis:
xpoints = ypoints = ax.get_xlim()
ax.plot(xpoints, ypoints, linestyle='--', color='k', lw=3, scalex=False, scaley=False)
Of course, this won't give you a square aspect ratio. If you care about that, go with Paul H's solution.

Categories