One legend entry when plotting several curves using one `plot` call - python

I am creating a grid by plotting several curves using one plot call as:
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
x = np.array([[0,1], [0,1], [0,1]])
y = np.array([[0,0], [1,1], [2,2]])
ax.plot([0,1],[0,2], label='foo', color='b')
ax.plot(x.T, y.T, label='bar', color='k')
ax.legend()
plt.show()
The resulting legend has as many 'bar' entries as there are curves (see below). I wish that have only one legend entry per plot call (in this case only one time 'bar').
I want this such that I can have other plot commands (e.g. the one plotting the 'foo' curve) whose curves are automatically included in the legend if they have a label. I specifically want to avoid hand-selecting the handles when constructing the legend, but rather use matplotlib's feature to deal with this by yes/no including a label when plotting. How can I achieve this?

Here is one possible solution: You may use the fact that underscores do not produce legend entries. So setting all but the first label to "_" suppresses those to appear in the legend.
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
x = np.array([[0,1], [0,1], [0,1]])
y = np.array([[0,0], [1,1], [2,2]])
ax.plot([0,1],[0,2], label='foo', color='b')
lines = ax.plot(x.T, y.T, label='bar', color='k')
plt.setp(lines[1:], label="_")
ax.legend()
plt.show()

Following is one way using the already existing legend handles and labels. You first get the three handles, labels and then just show the first one. This way additionally gives you a control not only on the order of putting handles but also what to show on the plot.
ax.plot(x.T, y.T, label='bar', color='k')
handles, labels = ax.get_legend_handles_labels()
ax.legend([handles[0]], [labels[0]], loc='best')
Alternative approach where the legends will only be taken from a particular plot (set of lines) -- ax1 in this case
ax1 = ax.plot(x.T, y.T, label='bar', color='k')
plt.legend(handles=[ax1[0]], loc='best')
Extending it to you problem with two figures
ax1 = ax.plot([0,1],[0,2], label='foo', color='b')
ax2 = ax.plot(x.T, y.T, label='bar', color='k')
plt.legend(handles=[ax1[0], ax2[1]], loc='best')
Another alternative using for loops as suggested by #SpghttCd
for i in range(len(x)):
ax.plot(x[i], y[i], label=('' if i==0 else '_') + 'bar', color='k')
ax.legend()

Maybe not quite elegant, but the easiest and most straightforward way is to make a second plot using a single pair of elements where you prescribe the 'label' you want!
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
x = np.array([[0,1], [0,1], [0,1]])
y = np.array([[0,0], [1,1], [2,2]])
ax.plot([0,1],[0,2], label='foo', color='b')
ax.plot(x.T, y.T, color='k')
ax.plot(x[0].T, y[0].T, label='bar', color='k')
ax.legend()
plt.show()

Related

How to plot a paired histogram using seaborn

I would like to make a paired histogram like the one shown here using the seaborn distplot.
This kind of plot can also be referred to as the back-to-back histogram shown here, or a bihistogram inverted/mirrored along the x-axis as discussed here.
Here is my code:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
green = np.random.normal(20,10,1000)
blue = np.random.poisson(60,1000)
fig, ax = plt.subplots(figsize=(8,6))
sns.distplot(blue, hist=True, kde=True, hist_kws={'edgecolor':'black'}, kde_kws={'linewidth':2}, bins=10, color='blue')
sns.distplot(green, hist=True, kde=True, hist_kws={'edgecolor':'black'}, kde_kws={'linewidth':2}, bins=10, color='green')
ax.set_xticks(np.arange(-20,121,20))
ax.set_yticks(np.arange(0.0,0.07,0.01))
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.show()
Here is the output:
When I use the method discussed here (plt.barh), I get the bar plot shown just below, which is not what I am looking for.
Or maybe I haven't understood the workaround well enough...
A simple/short implementation of python-seaborn-distplot similar to these kinds of plots would be perfect. I edited the figure of my first plot above to show the kind of plot I hope to achieve (though y-axis not upside down):
Any leads would be greatly appreciated.
You could use two subplots and invert the y-axis of the lower one and plot with the same bins.
df = pd.DataFrame({'a': np.random.normal(0,5,1000), 'b': np.random.normal(20,5,1000)})
fig =plt.figure(figsize=(5,5))
ax = fig.add_subplot(211)
ax2 = fig.add_subplot(212)
bins = np.arange(-20,40)
ax.hist(df['a'], bins=bins)
ax2.hist(df['b'],color='orange', bins=bins)
ax2.invert_yaxis()
edit:
improvements suggested by #mwaskom
fig, axes = plt.subplots(nrows=2, ncols=1, sharex=True, figsize=(5,5))
bins = np.arange(-20,40)
for ax, column, color, invert in zip(axes.ravel(), df.columns, ['teal', 'orange'], [False,True]):
ax.hist(df[column], bins=bins, color=color)
if invert:
ax.invert_yaxis()
plt.subplots_adjust(hspace=0)
Here is a possible approach using seaborn's displots.
Seaborn doesn't return the created graphical elements, but the ax can be interrogated. To make sure the ax only contains the elements you want upside down, those elements can be drawn first. Then, all the patches (the rectangular bars) and the lines (the curve for the kde) can be given their height in negative. Optionally the x-axis can be set at y == 0 using ax.spines['bottom'].set_position('zero').
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
green = np.random.normal(20, 10, 1000)
blue = np.random.poisson(60, 1000)
fig, ax = plt.subplots(figsize=(8, 6))
sns.distplot(green, hist=True, kde=True, hist_kws={'edgecolor': 'black'}, kde_kws={'linewidth': 2}, bins=10,
color='green')
for p in ax.patches: # turn the histogram upside down
p.set_height(-p.get_height())
for l in ax.lines: # turn the kde curve upside down
l.set_ydata(-l.get_ydata())
sns.distplot(blue, hist=True, kde=True, hist_kws={'edgecolor': 'black'}, kde_kws={'linewidth': 2}, bins=10,
color='blue')
ax.set_xticks(np.arange(-20, 121, 20))
ax.set_yticks(np.arange(0.0, 0.07, 0.01))
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
pos_ticks = np.array([t for t in ax.get_yticks() if t > 0])
ticks = np.concatenate([-pos_ticks[::-1], [0], pos_ticks])
ax.set_yticks(ticks)
ax.set_yticklabels([f'{abs(t):.2f}' for t in ticks])
ax.spines['bottom'].set_position('zero')
plt.show()

Matplotlib: categorical plot without strings and inversion of axes

Let's take this snippet of Python:
import matplotlib.pyplot as plt
x = [5,4,3,2,1,0]
x_strings = ['5','4','3','2','1','0']
y = [0,1,2,3,4,5]
plt.figure()
plt.subplot(311)
plt.plot(x, y, marker='o')
plt.subplot(312)
plt.plot(x_strings, y, marker='^', color='red')
plt.subplot(313)
plt.plot(x, y, marker='^', color='red')
plt.gca().invert_xaxis()
plt.show()
Which produces these three subplots:
In the top subplot the x values are automatically sorted increasingly despite their order in the given list. If I want to plot x vs. y exactly in the given order of x, then I have two possibilities:
1) Convert x values to strings and have a categorical plot -- that's the middle subplot.
2) Invert the x-axis -- that's the bottom subplot.
Question: is there any other way to do a sort of categorical plot, but without conversion of numbers into strings and without the inversion of the x-axis?
ADD-ON:
If I use set_xticklabels(list), then for some unclear reason the first element in the list is skipped (no matter if I refer to the x or to the x_strings list), and the resulting plot is also totally strange:
import matplotlib.pyplot as plt
x = [5,4,3,2,1,0]
x_strings = ['5','4','3','2','1','0']
y = [0,1,2,3,4,5]
fig, ax = plt.subplots()
ax.set_xticklabels(x)
ax.plot(x, y, marker='^', color='red')
plt.show()
Both attempted solutions seem possible. Alternatively, you can always mimic categorical plots by plotting integer numbers and setting the ticklabels to your liking.
import matplotlib.pyplot as plt
x = [5,4,3,2,1,0]
y = [0,1,2,3,4,5]
fig, ax = plt.subplots()
ax.plot(range(len(y)), y, marker='^', color='red')
ax.set_xticks(range(len(y)))
ax.set_xticklabels(x)
plt.show()
I have found another way to do it, without being anyhow categorical and without x-axis inversion!
ax = plt.subplot()
ax.set_xlim(x[0],x[-1], auto=True) # this line plays the trick
plt.plot(x, y, marker='^', color='red')

Z-order across axes when using matplotlib's twinx [duplicate]

In pyplot, you can change the order of different graphs using the zorder option or by changing the order of the plot() commands. However, when you add an alternative axis via ax2 = twinx(), the new axis will always overlay the old axis (as described in the documentation).
Is it possible to change the order of the axis to move the alternative (twinned) y-axis to background?
In the example below, I would like to display the blue line on top of the histogram:
import numpy as np
import matplotlib.pyplot as plt
import random
# Data
x = np.arange(-3.0, 3.01, 0.1)
y = np.power(x,2)
y2 = 1/np.sqrt(2*np.pi) * np.exp(-y/2)
data = [random.gauss(0.0, 1.0) for i in range(1000)]
# Plot figure
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax2 = ax1.twinx()
ax2.hist(data, bins=40, normed=True, color='g',zorder=0)
ax2.plot(x, y2, color='r', linewidth=2, zorder=2)
ax1.plot(x, y, color='b', linewidth=2, zorder=5)
ax1.set_ylabel("Parabola")
ax2.set_ylabel("Normal distribution")
ax1.yaxis.label.set_color('b')
ax2.yaxis.label.set_color('r')
plt.show()
Edit: For some reason, I am unable to upload the image generated by this code. I will try again later.
You can set the zorder of an axes, ax.set_zorder(). One would then need to remove the background of that axes, such that the axes below is still visible.
ax2 = ax1.twinx()
ax1.set_zorder(10)
ax1.patch.set_visible(False)

Matplotlib: how to adjust zorder of second legend?

Here is an example that reproduces my problem:
import matplotlib.pyplot as plt
import numpy as np
data1,data2,data3,data4 = np.random.random(100),np.random.random(100),np.random.random(100),np.random.random(100)
fig,ax = plt.subplots()
ax.plot(data1)
ax.plot(data2)
ax.plot(data3)
ax2 = ax.twinx()
ax2.plot(data4)
plt.grid('on')
ax.legend(['1','2','3'], loc='center')
ax2.legend(['4'], loc=1)
How can I get the legend in the center to plot on top of the lines?
To get exactly what you have asked for, try the following. Note I have modified your code to define the labels when you generate the plot and also the colors so you don't get a repeated blue line.
import matplotlib.pyplot as plt
import numpy as np
data1,data2,data3,data4 = (np.random.random(100),
np.random.random(100),
np.random.random(100),
np.random.random(100))
fig,ax = plt.subplots()
ax.plot(data1, label="1", color="k")
ax.plot(data2, label="2", color="r")
ax.plot(data3, label="3", color="g")
ax2 = ax.twinx()
ax2.plot(data4, label="4", color="b")
# First get the handles and labels from the axes
handles1, labels1 = ax.get_legend_handles_labels()
handles2, labels2 = ax2.get_legend_handles_labels()
# Add the first legend to the second axis so it displaysys 'on top'
first_legend = plt.legend(handles1, labels1, loc='center')
ax2.add_artist(first_legend)
# Add the second legend as usual
ax2.legend(handles2, labels2)
plt.show()
Now I will add that it would be clearer if you just use a single legend adding all the lines to that. This is described in this SO post and in the code above can easily be achieved with
ax2.legend(handles1+handles2, labels1+labels2)
But obviously you may have your own reasons for wanting two legends.

Adding y=x to a matplotlib scatter plot if I haven't kept track of all the data points that went in

Here's some code that does scatter plot of a number of different series using matplotlib and then adds the line y=x:
import numpy as np, matplotlib.pyplot as plt, matplotlib.cm as cm, pylab
nseries = 10
colors = cm.rainbow(np.linspace(0, 1, nseries))
all_x = []
all_y = []
for i in range(nseries):
x = np.random.random(12)+i/10.0
y = np.random.random(12)+i/5.0
plt.scatter(x, y, color=colors[i])
all_x.extend(x)
all_y.extend(y)
# Could I somehow do the next part (add identity_line) if I haven't been keeping track of all the x and y values I've seen?
identity_line = np.linspace(max(min(all_x), min(all_y)),
min(max(all_x), max(all_y)))
plt.plot(identity_line, identity_line, color="black", linestyle="dashed", linewidth=3.0)
plt.show()
In order to achieve this I've had to keep track of all the x and y values that went into the scatter plot so that I know where identity_line should start and end. Is there a way I can get y=x to show up even if I don't have a list of all the points that I plotted? I would think that something in matplotlib can give me a list of all the points after the fact, but I haven't been able to figure out how to get that list.
You don't need to know anything about your data per se. You can get away with what your matplotlib Axes object will tell you about the data.
See below:
import numpy as np
import matplotlib.pyplot as plt
# random data
N = 37
x = np.random.normal(loc=3.5, scale=1.25, size=N)
y = np.random.normal(loc=3.4, scale=1.5, size=N)
c = x**2 + y**2
# now sort it just to make it look like it's related
x.sort()
y.sort()
fig, ax = plt.subplots()
ax.scatter(x, y, s=25, c=c, cmap=plt.cm.coolwarm, zorder=10)
Here's the good part:
lims = [
np.min([ax.get_xlim(), ax.get_ylim()]), # min of both axes
np.max([ax.get_xlim(), ax.get_ylim()]), # max of both axes
]
# now plot both limits against eachother
ax.plot(lims, lims, 'k-', alpha=0.75, zorder=0)
ax.set_aspect('equal')
ax.set_xlim(lims)
ax.set_ylim(lims)
fig.savefig('/Users/paul/Desktop/so.png', dpi=300)
Et voilĂ 
In one line:
ax.plot([0,1],[0,1], transform=ax.transAxes)
No need to modify the xlim or ylim.
Starting with matplotlib 3.3 this has been made very simple with the axline method which only needs a point and a slope. To plot x=y:
ax.axline((0, 0), slope=1)
You don't need to look at your data to use this because the point you specify (i.e. here (0,0)) doesn't actually need to be in your data or plotting range.
If you set scalex and scaley to False, it saves a bit of bookkeeping. This is what I have been using lately to overlay y=x:
xpoints = ypoints = plt.xlim()
plt.plot(xpoints, ypoints, linestyle='--', color='k', lw=3, scalex=False, scaley=False)
or if you've got an axis:
xpoints = ypoints = ax.get_xlim()
ax.plot(xpoints, ypoints, linestyle='--', color='k', lw=3, scalex=False, scaley=False)
Of course, this won't give you a square aspect ratio. If you care about that, go with Paul H's solution.

Categories