Matplotlib: categorical plot without strings and inversion of axes - python

Let's take this snippet of Python:
import matplotlib.pyplot as plt
x = [5,4,3,2,1,0]
x_strings = ['5','4','3','2','1','0']
y = [0,1,2,3,4,5]
plt.figure()
plt.subplot(311)
plt.plot(x, y, marker='o')
plt.subplot(312)
plt.plot(x_strings, y, marker='^', color='red')
plt.subplot(313)
plt.plot(x, y, marker='^', color='red')
plt.gca().invert_xaxis()
plt.show()
Which produces these three subplots:
In the top subplot the x values are automatically sorted increasingly despite their order in the given list. If I want to plot x vs. y exactly in the given order of x, then I have two possibilities:
1) Convert x values to strings and have a categorical plot -- that's the middle subplot.
2) Invert the x-axis -- that's the bottom subplot.
Question: is there any other way to do a sort of categorical plot, but without conversion of numbers into strings and without the inversion of the x-axis?
ADD-ON:
If I use set_xticklabels(list), then for some unclear reason the first element in the list is skipped (no matter if I refer to the x or to the x_strings list), and the resulting plot is also totally strange:
import matplotlib.pyplot as plt
x = [5,4,3,2,1,0]
x_strings = ['5','4','3','2','1','0']
y = [0,1,2,3,4,5]
fig, ax = plt.subplots()
ax.set_xticklabels(x)
ax.plot(x, y, marker='^', color='red')
plt.show()

Both attempted solutions seem possible. Alternatively, you can always mimic categorical plots by plotting integer numbers and setting the ticklabels to your liking.
import matplotlib.pyplot as plt
x = [5,4,3,2,1,0]
y = [0,1,2,3,4,5]
fig, ax = plt.subplots()
ax.plot(range(len(y)), y, marker='^', color='red')
ax.set_xticks(range(len(y)))
ax.set_xticklabels(x)
plt.show()

I have found another way to do it, without being anyhow categorical and without x-axis inversion!
ax = plt.subplot()
ax.set_xlim(x[0],x[-1], auto=True) # this line plays the trick
plt.plot(x, y, marker='^', color='red')

Related

Create a scaled secondary y-axis in Matplotlib

The objective is to plot a scatter plot and create secondary y-axis. Here, the secondary y-axis is just scaled copy of the original scatter plot.
Assume the scaling can be calculated
y2=y1/2.5
where, y1 and y2 is the y axis from the scatter plot,and scaled copy of the original scatter plot, respectively.
This can be visualized as below.
However, using the code below,
import numpy as np
import matplotlib.pyplot as plt
x, y = np.random.random((2,50))
fig, ax1 = plt.subplots()
ax1.scatter(x, y*10, c='b')
ax2 = ax1.twinx()
y2=y/2.5
ax2.plot(1, 1, 'w-')
ax1.set_xlabel('X1_z')
ax1.set_ylabel('x1_y', color='g')
ax2.set_ylabel('x2_y', color='r')
which produced
There are three issues
The secondary y-axis is not scaled properly
As expected but not intended the existence multiple horizontal line root from the secondary y-axis
Is there a possible way to create the scaled y-axis without the need of the line ax2.plot(1, 1, 'w-')
May I know how to handle this?
As suggested in the comment, using secondary_yaxis
x, y = np.random.random((2,50))
fig, ax = plt.subplots()
ax.scatter(x, y*10, c='b')
ax.set_xlabel('X1_z')
ax.set_ylabel('x1_y')
ax.set_title('Adding secondary y-axis')
def a2b(y):
return y/2.5
def b2a(y):
return 2.5*y
secax = ax.secondary_yaxis('right', functions=(a2b,b2a))
secax.set_ylabel('x2_y')
plt.show()
Produced

Creating multiple rows of matplotlib x labels

import matplotlib.pyplot as plt
import numpy as np
x = np.arange(-1,5)
y = 6 - np.square(x-1)
fig, ax = plt.subplots()
ax.plot(x, y, 'b')
ax.scatter(x, y, color='m', zorder=10)
ax.set_xlabel('x')
ax.set_ylabel('y')
This creates the following:
This function is increasing for all values of x < 1 and increasing for all values of x > 1. Is there a simple way that I can put the text "Increasing" like an x label but centered below the x ticks of 0 and 1, "Decreasing" like an x label but centered below 3, and move the "x" xlabel lower such that it has a lower vertical position than "Increasing" and "Decreasing"? I'd rather not do this with ax.text() unless I absolutely have to.
Maybe use text? I have tried changing the labels but this seems cumbersome. Unfortunately you have to set the text coordinates "manually". Note that you can use newline in the ticks to move them down.
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(-1,5)
y = 6 - np.square(x-1)
fig, ax = plt.subplots()
ax.text(0.5, -4.6, 'Increasing', ha="center")
ax.text(3, -4.6, 'Decreasing', ha="center")
ax.plot(x, y, 'b')
ax.scatter(x, y, color='m', zorder=10)
ax.set_xlabel('\nx')
ax.set_ylabel('y')
which produces

One legend entry when plotting several curves using one `plot` call

I am creating a grid by plotting several curves using one plot call as:
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
x = np.array([[0,1], [0,1], [0,1]])
y = np.array([[0,0], [1,1], [2,2]])
ax.plot([0,1],[0,2], label='foo', color='b')
ax.plot(x.T, y.T, label='bar', color='k')
ax.legend()
plt.show()
The resulting legend has as many 'bar' entries as there are curves (see below). I wish that have only one legend entry per plot call (in this case only one time 'bar').
I want this such that I can have other plot commands (e.g. the one plotting the 'foo' curve) whose curves are automatically included in the legend if they have a label. I specifically want to avoid hand-selecting the handles when constructing the legend, but rather use matplotlib's feature to deal with this by yes/no including a label when plotting. How can I achieve this?
Here is one possible solution: You may use the fact that underscores do not produce legend entries. So setting all but the first label to "_" suppresses those to appear in the legend.
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
x = np.array([[0,1], [0,1], [0,1]])
y = np.array([[0,0], [1,1], [2,2]])
ax.plot([0,1],[0,2], label='foo', color='b')
lines = ax.plot(x.T, y.T, label='bar', color='k')
plt.setp(lines[1:], label="_")
ax.legend()
plt.show()
Following is one way using the already existing legend handles and labels. You first get the three handles, labels and then just show the first one. This way additionally gives you a control not only on the order of putting handles but also what to show on the plot.
ax.plot(x.T, y.T, label='bar', color='k')
handles, labels = ax.get_legend_handles_labels()
ax.legend([handles[0]], [labels[0]], loc='best')
Alternative approach where the legends will only be taken from a particular plot (set of lines) -- ax1 in this case
ax1 = ax.plot(x.T, y.T, label='bar', color='k')
plt.legend(handles=[ax1[0]], loc='best')
Extending it to you problem with two figures
ax1 = ax.plot([0,1],[0,2], label='foo', color='b')
ax2 = ax.plot(x.T, y.T, label='bar', color='k')
plt.legend(handles=[ax1[0], ax2[1]], loc='best')
Another alternative using for loops as suggested by #SpghttCd
for i in range(len(x)):
ax.plot(x[i], y[i], label=('' if i==0 else '_') + 'bar', color='k')
ax.legend()
Maybe not quite elegant, but the easiest and most straightforward way is to make a second plot using a single pair of elements where you prescribe the 'label' you want!
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
x = np.array([[0,1], [0,1], [0,1]])
y = np.array([[0,0], [1,1], [2,2]])
ax.plot([0,1],[0,2], label='foo', color='b')
ax.plot(x.T, y.T, color='k')
ax.plot(x[0].T, y[0].T, label='bar', color='k')
ax.legend()
plt.show()

Two y axes for a single plot

I'm trying to create a plot with two Y axes (left and right) for the same data, that is, one is a scaled version of the other. I would like also to preserve the tick positions and grid positions, so the grid will match the ticks at both sides.
I'm trying to do this by plotting twice the same data, one as-is and the other scaled, but they are not coincident.
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(17, 27, 0.1)
y1 = 0.05 * x + 100
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax1.plot(x, y1, 'g-')
ax2.plot(x, y1/max(y1), 'g-')
ax1.set_xlabel('X data')
ax1.set_ylabel('Y data', color='g')
ax2.set_ylabel('Y data normalized', color='b')
plt.grid()
plt.show()
Any help will be appreciated.
Not sure if you can achieve this without getting ugly-looking numbers on your normalized axis. But if that doesn't bother you, try adding this to your code:
ax2.set_ylim([ax1.get_ylim()[0]/max(y1),ax1.get_ylim()[1]/max(y1)])
ax2.set_yticks(ax1.get_yticks()/max(y1))
Probably not the most elegant solution, but it scales your axis limits and tick positions similarly to what you do with the data itself so the grid matches both axes.

Adding y=x to a matplotlib scatter plot if I haven't kept track of all the data points that went in

Here's some code that does scatter plot of a number of different series using matplotlib and then adds the line y=x:
import numpy as np, matplotlib.pyplot as plt, matplotlib.cm as cm, pylab
nseries = 10
colors = cm.rainbow(np.linspace(0, 1, nseries))
all_x = []
all_y = []
for i in range(nseries):
x = np.random.random(12)+i/10.0
y = np.random.random(12)+i/5.0
plt.scatter(x, y, color=colors[i])
all_x.extend(x)
all_y.extend(y)
# Could I somehow do the next part (add identity_line) if I haven't been keeping track of all the x and y values I've seen?
identity_line = np.linspace(max(min(all_x), min(all_y)),
min(max(all_x), max(all_y)))
plt.plot(identity_line, identity_line, color="black", linestyle="dashed", linewidth=3.0)
plt.show()
In order to achieve this I've had to keep track of all the x and y values that went into the scatter plot so that I know where identity_line should start and end. Is there a way I can get y=x to show up even if I don't have a list of all the points that I plotted? I would think that something in matplotlib can give me a list of all the points after the fact, but I haven't been able to figure out how to get that list.
You don't need to know anything about your data per se. You can get away with what your matplotlib Axes object will tell you about the data.
See below:
import numpy as np
import matplotlib.pyplot as plt
# random data
N = 37
x = np.random.normal(loc=3.5, scale=1.25, size=N)
y = np.random.normal(loc=3.4, scale=1.5, size=N)
c = x**2 + y**2
# now sort it just to make it look like it's related
x.sort()
y.sort()
fig, ax = plt.subplots()
ax.scatter(x, y, s=25, c=c, cmap=plt.cm.coolwarm, zorder=10)
Here's the good part:
lims = [
np.min([ax.get_xlim(), ax.get_ylim()]), # min of both axes
np.max([ax.get_xlim(), ax.get_ylim()]), # max of both axes
]
# now plot both limits against eachother
ax.plot(lims, lims, 'k-', alpha=0.75, zorder=0)
ax.set_aspect('equal')
ax.set_xlim(lims)
ax.set_ylim(lims)
fig.savefig('/Users/paul/Desktop/so.png', dpi=300)
Et voilĂ 
In one line:
ax.plot([0,1],[0,1], transform=ax.transAxes)
No need to modify the xlim or ylim.
Starting with matplotlib 3.3 this has been made very simple with the axline method which only needs a point and a slope. To plot x=y:
ax.axline((0, 0), slope=1)
You don't need to look at your data to use this because the point you specify (i.e. here (0,0)) doesn't actually need to be in your data or plotting range.
If you set scalex and scaley to False, it saves a bit of bookkeeping. This is what I have been using lately to overlay y=x:
xpoints = ypoints = plt.xlim()
plt.plot(xpoints, ypoints, linestyle='--', color='k', lw=3, scalex=False, scaley=False)
or if you've got an axis:
xpoints = ypoints = ax.get_xlim()
ax.plot(xpoints, ypoints, linestyle='--', color='k', lw=3, scalex=False, scaley=False)
Of course, this won't give you a square aspect ratio. If you care about that, go with Paul H's solution.

Categories