I have an issue with setting the x labels while using twinx function. My original data is a pandas dataframe, namely, df, which has 3 attributes, "name"=product name, "sold"=number of items sold, and "revenue". the name is a pandas series (like "2 shampoo"), but I can't set it to be x tick label (see pic below). How could I set the x labels to display the product's names?
fig = plt.figure() # Create matplotlib figure
ax = fig.add_subplot(111) # Create matplotlib axes
ax2 = ax.twinx() # Create another axes that shares the same x-axis as ax.
width = 0.4
df.sold.plot(kind='bar', color='red', ax=ax, width=width, position=1, rot=90)
df.revenue.plot(kind='bar', color='blue', ax=ax2, width=width, position=0, rot=90)
# print(type(df['name']), "\n", df['name'])
ax.set_ylabel('Sold')
ax2.set_ylabel('Revenue')
ax.legend(['Sold'], loc='upper left')
ax2.legend(['Revenue'], loc='upper right')
plt.show()
You will need to set the labels for X-axis using the set_xticklabels() to show the fields. Add this line after plotting the graph.
ax.set_xticklabels(df.Name)
and you will get the below plot.
Related
How can I set the labels on the extra axes?
The ticks and labels should be the same on all 4 axes. I'm doing something wrong... Thanks!
import matplotlib.pyplot as plt
plt.rcParams['text.usetex'] = True
plt.figure(figsize=(5,5))
f, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax3 = ax1.twiny()
plt.show()
# create reusable ticks and labels
ticks = [0,1/2,3.14159/4,3.14159/2,1]
labels = [r"$0$", r"$\displaystyle\frac{1}{2}$", r"$\displaystyle\frac{\pi}{4}$", r"$\displaystyle\frac{\pi}{2}$", r"$1$"]
# Version 1: twinx() + xaxis.set_ticks()
plt.figure(figsize=(5,5))
f, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax3 = ax1.twiny()
ax1.xaxis.set_ticks(ticks, labels=labels)
ax1.yaxis.set_ticks(ticks, labels=labels)
ax2.xaxis.set_ticks(ticks, labels=labels)
ax3.yaxis.set_ticks(ticks, labels=labels)
plt.show()
# Version 2: twinx() + set_xticklabels)()
plt.figure(figsize=(5,5))
f, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax3 = ax1.twiny()
ax1.set_xticks(ticks)
ax1.set_xticklabels(labels)
ax1.set_yticks(ticks)
ax1.set_yticklabels(labels)
ax2.set_xticks(ticks)
ax2.set_xticklabels(labels)
ax3.set_yticks(ticks)
ax3.set_yticklabels(labels)
plt.show()
Confused: How come ax1 has both xaxis and yaxis, while ax2, ax3 do not appear to?
A unintuitive solution based on matplotlib.axes.Axes.twinx:
Create a new Axes with an invisible x-axis and an independent y-axis
positioned opposite to the original one (i.e. at right).
This means unintuitively (at least for me) you have to switch x/y at the .twin call.
unintuitively not concerning the general matplotlib twinx functionality, but concerning such a manual ticks and label assignment
To highlight that a bit more I used ax2_x and ax3_y in the code.
Disclaimer: Not sure if that will break your plot intention when data is added.
Probably at least you have to take special care with the data assignment to those twin axes - keeping that "axis switch" in mind.
Also keep that axis switch" in mind when assigning different ticks and labels to the x/y axis.
But for now I think that's the plot you were looking for:
Code:
import matplotlib.pyplot as plt
plt.rcParams['text.usetex'] = True
# create reusable ticks and labels
ticks = [0,1/2,3.14159/4,3.14159/2,1]
labels = [r"$0$", r"$\displaystyle\frac{1}{2}$", r"$\displaystyle\frac{\pi}{4}$", r"$\displaystyle\frac{\pi}{2}$", r"$1$"]
plt.figure(figsize=(5,5))
f, ax1 = plt.subplots()
ax1.xaxis.set_ticks(ticks, labels=labels)
ax1.yaxis.set_ticks(ticks, labels=labels)
ax2_x = ax1.twiny() # switch
ax3_y = ax1.twinx() # switch
ax2_x.xaxis.set_ticks(ticks, labels=labels)
ax3_y.yaxis.set_ticks(ticks, labels=labels)
plt.show()
Or switch the x/yaxis.set_ticks - with the same effect:
On second thought, I assume that's the preferred way to do it, especially when data comes into play.
ax2_x = ax1.twinx()
ax3_y = ax1.twiny()
ax2_x.yaxis.set_ticks(ticks, labels=labels) # switch
ax3_y.xaxis.set_ticks(ticks, labels=labels) # switch
In case you don't intend to use the twin axis functionality (that means having different data with different scaling assigned to those axis) but 'only' want the ticks and labels on all 4 axis for better plot readability:
Solution based on answer of ImportanceOfBeingErnest with the same plot result:
import matplotlib.pyplot as plt
plt.rcParams['text.usetex'] = True
# create reusable ticks and labels
ticks = [0,1/2,3.14159/4,3.14159/2,1]
labels = [r"$0$", r"$\displaystyle\frac{1}{2}$", r"$\displaystyle\frac{\pi}{4}$", r"$\displaystyle\frac{\pi}{2}$", r"$1$"]
plt.figure(figsize=(5,5))
f, ax1 = plt.subplots()
ax1.xaxis.set_ticks(ticks, labels=labels)
ax1.yaxis.set_ticks(ticks, labels=labels)
ax1.tick_params(axis="x", bottom=True, top=True, labelbottom=True, labeltop=True)
ax1.tick_params(axis="y", left=True, right=True, labelleft=True, labelright=True)
plt.show()
ax2 = ax1.twinx() shares the x-axis with ax1.
ax3 = ax1.twiny() shares the y-axis with ax1.
As a result, the two lines where you set ax2.xaxis and ax3.yaxis's ticks and ticklabels are redundant with the changes you already applied on ax1.
import matplotlib.pyplot as plt
plt.rcParams['text.usetex'] = False # My computer doesn't have LaTeX, don't mind me.
# Create reusable ticks and labels.
ticks = [0, 1/2, 3.14159/4, 3.14159/2, 1]
labels = [r"$0$", r"$\frac{1}{2}$", r"$\frac{\pi}{4}$", r"$\frac{\pi}{2}$", r"$1$"]
# Set the ticks and ticklabels for each axis.
fig = plt.figure(figsize=(5,5))
ax1 = fig.add_subplot()
ax2 = ax1.twinx()
ax3 = ax1.twiny()
for axis in (ax1.xaxis,
ax1.yaxis,
ax2.yaxis,
ax3.xaxis):
axis.set_ticks(ticks)
axis.set_ticklabels(labels)
fig.show()
Notice that if I comment out the work on ax2 and ax3, we get exactly what you have in your question:
for axis in (ax1.xaxis, ax1.yaxis,
# ax2.yaxis,
# ax3.xaxis,
):
axis.set_ticks(ticks)
axis.set_ticklabels(labels)
Now let's ruin ax1 via modifications on ax2, just to show that the bound between twins works well:
ax2.xaxis.set_ticks(range(10))
ax2.xaxis.set_ticklabels(tuple("abcdefghij"))
I have a pandas dataframe with a 'frequency_mhz' variable and a 'type' variable. I want to create a dist plot using seaborne that overlays all of the frequencys but changes the colour based on the 'type'.
small_df = df[df.small_airport.isin(['Y'])]
medium_df = df[df.medium_airport.isin(['Y'])]
large_df = df[df.large_airport.isin(['Y'])]
plt.figure()
sns.distplot(small_df['frequency_mhz'], color='red')
plt.figure()
sns.distplot(medium_df['frequency_mhz'], color='green')
plt.figure()
sns.distplot(large_df['frequency_mhz'])
Is there a way I can overlay the 3 into one plot? or a way ive missed to change the colour of the bars based on another variable as you can with 'hue=' in other plots?
You can specify ax as kwarg to superimpose your plots:
small_df = df[df.small_airport.isin(['Y'])]
medium_df = df[df.medium_airport.isin(['Y'])]
large_df = df[df.large_airport.isin(['Y'])]
ax = sns.distplot(small_df['frequency_mhz'], color='red')
sns.distplot(medium_df['frequency_mhz'], color='green', ax=ax)
sns.distplot(large_df['frequency_mhz'], ax=ax)
plt.show()
I am creating two plots using matplotlib, each of them is a subplot showing two metrics on the same axis.
I'm trying to run them so they show as two charts but in one graphic, so that when I save the graphic I see both plots. At the moment, running the second plot overwrites the first in memory so I can only ever save the second.
How can I plot them together?
My code is below.
plot1 = plt.figure()
fig,ax1 = plt.subplots()
ax1.plot(dfSat['time'],dfSat['wind_at_altitude'], 'b-', label = "speed", linewidth = 5.0)
plt.title('Wind Speeds - Saturday - {}'.format(windloc))
plt.xlabel('Time of day')
plt.ylabel('Wind speed (mph)')
ax1.plot(dfSat['time'],dfSat['gust_at_altitude'], 'r-', label = "gust", linewidth = 5.0)
plt.legend(loc="upper right")
ax1.text(0.05, 0.95, calcmeassat, transform=ax1.transAxes, fontsize=30,
verticalalignment='top')
plt.ylim((0,100))
plot2 = plt.figure()
fig,ax2 = plt.subplots()
ax2.plot(dfSun['time'],dfSun['wind_at_altitude'], 'b-', label = "speed", linewidth = 5.0)
plt.title('Wind Speeds - Sunday - {}'.format(windloc))
plt.xlabel('Time of day')
plt.ylabel('Wind speed (mph)')
ax2.plot(dfSun['time'],dfSun['gust_at_altitude'], 'r-', label = "gust", linewidth = 5.0)
plt.legend(loc="upper right")
ax2.text(0.05, 0.95, calcmeassun, transform=ax2.transAxes, fontsize=30,
verticalalignment='top')
plt.ylim((0,100))
As mentioned, in your case you only need one level of subplots, e.g., nrows=1, ncols=2.
However, in matplotlib 3.4+ there is such a thing as "subplotting subplots" called subfigures, which makes it easier to implement nested layouts, e.g.:
How to create row titles for subplots
How to share colorbars within some subplots
How to share xlabels within some subplots
Subplots
For your simpler use case, create 1x2 subplots with ax1 on the left and ax2 on the right:
# create 1x2 subplots
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(16, 4))
# plot saturdays on the left
dfSat.plot(ax=ax1, x='date', y='temp_min')
dfSat.plot(ax=ax1, x='date', y='temp_max')
ax1.set_ylim(-20, 50)
ax1.set_title('Saturdays')
# plot sundays on the right
dfSun.plot(ax=ax2, x='date', y='temp_min')
dfSun.plot(ax=ax2, x='date', y='temp_max')
ax2.set_ylim(-20, 50)
ax2.set_title('Sundays')
Subfigures
Say you want something more complicated like having the left side show 2012 with its own suptitle and right side to show 2015 with its own suptitle.
Create 1x2 subfigures (left subfig_l and right subfig_r) with 2x1 subplots on the left (top ax_lt and bottom ax_lb) and 2x1 subplots on the right (top ax_rt and bottom ax_rb):
# create 1x2 subfigures
fig = plt.figure(constrained_layout=True, figsize=(12, 5))
(subfig_l, subfig_r) = fig.subfigures(nrows=1, ncols=2, wspace=0.07)
# create top/box axes in left subfig
(ax_lt, ax_lb) = subfig_l.subplots(nrows=2, ncols=1)
# plot 2012 saturdays on left-top axes
dfSat12 = dfSat.loc[dfSat['date'].dt.year.eq(2012)]
dfSat12.plot(ax=ax_lt, x='date', y='temp_min')
dfSat12.plot(ax=ax_lt, x='date', y='temp_max')
ax_lt.set_ylim(-20, 50)
ax_lt.set_ylabel('Saturdays')
# plot 2012 sundays on left-top axes
dfSun12 = dfSun.loc[dfSun['date'].dt.year.eq(2012)]
dfSun12.plot(ax=ax_lb, x='date', y='temp_min')
dfSun12.plot(ax=ax_lb, x='date', y='temp_max')
ax_lb.set_ylim(-20, 50)
ax_lb.set_ylabel('Sundays')
# set suptitle for left subfig
subfig_l.suptitle('2012', size='x-large', weight='bold')
# create top/box axes in right subfig
(ax_rt, ax_rb) = subfig_r.subplots(nrows=2, ncols=1)
# plot 2015 saturdays on left-top axes
dfSat15 = dfSat.loc[dfSat['date'].dt.year.eq(2015)]
dfSat15.plot(ax=ax_rt, x='date', y='temp_min')
dfSat15.plot(ax=ax_rt, x='date', y='temp_max')
ax_rt.set_ylim(-20, 50)
ax_rt.set_ylabel('Saturdays')
# plot 2015 sundays on left-top axes
dfSun15 = dfSun.loc[dfSun['date'].dt.year.eq(2015)]
dfSun15.plot(ax=ax_rb, x='date', y='temp_min')
dfSun15.plot(ax=ax_rb, x='date', y='temp_max')
ax_rb.set_ylim(-20, 50)
ax_rb.set_ylabel('Sundays')
# set suptitle for right subfig
subfig_r.suptitle('2015', size='x-large', weight='bold')
Sample data for reference:
import pandas as pd
from vega_datasets import data
df = data.seattle_weather()
df['date'] = pd.to_datetime(df['date'])
dfSat = df.loc[df['date'].dt.weekday.eq(5)]
dfSun = df.loc[df['date'].dt.weekday.eq(6)]
It doesn't work like that. Subplots are what they are called; plots inside a main plot.
That means if you need two subplots; then you need one plot containing two subplots in it.
# figure object NOT plot object
# useful when you want only one plot NO subplots
fig = plt.figure()
# 2 subplots inside 1 plot
# 1 row, 2 columns
fig, [ax1, ax2] = plt.subplots(1, 2)
# then call plotting method on each axis object to
# create plot on that subplot
sns.histplot(...., ax=ax1)
sns.violinplot(..., ax=ax2)
# or using matplotlib like this
ax1.plot()
ax2.plot()
Learn more about subplots
I'm plotting 2 dataframes with this method:
df.plot(ax=ax, x='x', y='y', label = "first_df")
df2.plot(ax=ax, x='x', y='y', label = "second_df")
And I add some avxspan functions:
plt.axvspan(x, y, label = value)
Since that I have multiple avxspan and there are also dupplicated values, I am using this code to uniquely display the values.
handles, labels = plt.gca().get_legend_handles_labels()
by_label = dict(zip(labels, handles))
plt.legend(by_label.values(), by_label.keys(),loc='upper center', bbox_to_anchor=(1.1, 0.8))
But when I display the legend, I have one legend for the dfs and an other for the avxspan functions. I think it is because I use plot for dfs and plt for axvspan, so I don't know how to fusion the legends.
EDITH:
I tried this with ax1 et ax2 for my dfs:
h1, l1 = ax1.get_legend_handles_labels()
h2, l2 = ax2.get_legend_handles_labels()
ax1.legend((h1+h2), l1+l2, loc='upper center', bbox_to_anchor=(1.1, 0.8))
It's working but I have dupplicates in the legend, how can I remove it ?
In stead of using df.plot which creates a legend whenever it's called, you can use ax.plot:
ax.plot(df['x'], df['y'], label='first df')
ax.plot(df2['x'], df2['y'], label='second df')
ax.legend()
In the following example, the legend is combined by plotting everything onto the same Axes including the avxspan with ax.avxspan and then running ax.legend to add the avxspan legend to the existing legend:
import numpy as np # v 1.19.2
import pandas as pd # v 1.2.3
# Create sample dataset
rng = np.random.default_rng(seed=1)
size = 30
df = pd.DataFrame(dict(x=range(size), y=rng.integers(0, 100, size=size)))
df2 = pd.DataFrame(dict(x=range(size), y=rng.integers(10, 50, size=size)))
# Plot data unto single Axes with combined legend using multiple plotting functions
ax = df.plot(x='x', y='y', label='first_df', figsize=(8,4))
df2.plot(ax=ax, x='x', y='y', label = 'second_df')
ax.axvspan(10, 15, label='span', facecolor='black', edgecolor=None, alpha=0.2)
ax.legend(loc='upper right');
I have been playing a bit with plt.legend() and ax.legend() and legend from seaborn itself and I think I'm missing something.
My first question is, could someone please explain to me how those go together, how they work and if I have subplots, what is superior to what? Meaning can I set a general definition (eg. have this legend in all subplots in this loc) and then overwrite this definition for specific subplots (eg by ax.legend())?
My second question is practical and showing my problems. Let's take the seaborn Smokers data set to illustrate it on:
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
# define sizes for labels, ticks, text, ...
# as defined here https://stackoverflow.com/questions/3899980/how-to-change-the-font-size-on-a-matplotlib-plot
SMALL_SIZE = 10
MEDIUM_SIZE = 14
BIGGER_SIZE = 18
plt.rc('font', size=SMALL_SIZE) # controls default text sizes
plt.rc('axes', titlesize=SMALL_SIZE) # fontsize of the axes title
plt.rc('axes', labelsize=BIGGER_SIZE) # fontsize of the x and y labels
plt.rc('xtick', labelsize=MEDIUM_SIZE) # fontsize of the tick labels
plt.rc('ytick', labelsize=MEDIUM_SIZE) # fontsize of the tick labels
plt.rc('legend', fontsize=SMALL_SIZE) # legend fontsize
plt.rc('figure', titlesize=BIGGER_SIZE) # fontsize of the figure title
# create figure
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(nrows=2, ncols=2, figsize=(16,12))
ylim = (0,1)
sns.boxplot(x= 'day', y= 'tip', hue="sex",
data=tips, palette="Set2", ax=ax1)
sns.swarmplot(x= 'day', y= 'tip', hue="sex",
data=tips, palette="Set2", ax=ax2)
ax2.legend(loc='upper right')
sns.boxplot(x= 'day', y= 'total_bill', hue="sex",
data=tips, palette="Set2", ax=ax3)
sns.swarmplot(x= 'day', y= 'total_bill', hue="sex",
data=tips, palette="Set2", ax=ax4)
plt.suptitle('Smokers')
plt.legend(loc='upper right')
plt.savefig('test.png', dpi = 150)
If I use simply seaborn, I get a legend as in Subplot 1 and 3 -- it has the 'hue' label and follows defined font size. However, I'm not able to control its location (it has some default, see the difference between 1 and 3). If I use ax.legend() as in Subplot 2, then I can modify specific subplot but I lose the seaborn 'hue' feature (notice that the "sex" disappears) and it does not follow my font definitions. If I use plt.legend(), it only affects the Subplot before it (Subplot 4 in this case).
How can I unite all this? Eg. to have one definition for all subplots or how to control the seaborn default? To make clear goal, how to have a legend as in Subplot 1 where the labels come automatically from the data (but I can change them) and the location, font size, ... is set the same for all the subplots (eg. upper right, font size of 10, ...)?
Thank you for help and explanation.
Seaborn legends are always called with the keyword loc=best. This is hardcoded in the sourcecode. You could change the sourcecode, e.g. in this line and replace by ax.legend(). Then setting the rc parameter in your code like
plt.rc('legend', loc="upper right")
would give the desired output.
The only other option is to create the legend manually, like you do in the second case,
ax2.legend(loc="upper right", title="sex", title_fontsize="x-large")