I have created a bubble plot using seaborn, and used matplotlib to draw the legend to the right of my seaborn plots. I specified the sizing of the bubbles in my seaborn code using sizes=(1,900) but the scaling on my matplotlib legend does not reflect what the plots show. The legend reads from 0 to 45 but the actual data in my plots range from 0 to 900
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(11,4))
sns.scatterplot(y="Min", x="Max",
size="Count", sizes=(1,900), alpha=0.5,
color='r', data=code1, ax=ax1, legend=False)
sns.scatterplot(y="Min", x="Max", alpha=0.5,
color='b', size="Count", sizes=(1,900),
data=code2, ax=ax2, legend=False)
sns.scatterplot(y="Min", x="Max", alpha=0.5,
color='g', size="Count", sizes=(1,900),
data=code3, ax=ax3)
ax3.legend(loc='upper right', bbox_to_anchor=(1.7,1), labelspacing=2,
fontsize=14, frameon=False, markerscale=1)
Here is my plot
I was unable to figure out how seaborn structures the legend output for ingestion by matplotlib. I did learn that my data (code1, code2, and code3) had different min and max values which should have been specified under seaborn's sizes argument. For code1, sizes=(1,900); for code2, sizes=(1,300); for code3, sizes=(1,45). Because I was using matplotlib to draw the legend to the right of code3's plot, the scaling was specific to the rightmost plot rather than for all 3 plots. In the end, I ended up using matplotlib's legend_elements as follows:
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(12,4))
scatter = ax1.scatter(y=code1["Min"], x=code1["Max"],
s=code1["Count"],
color='r', alpha=0.5)
ax2.scatter(y=code2["Min"], x=code2["Max"],
color='b', s=code2["Count"], alpha=0.5)
ax3.scatter(y=code3["Min"], x=code3["Max"],
color='g', s=code3["Count"], alpha=0.5)
kw = dict(prop="sizes", num=[10,100,500,900])
legend = ax3.legend(*scatter.legend_elements(**kw), title="Count", fontsize=12,
loc='upper right', bbox_to_anchor=(1.5,1), labelspacing=2,
frameon=False)
Related
I have the following code for a 2x2 subplot:
figure(figsize=(10, 6), dpi=100)
plt.style.use('fivethirtyeight')
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2)
fig.tight_layout(pad=2)
fig.suptitle('Driving Relationships', fontsize=20)
# subplot 221
ax1.scatter(dataset2["duration"]/60, dataset2["distance"]/1000, c='red', alpha=0.5)
ax1.set_title("Duration vs Distance", fontsize=12)
ax1.set_xlabel("Duration (min)",fontsize=8)
ax1.set_ylabel("Distance (km)",fontsize=8)
# subplot 222
ax2.scatter(dataset2["duration"]/60, dataset2["speed_mean"], c='red', alpha=0.5)
ax2.set_title("Duration vs Speed", fontsize=12)
ax2.set_xlabel("Duration (min)",fontsize=8)
ax2.set_ylabel("Mean Speed (m/s)",fontsize=8)
# subplot 223
ax3.scatter(dataset2["ascent_total"], dataset2["acceleration_mean"], c='red', alpha=0.5)
ax3.set_title("Ascent vs Acceleration", fontsize=12)
ax3.set_xlabel("Ascent (m)",fontsize=8)
ax3.set_ylabel("Mean Acceleration (m/s^2)",fontsize=8)
# subplot 224
ax4.scatter(dataset2["descent_total"], dataset2["acceleration_mean"], c='red', alpha=0.5)
ax4.set_title("Descent vs Acceleration", fontsize=12)
ax4.set_xlabel("Descent (m)",fontsize=8)
ax4.set_ylabel("Mean Acceleration (m/s^2)",fontsize=8)
plt.show()
Despite my attempts to improve it, there are many overlappings as shown below:
I've tried changing the figure size (nothing happened). I also used fig.tight_layour() not a major improvement even when setting padding values. How can I fix my code to have a more presentable figure?
Try to write it after your plots
plt.tight_layout()
Apparently, for subplots changing the figure size is different. The following code did the job:
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(8, 8))
I want to plot two subplot in in one plot using matplotlib
my code is as following:
fig, axes = plt.subplots(ncols=2)
df5.plot(ax=axes[0], kind='bar' ,stacked=True)
ax[0,0].set_title("metagenome data")
plt.xticks(r1, names1)
plt.xlabel("Sample")
plt.legend(loc='upper left', bbox_to_anchor=(1,1), ncol=1)
df_b5.plot(ax=axes[1], kind='bar', stacked=True)
ax[0,1].set_title("Amplicon data")
plt.xticks(r2, names2)
plt.xlabel("Sample")
plt.legend(loc='upper left', bbox_to_anchor=(1,1), ncol=1)
and only the first one is at the plot
what am I doing wrong?
any help will be grate!
TNX!
I create two scatterplots with matplotlib in python with this code, the data for the code is here:
import matplotlib.pyplot as plt
from matplotlib.colors import Normalize
fig = plt.figure(figsize=(20,12))
ax1 = fig.add_subplot(111)
ax3 = ax1.twinx()
norm = Normalize(vmin=0.95*min(arr), vmax=1.05*max(arr))
ax1.scatter(x, y1, s=20, c=arr, cmap='Blues_r', norm=norm, marker='x', label='bla1')
ax3.scatter(x, y2, s=(20*(1.1-arr))**3.5, c=arr, cmap='Reds_r', norm=norm, marker='^', label='bla1')
The created fig. looks like this:
So, the dot size (in ax3) and the dot colour (in ax1 and ax3) are taken from arrays containing floats with all kinds of values in the range [0,1]. My question: How do I create a legend that displays the corresponding y-values for, let's say 5 different dot sizes and 5 different colour nuances?
I would like the legend to look like in the figure below (source here), but with the colour bar and size bar put into a single legend, if possible. Thanks for suggestions and code!
# using your data in dataframe df
# create s2
df['s2'] = (20*(1.1-df.arr))**3.5
fig = plt.figure(figsize=(20,12))
ax1 = fig.add_subplot(111)
ax3 = ax1.twinx()
norm = Normalize(vmin=0.95*min(df.arr), vmax=1.05*max(df.arr))
p1 = ax1.scatter(df.x, df.y1, s=20, c=df.arr, cmap='Blues_r', norm=norm, marker='x')
fig.colorbar(p1, label='arr')
p2 = ax3.scatter(df.x, df.y2, s=df.s2, c=df.arr, cmap='Reds_r', norm=norm, marker='^')
fig.colorbar(p2, label='arr')
# create the size legend for red
for x in [15, 80, 150]:
plt.scatter([], [], c='r', alpha=1, s=x, label=str(x), marker='^')
plt.legend(loc='upper center', bbox_to_anchor=(1.23, 1), ncol=1, fancybox=True, shadow=True, title='s2')
plt.show()
There's no legend for p1 because the size is static.
I think this would be better as two separate plots
I used Customizing Plot Legends: Legend for Size of Points
Separate
fig, (ax1, ax2) = plt.subplots(nrows=2, figsize=(20, 10))
norm = Normalize(vmin=0.95*min(df.arr), vmax=1.05*max(df.arr))
p1 = ax1.scatter(df.x, df.y1, s=20, c=df.arr, cmap='Blues_r', norm=norm, marker='x')
fig.colorbar(p1, ax=ax1, label='arr')
p2 = ax2.scatter(df.x, df.y2, s=df.s2, c=df.arr, cmap='Reds_r', norm=norm, marker='^')
fig.colorbar(p2, ax=ax2, label='arr')
# create the size legend for red
for x in [15, 80, 150]:
plt.scatter([], [], c='r', alpha=1, s=x, label=str(x), marker='^')
plt.legend(loc='upper center', bbox_to_anchor=(1.2, 1), ncol=1, fancybox=True, shadow=True, title='s2')
plt.show()
I'm plotting a simple scatter plot:
It represents my data correctly, however there is many datapoints with coordinates (1.00,1.00) and in the plot, they appear under a single marker (top right corner). I'd like to have a functionality that changes the size of every marker according to the number of points it is representing. Will appreciate any help. Here's my code:
def saveScatter(figureTitle, xFeature, yFeature, xTitle, yTitle):
''' save a scatter plot of xFeatures vs yFeatures '''
fig = plt.figure(figsize=(8, 6), dpi=300)
ax = fig.add_subplot(111)
ax.scatter(dfModuleCPositives[names[xFeature]][:], dfModuleCPositives[names[yFeature]][:], c='r', marker='x', alpha=1, label='Module C Positives')
ax.scatter(dfModuleCNegatives[names[xFeature]][:], dfModuleCNegatives[names[yFeature]][:], c='g', alpha=0.5, label='Module C Negatives')
ax.scatter(dfModuleDPositives[names[xFeature]][:], dfModuleDPositives[names[yFeature]][:], c='k', marker='x', alpha=1, label='Module D Positives')
ax.scatter(dfModuleDNegatives[names[xFeature]][:], dfModuleDNegatives[names[yFeature]][:], c='b', alpha=0.5, label='Module D Negatives')
ax.set_xlabel(xTitle, fontsize=10)
ax.set_ylabel(yTitle, fontsize=10)
ax.set_title(figureTitle)
ax.grid(True)
ax.legend(loc="lower right")
fig.tight_layout()
plt.show()
return ax
I have several seaborn bar plot as below, and I would like to add horizontal lines above each set of bars. I know the y coordinates, but how can I automatically get the xmin and ymin range without needing to look at them manually?
sns.countplot(x="class", hue="who", kind="bar", data=titanic)
plt.hlines(y=30, xmin=-0.5, xmax=0.5, color='black', alpha=0.4)
plt.hlines(y=50, xmin=0.6, xmax=1.5, color='black', alpha=0.4)
plt.hlines(y=200, xmin=1.5, xmax=2.5, color='black', alpha=0.4)