Matplotlib Scatterplot legend for points - python

I am programmatically creating a scatterplot like this:
(Ipython sample code)
%matplotlib inline
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1, axisbg="1.0")
d1 = [range(1,11)]
d2 = [range(1,11)]
dcolor = ['red','red','red','green','green','green','blue','blue','blue', 'blue']
colordict{'red': 'monkey', 'green':'whale', 'blue':'cat'}
ax.scatter(d1,d2,alpha=0.8, c=dcolor,edgecolors='none',s=30)
I would like to add a legend for each different point, so that the legend contains a point in the given color and the name from colordict. Is that possible without splitting the creation of the scatterplot into multiple calls to scatter? Since this happens in a automated library, I would rather avoid to have different calls to scatter().

I would probably do the following.
%matplotlib inline
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1, axisbg="1.0")
g1 = ([1,2,3], [1,2,3])
g2 = ([4,5,6], [4,5,6])
g3 = ([7,8,9,10], [7,8,9,10])
data = (g1, g2, g3)
colors = ("red", "green", "blue")
groups = ("monkey", "whale", "cat")
for data, color, group in zip(data, colors, groups):
x, y = data
ax.scatter(x, y, alpha=0.8, c=color, edgecolors='none', s=30, label=group)
plt.legend(loc=2)

I like keeping the data and its symbols (color, label) even tighter than cel does. I find the code more readable and more checkable, and often I'm getting them together out of some datasource anyway:
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1, axisbg="1.0")
zoo=[]
zoo.append(([4,5,6], [4,5,6], "blue","ape"))
zoo.append(([1,2,3], [1,2,3], "red","monkey"))
for x,y,c,l in zoo:
plt.scatter(x,y,c=c,label=l)
plt.legend(loc="upper left")

Finally, I have used the following code:
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1, axisbg="1.0")
d1 = [range(1,11)]
d2 = [range(1,11)]
dcolor = ['red','red','red','green','green','green','blue','blue','blue', 'blue']
ax.scatter(d1,d2,alpha=0.8, c=dcolor,edgecolors='none',s=30)
import matplotlib.patches as mpatches
patch = mpatches.Patch(color='red', label='a')
patch2 = mpatches.Patch(color='red', label='a')
fig.legend( [patch, patch2],['abc', 'xyz'], loc = 'lower center', ncol=5, labelspacing=0. )
Here it is not yet in a loop, but that is easily doable.

Related

Python: Using a dictionary for matplotlib colors and hatching

I am trying to use a dictionary to control the color and hatching of a fill on a matplotlib plot using fill_betweenx().
I have had success using lists like in the example below. However, I am struggling to work out how I could use a dictionary in a similar way. The intention is that the number in the first part of the dictionary relates to a column in a dataframe and when I come to plot the data it should lookup the relevant hatch and color arguments from the dictionary.
What would be the best way to achieve this?
Here is an example dictionary that I am wanting to use in place of the lists
example_dict = {1:{'lith':'sandstone', 'hatch':'.', 'color':'yellow'},
2:{'lith':'fine sand', 'hatch':'..', 'color':'yellow'},
3:{'lith':'mudstone', 'hatch':'-', 'color':'green'},
4:{'lith':'laminated shale', 'hatch':'--', 'color':'green'}}
Working code using lists.
import matplotlib.pyplot as plt
y = [0, 1]
x = [1, 1]
fig, axes = plt.subplots(ncols=4,nrows=1, sharex=True, sharey=True,
figsize=(10,5), subplot_kw={'xticks': [], 'yticks': []})
colors = ['yellow', 'yellow', 'green', 'green']
hatchings = ['.', '..', '-', '--']
for ax, color, hatch in zip(axes.flat, colors, hatchings):
ax.plot(x, y)
ax.fill_betweenx(y, 0, 1, facecolor=color, hatch=hatch)
ax.set_xlim(0, 0.1)
ax.set_ylim(0, 1)
ax.set_title(str(hatch))
plt.tight_layout()
plt.show()
This generates:
Just replace what you iterate over to be the dict keys and then access the color or hatch within the code:
import matplotlib.pyplot as plt
y = [0, 1]
x = [1, 1]
fig, axes = plt.subplots(ncols=4,nrows=1, sharex=True, sharey=True,
figsize=(10,5), subplot_kw={'xticks': [], 'yticks': []})
example_dict = {1:{'lith':'sandstone', 'hatch':'.', 'color':'yellow'},
2:{'lith':'fine sand', 'hatch':'..', 'color':'yellow'},
3:{'lith':'mudstone', 'hatch':'-', 'color':'green'},
4:{'lith':'laminated shale', 'hatch':'--', 'color':'green'}}
for ax, key in zip(axes.flat, example_dict.keys()):
ax.plot(x, y)
ax.fill_betweenx(y, 0, 1, facecolor=example_dict[key]['color'], hatch=example_dict[key]['hatch'])
ax.set_xlim(0, 0.1)
ax.set_ylim(0, 1)
ax.set_title(str(example_dict[key]['hatch']))
plt.tight_layout()
plt.show()

Putting one color bar for several subplots from different dataframes

I looked everywhere and nothing really helped.
Here is my code:
fig = plt.figure(figsize=(12, 6))
marker_colors = pca_data2['Frame']
fig.suptitle('PCA')
plt.subplot(1, 2, 1)
x = pca_data2.PC_1
y = pca_data2.PC_2
plt.scatter(x, y, c = marker_colors, cmap = "inferno")
plt.colorbar()
plt.subplot(1, 2, 2)
x1 = pca_data.PC_1
y1 = pca_data.PC_2
plt.scatter(x1, y1, c = marker_colors, cmap = "inferno")
plt.colorbar()
plt.show()
pca_data and pca_data2 are two completely different dataframes from to completele different things. But I need them side by side with the 1 color bar being on the right side for all.
Thats how the figure looks like
When I try to remove the first plt.colorbar() then the two subplots look uneven.
I would really appreciate the help.
... since none of the answers seems to mention the fact that you can tell the colorbar the axes on which it should be drawn... here's a simple example how I would do it:
The benefits of this are:
it's much clearer to read
you have complete control over the size of the colorbar
you can extend this easily to any grid of subplots and any position of the colorbar
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.gridspec import GridSpec
# generate some data
data, data1 = np.random.rand(10,10), np.random.rand(10,10)
x, y = np.meshgrid(np.linspace(0,1,10), np.linspace(0,1,10))
# initialize a plot-grid with 3 axes (2 plots and 1 colorbar)
gs = GridSpec(1, 3, width_ratios=[.48,.48,.04])
# set vmin and vmax explicitly to ensure that both colorbars have the same range!
vmin = np.min([np.min(data), np.min(data1)])
vmax = np.max([np.max(data), np.max(data1)])
plot_kwargs = dict(cmap = "inferno", vmin=vmin, vmax=vmax)
fig = plt.figure(figsize=(12, 6))
ax_0 = fig.add_subplot(gs[0], aspect='equal')
ax_1 = fig.add_subplot(gs[1], aspect='equal')
ax_cb = fig.add_subplot(gs[2])
s1 = ax_0.scatter(x, y, c = data, **plot_kwargs)
s2 = ax_1.scatter(x, y, c = data1, **plot_kwargs)
plt.colorbar(s1, cax=ax_cb)
You can use aspect to set a fixed aspect ratio on the subplots. Then append the colorbars to the right side of each axis and discard the first colorbar, to get an even layout:
import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits.axes_grid1 import make_axes_locatable
fig = plt.figure(figsize=(12, 6))
marker_colors = range(0,10)
x = x1 = np.random.randint(0,10,10)
y = y1 = np.random.randint(0,10,10)
ax1 = fig.add_subplot(1, 2, 1, aspect="equal") # or e.g. aspect=0.9
g1 = ax1.scatter(x, y, c = marker_colors, cmap = "inferno", )
ax2 = fig.add_subplot(1, 2, 2, aspect="equal") # or e.g. aspect=0.9
g2 = ax2.scatter(x1, y1, c = marker_colors, cmap = "inferno")
# put colorbars right next to axes
divider1 = make_axes_locatable(ax1)
cax1 = divider1.append_axes("right", size="5%", pad=0.05)
divider2 = make_axes_locatable(ax2)
cax2 = divider2.append_axes("right", size="5%", pad=0.05)
# reserve space for 1st colorbar, then remove
cbar1 = fig.colorbar(g1, cax=cax1)
fig.delaxes(fig.axes[2])
# 2nd colorbar
cbar2 = fig.colorbar(g2, cax=cax2)
plt.tight_layout()
plt.show()
If you want a different aspect ratio, you can modify aspect, e.g. to aspect=0.9. The result will have locked aspect ratios for the subplots, even if you resize the figure box:
use following code:
Hope it will match your problem statment.
fig = plt.figure(figsize=(12, 6))
marker_colors = range(0,10)
x=x1=np.random.randint(0,10,10)
y=y1=np.random.randint(0,10,10)
plt.subplot(1, 2, 1)
g1=plt.scatter(x, y, c = marker_colors, cmap = "inferno")
plt.subplot(1, 2, 2)
g2=plt.scatter(x1, y1, c = marker_colors, cmap = "inferno")
g11=plt.colorbar(g1)
g12=plt.colorbar(g2)
g11.ax.set_title('g1')
g12.ax.set_title('g2')

How to produce nested legends in Matplotlib

I have to plot in Matplotlib a quantity which is the sum of various contributions.
I would like to highlight this fact in the legend of the plot by listing the various contribution as sub-elements of the main legend entry.
A sketch of the result I would like to obtain can be found in the picture below. Note that I do not need to necessarily achieve exactly the legend that is depicted, but just something similar.
You can try creating two separate legends to your figure. Sure, it’s a trick rather than a direct feature of the legend object, as there seems to be no implementation of what you need in matplotlib. But playing with the numbers in bbox and the fontsize you can customize it pretty nicely.
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(0.0, 1, 0.01)
x1 = np.sin(2*np.pi*x)
x2 = np.sin(2*np.pi*x+1)
x3 = np.sin(2*np.pi*x+2)
fig, ax = plt.subplots()
f1, = ax.plot(x1, 'r', lw=4)
f2, = ax.plot(x2, 'k', lw=2)
f3, = ax.plot(x3, 'b', lw=2)
legend1 = plt.legend([f1], ["Main legend"], fontsize=12, loc=3, bbox_to_anchor=(0,0.1,0,0), frameon=False)
legend2 = plt.legend((f2, f3), ('sublegend 1', 'sublegend 2'), fontsize=9,
loc=3, bbox_to_anchor=(0.05,0,0,0), frameon=False)
plt.gca().add_artist(legend1)
plt.show()
EDIT:
Well, if we insert 2 legends, why not just inserting a completely new figure as inset inside the bigger figure, dedicated for a legend, inside which you can draw and write whatever you like? Admittedly it’s a hard work, you have to design each and every line inside including the precise location coordinates. But that’s the way I could think of for doing what you wanted:
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(0.0, 1, 0.01)
x1 = np.sin(2*np.pi*x)
x2 = np.sin(2*np.pi*x+1)
x3 = np.sin(2*np.pi*x+2)
fig, ax = plt.subplots()
f1, = ax.plot(x1, 'r', lw=4)
f2, = ax.plot(x2, 'k', lw=2)
f3, = ax.plot(x3, 'b', lw=2)
## set all lines for inner figure
yline1 = np.array([-0.15, -0.15])
line1 = np.array([2, 10])
yline2 = np.array([3, 0])
line2 = np.array([4, 4])
yline3 = np.array([1.5, 1.5])
line3 = np.array([4, 6])
yline4 = np.array([1.5, 1.5])
line4 = np.array([7, 10])
yline5 = np.array([3, 3])
line5 = np.array([4, 6])
yline6 = np.array([3, 3])
line6 = np.array([7, 10])
## inset figure
axin1 = ax.inset_axes([2.5, -1, 30, 0.5], transform=ax.transData) #
## plot all lines
axin1.plot(line1, yline1, linewidth=4, c='r')
axin1.plot(line2, yline2, 'k', lw=1)
axin1.plot(line3, yline3, 'k', lw=1)
axin1.plot(line4, yline4, 'b', lw=3)
axin1.plot(line5, yline5, 'k', lw=1)
axin1.plot(line6, yline6, 'k', lw=3)
## text
axin1.text(12, 0, 'MAIN', fontsize=12)
axin1.text(12, 1.7, 'Subtext 1', fontsize=10)
axin1.text(12, 3.2, 'Subtext 2', fontsize=10)
## adjust
axin1.set_ylim([4, -1])
axin1.set_xlim([0, 27])
axin1.set_xticklabels('')
axin1.set_yticklabels('')
I looked for a custom example in the legend and could not see any indication of lowering the level. You can just line up the objects in the legend. I've created a hierarchy of the presented images in the form of colors and markers. The official reference has been customized. This has the effect of eliminating the need to annotate only the legend in a special way.
import matplotlib.pyplot as plt
from matplotlib.legend_handler import HandlerLineCollection, HandlerTuple
fig, ax1 = plt.subplots(1, 1, constrained_layout=True)
params = {'legend.fontsize': 16,
'legend.handlelength': 3}
plt.rcParams.update(params)
x = np.linspace(0, np.pi, 25)
xx = np.linspace(0, 2*np.pi, 25)
xxx = np.linspace(0, 3*np.pi, 25)
p1, = ax1.plot(x, np.sin(x), lw=5, c='r')
p2, = ax1.plot(x, np.sin(xx), 'm-d', c='g')
p3, = ax1.plot(x, np.sin(xxx), 'm-s', c='b')
# Assign two of the handles to the same legend entry by putting them in a tuple
# and using a generic handler map (which would be used for any additional
# tuples of handles like (p1, p2)).
l = ax1.legend([p1, (p1, p2), (p1, p3)], ['Legend entry', 'Contribution 1', 'Contribution 2'], scatterpoints=1,
numpoints=1, markerscale=1.3, handler_map={tuple: HandlerTuple(ndivide=None, pad=1.0)})
plt.show()

My scatter plot is only showing one part of the data

import numpy as np
import matplotlib.pyplot as plt
# Create data
N = 60
g1 = (0.6 + 0.6 * np.random.rand(N), np.random.rand(N))
g2 = (0.4+0.3 * np.random.rand(N), 0.5*np.random.rand(N))
g3 = (0.3*np.random.rand(N),0.3*np.random.rand(N))
data = (g1, g2, g3)
colors = ("red", "green", "blue")
groups = ("coffee", "tea", "water")
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
for data, color, group in zip(data, colors, groups):
x, y = data
ax.scatter(x, y, alpha=0.8, c=color, edgecolors='none', s=30, label=group)
plt.title('Matplot scatter plot')
plt.legend(loc=2)
plt.show()
None of the code before creating the pyplot figure can be changed. It seems like the for loop isn't working correctly. I'm not sure what the problem is. There aren't any errors.

Creating a 2x2 subplot from one dataset as different graphs

I have a large census dataset I am working with and am taking different data from it and representing it as a singular .png in the end. I have created the graphs individually, but when I try to add them to the subplots they get distorted or axis get messed up.
Current code:
fig = plt.figure()
ax1 = fig.add_subplot(2, 2, 1)
ax2 = fig.add_subplot(2, 2, 2)
ax3 = fig.add_subplot(2, 2, 3)
ax4 = fig.add_subplot(2, 2, 4)
ax1.pie(df.data.valuecounts(normalize=True),labels=None,startangle-240)
ax1.legend(['a','b','c','d','e'])
ax1.axis('equal')
data2=df[['A']].dropna().values
kde=df.A.plot.kde()
binss = np.logspace(0.01,7.0)
ax2=plt.hist(hincp, normed=True, bins=binss)
ax2=plt.xscale('log')
ax3 = df.replace(np.nan,0)
ax3 = (df.groupby(['G'])['R'].sum()/1000)
ax3.plot.bar(width=0.9, color='red',title='Gs').set_ylabel('Rs')
ax3.set_ylabel('Rs')
ax3.set_xlabel('# G')
t = df[['p','o','s','y']]
ax4=plt.scatter(t.o,t.p,s=t.s,c=t.y, marker = 'o', alpha = 0.2)
plt.ylim(0, 10000)
plt.xlim(0,1200000)
cbar=plt.colorbar()
plt.title("this vs that", loc = 'center')
plt.xlabel('this')
plt.ylabel('that')
All four types of graphs should be displayed and not overlap.
You create Axes for each subplot but then you don't use them.
ax1.pie(...) looks correct but later you don't use ax2,ax3,ax4.
If you are going to to use the DataFrame plotting methods, just call plt.subplot before each new plot. Like this.
df = pd.DataFrame(np.random.random((6,3)))
plt.subplot(3,1,1)
df.loc[:,0].plot()
plt.subplot(3,1,2)
df.loc[:,1].plot()
plt.subplot(3,1,3)
df.loc[:,2].plot()
plt.show()
plt.close()
Or use the Axes that you create.
df = pd.DataFrame(np.random.random((6,3)))
fig = plt.figure()
ax1 = fig.add_subplot(3,1,1)
ax2 = fig.add_subplot(3,1,2)
ax3 = fig.add_subplot(3,1,3)
ax1.plot(df.loc[:,0])
ax2.plot(df.loc[:,1])
ax3.plot(df.loc[:,2])
plt.show()
plt.close()

Categories