Contour plot legend - Matplotlib - python

As the question says, I have a contour plot and I would like show a legend for if.
I'm using the contour plot style that uses:
dashed lines for negative levels
solid lines for positive values
I would like to have a legend for them (dashed == negative and solid == positive).
I tried the approaches found here and here. However, as can be seen below, this doesn't show the correct result.
# Draw the scalar field level curves
div_field = plt.contour(x, y, div_scalar_field, colors='white')
rot_field = plt.contour(x, y, rot_scalar_field, colors='lightgoldenrodyellow')
labels = ['Div Neg', 'Div Pos', 'Rot Neg', 'Rot Pos']
div_field.collections[0].set_label(labels[0])
div_field.collections[-1].set_label(labels[1])
rot_field.collections[0].set_label(labels[2])
rot_field.collections[-1].set_label(labels[3])
As I for the div scalar field I just have positive levels, I got two labels with the same line style.
I'm wondering how could I achieve what I want properly.
Thank you in advance.

I could solve this manually setting the legend (which I don't know if it's the best approach):
div_neg = plt.Line2D((0, 1), (0, 0), color='white', linestyle='--', linewidth=2)
div_pos = plt.Line2D((0, 1), (0, 0), color='white', linestyle='-', linewidth=2)
rot_neg = plt.Line2D((0, 1), (0, 0), color='lightgoldenrodyellow', linestyle='--', linewidth=2)
rot_pos = plt.Line2D((0, 1), (0, 0), color='lightgoldenrodyellow', linestyle='-', linewidth=2)
plt.legend([rot_max, div_neg, div_pos, rot_neg, rot_pos],
['Rot Max Points', 'Div Neg', 'Div Pos', 'Rot Neg', 'Rot Pos'])

Something like the following works for me - this complete hack is to use a labelled dummy point, fetch its colour, apply that to the contours and then just plot the legend in the usual way:
import matplotlib as plt
labels = ['div_field'] # etc.
dummy_position = [-1.0e3,-1.0e3] # Could automate
colors = []
for k in labels:
# Fetch colours via a dummy point
dummy_point = plt.plot(dummy_position[0],dummy_position[1], label = k)
c = dummy_point[-1].get_color()
colors.append(c)
# This is specific to your problem, but roughly:
div_field = plt.contour(x, y, div_scalar_field, colors=c)
# etc.
_=plt.legend()
plt.savefig('contours.pdf')
Hope that makes sense.

Related

Simulate CDF curve for penetration/adoption extrapolation

I'd like to be able to plot a line like the cumulative distribution function for the normal distribution, because it's useful for simulating the adoption curve:
Specifically, I'd like to be able to use initial data (percentage adoption of a product) to extrapolate what the rest of that curve would look like, to give a rough estimate of the timeline to each of the phases. So, for example, if we got to 10% penetration by 30 days and 20% penetration by 40 days, and we try to fit this curve, I'd like to know when we're going to get to 80% penetration (vs another population that may have taken 50 days to get to 10% penetration).
So, my question is, how could I go about doing this? I would ideally be able to provide initial data (time and penetration), and use python (e.g. matplotlib) to plot out the rest of the chart for me. But I don't know where to start! Can anyone point me in the right direction?
(Incidentally, I also posted this question on CrossValidated, but I wasn't sure whether it belonged there, as it's a stats question, or here, as it's a python question. Apologies for duplication!)
The cdf can be calculated via scipy.stats.norm.cdf(). Its ppf can be used to help map the desired correspondences. scipy.interpolate.pchip can then create a function to so that the transformation interpolates smoothly.
import matplotlib.pyplot as plt
from matplotlib.ticker import PercentFormatter
import numpy as np
from scipy.interpolate import pchip # monotonic cubic interpolation
from scipy.stats import norm
desired_xy = np.array([(30, 10), (40, 20)]) # (number of days, percentage adoption)
# desired_xy = np.array([(0, 1), (30, 10), (40, 20), (90, 99)])
labels = ['Innovators', 'Early\nAdopters', 'Early\nMajority', 'Late\nMajority', 'Laggards']
xmin, xmax = 0, 90 # minimum and maximum day on the x-axis
px = desired_xy[:, 0]
py = desired_xy[:, 1] / 100
# smooth function that transforms the x-values to the corresponding spots to get the desired y-values
interpfunc = pchip(px, norm.ppf(py))
fig, ax = plt.subplots(figsize=(12, 4))
# ax.scatter(px, py, color='crimson', s=50, zorder=3) # show desired correspondances
x = np.linspace(xmin, xmax, 1000)
ax.plot(x, norm.cdf(interpfunc(x)), lw=4, color='navy', clip_on=False)
label_divs = np.linspace(xmin, xmax, len(labels) + 1)
label_pos = (label_divs[:-1] + label_divs[1:]) / 2
ax.set_xticks(label_pos)
ax.set_xticklabels(labels, size=18, color='navy')
min_alpha, max_alpha = 0.1, 0.4
for p0, p1, alpha in zip(label_divs[:-1], label_divs[1:], np.linspace(min_alpha, max_alpha, len(labels))):
ax.axvspan(p0, p1, color='navy', alpha=alpha, zorder=-1)
ax.axvline(p0, color='white', lw=1, zorder=0)
ax.axhline(0, color='navy', lw=2, clip_on=False)
ax.axvline(0, color='navy', lw=2, clip_on=False)
ax.yaxis.set_major_formatter(PercentFormatter(1))
ax.set_xlim(xmin, xmax)
ax.set_ylim(0, 1)
ax.set_ylabel('Total Adoption', size=18, color='navy')
ax.set_title('Adoption Curve', size=24, color='navy')
for s in ax.spines:
ax.spines[s].set_visible(False)
ax.tick_params(axis='x', length=0)
ax.tick_params(axis='y', labelcolor='navy')
plt.tight_layout()
plt.show()
Using just two points for desired_xy the curve will be linearly stretched. If more points are given, a smooth transformation will be applied. Here is how it looks like with [(0, 1), (30, 10), (40, 20), (90, 99)]. Note that 0 % and 100 % will cause problems, as they lie at minus at plus infinity.

Transform from data to figure coordinates

Similar to this post, I would like to transform my data coordinates to figure coordinates. Unfortunately, the transformation tutorial doesn't seem to talk about it. So I came up with something analogous to the answer by wilywampa, but for some reason, there is something wrong and I can't figure it out:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import ConnectionPatch
t = [
0, 6.297, 39.988, 46.288, 79.989, 86.298, 120.005, 126.314, 159.994,
166.295, 200.012, 206.314, 240.005, 246.301, 280.05, 286.35, 320.032,
326.336, 360.045, 366.345, 480.971, 493.146, 1080.117, 1093.154, 1681.019,
1692.266, 2281.008, 2293.146, 2881.014, 2893.178, 3480.988, 3493.149,
4080.077, 4092.298, 4681.007, 4693.275, 5281.003, 5293.183, 5881.023,
5893.188, 6481.002, 6492.31
]
y = np.zeros(len(t))
fig, (axA, axB) = plt.subplots(2, 1)
fig.tight_layout()
for ax in (axA, axB):
ax.set_frame_on(False)
ax.axes.get_yaxis().set_visible(False)
axA.plot(t[:22], y[:22], c='black')
axA.plot(t[:22], y[:22], 'o', c='#ff4500')
axA.set_ylim((-0.05, 1))
axB.plot(t, y, c='black')
axB.plot(t, y, 'o', c='#ff4500')
axB.set_ylim((-0.05, 1))
pos1 = axB.get_position()
pos2 = [pos1.x0, pos1.y0 + 0.3, pos1.width, pos1.height]
axB.set_position(pos2)
trans = [
# (ax.transAxes + ax.transData.inverted()).inverted().transform for ax in
(fig.transFigure + ax.transData.inverted()).inverted().transform for ax in
(axA, axB)
]
con1 = ConnectionPatch(
xyA=trans[0]((0, 0)), xyB=(0, 0.1), coordsA="figure fraction",
coordsB="data", axesA=axA, axesB=axB, color="black"
)
con2 = ConnectionPatch(
xyA=(500, 0), xyB=(500, 0.1), coordsA="data", coordsB="data",
axesA=axA, axesB=axB, color="black"
)
print(trans[0]((0, 0)))
axB.add_artist(con1)
axB.add_artist(con2)
plt.show()
The line on the left is supposed to go to (0, 0) of the upper axis, but it doesn't. The same happens btw if I try to convert to axes coordinates, so there seems be to something fundamentally wrong.
The reason why I want to use figure coords is because I don't actually want the line to end at (0, 0), but slightly below the '0' tick label. I cannot do that in data coords so I tried to swap to figure coods.
Adapting the second example from this tutorial code, it seems no special combinations of transforms is needed. You can use coordsA=axA.get_xaxis_transform(), if x is in data coordinates and y in figure coordinates. Or coordsA=axA.transData if x and y are both in data coordinates. Note that when using data coordinates you are allowed to give coordinates outside the view window; by default a ConnectionPatch isn't clipped.
The following code uses z-order to put the connection lines behind the rest and adds a semi-transparent background to the tick labels of axA (avoiding that the text gets crossed out by the connection line):
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import ConnectionPatch
t = [0, 6.297, 39.988, 46.288, 79.989, 86.298, 120.005, 126.314, 159.994, 166.295, 200.012, 206.314, 240.005, 246.301, 280.05, 286.35, 320.032, 326.336, 360.045, 366.345, 480.971, 493.146, 1080.117, 1093.154, 1681.019, 1692.266, 2281.008, 2293.146, 2881.014, 2893.178, 3480.988, 3493.149, 4080.077, 4092.298, 4681.007, 4693.275, 5281.003, 5293.183, 5881.023, 5893.188, 6481.002, 6492.31]
y = np.zeros(len(t))
fig, (axA, axB) = plt.subplots(2, 1)
fig.tight_layout()
for ax in (axA, axB):
ax.set_frame_on(False)
ax.axes.get_yaxis().set_visible(False)
axA.plot(t[:22], y[:22], c='black')
axA.plot(t[:22], y[:22], 'o', c='#ff4500')
axA.set_ylim((-0.05, 1))
axB.plot(t, y, c='black')
axB.plot(t, y, 'o', c='#ff4500')
axB.set_ylim((-0.05, 1))
pos1 = axB.get_position()
pos2 = [pos1.x0, pos1.y0 + 0.3, pos1.width, pos1.height]
axB.set_position(pos2)
con1 = ConnectionPatch(xyA=(0, 0.02), coordsA=axA.get_xaxis_transform(),
xyB=(0, 0.05), coordsB=axB.get_xaxis_transform(),
# linestyle='--', color='black', zorder=-1)
linestyle='--', color='darkgrey', zorder=-1)
con2 = ConnectionPatch(xyA=(500, 0.02), coordsA=axA.get_xaxis_transform(),
xyB=(500, 0.05), coordsB=axB.get_xaxis_transform(),
linestyle='--', color='darkgrey', zorder=-1)
fig.add_artist(con1)
fig.add_artist(con2)
for lbl in axA.get_xticklabels():
lbl.set_backgroundcolor((1, 1, 1, 0.8))
plt.show()
Possible answer to your last comment:
As you're dealing with figure coords, these can change depending on your screen resolution. So if your other machine has a different res then this could be why its changing. You'll have to look into using Axes coords instead if you don't want these random changes.

Increasing the space between the plot and the title with matplotlib

I am using the following script to generate some plots. The problem is sometimes the scientific notation is overlapping with the title.
Is there a way to fix this like moving the plot a little bit down?
# init
u = {}
o = {}
# create figure
fig = plt.figure()
# x-Axis (timesteps)
i = np.array(i)
for key in urbs_values.keys():
# y-Axis (values)
u[key] = np.array(urbs_values[key])
o[key] = np.array(oemof_values[key])
# draw plots
plt.plot(i, u[key], label='urbs_'+str(key), linestyle='None', marker='x')
plt.ticklabel_format(axis='y', style='sci', scilimits=(0, 0))
plt.plot(i, o[key], label='oemof_'+str(key), linestyle='None', marker='.')
plt.ticklabel_format(axis='y', style='sci', scilimits=(0, 0))
# plot specs
plt.xlabel('Timesteps [h]')
plt.ylabel('Flow [MWh]')
plt.title(site+' '+name)
plt.grid(True)
plt.tight_layout(rect=[0,0,0.7,1])
plt.legend(bbox_to_anchor=(1.025, 1), loc=2, borderaxespad=0)
# plt.show()
Example:
You can change the position of the title by providing a value for the y parameter in plt.title(...), e.g., plt.title(site+' '+name, y=1.1).
You can edit the tittle position this way:
# plot specs
plt.xlabel('Timesteps [h]')
plt.ylabel('Flow [MWh]')
plt.title(site+' '+name)
ttl = plt.title
ttl.set_position([.5, 1.02])
plt.grid(True)
plt.tight_layout(rect=[0,0,0.7,1])
plt.legend(bbox_to_anchor=(1.025, 1), loc=2, borderaxespad=0)
# plt.show()
tuning the '1.02' should do the trick

SNS Bar plot color intensity is wrong [duplicate]

This question already has answers here:
Changing color scale in seaborn bar plot
(5 answers)
Closed 4 years ago.
I am plotting following type of Bar Plot using SNS using the following code. I used cubehelix_palette as I want the bar color intensities according to the values. I am expecting the higher values get darker purple and lower values get lighter. But It seems very different what I am getting here. . It seems less negative values are getting darker and more positive value is neglected. Am I doing something wrong here?
x = ["A","B","C","D"]
y = [-0.086552691,0.498737914,-0.090153413,-0.075941404]
sns.axes_style('white')
sns.set_style('white')
pal=sns.cubehelix_palette(5)
ax = sns.barplot(x, y,palette=pal)
for n, (label, _y) in enumerate(zip(x, y)):
ax.annotate(
s='{:.3f}'.format(_y),
xy=(n, _y),
ha='center',va='center',
xytext=(0,10*(1 if _y > 0 else -1)),
textcoords='offset points',
size = 8,
weight='bold'
)
ax.annotate(
s=label,
xy=(n, 0),
ha='left',va='center',
xytext=(0,50*(-1 if _y > 0 else 1)),
textcoords='offset points',
rotation=90,
size = 10,
weight='bold'
)
# axes formatting
#ax.set_yticks([])
ax.set_xticks([])
sns.despine(ax=ax, bottom=True, left=True)
EDITED
As per #ImportanceOfBeingErnest suggestion, I tried the following code too. However, the negative directional intensities are wrong. Also disturbing Legend is also visible.
import numpy as np, matplotlib.pyplot as plt, seaborn as sns
sns.set(style="whitegrid", color_codes=True)
pal = sns.color_palette("Greens_d", 5)
ax = sns.barplot(x=x, y=y, palette=pal,hue=y,dodge=False)
x = ["A","B","C","D","E","F","G","H","I","J","K"]
y = [-0.086552691,
0.498737914,
-0.090153413,
-0.075941404,
-0.089105985,
-0.05301275,
-0.095927691,
-0.083528335,
0.250680624,
-0.092506638,
-0.082689631,
]
for n, (label, _y) in enumerate(zip(x, y)):
ax.annotate(
s='{:.3f}'.format(_y),
xy=(n, _y),
ha='center',va='center',
xytext=(0,10*(1 if _y > 0 else -1)),
textcoords='offset points',
size = 8,
weight='bold'
)
ax.annotate(
s=label,
xy=(n, 0),
ha='left',va='center',
xytext=(0,50*(-1 if _y > 0 else 1)),
textcoords='offset points',
rotation=90,
size = 10,
weight='bold'
)
ax.set_xticks([])
sns.despine(ax=ax, bottom=True, left=True)
plt.show()
The documentation says that your palette argument maps your colors onto the different levels of your hue argument, which you haven't provided.
So I think that you need to set the hue argument in your barplot, so that your colors are mapped specifically to your y values.
With everything else untouched except replacing ax = sns.barplot(x, y,palette=pal) with this:
ax = sns.barplot(x, y, hue=y, palette=pal, dodge=False)
# Remove the legend
ax.legend_.remove()
you get this plot, in which the higher the y, the darker the color:

Multiple step histograms in matplotlib

Dear python/matplotlib community,
I am having an issue within matplotlib: I can't seem to plot multiple overlaid histograms in the same plot space using the following:
binsize = 0.05
min_x_data_sey, max_x_data_sey = np.min(logOII_OIII_sey), np.max(logOII_OIII_sey)
num_x_bins_sey = np.floor((max_x_data_sey - min_x_data_sey) / binsize)
min_x_data_comp, max_x_data_comp = np.min(logOII_OIII_comp), np.max(logOII_OIII_comp)
num_x_bins_comp = np.floor((max_x_data_comp - min_x_data_comp) / binsize)
min_x_data_sf, max_x_data_sf = np.min(logOII_OIII_sf), np.max(logOII_OIII_sf)
num_x_bins_sf = np.floor((max_x_data_sf - min_x_data_sf) / binsize)
axScatter_farright = fig.add_subplot(gs_right[0,0])
axScatter_farright.tick_params(axis='both', which='major', labelsize=10)
axScatter_farright.tick_params(axis='both', which='minor', labelsize=10)
axScatter_farright.set_ylabel(r'$\mathrm{N}$', fontsize='medium')
axScatter_farright.set_xlim(-1.5, 1.0)
axScatter_farright.set_xlabel(r'$\mathrm{log([OII]/[OIII])}$', fontsize='medium')
axScatter_farright.hist(logOII_OIII_sey, num_x_bins_sey, ec='0.3', fc='none', histtype='step')
axScatter_farright.hist(logOII_OIII_comp, num_x_bins_comp, ec='0.3', fc='none', histtype='step')
axScatter_farright.hist(logOII_OIII_sf, num_x_bins_sf, ec='0.3', fc='none', histtype='step')
It seems like the axes class can not handle multiple histograms? Please correct me if and/or where I have gone wrong.
My overall plot is a 1 row, 3 column plotting space. I would like to use grid spec to give the plots a good layout.
This is what my plot looks like thus far:
This is what I want the histogram portion of the figure to look like in terms of the step type histogram overlays (with legend):
I have the datasets as three different tuple type arrays generated from a csv file. i.e., using x, y = np.genfromtext(datafile.csv)
If anyone is able to explain how this could be done I would be very appreciative.
What you're doing should work perfectly. Is it possible that only one of the distributions is in the x-range of -1.5 to 1 that you've set a couple of lines before? (i.e. Try removing the manual set_xlim statement and see if the other distributions show up.)
As a quick, stand-alone example to demonstrate that things should work:
import numpy as np
import matplotlib.pyplot as plt
num = 1000
d1 = np.random.normal(-1, 1, num)
d2 = np.random.normal(1, 1, num)
d3 = np.random.normal(0, 3, num)
fig, ax = plt.subplots()
ax.hist(d1, 50, ec='red', fc='none', lw=1.5, histtype='step', label='Dist A')
ax.hist(d2, 50, ec='green', fc='none', lw=1.5, histtype='step', label='Dist B')
ax.hist(d3, 100, ec='blue', fc='none', lw=1.5, histtype='step', label='Dist C')
ax.legend(loc='upper left')
plt.show()
(If you want the legend to show lines instead of boxes, you'll need use a proxy artist. I can add an example if you'd like. That's outside the scope of this question, though.)

Categories