How to use scientific notation in seaborn heatmap labels?

How to use scientific notation in seaborn heatmap labels? - python

I'm trying to get a heatmap using seaborn in python. Unfortunately it is not using scientific notation even though the numbers are very large. I was wondering if there's any simple way to convert to scientific notation or any other reasonable format. Here's a piece of code that shows the problem:
import seaborn as sns
import numpy as np
C_vals = np.logspace(3, 10, 8)
g_vals = np.logspace(-6, 2, 9)
score = np.random.rand(len(g_vals), len(C_vals))
sns.heatmap(score, xticklabels=C_vals, yticklabels=g_vals)
The resulting plot is the following

The heatmap allows to create its labels from the input to the xticklabels/yticklabels command. Those are then put along the axes, so there is no numeric format to change their appearance.
An option is to format the labels prior to supplying them to the heatmap. To this end a matplotlib ScalarFormatter can be (mis)used, which allows to automatically generate a MathText string from a float number. The following would be an example:
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import seaborn as sns
import numpy as np
C_vals = np.logspace(3, 10, 8)
g_vals = np.logspace(-6, 2, 9)
score = np.random.rand(len(g_vals),len(C_vals))
tick = ticker.ScalarFormatter(useOffset=False, useMathText=True)
tick.set_powerlimits((0,0))
tc = [u"${}$".format(tick.format_data(x)) for x in C_vals]
tg = [u"${}$".format(tick.format_data(x)) for x in g_vals]
sns.heatmap(score, xticklabels=tc, yticklabels=tg)
plt.show()

If you can bear to do w/o sns.heatmap, its perhaps more natural to do this with pcolormesh
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import numpy as np
C_vals = np.logspace(3, 10, 8)
g_vals = np.logspace(-6, 2, 9)
score = np.random.rand(len(g_vals),len(C_vals))
fig, ax = plt.subplots()
ax.pcolormesh(C_vals, g_vals, score)
ax.set_yscale('log')
ax.set_xscale('log')
plt.show()
As pointed out below, pcolormesh doesn't centre the same way. Further, it actually drops a level. While I have a PR in to change that behaviour, here is a workaround. I admit at this point, its not much more elegant than messing w/ the heatmap output.
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import numpy as np
C_vals = np.logspace(3, 10, 8)
g_vals = np.logspace(-6, 2, 9)
# make bracketing:
def midpointext(x):
return np.hstack(( 1.5 * x[0] - 0.5 * x[1],
x[:-1] + 0.5 * np.diff(x),
1.5 * x[-1] - 0.5 * x[-2]))
newC = np.log10(C_vals)
newC = midpointext(newC)
newC = 10**newC
newg = np.log10(g_vals)
newg = midpointext(newg)
newg = 10**newg
score = np.random.rand(len(g_vals),len(C_vals))
fig, ax = plt.subplots()
ax.pcolormesh(newC, newg, score)
ax.set_yscale('log')
ax.set_xscale('log')
plt.show()

Related

limit range of colorbar on bar graph in matplotlib

I've been attempting to limit the range on the colorbar function in matplotlib. For whatever reason, I cannot use the clim function. Ideally I would like 80 and 20 to be the max values of the colorbar, and all values above or below those values to be a single dark blue/red, and the entire colorbar to be fit within the range of 20 and 80.
import requests
from bs4 import BeautifulSoup
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.cm import ScalarMappable
import matplotlib as mpl
import numpy as np
Gpercent=40
xGpercent = 60
SCFpercent = 55
CFpercent = 45
Analytics = ['GF%','xGF%','SCF%','CF%']
AnalyticsValues = [Gpercent,xGpercent,SCFpercent,CFpercent]
AnalyticsValues = [float(val) for val in AnalyticsValues]
data_height_normalized = [x / 100 for x in AnalyticsValues]
fig, ax = plt.subplots(figsize=(15, 4))
#my_cmap = plt.cm.get_cmap('RdBu')
my_cmap = plt.cm.get_cmap('coolwarm_r')
colors = my_cmap(data_height_normalized)
rects = ax.bar(Analytics, AnalyticsValues, color=colors)
sm = ScalarMappable(cmap=my_cmap, norm=plt.Normalize(0,100))
plt.ylim(0, 100)
cbar = plt.colorbar(sm)
plt.yticks(np.arange(0, 100.8, 10))
plt.title('bob' + (" On Ice 5v5 Impact"))
plt.xlabel('Analytical Metric')
plt.ylabel('%')
fig.patch.set_facecolor('xkcd:white')
plt.show()
The plot comes out as follows. I'd like the colorbar to be more defined in a shorter range, while still showing the % from 0-100

The intent of your question is to add an upper and lower limit to the color bar only. I would like to set the lower limit to 20 and the upper limit to 80. I will answer with the understanding that
The gist of the code is to create a new colormap from the defined colormap using LinearSegmentedColormap with the upper and lower color range.
My answer was modified from this excellent answer to fit your assignment.
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.cm import ScalarMappable
from matplotlib.colors import LinearSegmentedColormap # add
import matplotlib as mpl
import numpy as np
Gpercent=40
xGpercent = 60
SCFpercent = 55
CFpercent = 45
Analytics = ['GF%','xGF%','SCF%','CF%']
AnalyticsValues = [Gpercent,xGpercent,SCFpercent,CFpercent]
AnalyticsValues = [float(val) for val in AnalyticsValues]
data_height_normalized = [x / 100 for x in AnalyticsValues]
fig, ax = plt.subplots(figsize=(15, 4))
#my_cmap = plt.cm.get_cmap('RdBu')
my_cmap = plt.cm.get_cmap('coolwarm_r')
colors = my_cmap(data_height_normalized)
rects = ax.bar(Analytics, AnalyticsValues, color=colors)
# update
vmin,vmax = 20,80
colors2 = my_cmap(np.linspace(1.-(vmax-vmin)/float(vmax), 1, my_cmap.N))
color_map = LinearSegmentedColormap.from_list('cut_coolwarm', colors2)
sm = ScalarMappable(cmap=color_map, norm=plt.Normalize(vmin, vmax))
plt.ylim(0, 100)
cbar = plt.colorbar(sm)
plt.yticks(np.arange(0, 100.8, 10))
plt.title('bob' + (" On Ice 5v5 Impact"))
plt.xlabel('Analytical Metric')
plt.ylabel('%')
fig.patch.set_facecolor('xkcd:white')
plt.show()

Create dynamic footnote text in matplotlib

I would like to be able to add footnote text similar to the following in matplotlib:
The following code will create a plot with similar text
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
fig, ax = plt.subplots(figsize = (5, 8))
n = 10
np.random.seed(1)
_ = ax.scatter(np.random.randint(0, 10, n), np.random.randint(0, 10, n), s=500)
x = 0
y = 1
_ = ax.text(
x, y, "hello this is some text at the bottom of the plot", fontsize=15, color="#555"
)
Which looks as:
However, if the data changes then the above won't adjust, such as:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
fig, ax = plt.subplots(figsize=(5, 8))
n = 10
np.random.seed(2)
_ = ax.scatter(np.random.randint(0, 10, n), np.random.randint(0, 10, n), s=500)
x = 0
y = 1
_ = ax.text(
x, y, "hello this is some text at the bottom of the plot", fontsize=15, color="#555"
)
I have seen this question/answer, and this just says how to plot text at a particular x,y coordinate. I specifically want to be able to set a footnote though, not plot at a particular x,y, so the solution should be dynamic.
Also, use of the OOP interface is preferred as mentioned in the docs.
Note - there seems to be issues with the current suggestion when using fig.tight_layout()

You should try plotting the text relative to the subplot and not relative to the points in the subplot using transform=ax.transAxes. You should also set the alignment so that the text starts based on the location you want. The can play around with the point location.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
fig, ax = plt.subplots(figsize=(5, 8))
n = 10
np.random.seed(2)
_ = ax.scatter(np.random.randint(0, 10, n), np.random.randint(0, 10, n), s=500)
x = 0
y = -.07
ax.text(x, y, "hello this is some text at the bottom of the plot", fontsize=15,
horizontalalignment='left',verticalalignment='top', transform=ax.transAxes)
plt.show()

Show decimal places and scientific notation on the axis of a matplotlib plot

I am plotting some big numbers with matplotlib in a pyqt program using python 2.7. I have a y-axis that ranges from 1e+18 to 3e+18 (usually). I'd like to see each tick mark show values in scientific notation and with 2 decimal places. For example 2.35e+18 instead of just 2e+18 because values between 2e+18 and 3e+18 still read just 2e+18 for a few tickmarks. Here is an example of that problem.
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
x = np.linspace(0, 300, 20)
y = np.linspace(0,300, 20)
y = y*1e16
ax.plot(x,y)
ax.get_xaxis().set_major_formatter(plt.LogFormatter(10, labelOnlyBase=False))
ax.get_yaxis().set_major_formatter(plt.LogFormatter(10, labelOnlyBase=False))
plt.show()

This is really easy to do if you use the matplotlib.ticker.FormatStrFormatter as opposed to the LogFormatter. The following code will label everything with the format '%.2e':
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
fig = plt.figure()
ax = fig.add_subplot(111)
x = np.linspace(0, 300, 20)
y = np.linspace(0,300, 20)
y = y*1e16
ax.plot(x,y)
ax.yaxis.set_major_formatter(mtick.FormatStrFormatter('%.2e'))
plt.show()

In order to get nicely formatted labels in scientific notation one may use the formatting capabilities of a ScalarFormatter which uses MathText (Latex) and apply it to the labels.
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.ticker as mticker
fig, ax = plt.subplots()
x = np.linspace(0, 300, 20)
y = np.linspace(0,300, 20)
y = y*1e16
ax.plot(x,y)
f = mticker.ScalarFormatter(useOffset=False, useMathText=True)
g = lambda x,pos : "${}$".format(f._formatSciNotation('%1.10e' % x))
plt.gca().yaxis.set_major_formatter(mticker.FuncFormatter(g))
plt.show()
While this may be useful in a lot of cases, it does not actually meet the requirements of the question. To have equal digits on all labels a more customized version can be used.
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.ticker as mticker
fig, ax = plt.subplots()
x = np.linspace(0, 300, 20)
y = np.linspace(0,300, 20)
y = y*1e16
ax.plot(x,y)
class MathTextSciFormatter(mticker.Formatter):
def __init__(self, fmt="%1.2e"):
self.fmt = fmt
def __call__(self, x, pos=None):
s = self.fmt % x
decimal_point = '.'
positive_sign = '+'
tup = s.split('e')
significand = tup[0].rstrip(decimal_point)
sign = tup[1][0].replace(positive_sign, '')
exponent = tup[1][1:].lstrip('0')
if exponent:
exponent = '10^{%s%s}' % (sign, exponent)
if significand and exponent:
s = r'%s{\times}%s' % (significand, exponent)
else:
s = r'%s%s' % (significand, exponent)
return "${}$".format(s)
# Format with 2 decimal places
plt.gca().yaxis.set_major_formatter(MathTextSciFormatter("%1.2e"))
plt.show()

matplotlib hist() autocropping range

I am trying to make a histgram over a specific range but the matplotlib.pyplot.hist() function keeps cropping the range to the bins with entries in them. A toy example:
import numpy as np
import matplotlib.pyplot as plt
x = np.random.uniform(-100,100,1000)
nbins = 100
xmin = -500
xmax = 500
fig = plt.figure();
ax = fig.add_subplot(1, 1, 1)
ax.hist(x, bins=nbins,range=[xmin,xmax])
plt.show()
Gives a plot with a range [-100,100]. Why is the range not [-500,500] as specified?
(I am using the Enthought Canopy 1.4 and sorry but I do not have a high enough rep to post an image of the plot.)

Actually, it works if you specify with range an interval shorter than [-100, 100]. For example, this work :
import numpy as np
import matplotlib.pyplot as plt
x = np.random.uniform(-100, 100, 1000)
plt.hist(x, bins=30, range=(-50, 50))
plt.show()
If you want to plot the histogram on a range larger than [x.min(), x.max()] you can change xlim propertie of the plot.
import numpy as np
import matplotlib.pyplot as plt
x = np.random.uniform(-100, 100, 1000)
plt.hist(x, bins=30)
plt.xlim(-500, 500)
plt.show()

the following code is for making the same y axis limit on two subplots
f ,ax = plt.subplots(1,2,figsize = (30, 13),gridspec_kw={'width_ratios': [5, 1]})
df.plot(ax = ax[0], linewidth = 2.5)
ylim = [df['min_return'].min()*1.1,df['max_return'].max()*1.1]
ax[0].set_ylim(ylim)
ax[1].hist(data,normed =1, bins = num_bin, color = 'yellow' ,alpha = 1)
ax[1].set_ylim(ylim)

changing default x range in histogram matplotlib

I would like to change the default x range for the histogram plot. The range of the data is from 7 to 12. However, by default the histogram starts right at 7 and ends at 13. I want it to start at 6.5 and end at 12.5. However, the ticks should go from 7 to 12.How do I do it?
import asciitable
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
import pylab
from pylab import xticks
data = asciitable.read(file)
hmag = data['col8']
visits = data['col14']
origin = data['col13']
n, bins, patches = plt.hist(hmag, 30, facecolor='gray', align='mid')
xticks(range(7,13))
pylab.rc("axes", linewidth=8.0)
pylab.rc("lines", markeredgewidth=2.0)
plt.xlabel('H mag', fontsize=14)
plt.ylabel('# of targets', fontsize=14)
pylab.xticks(fontsize=15)
pylab.yticks(fontsize=15)
plt.grid(True)
plt.savefig('hmag_histogram.eps', facecolor='w', edgecolor='w', format='eps')
plt.show()

plt.hist(hmag, 30, range=[6.5, 12.5], facecolor='gray', align='mid')

import matplotlib.pyplot as plt
...
plt.xlim(xmin=6.5, xmax = 12.5)

the following code is for making the same y axis limit on two subplots
f ,ax = plt.subplots(1,2,figsize = (30, 13),gridspec_kw={'width_ratios': [5, 1]})
df.plot(ax = ax[0], linewidth = 2.5)
ylim = [lower_limit,upper_limit]
ax[0].set_ylim(ylim)
ax[1].hist(data,normed =1, bins = num_bin, color = 'yellow' ,alpha = 1)
ax[1].set_ylim(ylim)
just a reminder, plt.hist(range=[low, high]) the histogram auto crops the range if the specified range is larger than the max&min of the data points. So if you want to specify the y-axis range number, i prefer to use set_ylim

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to use scientific notation in seaborn heatmap labels? - python

Related

limit range of colorbar on bar graph in matplotlib

Create dynamic footnote text in matplotlib

Show decimal places and scientific notation on the axis of a matplotlib plot

matplotlib hist() autocropping range

changing default x range in histogram matplotlib

Categories

Resources