The axes labels do not align with the matrix when using matshow

The axes labels do not align with the matrix when using matshow - python

I used the following code to create the attached plot:
fig, ax = plt.subplots()
fig.set_figheight(50)
fig.set_figwidth(50)
ax.matshow(power_final_for_plotting, cmap='GnBu', origin='upper')
ax.set_xticks(time_periods)
ax.set_xticklabels(time_periods)
ax.set_yticks(sig_wave_height)
ax.set_yticklabels(sig_wave_height)
for i in range(len(time_periods)):
for j in range(len(amplitude)):
c = round(power_final_for_plotting[j, i],3)
ax.text(i, j, str(c), va='center', ha='center', size=27)
plt.tight_layout()
Here time_period and sig_wave_height are lists of integers. The axis labels do not align properly in this case (Check the top right of the image to see the labels as they are). How can I fix this? The labels are really small in this case:

The current scale definition, for example, gives the necessary amount of x-axis as a numerical value, but the scale name is displayed up to 15 because it is a numerical value. We need 30 tick points and tick names for each column. Since no data is provided, I have created sample data as appropriate. Also, the small font of the scale is due to the size of 50 inches.
import numpy as np
import matplotlib.pyplot as plt
power_final_for_plotting = np.random.rand(450).reshape(15,30)
time_periods = np.arange(0,15,0.5)
sig_wave_height = np.arange(0.5,8.0,0.5)
fig, ax = plt.subplots()
fig.set_figheight(6)
fig.set_figwidth(12)
ax.matshow(power_final_for_plotting, cmap='GnBu', origin='upper')
ax.set_xticks(range(len(time_periods)))
ax.set_xticklabels([str(x) for x in time_periods])
ax.set_yticks(range(len(sig_wave_height)))
ax.set_yticklabels([str(x) for x in sig_wave_height])
for i in range(len(time_periods)):
for j in range(len(sig_wave_height)):
c = round(power_final_for_plotting[j, i],3)
ax.text(i, j, str(c), va='center', ha='center', size=9)
plt.tight_layout()
#print(ax.get_xticklabels())
plt.show()

Related

adjusting horizontal bar chart matplotlib to accommodate the bars

I am doing a horizontal bar chart but struggling with adjusting ylim, or maybe another parameter to make my labels clearer and make all the labels fit the y axis . I played around with ylim and the text size can be bigger or smaller but the bars do not fit the y axis. Any idea about the right approach?
My code:
import matplotlib.pyplot as plt #we load the library that contains the plotting capabilities
from operator import itemgetter
D=[]
for att, befor, after in zip(df_portion['attributes'], df_portion['2005_2011 (%)'], df_portion['2012_2015 (%)']):
i=(att, befor, after)
D.append(i)
Dsort = sorted(D, key=itemgetter(1), reverse=False) #sort the list in order of usage
attri = [x[0] for x in Dsort]
aft = [x[1] for x in Dsort]
bef = [x[2] for x in Dsort]
ind = np.arange(len(attri))
width=3
ax = plt.subplot(111)
ax.barh(ind, aft, width,align='center',alpha=1, color='r', label='from 2012 to 2015') #a horizontal bar chart (use .bar instead of .barh for vertical)
ax.barh(ind - width, bef, width, align='center', alpha=1, color='b', label='from 2005 to 2008') #a horizontal bar chart (use .bar instead of .barh for vertical)
ax.set(yticks=ind, yticklabels=attri,ylim=[1, len(attri)/2])
plt.xlabel('Frequency distribution (%)')
plt.title('Frequency distribution (%) of common attributes between 2005_2008 and between 2012_2015')
plt.legend()
plt.show()
This is the plot for above code

To make the labels fit, you need to set a smaller fontsize, or use a larger figsize. Changing the ylim will either just show a subset of the bars (in case ylim is set too narrow), or will show more whitespace (when ylim is larger).
The biggest problem in the code is width being too large. Twice the width needs to fit over a distance of 1.0 (the ticks are placed via ind, which is an array 0,1,2,...). As matplotlib calls the thickness of a horizontal bar plot "height", this name is used in the example code below. Using align='edge' lets you position the bars directly (align='center' will move them half their "height").
Pandas has simple functions to sort dataframes according to one or more rows.
Code to illustrate the ideas:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# first create some test data
df = pd.DataFrame({'attributes': ["alpha", "beta", "gamma", "delta", "epsilon", "zata", "eta", "theta", "iota",
"kappa", "lambda", "mu", "nu", "xi", "omikron", "pi", "rho", "sigma", "tau",
"upsilon", "phi", "chi", "psi", "omega"]})
totals_2005_2011 = np.random.uniform(100, 10000, len(df))
totals_2012_2015 = totals_2005_2011 * np.random.uniform(0.70, 2, len(df))
df['2005_2011 (%)'] = totals_2005_2011 / totals_2005_2011.sum() * 100
df['2012_2015 (%)'] = totals_2012_2015 / totals_2012_2015.sum() * 100
# sort all rows via the '2005_2011 (%)' column, sort from large to small
df = df.sort_values('2005_2011 (%)', ascending=False)
ind = np.arange(len(df))
height = 0.3 # two times height needs to be at most 1
fig, ax = plt.subplots(figsize=(12, 6))
ax.barh(ind, df['2012_2015 (%)'], height, align='edge', alpha=1, color='crimson', label='from 2012 to 2015')
ax.barh(ind - height, df['2005_2011 (%)'], height, align='edge', alpha=1, color='dodgerblue', label='from 2005 to 2011')
ax.set_yticks(ind)
ax.set_yticklabels(df['attributes'], fontsize=10)
ax.grid(axis='x')
ax.set_xlabel('Frequency distribution (%)')
ax.set_title('Frequency distribution (%) of common attributes between 2005_2011 and between 2012_2015')
ax.legend()
ax.margins(y=0.01) # use smaller margins in the y-direction
plt.tight_layout()
plt.show()
The seaborn library has some functions to create barplots with multiple bars per attribute, without the need to manually fiddle with bar positions. Seaborn prefers its data in "long form", which can be created via pandas' melt().
Example code:
import seaborn as sns
df = df.sort_values('2005_2011 (%)', ascending=True)
df_long = df.melt(id_vars='attributes', value_vars=['2005_2011 (%)', '2012_2015 (%)'],
var_name='period', value_name='distribution')
fig, ax = plt.subplots(figsize=(12, 6))
sns.barplot(data=df_long, y='attributes', x='distribution', hue='period', palette='turbo', ax=ax)
ax.set_xlabel('Frequency distribution (%)')
ax.set_title('Frequency distribution (%) of common attributes between 2005_2011 and between 2012_2015')
ax.grid(axis='x')
ax.tick_params(axis='y', labelsize=12)
sns.despine()
plt.tight_layout()
plt.show()

Parasite x-axis in loglog plot

I have a graph where the x-axis is the temperature in GeV, but I also need to put a reference of the temperature in Kelvin, so I thought of putting a parasite axis with the temperature in K. Trying to follow this answer How to add a second x-axis in matplotlib , Here is the example of the code. I get a second axis at the top of my graph, but it is not the temperature in K as I need.
import numpy as np
import matplotlib.pyplot as plt
tt = np.logspace(-14,10,100)
yy = np.logspace(-10,-2,100)
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax2 = ax1.twiny()
ax1.loglog(tt,yy)
ax1.set_xlabel('Temperature (GeV')
new_tick_locations = np.array([.2, .5, .9])
def tick_function(X):
V = X*1.16e13
return ["%.1f" % z for z in V]
ax2.set_xlim(ax1.get_xlim())
ax2.set_xticks(new_tick_locations)
ax2.set_xticklabels(tick_function(ax1Xs))
ax2.set_xlabel('Temp (Kelvin)')
plt.show()
This is what I get when I run the code.
loglog plot
I need the parasite axis be proportional to the original x-axis. And that it can be easy to read the temperature in Kelvin when anyone sees the graph. Thanks in advance.

A general purpose solution may look as follows. Since you have a non-linear scale, the idea is to find the positions of nice ticks in Kelvin, convert to GeV, set the positions in units of GeV, but label them in units of Kelvin. This sounds complicated, but the advantage is that you do not need to find the ticks yourself, just rely on matplotlib for finding them.
What this requires though is the functional dependence between the two scales, i.e. the converion between GeV and Kelvin and its inverse.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
tt = np.logspace(-14,10,100)
yy = np.logspace(-10,-2,100)
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax2 = ax1.twiny()
plt.setp([ax1,ax2], xscale="log", yscale="log")
ax1.get_shared_x_axes().join(ax1, ax2)
ax1.plot(tt,yy)
ax1.set_xlabel('Temperature (GeV)')
ax2.set_xlabel('Temp (Kelvin)')
fig.canvas.draw()
# 1 GeV == 1.16 × 10^13 Kelvin
Kelvin2GeV = lambda k: k / 1.16e13
GeV2Kelvin = lambda gev: gev * 1.16e13
loc = mticker.LogLocator()
locs = loc.tick_values(*GeV2Kelvin(np.array(ax1.get_xlim())))
ax2.set_xticks(Kelvin2GeV(locs))
ax2.set_xlim(ax1.get_xlim())
f = mticker.ScalarFormatter(useOffset=False, useMathText=True)
g = lambda x,pos : "${}$".format(f._formatSciNotation('%1.10e' % GeV2Kelvin(x)))
fmt = mticker.FuncFormatter(g)
ax2.xaxis.set_major_formatter(mticker.FuncFormatter(fmt))
plt.show()

The problem appears to be the following: When you use ax2.set_xlim(ax1.get_xlim()), you are basically setting the limit of upper x-axis to be the same as that of the lower x-axis. Now if you do
print(ax1.get_xlim())
print(ax2.get_xlim())
you get for both axes the same values as
(6.309573444801943e-16, 158489319246.11108)
(6.309573444801943e-16, 158489319246.11108)
but your lower x-axis is having a logarithmic scale. When you assign the limits using ax2.set_xlim(), the limits of ax2 are the same but the scale is still linear. That's why when you set the ticks at [.2, .5, .9], these values appear as ticks on the far left of the upper x-axis as in your figure.
The solution is to set the upper x-axis also to be a logarithmic scale. This is required because your new_tick_locations corresponds to the actual values on the lower x-axis. You just want to rename these values to show the ticklabels in Kelvin. It is clear from your variable names that new_tick_locations corresponds to the new tick locations. I use some modified values of new_tick_locations to highlight the problem.
I am using scientific formatting '%.0e' because 1 GeV = 1.16e13 K and so 0.5 GeV would be a very large value with many zeros.
Below is a sample answer:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
tt = np.logspace(-14,10,100)
yy = np.logspace(-10,-2,100)
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax2 = ax1.twiny()
ax1.loglog(tt,yy)
ax1.set_xlabel('Temperature (GeV)')
new_tick_locations = np.array([0.000002, 0.05, 9000])
def tick_function(X):
V = X*1.16e13
return ["%.1f" % z for z in V]
ax2.set_xscale('log') # Setting the logarithmic scale
ax2.set_xlim(ax1.get_xlim())
ax2.set_xticks(new_tick_locations)
ax2.set_xticklabels(tick_function(new_tick_locations))
ax2.xaxis.set_major_formatter(mtick.FormatStrFormatter('%.0e'))
ax2.set_xlabel('Temp (Kelvin)')
plt.show()

Polar plot - Put one grid line in bold

I am trying to make use the polar plot projection to make a radar chart. I would like to know how to put only one grid line in bold (while the others should remain standard).
For my specific case, I would like to highlight the gridline associated to the ytick "0".
from matplotlib import pyplot as plt
import pandas as pd
import numpy as np
#Variables
sespi = pd.read_csv("country_progress.csv")
labels = sespi.country
progress = sespi.progress
angles=np.linspace(0, 2*np.pi, len(labels), endpoint=False)
#Concatenation to close the plots
progress=np.concatenate((progress,[progress[0]]))
angles=np.concatenate((angles,[angles[0]]))
#Polar plot
fig=plt.figure()
ax = fig.add_subplot(111, polar=True)
ax.plot(angles, progress, '.--', linewidth=1, c="g")
#ax.fill(angles, progress, alpha=0.25)
ax.set_thetagrids(angles * 180/np.pi, labels)
ax.set_yticklabels([-200,-150,-100,-50,0,50,100,150,200])
#ax.set_title()
ax.grid(True)
plt.show()

The gridlines of a plot are Line2D objects. Therefore you can't make it bold. What you can do (as shown, in part, in the other answer) is to increase the linewidth and change the colour but rather than plot a new line you can do this to the specified gridline.
You first need to find the index of the y tick labels which you want to change:
y_tick_labels = [-100,-10,0,10]
ind = y_tick_labels.index(0) # find index of value 0
You can then get a list of the gridlines using gridlines = ax.yaxis.get_gridlines(). Then use the index you found previously on this list to change the properties of the correct gridline.
Using the example from the gallery as a basis, a full example is shown below:
r = np.arange(0, 2, 0.01)
theta = 2 * np.pi * r
ax = plt.subplot(111, projection='polar')
ax.set_rmax(2)
ax.set_rticks([0.5, 1, 1.5, 2]) # less radial ticks
ax.set_rlabel_position(-22.5) # get radial labels away from plotted line
ax.grid(True)
y_tick_labels = [-100, -10, 0, 10]
ax.set_yticklabels(y_tick_labels)
ind = y_tick_labels.index(0) # find index of value 0
gridlines = ax.yaxis.get_gridlines()
gridlines[ind].set_color("k")
gridlines[ind].set_linewidth(2.5)
plt.show()
Which gives:

It is just a trick, but I guess you could just plot a circle and change its linewidth and color to whatever could be bold for you.
For example:
import matplotlib.pyplot as plt
import numpy as np
Yline = 0
Npoints = 300
angles = np.linspace(0,360,Npoints)*np.pi/180
line = 0*angles + Yline
ax = plt.subplot(111, projection='polar')
plt.plot(angles, line, color = 'k', linewidth = 3)
plt.ylim([-1,1])
plt.grid(True)
plt.show()
In this piece of code, I plot a line using plt.plot between any point of the two vectors angles and line. The former is actually all the angles between 0 and 2*np.pi. The latter is constant, and equal to the 'height' you want to plot that line Yline.
I suggest you try to decrease and increase Npoints while having a look to the documentaion of np.linspace() in order to understand your problem with the roundness of the circle.

Midpoint of Color Palette [duplicate]

This question already has answers here:
Shifted colorbar matplotlib
(1 answer)
Defining the midpoint of a colormap in matplotlib
(10 answers)
Closed 5 years ago.
For my current project I need a heat map. The heat map needs a scalable color palette, because the values are interesting only in a small range. That means, even if I have values from 0 to 1, interesting is only the part between 0.6 and 0.9; so I would like to scale the heat map colors accordingly, plus show the scale next to the chart.
In Matplotlib I had no way of setting the mid point of a color palette except for overloading the original class, like shown here in the matplotlib guide.
This is exactly what I need, but without the disadvantages of the unclean data structure in Matplotlib.
So I tried Bokeh.
In five minutes I achieved more than with Matplotlib in an hour, however, I got stuck when I wanted to show the color scale next to the heatmap and when I wanted to change the scale of the color palette.
So, here are my questions:
How can I scale the color palette in Bokeh or Matplotlib?
Is there a way to display the annotated color bar next to the heatmap?
import pandas
scores_df = pd.DataFrame(myScores, index=c_range, columns=gamma_range)
import bkcharts
from bokeh.palettes import Inferno256
hm = bkcharts.HeatMap(scores_df, palette=Inferno256)
# here: how to insert a color bar?
# here: how to correctly scale the inferno256 palette?
hm.ylabel = "C"
hm.xlabel = "gamma"
bkcharts.output_file('heatmap.html')
Following Aarons tips, i now implemented it as follows:
import matplotlib.pyplot as plt
import matplotlib.colors as colors
from bokeh.palettes import Inferno256
def print_scores(scores, gamma_range, C_range):
# load a color map
# find other colormaps here
# https://docs.bokeh.org/en/latest/docs/reference/palettes.html
cmap = colors.ListedColormap(Inferno256, len(Inferno256))
fig, ax = plt.subplots(1, 1, figsize=(6, 5))
# adjust lower, midlle and upper bound of the colormap
cmin = np.percentile(scores, 10)
cmid = np.percentile(scores, 75)
cmax = np.percentile(scores, 99)
bounds = np.append(np.linspace(cmin, cmid), np.linspace(cmid, cmax))
norm = colors.BoundaryNorm(boundaries=bounds, ncolors=len(Inferno256))
pcm = ax.pcolormesh(np.log10(gamma_range),
np.log10(C_range),
scores,
norm=norm,
cmap=cmap)
fig.colorbar(pcm, ax=ax, extend='both', orientation='vertical')
plt.show()

ImportanceOfBeingErnest correctly pointed out that my first comment wasn't entirely clear (or accurately worded)..
Most plotting functions in mpl have a kwarg: norm= this denotes a class (subclass of mpl.colors.Normalize) that will map your array of data to the values [0 - 1] for the purpose of mapping to the colormap, but not actually impact the numerical values of the data. There are several built in subclasses, and you can also create your own. For this application, I would probably utilize BoundaryNorm. This class maps N-1 evenly spaced colors to the space between N discreet boundaries.
I have modified the example slightly to better fit your application:
#adaptation of https://matplotlib.org/users/colormapnorms.html#discrete-bounds
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors as colors
from matplotlib.mlab import bivariate_normal
#example data
N = 100
X, Y = np.mgrid[-3:3:complex(0, N), -2:2:complex(0, N)]
Z1 = (bivariate_normal(X, Y, 1., 1., 1.0, 1.0))**2 \
- 0.4 * (bivariate_normal(X, Y, 1.0, 1.0, -1.0, 0.0))**2
Z1 = Z1/0.03
'''
BoundaryNorm: For this one you provide the boundaries for your colors,
and the Norm puts the first color in between the first pair, the
second color between the second pair, etc.
'''
fig, ax = plt.subplots(3, 1, figsize=(8, 8))
ax = ax.flatten()
# even bounds gives a contour-like effect
bounds = np.linspace(-1, 1)
norm = colors.BoundaryNorm(boundaries=bounds, ncolors=256)
pcm = ax[0].pcolormesh(X, Y, Z1,
norm=norm,
cmap='RdBu_r')
fig.colorbar(pcm, ax=ax[0], extend='both', orientation='vertical')
# clipped bounds emphasize particular region of data:
bounds = np.linspace(-.2, .5)
norm = colors.BoundaryNorm(boundaries=bounds, ncolors=256)
pcm = ax[1].pcolormesh(X, Y, Z1, norm=norm, cmap='RdBu_r')
fig.colorbar(pcm, ax=ax[1], extend='both', orientation='vertical')
# now if we want 0 to be white still, we must have 0 in the middle of our array
bounds = np.append(np.linspace(-.2, 0), np.linspace(0, .5))
norm = colors.BoundaryNorm(boundaries=bounds, ncolors=256)
pcm = ax[2].pcolormesh(X, Y, Z1, norm=norm, cmap='RdBu_r')
fig.colorbar(pcm, ax=ax[2], extend='both', orientation='vertical')
fig.show()

Rotating Matplotlib tick labels causes weird spacing issues

I have the following plot:
I would like to make the x-axis ticks more readable by rotating the ticks by ~40 degrees. So from:
plt.xticks(list(range(0, width)), list(df_100.columns), rotation='90', fontsize=16)
To:
plt.xticks(list(range(0, width)), list(df_100.columns), rotation='40', fontsize=16)
When I do this, though, I get some crazy spacing issues:
(ignore the change in color...)
What's causing this problem? How can I fix it? Here's a minimum working example:
import matplotlib.pyplot as plt
import numpy as np
# Z is your data set
N = 100
height = df_100.shape[0]
width = df_100.shape[1]
# Z = np.random.random((100, 29))
# G is a NxNx3 matrix
G = np.zeros((height,width,3))
# Where we set the RGB for each pixel
G[Z>0.5] = [1, 1, 1]
G[Z<0.5] = [0.25, 0.25, 0.25]
fig, ax = plt.subplots(figsize=(20, 10))
ax.imshow(G, interpolation='none')
ax.set_aspect('auto')
ax.grid(None)
ax.xaxis.tick_top()
plt.xticks(list(range(0, width)), list(df_100.columns), rotation='45', fontsize=16)
plt.yticks([0, df_100.shape[0] - 1], [1, df_100.shape[0]], fontsize=20)
plt.tight_layout()
plt.show()

If xticklabels are of the same length, you won't have this kind of problem. But given different length of labels, you can encounter this kind of problem. Because the default rotation is from the center of the xlabel string. So you can try to set the rotation anchor properly from
['right', 'center', 'left'].
ha = 'left' # or 'right'. Experiment with it.
ax.set_xticks(x) # set tick location
ax.set_xticklabels(xlabels, rotation=40, ha=ha) # rotate the labels with proper anchoring.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

The axes labels do not align with the matrix when using matshow - python

Related

adjusting horizontal bar chart matplotlib to accommodate the bars

Parasite x-axis in loglog plot

Polar plot - Put one grid line in bold

Midpoint of Color Palette [duplicate]

Rotating Matplotlib tick labels causes weird spacing issues

Categories

Resources