default settings of seaborn.heatmap gives
the x-axis starts from the origin of 0 then increases towards the
right
the y-axis starts from an origin of 9 then increases towards the
upward
This is odd compared to matplotlib.pyplot.pcolormesh, which gives a y-axis that starts from an origin of 0 that moves upward, like what we'd intuitively want since it only makes sense for origins to be (0,0), not (0,9)!
How to make the y-axis of heatmap also start from an origin of 0, instead of 9, moving upward? (while of course re-orienting the data correspondingly)
I tried transposing the input data, but this doesn't look right and the axes don't change. I don't think it's a flip about the y-axis that's needed, but a simple rotating of the heatmap.
You can flip the y-axis using ax.invert_yaxis():
import seaborn as sns
import numpy as np
np.random.seed(0)
sns.set_theme()
uniform_data = np.random.rand(10, 12)
ax = sns.heatmap(uniform_data)
ax.invert_yaxis()
If you want to do the rotation you describe, you have to transpose the matrix first:
import seaborn as sns
import numpy as np
np.random.seed(0)
sns.set_theme()
uniform_data = np.random.rand(10, 12)
ax = sns.heatmap(uniform_data.T)
ax.invert_yaxis()
The reason for the difference is that they are assuming different coordinate systems. pcolormesh is assuming that you want to access the elements using cartesian coordinates i.e. [x, y] and it displays them in the way you would expect. heatmap is assuming you want to access the elements using array coordinates i.e. [row, col], so the heatmap it gives has the same layout as if you print the array to the console.
Why do they use different coordinate systems? I would be speculating but I think it's due to the ages of the 2 libraries. matplotlib, particularly its older commands is a port from Matlab, so many of the assumptions are the same. seaborn was developed for Python much later, specifically aimed at statistical visualization, and after pandas was already existent. So I would guess that mwaskom chose the layout to replicate how a DataFrame looks when you print it to the screen.
You can create a graph at the lower left point by resetting yticklabels=[].Does this fit your question?
import seaborn as sns
import numpy as np
np.random.seed(0)
sns.set_theme()
uniform_data = np.random.rand(10, 12)
ax = sns.heatmap(uniform_data, yticklabels=[9,8,7,6,5,4,3,2,1,0])
Related
I'm experimenting with seaborn and have a question about specifying axes properties. In my code below, I've taken two approaches to creating a heatmap of a matrix and placing the results on two sets of axes in a figure.
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
A=np.random.randn(4,4)
labels=['a','b','c','d']
fig, ax = plt.subplots(2)
sns.heatmap(ax =ax[0], data = A)
ax[0].set_xticks(range(len(labels)))
ax[0].set_xticklabels(labels,fontsize=10,rotation=45)
ax[0].set_yticks(range(len(labels)))
ax[0].set_yticklabels(labels,fontsize=10,rotation=45)
ax[1].set_xticks(range(len(labels)))
ax[1].set_xticklabels(labels,fontsize=10,rotation=45)
ax[1].set_yticks(range(len(labels)))
ax[1].set_yticklabels(labels,fontsize=10,rotation=45)
sns.heatmap(ax =ax[1], data = A,xticklabels=labels, yticklabels=labels)
plt.show()
The resulting figure looks like this:
Normally, I would always take the first approach of creating the heatmap and then specifying axis properties. However, when creating an animation (to be embedded on a tkinter canvas), which is what I'm ultimately interested in doing, I found such an ordering in my update function leads to "flickering" of axis labels. The second approach will eliminate this effect, and it also centers the tickmarks within squares along the axes.
However, the second approach does not rotate the y-axis tickmark labels as desired. Is there a simple fix to this?
I'm not sure this is what you're looking for. It looks like you create your figure after you change the yticklabels. so the figure is overwriting your yticklabels.
Below would fix your issue.
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
A=np.random.randn(4,4)
labels=['a','b','c','d']
fig, ax = plt.subplots(2)
sns.heatmap(ax =ax[0], data = A)
ax[0].set_xticks(range(len(labels)))
ax[0].set_xticklabels(labels,fontsize=10,rotation=45)
ax[0].set_yticks(range(len(labels)))
ax[0].set_yticklabels(labels,fontsize=10,rotation=45)
ax[1].set_xticks(range(len(labels)))
ax[1].set_xticklabels(labels,fontsize=10,rotation=45)
ax[1].set_yticks(range(len(labels)))
sns.heatmap(ax =ax[1], data = A,xticklabels=labels, yticklabels=labels)
ax[1].set_yticklabels(labels,fontsize=10,rotation=45)
plt.show()
I noticed a 'strange' behaviour when running the following code:
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.ticker import (MultipleLocator, AutoMinorLocator)
freqs = np.logspace(2,4)
freqs_ext = np.logspace(2, 10)
fig, ax = plt.subplots(1,2)
ax [0].plot(freqs , freqs**2)
#ax[0].xaxis.set_minor_locator(AutoMinorLocator(5))
ax[0].grid(which='both')
#ax[0].minorticks_on()
ax[0].set_xscale( 'log')
ax[1].plot(freqs_ext,freqs_ext**2)
#ax[l].xaxis.set_minor_locator(AutoMinorLocator(5))
ax[1].grid(which='both')
#ax[1].minorticks on()
ax[1].set_xscale('log')
The output is the following:
I have tried more variants than I care to report, (some are commented out in the code above), but I cannot get matplotlib to draw minor gridlines for the plot on the right side, as it does for the one on the left.
I think I have understood that the "problem" lies in where the ticks are located for the second plot, which has a much larger span. They are every two decades and I believe this might be the source of the minor grid lines not displaying.
I have played with xaxis.set_xticks and obtained ticks every decade, but still cannot get this to correctly produce the gridlines.
It is probably something stupid but I can't see it.
NOTE : I know that matplotlib doesn't turn the minor ticks on by default, and in this case this action is "triggered" by changing the scale to log (that's why axis.grid(which='both') actually only acts on the x axis)
OK, I have found this answer:
Matplotlib: strange double-decade axis ticks in log plot
which actually shows how the issue is a design choice for matplotlib starting with v2. Answer was given in 2017 so, not the newest issue :)
The following code correctly plots the minor grids as wanted:
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.ticker import LogLocator
freqs = np.logspace(2,4)
freqs_ext = np.logspace(2, 10)
fig, ax = plt.subplots(1,2)
ax[0].plot(freqs , freqs**2)
ax[0].grid(which='both')
ax[0].set_xscale( 'log')
ax[1].plot(freqs_ext,freqs_ext**2)
ax[1].set_xscale('log')
ax[1].xaxis.set_major_locator(LogLocator(numticks=15))
ax[1].xaxis.set_minor_locator(LogLocator(numticks=15,subs=np.arange(2,10)))
ax[1].grid(which='both')
I have one questions about matplotlib and contourf.
I am using the last version of matplotlib with python3.7. Basically I have to matrix I want to plot on the same contour plot but using different colormap. One important aspect is that, for instance, if we have zero matrixA and matrixB with shape=(10,10) then the positions in which matrixA is different of zero are the positions in which matrixB are non-zero, and viceversa.
In other words I want to plot in different colors two different mask.
Thanks for your time.
Edited:
I add an example here
import numpy
import matplotlib.pyplot as plt
matrixA=numpy.random.randn(10,10).reshape(100,)
matrixB=numpy.random.randn(10,10).reshape(100,)
mask=numpy.random.uniform(10,10)
mask=mask.reshape(100,)
indexA=numpy.where(mask[mask>0.5])[0]
indexB=numpy.where(mask[mask<=0.5])[0]
matrixA_masked=numpy.zeros(100,)
matrixB_masked=numpy.zeros(100,)
matrixA_masked[indexA]=matrixA[indexA]
matrixB_masked[indexB]=matrixB[indexB]
matrixA_masked=matrixA_masked.reshape(100,100)
matrixB_masked=matrixB_masked.reshape(100,100)
x=numpy.linspace(0,10,1)
X,Y = numpy.meshgrid(x,x)
plt.contourf(X,Y,matrixA_masked,colormap='gray')
plt.contourf(X,Y,matrixB_masked,colormap='winter')
plt.show()
What I want is to be able to use different colormaps that appear in the same plot. So for instance in the plot there will be a part assigned to matrixA with a contour color (and 0 where matrixB take place), and the same to matrixB with a different colormap.
In other works each part of the contourf plot correspond to one matrix. I am plotting decision surfaces of Machine Learning Models.
I stumbled into some errors in your code so I have created my own dataset.
To have two colormaps on one plot you need to open a figure and define the axes:
import numpy
import matplotlib.pyplot as plt
matrixA=numpy.linspace(1,20,100)
matrixA[matrixA >= 10] = numpy.nan
matrixA_2 = numpy.reshape(matrixA,[50,2])
matrixB=numpy.linspace(1,20,100)
matrixB[matrixB <= 10] = numpy.nan
matrixB_2 = numpy.reshape(matrixB,[50,2])
fig,ax = plt.subplots()
a = ax.contourf(matrixA_2,cmap='copper',alpha=0.5,zorder=0)
fig.colorbar(a,ax=ax,orientation='vertical')
b=ax.contourf(matrixB_2,cmap='cool',alpha=0.5,zorder=1)
fig.colorbar(b,ax=ax,orientation='horizontal')
plt.show()
You'll also see I've changed the alpha and zorder
I hope this helps.
I have been given a data for which I need to find a histogram. So I used pandas hist() function and plot it using matplotlib. The code runs on a remote server so I cannot directly see it and hence I save the image. Here is what the image looks like
Here is my code below
import matplotlib.pyplot as plt
df_hist = pd.DataFrame(np.array(raw_data)).hist(bins=5) // raw_data is the data supplied to me
plt.savefig('/path/to/file.png')
plt.close()
As you can see the x axis labels are overlapping. So I used this function plt.tight_layout() like so
import matplotlib.pyplot as plt
df_hist = pd.DataFrame(np.array(raw_data)).hist(bins=5)
plt.tight_layout()
plt.savefig('/path/to/file.png')
plt.close()
There is some improvement now
But still the labels are too close. Is there a way to ensure the labels do not touch each other and there is fair spacing between them? Also I want to resize the image to make it smaller.
I checked the documentation here https://matplotlib.org/api/_as_gen/matplotlib.pyplot.savefig.html but not sure which parameter to use for savefig.
Since raw_data is not already a pandas dataframe there's no need to turn it into one to do the plotting. Instead you can plot directly with matplotlib.
There are many different ways to achieve what you'd like. I'll start by setting up some data which looks similar to yours:
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import gamma
raw_data = gamma.rvs(a=1, scale=1e6, size=100)
If we go ahead and use matplotlib to create the histogram we may find the xticks too close together:
fig, ax = plt.subplots(1, 1, figsize=[5, 3])
ax.hist(raw_data, bins=5)
fig.tight_layout()
The xticks are hard to read with all the zeros, regardless of spacing. So, one thing you may wish to do would be to use scientific formatting. This makes the x-axis much easier to interpret:
ax.ticklabel_format(style='sci', axis='x', scilimits=(0,0))
Another option, without using scientific formatting would be to rotate the ticks (as mentioned in the comments):
ax.tick_params(axis='x', rotation=45)
fig.tight_layout()
Finally, you also mentioned altering the size of the image. Note that this is best done when the figure is initialised. You can set the size of the figure with the figsize argument. The following would create a figure 5" wide and 3" in height:
fig, ax = plt.subplots(1, 1, figsize=[5, 3])
I think the two best fixes were mentioned by Pam in the comments.
You can rotate the labels with
plt.xticks(rotation=45
For more information, look here: Rotate axis text in python matplotlib
The real problem is too many zeros that don't provide any extra info. Numpy arrays are pretty easy to work with, so pd.DataFrame(np.array(raw_data)/1000).hist(bins=5) should get rid of three zeros off of both axes. Then just add a 'kilo' in the axes labels.
To change the size of the graph use rcParams.
from matplotlib import rcParams
rcParams['figure.figsize'] = 7, 5.75 #the numbers are the dimensions
I'm facing issues in scaling axes 3d in matplotlib. I have found another questions but somehow the answer it does not seems to work. Here is a sample code:
import matplotlib as mpl
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
data=np.array([[0,0,0],[10,1,1],[2,2,2]])
fig=plt.figure()
ax=Axes3D(fig)
ax.set_xlim3d(0,15)
ax.set_ylim3d(0,15)
ax.set_zlim3d(0,15)
ax.scatter(data[:,0],data[:,1],data[:,2])
plt.show()
It seems it just ignore the ax.set commands...
In my experience, you have to set your axis limits after plotting the data, otherwise it will look at your data and adjust whatever axes settings you entered before to fit it all in-frame out to the next convenient increment along the axes in question. If, for instance, you set your x-axis limits to +/-400 but your data go out to about +/-1700 and matplotlib decides to label the x-axis in increments of 500, it's going to display the data relative to an x-axis that goes out to +/-2000.
So in your case, you just want to rearrange that last block of text as:
fig=plt.figure()
ax=Axes3D(fig)
ax.scatter(data[:,0],data[:,1],data[:,2])
ax.set_xlim3d(0,15)
ax.set_ylim3d(0,15)
ax.set_zlim3d(0,15)
plt.show()
The way of ColorOutOfSpace is good. But if you want to automate the scaling you have to search for the maximum and minimum number in the data and scale with those values.
min = np.amin(data) # lowest number in the array
max = np.amax(data) # highest number in the array
ax.set_xlim3d(min, max)
ax.set_ylim3d(min, max)
ax.set_zlim3d(min, max)