I am creating a scatterplot with a colorbar
plt.scatter(X, Y, c=Z)
plt.colorbar()
plt.show()
plt.close()
where X and Y are float arrays and Z is an integer array.
Even though Z is an integer array (here 1-14), the colorbar displays floats.
How can I display a discrete colorbar 1-14?
I found something attempting to answer a similar question here, but I don't understand the answer (containing some complications to make 0 be gray) well enough to apply it.
Check out the second answer to your linked question. If you discretize your colourmap before calling scatter, it will automatically work as you want it to:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
n = 14
X = np.random.rand(20)
Y = np.random.rand(20)
Z = np.random.randint(low=0,high=n,size=X.shape)
plt.figure()
plt.scatter(X,Y,c=Z,cmap=cm.hot)
plt.colorbar()
plt.figure()
plt.scatter(X,Y,c=Z,cmap=cm.get_cmap('hot',n))
plt.colorbar()
Results for comparison:
Note that the default colourmap is jet. But only until viridis kicks in starting from version 2.0 as the new (and wonderful) default.
If what's bothering you is that the numbers are floating-point on the colourbar, you can set manual ticks in it, irrespective of the discretization of colours:
plt.figure()
plt.scatter(X,Y,c=Z,cmap=cm.jet)
plt.colorbar(ticks=np.unique(Z))
#or
#plt.colorbar(ticks=range(Z.min(),Z.max()+1))
Result:
Note that since I used a few random-generated points, not every number is present in Z, so unique might not be the best approach (see the missing ticks in the above figure). This is why I also added a solution based on min/max. You can tailor the limits to your needs depending on your actual application.
Here is my discrete colorbar for land use type, it seems like your work,because the Z value is also an interger array from 1-14.
My method
creat the colormap and colorbar label manually learned from here
My Code
cMap = ListedColormap(['white', '#8dd3c7','#ffffb3','#bebada', \
'#b2182b','#80b1d3','#fdb462','#b3de69','#6a3d9a',\
'#b2df8a', '#1f78b4', '#ccebc5','#ffed6f'])
## If you want to use the colormap from plt.cm..., you can use(take 'jet' for example)
cMap = plt.cm.get_cmap("jet",lut=13)
### here you can change your data in
lulc = plt.pcolormesh(lulc,cmap = cMap,alpha = 0.7)
z_range = np.linspace(1,14,14)
list = z_range.astype('S10')
k = -0.05
for i in range(0,13,1):
k = k + 1/13.0
ax.annotate(list[i],xycoords='axes fraction',xy=(1.12,k),fontsize = 14, \
fontstyle = 'italic',zorder =3)
cbar = plt.colorbar(lulc,ticks = [ ])
for label in cbar.ax.yaxis.get_ticklabels()[::-1]:
label.set_visible(False)
My result
(source: tietuku.com)
Wish it can help!
Related
I frequently find myself working in log units for my plots, for example taking np.log10(x) of data before binning it or creating contour plots. The problem is, when I then want to make the plots presentable, the axes are in ugly log units, and the tick marks are evenly spaced.
If I let matplotlib do all the conversions, i.e. by setting ax.set_xaxis('log') then I get very nice looking axes, however I can't do that to my data since it is e.g. already binned in log units. I could manually change the tick labels, but that wouldn't make the tick spacing logarithmic. I suppose I could also go and manually specify the position of every minor tick such it had log spacing, but is that the only way to achieve this? That is a bit tedious so it would be nice if there is a better way.
For concreteness, here is a plot:
I want to have the tick labels as 10^x and 10^y (so '1' is '10', 2 is '100' etc.), and I want the minor ticks to be drawn as ax.set_xaxis('log') would draw them.
Edit: For further concreteness, suppose the plot is generated from an image, like this:
import matplotlib.pyplot as plt
import scipy.misc
img = scipy.misc.face()
x_range = [-5,3] # log10 units
y_range = [-55, -45] # log10 units
p = plt.imshow(img,extent=x_range+y_range)
plt.show()
and all we want to do is change the axes appearance as I have described.
Edit 2: Ok, ImportanceOfBeingErnest's answer is very clever but it is a bit more specific to images than I wanted. I have another example, of binned data this time. Perhaps their technique still works on this, though it is not clear to me if that is the case.
import numpy as np
import pandas as pd
import datashader as ds
from matplotlib import pyplot as plt
import scipy.stats as sps
v1 = sps.lognorm(loc=0, scale=3, s=0.8)
v2 = sps.lognorm(loc=0, scale=1, s=0.8)
x = np.log10(v1.rvs(100000))
y = np.log10(v2.rvs(100000))
x_range=[np.min(x),np.max(x)]
y_range=[np.min(y),np.max(y)]
df = pd.DataFrame.from_dict({"x": x, "y": y})
#------ Aggregate the data ------
cvs = ds.Canvas(plot_width=30, plot_height=30, x_range=x_range, y_range=y_range)
agg = cvs.points(df, 'x', 'y')
# Create contour plot
fig = plt.figure()
ax = fig.add_subplot(111)
ax.contourf(agg, extent=x_range+y_range)
ax.set_xlabel("x")
ax.set_ylabel("y")
plt.show()
The general answer to this question is probably given in this post:
Can I mimic a log scale of an axis in matplotlib without transforming the associated data?
However here an easy option might be to scale the content of the axes and then set the axes to a log scale.
A. image
You may plot your image on a logarithmic scale but make all pixels the same size in log units. Unfortunately imshow does not allow for such kind of image (any more), but one may use pcolormesh for that purpose.
import numpy as np
import matplotlib.pyplot as plt
import scipy.misc
img = scipy.misc.face()
extx = [-5,3] # log10 units
exty = [-45, -55] # log10 units
x = np.logspace(extx[0],extx[-1],img.shape[1]+1)
y = np.logspace(exty[0],exty[-1],img.shape[0]+1)
X,Y = np.meshgrid(x,y)
c = img.reshape((img.shape[0]*img.shape[1],img.shape[2]))/255.0
m = plt.pcolormesh(X,Y,X[:-1,:-1], color=c, linewidth=0)
m.set_array(None)
plt.gca().set_xscale("log")
plt.gca().set_yscale("log")
plt.show()
B. contour
The same concept can be used for a contour plot.
import numpy as np
from matplotlib import pyplot as plt
x = np.linspace(-1.1,1.9)
y = np.linspace(-1.4,1.55)
X,Y = np.meshgrid(x,y)
agg = np.exp(-(X**2+Y**2)*2)
fig, ax = plt.subplots()
plt.gca().set_xscale("log")
plt.gca().set_yscale("log")
exp = lambda x: 10.**(np.array(x))
cf = ax.contourf(exp(X), exp(Y),agg, extent=exp([x.min(),x.max(),y.min(),y.max()]))
ax.set_xlabel("x")
ax.set_ylabel("y")
plt.show()
I have already binned data to plot a histogram. For this reason I'm using the plt.bar() function. I'd like to set both axes in the plot to a logarithmic scale.
If I set plt.bar(x, y, width=10, color='b', log=True) which lets me set the y-axis to log but I can't set the x-axis logarithmic.
I've tried plt.xscale('log') unfortunately this doesn't work right. The x-axis ticks vanish and the sizes of the bars don't have equal width.
I would be grateful for any help.
By default, the bars of a barplot have a width of 0.8. Therefore they appear larger for smaller x values on a logarithmic scale. If instead of specifying a constant width, one uses the distance between the bin edges and supplies this to the width argument, the bars will have the correct width. One would also need to set the align to "edge" for this to work.
import matplotlib.pyplot as plt
import numpy as np; np.random.seed(1)
x = np.logspace(0, 5, num=21)
y = (np.sin(1.e-2*(x[:-1]-20))+3)**10
fig, ax = plt.subplots()
ax.bar(x[:-1], y, width=np.diff(x), log=True,ec="k", align="edge")
ax.set_xscale("log")
plt.show()
I cannot reproduce missing ticklabels for a logarithmic scaling. This may be due to some settings in the code that are not shown in the question or due to the fact that an older matplotlib version is used. The example here works fine with matplotlib 2.0.
If the goal is to have equal width bars, assuming datapoints are not equidistant, then the most proper solution is to set width as
plt.bar(x, y, width=c*np.array(x), color='b', log=True) for a constant c appropriate for the plot. Alignment can be anything.
I know it is a very old question and you might have solved it but I've come to this post because I was with something like this but at the y axis and I manage to solve it just using ax.set_ylim(df['my data'].min()+100, df['my data'].max()+100). In y axis I have some sensible information which I thouhg the best way was to show in log scale but when I set log scale I couldn't see the numbers proper (as this post in x axis) so I just leave the idea of use log and use the min and max argment. It sets the scale of my graph much like as log. Still looking for another way for doesnt need use that -+100 at set_ylim.
While this does not actually use pyplot.bar, I think this method could be helpful in achieving what the OP is trying to do. I found this to be easier than trying to calibrate the width as a function of the log-scale, though it's more steps. Create a line collection whose width is independent of the chart scale.
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.collections as coll
#Generate data and sort into bins
a = np.random.logseries(0.5, 1000)
hist, bin_edges = np.histogram(a, bins=20, density=False)
x = bin_edges[:-1] # remove the top-end from bin_edges to match dimensions of hist
lines = []
for i in range(len(x)):
pair=[(x[i],0), (x[i], hist[i])]
lines.append(pair)
linecoll = coll.LineCollection(lines, linewidths=10, linestyles='solid')
fig, ax = plt.subplots()
ax.add_collection(linecoll)
ax.set_xscale("log")
ax.set_yscale("log")
ax.set_xlim(min(x)/10,max(x)*10)
ax.set_ylim(0.1,1.1*max(hist)) #since this is an unweighted histogram, the logy doesn't make much sense.
Resulting plot - no frills
One drawback is that the "bars" will be centered, but this could be changed by offsetting the x-values by half of the linewidth value ... I think it would be
x_new = x + (linewidth/2)*10**round(np.log10(x),0).
Here is my question.
When I want use a lot of colormap, I could use
CMAP = ["summer_r", "brg_r", "Dark2", "prism", "PuOr_r", "afmhot_r", "terrain_r", "PuBuGn_r", "RdPu", \
"gist_ncar_r", "gist_yarg_r", "Dark2_r", "YlGnBu", "RdYlBu", "hot_r"]
## value was a 3-d array, the first dimension represent the amount of 2-d array with the value (0, 1).
## I just plot the value 1 for each value[i,:,:]
for i in range(0,len(CMAP),1):
plt.pcolor(xx,yy,value[i,:,:], cmap = CMAP[i])
And I can get this:
http://i8.tietuku.com/cdcdcd5f539c124b.png
But I can't clearly realize the each grid's color befor generating the figure.
Because some colormap which I add in CMAP may have the same start color. SO, some value[ i, :, :] grids will be hard to distinguish.
My idea
Using one colormap instead and split into single color for each value[ i, :, :]. So, each value grid has a different color.
For example:
## 1. cut the colormap, take "jet" for example
cMap = plt.cm.get_cmap("jet",lut=6)
http://i4.tietuku.com/be127c44e87a03fc.png
## 2. I havn't figured it out
## This is the fake code
CMAP = Func[one color -> colormap](cMap)
Update -2016-01-18
This is my code to set different cmap and loop, but it was a bit of rigid.
cmap1 = colors.ListedColormap(["w",'red'])
cmap2 = colors.ListedColormap(["w",'blue'])
cmap3 = colors.ListedColormap(["w",'yellow'])
CMAP = [cmap1,cmap2,cmap3]
Then, I can cope with my original attempt.
But I was wondering is there a smart way to generate the cmap1,cmap2,......?
The hard part of this is coming up with N distinctive colors. In practice, it's usually easiest to just grab random colors as long as N is small. If you'd prefer a bit nicer way of getting N distinct colors, have a look at how seaborn's husl_palette and hsl_palette are implemented. They choose N evenly spaced colors in HSL/HUSL space and convert it back to RGB.
At any rate, there are two parts to tying specific values to specific colors in matplotlib. One is the colormap and the other is the norm. The Normalize instance (the norm) handles transforming the data ranges into a 0-1 space for the colormap.
There's a function to make this use-case easier:matplotlib.colors.from_levels_and_colors. It returns a cmap and norm instance that you can pass in to imshow/pcolormesh/scatter/etc.
As a stand-alone example, let's generate data with a random number of unique integer values. We'll use random pastel colors instead of trying to do something fancy.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import from_levels_and_colors
nvals = np.random.randint(2, 20)
data = np.random.randint(0, nvals, (10, 10))
colors = np.random.random((nvals, 3))
# Make the colors pastels...
colors = colors / 2.5 + 0.55
levels = np.arange(nvals + 1) - 0.5
cmap, norm = from_levels_and_colors(levels, colors)
fig, ax = plt.subplots()
im = ax.imshow(data, interpolation='nearest', cmap=cmap, norm=norm)
fig.colorbar(im, ticks=np.arange(nvals))
plt.show()
Not the nicest looking color palette, but it's not awful. Here's another run:
Even with 17 values, we're still getting fairly distinct colors by choosing random values.
I try to plot different data with similar representations but slight different behaviours and different origins on several figures. So the min & max of the Y axis is different between each figure, but the scale too.
e.g. here are some extracts of my batch plotting :
Does it exists a simple way with matplotlib to constraint the same Y step on those different figures, in order to have an easy visual interpretation, while keeping an automatically determined Y min and Y max ?
In others words, I'd like to have the same metric spacing between each Y-tick
you could use a MultipleLocator from the ticker module on both axes to define the tick spacings:
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
fig=plt.figure()
ax1=fig.add_subplot(211)
ax2=fig.add_subplot(212)
ax1.set_ylim(0,100)
ax2.set_ylim(40,70)
# set ticks every 10
tickspacing = 10
ax1.yaxis.set_major_locator(ticker.MultipleLocator(base=tickspacing))
ax2.yaxis.set_major_locator(ticker.MultipleLocator(base=tickspacing))
plt.show()
EDIT:
It seems like your desired behaviour was different to how I interpreted your question. Here is a function that will change the limits of the y axes to make sure ymax-ymin is the same for both subplots, using the larger of the two ylim ranges to change the smaller one.
import matplotlib.pyplot as plt
import numpy as np
fig=plt.figure()
ax1=fig.add_subplot(211)
ax2=fig.add_subplot(212)
ax1.set_ylim(40,50)
ax2.set_ylim(40,70)
def adjust_axes_limits(ax1,ax2):
yrange1 = np.ptp(ax1.get_ylim())
yrange2 = np.ptp(ax2.get_ylim())
def change_limits(ax,yr):
new_ymin = ax.get_ylim()[0] - yr/2.
new_ymax = ax.get_ylim()[1] + yr/2.
ax.set_ylim(new_ymin,new_ymax)
if yrange1 > yrange2:
change_limits(ax2,yrange1-yrange2)
elif yrange2 > yrange1:
change_limits(ax1,yrange2-yrange1)
else:
pass
adjust_axes_limits(ax1,ax2)
plt.show()
Note that the first subplot here has expanded from (40, 50) to (30, 60), to match the y range of the second subplot
The answer of Tom is pretty fine !
But I decided to use a simpler solution
I define an arbitrary yrange for all my plots e.g.
yrang = 0.003
and for each plot, I do :
ymin, ymax = ax.get_ylim()
ymid = np.mean([ymin,ymax])
ax.set_ylim([ymid - yrang/2 , ymid + yrang/2])
and possibly:
ax.yaxis.set_major_locator(ticker.MultipleLocator(base=0.005))
I have a set of coordinates, say [(2,3),(45,4),(3,65)]
I need to plot them as a matrix is there anyway I can do this in matplotlib so I want it to have this sort of look http://imgur.com/Q6LLhmk
Edit: My original answer used ax.scatter. There is a problem with this: If two points are side-by-side, ax.scatter may draw them with a bit of space in between, depending on the scale:
For example, with
data = np.array([(2,3),(3,3)])
Here is a zoomed-in detail:
So here is a alternative solution that fixes this problem:
import matplotlib.pyplot as plt
import numpy as np
data = np.array([(2,3),(3,3),(45,4),(3,65)])
N = data.max() + 5
# color the background white (1 is white)
arr = np.ones((N,N), dtype = 'bool')
# color the dots black (0)
arr[data[:,1], data[:,0]] = 0
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.imshow(arr, interpolation='nearest', cmap = 'gray')
ax.invert_yaxis()
# ax.axis('off')
plt.show()
No matter how much you zoom in, the adjacent squares at (2,3) and (3,3) will remain side-by-side.
Unfortunately, unlike ax.scatter, using ax.imshow requires building an N x N array, so it could be more memory-intensive than using ax.scatter. That should not be a problem unless data contains very large numbers, however.