Problem:
I have two columns of data (x and y points), and a third column with labels (values 0 or 1). I want to plot x and y on a scatter plot, and color them according to whether the label is 0 or 1, and I want a colorbar on the right of the plot.
Here is my data: https://www.dropbox.com/s/ffta3wgrl2vvcpw/data.csv?dl=0
Note: I know that since there are only two labels I will only get two colors despite using a colorbar; but this dataset is just used as an example here.
What I've done so far
import matplotlib.pyplot as plt
import csv
import matplotlib as m
#read in the data
with open('data.csv', 'rb') as infile:
data=[]
r = csv.reader(infile)
for row in r:
data.append(row)
col1, col2, col3 = [el for el in zip(*data)]
#I'd like to have a colormap going from red to green:
cdict = {
'red' : ( (0.0, 0.25, 0), (0.5, 1, 1), (1., 0.0, 1.)),
'green': ( (0.0, 0.0, 0.0), (0.5, 0.0, 0.0), (1., 1.0, 1.0)),
'blue' : ( (0.0, 0.0, 0.0), (1, 0.0, 0.0), (1., 0.0, 0.0))}
cm = m.colors.LinearSegmentedColormap('my_colormap', cdict)
# I got the following line from an example I saw; it works for me,
# but I don't really know how it works as an input to colorbar,
# and would like to know.
formatter = plt.FuncFormatter(lambda i, *args: ['0', '1'][int(i)])
plt.figure()
plt.scatter(col1, col2, c=col3)
plt.colorbar(ticks=[0, 1], format=formatter, cmap=cm)
The above code doesn't work because of the call to plt.colorbar.
How can I make it work (what is missing), and is this the best way to do it?
The documentation on what the ticks parameter is is incomprehensible to me. What is it exactly?
Documentation: http://matplotlib.org/api/figure_api.html#matplotlib.figure.Figure.colorbar
You need to pass col3 to scatter as an array of floats, not a tuple, and not ints
So, this should work:
import matplotlib.pyplot as plt
import csv
import matplotlib as m
import numpy as np
#read in the data
with open('data.csv', 'rb') as infile:
data=[]
r = csv.reader(infile)
for row in r:
data.append(row)
col1, col2, col3 = [el for el in zip(*data)]
#I'd like to have a colormap going from red to green:
cdict = {
'red' : ( (0.0, 1.0, 1.0), (0.5, 0.0, 0.0), (1.0, 0.0, 0.0)),
'green': ( (0.0, 0.0, 0.0), (0.5, 0.0, 0.0), (1.0, 1.0, 1.0)),
'blue' : ( (0.0, 0.0, 0.0), (1.0, 0.0, 0.0), (1.0, 0.0, 0.0))}
cm = m.colors.LinearSegmentedColormap('my_colormap', cdict)
#I got the following line from an example I saw; it works for me, but I don't really know how it works as an input to colorbar, and would like to know.
formatter = plt.FuncFormatter(lambda i, *args: ['0', '1'][int(i)])
plt.figure()
plt.scatter(col1, col2, c=np.asarray(col3,dtype=np.float32),lw=0,cmap=cm)
plt.colorbar(ticks=[0, 1], format=formatter, cmap=cm)
As for ticks, you are passing a list of where you want ticks on the colorbar. So, in your example, you have a tick at 0 and a tick at 1.
I've also fixed your cmap, to go from red to green. You need to tell scatter to use the cmap too.
Related
I'm trying to make a heatmap over time, but I think matplotlib is messing with the plot colours.
My code is based on the heat equation, I think the specs are not important, the main thing is that I am creating a 3D array and plotting a slice from that array (a 2D matrix), setting which slice I plot using the matplotlib widget Slider.
The important part of the code is this:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.widgets import Slider
from matplotlib.colors import LogNorm
def update(val):
newdata = mat[:,:,int(val)]
plot.set_data(newdata)
plt.title(f'{val}')
plt.draw()
def init_plot():
global plot
fig, ax = plt.subplots()
flukacolours = [(1.0, 1.0, 1.0), (0.9, 0.6, 0.9), (1.0, 0.4, 1.0), (0.9, 0.0, 1.0), (0.7, 0.0, 1.0), (0.5, 0.0, 0.8), (0.0, 0.0, 0.8),
(0.0, 0.0, 1.0), (0.0, 0.6, 1.0), (0.0, 0.8, 1.0), (0.0, 0.7, 0.5), (0.0, 0.9, 0.2), (0.5, 1.0, 0.0), (0.8, 1.0, 0.0),
(1.0, 1.0, 0.0), (1.0, 0.8, 0.0), (1.0, 0.5, 0.0), (1.0, 0.0, 0.0), (0.8, 0.0, 0.0), (0.6, 0.0, 0.0), (0.0, 0.0, 0.0)]
cmap_name = 'fluka'
cm = colors.LinearSegmentedColormap.from_list(cmap_name, flukacolours, N=30)
plot = plt.imshow(mat[:,:,0], cmap=cm, norm=LogNorm(vmin=mat.min(), vmax=mat.max()), aspect='auto')
ax = plot.axes
cbar = plt.colorbar(plot, ax=ax)
plt.subplots_adjust(left=0.10, bottom=0.15, right=1, top=0.9)
axfreq = plt.axes([0.10, 0.02, 0.8, 0.03])
freq_slider = Slider(ax=axfreq, label='Slice', valmin=0, valmax=mat.shape[2], valinit=0, valstep=1, orientation='horizontal')
freq_slider.on_changed(update)
plt.show()
if __name__ == "__main__":
mat = crazy_function() # This function returns a 3D np.array
init_plot()
The problem is seen in some slices of the plot, where the colours just... break. In the images below I am showing the differences between 3 consecutive slices. At this point, I thought the problem was in my crazy_function(), but then I noticed the graph value that appears in the upper right corner when you place the cursor inside the chart.
Trying to place the cursor at the same maximum point for each plot, the 36th slice is showing a green tint, which would mean a value in the order 10⁻¹⁶ (as shown in colorbar), but the cursor value shows 7x10⁻⁸, which is the right value of the array that matplotlib is not showing correctly.
.
I think the problem might be my custom colour scale, or more likely the absurdly large scale of the colorbar. Because changing the scale vmin and vmax in the plt.imshow, the colour break tends to decrease and even stop. Which is not a problem, I even prefer a shorter scale to visualize the data, but I was really curious about the cause of this problem.
If you know the answer, I'd love to know. In case it matters, my current version of matplotlib is 3.5.1.
I would like to create my own custom colour map in python, I looked into some of the online examples and found out the commands
from matplotlib import cm
import matplotlib.pyplot as plt
from matplotlib.colors import LinearSegmentedColormap
cdict1 = {'red': ((0.0, 0.0, 0.0),
(0.5, 0.0, 0.1),
(1.0, 1.0, 1.0)),
'green': ((0.0, 0.0, 0.0),
(1.0, 0.0, 0.0)),
'blue': ((0.0, 0.0, 1.0),
(0.5, 0.1, 0.0),
(1.0, 0.0, 0.0))
}
blue_red1 = LinearSegmentedColormap('BlueRed1', cdict1)
plt.imshow(big,interpolation='nearest', cmap=blue_red1, aspect='auto')
plt.colorbar()
plt.show()
With the above command I get a colour map which is (Red - Black - Blue), Where red being maximum and blue being minimum. I would like to create a colour map which is (Black - White - Black). Could someone tell me what should be done or any other method ?
For what it's worth, there's also a simpler method.
The full form of LinearSegmentedColormap gives you the ability to have "hard" stops and gradients in the same colormap, so it's necessarily complex. However, there's a convenience constructor for simple cases such as what you describe.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import LinearSegmentedColormap
cmap = LinearSegmentedColormap.from_list('mycmap', ['black', 'white', 'black'])
fig, ax = plt.subplots()
im = ax.imshow(np.random.random((10, 10)), cmap=cmap, interpolation='nearest')
fig.colorbar(im)
plt.show()
You want all three components to be 0 at both 0 and 1, and all three to be 1 at 0.5.
So, you have:
cdict1 = {'red': ((0.0, 0.0, 0.0), # <- at 0.0, the red component is 0
(0.5, 1.0, 1.0), # <- at 0.5, the red component is 1
(1.0, 0.0, 0.0)), # <- at 1.0, the red component is 0
'green': ((0.0, 0.0, 0.0), # <- etc.
(0.5, 1.0, 1.0),
(1.0, 0.0, 0.0)),
'blue': ((0.0, 0.0, 0.0),
(0.5, 1.0, 1.0),
(1.0, 0.0, 0.0))
}
I've also found colormap creation confusing. The LinearSegmentedColormap is nice because it is very flexible, but cdict takes some getting used to.
The first - and perhaps most important - thing to making colormaps this way is that you understand RGB colors. Basically, each color has an intensity value from 0 to 1, and higher values give more of that color. In the RGB colorspace, white is represented by all three colors being 1, and black is all three colors 0.
The second thing that is important to learning to make colormaps this way is this: always make the 2nd and 3rd values of each tuple the same until you get comfortable with creating simple linear colormaps. Eventually you can change those values to make discontinuities in the colormaps, but it will only confuse you as you get started.
OK, so the first value in each tuple is the 'fraction' of the colormap and these must go from 0 to 1, the second and third values are the intensity for that color (lower and upper bounds). So, to make a colormap that is 'black-white-black', you would do:
cdict1 = {
'red': ((0.0, 0.0, 0.0),
(0.5, 1.0, 1.0),
(1.0, 0.0, 0.0)),
'green': ((0.0, 0.0, 0.0),
(0.5, 1.0, 1.0),
(1.0, 0.0, 0.0)),
'blue': ((0.0, 0.0, 0.0),
(0.5, 1.0, 1.0),
(1.0, 0.0, 0.0)),
}
black_white_black = LinearSegmentedColormap('BlackWhiteBlack', cdict1)
For example,
plt.imshow(np.arange(100).reshape(10,10), cmap=black_white_black, aspect='auto')
plt.colorbar()
Good Luck!
Try a cdict1 of
cdict1 = {'red': ((0.0, 0.0, 0.0),
(0.5, 1.0, 1.0),
(1.0, 0.0, 0.0)),
'green': ((0.0, 0.0, 0.0),
(0.5, 1.0, 1.0),
(1.0, 0.0, 0.0)),
'blue': ((0.0, 0.0, 0.0),
(0.5, 1.0, 1.0),
(1.0, 0.0, 0.0))
}
This dictionary describes how colors are interpolated, looking at each red-green-blue component individually. For each component you give it a list of 3-tuples (x, y0, y1) that specify how to interpolate that component, and each value you want is interpolated between two points in the list.
In this case we want to start at black [RGB=(0,0,0)], increase to white [RGB=1,1,1] at the halfway point of the data range, and then decrease back to black at the end.
For each value to assign a color, the map will first convert that value to a fraction of the input range so that it has something in the range [0, 1]. To get the level of the red component, the map will scan the first element in each 3-tuple in the 'red' list and grab the largest one not exceeding your fraction. The red level assigned will be interpolated between the y1 element of that 3-tuple and the y0 element of the next 3-tuple, based on the difference in x value.
And similarly for the blue and green components.
I'm attempting to plot 3D line trajectories that evolve over time, and I would like the colors to change to show that passage of time (e.g. from light blue to dark blue). However, there is a distinct lack of tutorials for using matplotlib's Line3DCollection; this is the closest I could find, but all I'm getting is a white line.
Here's my code.
import matplotlib.pyplot as plot
from mpl_toolkits.mplot3d.axes3d import Axes3D
from mpl_toolkits.mplot3d.art3d import Line3DCollection
import numpy as np
# X has shape (3, n)
c = np.linspace(0, 1., num = X.shape[1])[::-1]
a = np.ones(shape = c.shape[0])
r = zip(a, c, c, a) # an attempt to make red vary from light to dark
# r, which contains n tuples of the form (r,g,b,a), looks something like this:
# [(1.0, 1.0, 1.0, 1.0),
# (1.0, 0.99998283232330165, 0.99998283232330165, 1.0),
# (1.0, 0.9999656646466033, 0.9999656646466033, 1.0),
# (1.0, 0.99994849696990495, 0.99994849696990495, 1.0),
# ...,
# (1.0, 1.7167676698312416e-05, 1.7167676698312416e-05, 1.0),
# (1.0, 0.0, 0.0, 1.0)]
fig = plot.figure()
ax = fig.gca(projection = '3d')
points = np.array([X[0], X[1], X[2]]).T.reshape(-1, 1, 3)
segs = np.concatenate([points[:-1], points[1:]], axis = 1)
lc = Line3DCollection(segs, colors = r)
ax.add_collection3d(lc)
ax.set_xlim(-0.45, 0.45)
ax.set_ylim(-0.4, 0.5)
ax.set_zlim(-0.45, 0.45)
plot.show()
However, here's what I get:
Just a bunch of white line segments, no shift in the color. What am I doing wrong? Thanks!
Your code works just fine, here's a bit of a sample. Basically, this is your code with a custom X set.
fig = plot.figure();
ax = fig.gca(projection = '3d')
X = [(0,0,0,1,0),(0,0,1,0,0),(0,1,0,0,0)]
points = np.array([X[0], X[1], X[2]]).T.reshape(-1, 1, 3)
r = [(1.0, 1.0, 1.0, 1.0), (1.0, 0.75, 0.75, 1.0), (1.0, 0.5, 0.5, 1.0), (1.0, 0.25, 0.25, 1.0), (1.0, 0.0, 0.0, 1.0)];
segs = np.concatenate([points[:-1], points[1:]], axis = 1)
ax.add_collection(Line3DCollection(segs,colors=list(r)))
plot.show()
And the plot looks like this:
Wow, so it turns out the problem was that X was actually not of shape (3, n), but rather something like (3, n^10), but I was only plotting n points, hence the color appeared to never change (and why r seems to have extremely small intervals...there were something like 58,000 points when I was plotting only 250).
So yes, it was a bug. Sorry about that; it works fine now.
In my application I'm transitioning from R to native Python (scipy + matplotlib) where possible, and one of the biggest tasks was converting from a R heatmap to a matplotlib heatmap. This post guided me with the porting. While most of it was painless, I'm still not convinced on the colormap.
Before showing code, an explanation: in the R code I defined "breaks", i.e. a fixed number of points starting from the lowest value up to 10, and ideally centered on the median value of the data. Its equivalent here would be with numpy.linspace:
# Matrix is a DataFrame object from pandas
import numpy as np
data_min = min(matrix.min(skipna=True))
data_max = max(matrix.max(skipna=True))
median_value = np.median(matrix.median(skipna=True))
range_min = np.linspace(0, median_value, 50)
range_max = np.linspace(median_value, data_max, 50)
breaks = np.concatenate((range_min, range_max))
This gives us 100 points that will be used for coloring. However, I'm not sure on how to do the exact same thing in Python. Currently I have:
def red_black_green():
cdict = {
'red': ((0.0, 0.0, 0.0),
(0.5, 0.0, 0.0),
(1.0, 1.0, 1.0)),
'blue': ((0.0, 0.0, 0.0),
(1.0, 0.0, 0.0)),
'green': ((0.0, 0.0, 1.0),
(0.5, 0.0, 0.0),
(1.0, 0.0, 0.0))
}
my_cmap = mpl.colors.LinearSegmentedColormap(
'my_colormap', cdict, 100)
return my_cmap
And further down I do:
# Note: vmin and vmax are the maximum and the minimum of the data
# Adjust the max and min to scale these colors
if vmin > 0:
norm = mpl.colors.Normalize(vmin=0, vmax=vmax / 1.08)
else:
norm = mpl.colors.Normalize(vmin / 2, vmax / 2)
The numbers are totally empirical, that's why I want to change this into something more robust. How can I normalize my color map basing on the median, or do I need normalization at all?
By default, matplotlib will normalise the colormap such that the maximum colormap value will be the maximum of your data. Likewise for the minimum of your data. This means that the median of the colormap (the middle value) will line up with the interpolated median of your data (interpolated if you don't have a data point exactly at the median).
Here's an example:
from numpy.random import rand
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
cdict = {'red': ((0.0, 0.0, 0.0),
(0.5, 0.0, 0.0),
(1.0, 1.0, 1.0)),
'blue': ((0.0, 0.0, 0.0),
(1.0, 0.0, 0.0)),
'green': ((0.0, 0.0, 1.0),
(0.5, 0.0, 0.0),
(1.0, 0.0, 0.0))}
cmap = mcolors.LinearSegmentedColormap(
'my_colormap', cdict, 100)
ax = plt.subplot(111)
im = ax.imshow(2*rand(20, 20) + 1.5, cmap=cmap)
plt.colorbar(im)
plt.show()
Notice the middle of the colour bar takes value 2.5. This is the median of the data range: (min + max) / 2 = (1.5+3.5) / 2 = 2.5.
Hope this helps.
I have a 2D array that I'm plotting with imshow and I would like to have costums colors depending on the value of each pixel of my array. I'll explain it with an example.
from pylab import *
from numpy import *
img = ones((5,5))
img[1][1] = 2
imshow(img,interpolation='nearest');colorbar()
If you ran this code you would see a red square in a blue background. The red square corresponds to the pixel [1][1] in img, while the other pixel are colored blue because they have a value of 1. What if I want the red square to be colored with a custom color?
Or more generally, if I have a 2D array like img in the example, how can I color pixel with the same value with a color I can choose.
I have found this page that explains how to generate a custom colorbar but that's not useful: http://www.scipy.org/Cookbook/Matplotlib/Show_colormaps
That link you sent has the following:
But, what if I think those colormaps are ugly? Well, just make your
own using matplotlib.colors.LinearSegmentedColormap. First, create a
script that will map the range (0,1) to values in the RGB spectrum. In
this dictionary, you will have a series of tuples for each color
'red', 'green', and 'blue'. The first elements in each of these color
series needs to be ordered from 0 to 1, with arbitrary spacing
inbetween. Now, consider (0.5, 1.0, 0.7) in the 'red' series below.
This tuple says that at 0.5 in the range from (0,1) , interpolate from
below to 1.0, and above from 0.7. Often, the second two values in each
tuple will be the same, but using diferent values is helpful for
putting breaks in your colormap. This is easier understand than might
sound, as demonstrated by this simple script:
1 from pylab import *
2 cdict = {'red': ((0.0, 0.0, 0.0),
3 (0.5, 1.0, 0.7),
4 (1.0, 1.0, 1.0)),
5 'green': ((0.0, 0.0, 0.0),
6 (0.5, 1.0, 0.0),
7 (1.0, 1.0, 1.0)),
8 'blue': ((0.0, 0.0, 0.0),
9 (0.5, 1.0, 0.0),
10 (1.0, 0.5, 1.0))}
11 my_cmap = matplotlib.colors.LinearSegmentedColormap('my_colormap',cdict,256)
12 pcolor(rand(10,10),cmap=my_cmap)
13 colorbar()
Isn't this exactly what you want?
Here's an example of how to do it with the image you provided:
import matplotlib
from matplotlib import pyplot as plt
from pylab import *
img = ones((5,5))
img[1][1] = 2
cdict = {'red': ((0.0, 0.0, 0.0),
(0.5, 1.0, 0.7),
(1.0, 1.0, 1.0)),
'green': ((0.0, 0.0, 0.0),
(0.5, 1.0, 0.0),
(1.0, 1.0, 1.0)),
'blue': ((0.0, 0.0, 0.0),
(0.5, 1.0, 0.0),
(1.0, 0.5, 1.0))}
my_cmap = matplotlib.colors.LinearSegmentedColormap('my_colormap',cdict,256)
plt.pcolor(img,cmap=my_cmap)
plt.colorbar()
plt.show()
Also, if you really want to map a number to a colour you can use discrete_cmap as specified in that example you linked to, here's the example method the scipy documentation provides:
def discrete_cmap(N=8):
"""create a colormap with N (N<15) discrete colors and register it"""
# define individual colors as hex values
cpool = [ '#bd2309', '#bbb12d', '#1480fa', '#14fa2f', '#000000',
'#faf214', '#2edfea', '#ea2ec4', '#ea2e40', '#cdcdcd',
'#577a4d', '#2e46c0', '#f59422', '#219774', '#8086d9' ]
cmap3 = col.ListedColormap(cpool[0:N], 'indexed')
cm.register_cmap(cmap=cmap3)