I've spent too much time looking into this, some tabs still open in my browser:
Link1 Link2 Link3 Link4
I'm supposed to be working!
Anyway, my problem is: I use someone else's scripts to produce lots of heat maps which I then have to review and sort/assign:
Here's an example of one:
HM sample
I need to be able to easily distinguish a 0.03 from a zero but as you can see they look virtually the same. Ideal solution would be: White(just zero's)-Yellow-Orange-Red or White(just zero's)-Orange-Red
The dev used 'YlOrRd' like so:
sns.heatmap(heat_map, annot=True, fmt=".2g", cmap="YlOrRd", linewidths=0.5,
linecolor='black', xticklabels=xticks, yticklabels=yticks
)
I've tried a bunch of the standard/default colour map options provided to no avail.
I don't have any real experience building colour maps and I don't want break something that's already working. Would anyone have any ideas?
Thanks
**I'm limited in what code/samples I can post due to it being work product.
An option is to take the colors from an existing colormap, replace the first one by white and create a new colormap from those manipulated values.
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt
import matplotlib.colors
import seaborn as sns
# some data
a = np.array([0.,0.002,.005,.0099,0.01,.0101,.02,.04,.24,.42,.62,0.95,.999,1.])
data = np.random.choice(a, size=(12,12))
# create colormap. We take 101 values equally spaced between 0 and 1
# hence the first value 0, second value 0.01
c = np.linspace(0,1,101)
# For those values we store the colors from the "YlOrRd" map in an array
colors = plt.get_cmap("YlOrRd",101)(c)
# We replace the first row of that array, by white
colors[0,:] = np.array([1,1,1,1])
# We create a new colormap with the colors
cmap = matplotlib.colors.ListedColormap(colors)
# Plot the heatmap. The format is set to 4 decimal places
# to be able to disingush specifically the values ,.0099, .0100, .0101,
sns.heatmap(data, annot=True, fmt=".4f", cmap=cmap, vmin=0, vmax=1,
linewidths=0.5, linecolor='black')
plt.show()
Related
I have numerous sets of seasonal data that I am looking to show in a heatmap format. I am not worried about the magnitude of the values in the dataset but more the overall direction and any patterns that i can look at in more detail later. To do this I want to create a heatmap that only shows 2 colours (red for below zero and green for zero and above).
I can create a normal heatmap with seaborn but the normal colour maps do not have only 2 colours and I am not able to create one myself. Even if I could I am unable to set the parameters to reflect the criteria of below zero = red and zero+ = green.
I managed to create this simply by styling the dataframe but I was unable to export it as a .png because the table_criteria='matplotlib' option removes the formatting.
Below is an example of what I would like to create made from random data, could someone help or point me in the direction of a helpful Stackoverflow answer?
I have also included the code I used to style and export the dataframe.
Desired output - this is created with random data in an Excel spreadsheet
#Code to create a regular heatmap - can this be easily amended?
df_hm = pd.read_csv(filename+h)
pivot = df_hm.pivot_table(index='Year', columns='Month', values='delta', aggfunc='sum')
fig, ax = plt.subplots(figsize=(10,5))
ax.set_title('M1 '+h[:-7])
sns.heatmap(pivot, annot=True, fmt='.2f', cmap='RdYlGn')
plt.savefig(chartpath+h[:-7]+" M1.png", bbox_inches='tight')
plt.close()
#code used to export dataframe that loses format in the .png
import matplotlib.pyplot as plt
import dataframe_image as dfi
#pivot is the dateframe name
pivot = pd.DataFrame(np.random.randint(-100,100,size= (5, 12)),columns=list ('ABCDEFGHIJKL'))
styles = [dict(selector="caption", props=[("font-size", "120%"),("font-weight", "bold")])]
pivot = pivot.style.format(precision=2).highlight_between(left=-100000, right=-0.01, props='color:white;background-color:red').highlight_between(left=0, right= 100000, props='color:white;background-color:green').set_caption(title).set_table_styles(styles)
dfi.export(pivot, root+'testhm.png', table_conversion='matplotlib',chrome_path=None)
You can manually set cmap property to list of colors and if you want to annotate you can do it and it will show same value as it's not converted to -1 or 1.
import numpy as np
import seaborn as sns
arr = np.random.randn(10,10)
sns.heatmap(arr,cmap=["grey",'green'],annot=True,center=0)
# center will make it dividing point
Output:
PS. If you don't want color-bar you can pass cbar=False in `sns.heatmap)
Welcome to SO!
To achieve what you need, you just need to pass delta through the sign function, here's an example code:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
arr = np.random.randn(25,25)
sns.heatmap(np.sign(arr))
Which results in a binary heatmap, albeit one with a quite ugly colormap, still, you can fiddle around with Seaborn's colormaps in order to make it look like excel.
The color map in matplotlib allows to mark "bad" values, i.e. NaNs, with a specific color. When we plot the color bar afterwards, this color is not included. Is there a preferred approach to have both the contiuous color bar and a discrete legend for the specific color for bad values?
Edit:
Certainly, it's possible to make use of the "extend" functionality. However, this solution is not satisfactory. The function of the legend/colorbar is to clarify the meaning of colors to the user. In my opinion, this solution does not communicate that the value is a NaN.
Code example:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
data = np.random.rand(10, 10)
data[0:3, 0:3] = np.nan # some bad values for set_bad
colMap = cm.RdBu
colMap.set_bad(color='black')
plt.figure(figsize=(10, 9))
confusion_matrix = plt.imshow(data, cmap=colMap, vmin=0, vmax=1)
plt.colorbar(confusion_matrix)
plt.show()
Which produces:
A legend element could be created and used as follows:
from matplotlib.patches import Patch
legend_elements = [Patch(facecolor=colMap(np.nan), label='Bad values')]
plt.legend(handles=legend_elements)
You can do this using one of the approaches used for out-of-range plotting shown at https://matplotlib.org/3.1.1/tutorials/colors/colorbar_only.html#discrete-intervals-colorbar
Set the color of the bad value e.g. to -999 and use the keyword extend.
Another approach is to used masked plotting as shown here.
Another way could be to use cmap.set_bad(). An example can be found here.
I am using matplotlib
In plot() or bar(), we can easily put legend, if we add labels to them. but what if it is a contourf() or imshow()
I know there is a colorbar() which can present the color range, but it is not satisfied. I want such a legend which have names(labels).
For what I can think of is that, add labels to each element in the matrix, then ,try legend(), to see if it works, but how to add label to the element, like a value??
in my case, the raw data is like:
1,2,3,3,4
2,3,4,4,5
1,1,1,2,2
for example, 1 represents 'grass', 2 represents 'sand', 3 represents 'hill'... and so on.
imshow() works perfectly with my case, but without the legend.
my question is:
Is there a function that can automatically add legend, for example, in my case, I just have to do like this: someFunction('grass','sand',...)
If there isn't, how do I add labels to each value in the matrix. For example, label all the 1 in the matrix 'grass', labell all the 2 in the matrix 'sand'...and so on.
Thank you!
Edit:
Thanks to #dnalow, it's smart really. However, I still wonder if there is any formal solution.
I quote here a solution to a similar question, in case someone is still interested:
I suppose putting a legend for all values in a matrix only makes sense if there aren't too many of them. So let's assume you have 8 different values in your matrix. We can then create a proxy artist of the respective color for each of them and put them into a legend like this
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import numpy as np
# create some data
data = np.random.randint(0, 8, (5,5))
# get the unique values from data
# i.e. a sorted list of all values in data
values = np.unique(data.ravel())
plt.figure(figsize=(8,4))
im = plt.imshow(data, interpolation='none')
# get the colors of the values, according to the
# colormap used by imshow
colors = [ im.cmap(im.norm(value)) for value in values]
# create a patch (proxy artist) for every color
patches = [ mpatches.Patch(color=colors[i], label="Level {l}".format(l=values[i]) ) for i in range(len(values)) ]
# put those patched as legend-handles into the legend
plt.legend(handles=patches, bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0. )
plt.grid(True)
plt.show()
You could use matplotlib.pylab.text to add text to your plot and customize it to look like a legend
For example:
import numpy as np
import matplotlib.cm as cm
import matplotlib.pylab as plt
raw_data = np.random.random((100, 100))
fig, ax = plt.subplots(1)
ax.imshow(raw_data, interpolation='nearest', cmap=cm.gray)
ax.text(5, 5, 'your legend', bbox={'facecolor': 'white', 'pad': 10})
plt.show()
which gives you following
You can check out the matplotlib documentation on text for more details matplotlib text examples
I am just working on the same project to draw a land use map like your problem. Here is my solution following the answers above.
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import numpy as np
##arrayLucc is the array of land use types
arrayLucc = np.random.randint(1,4,(5,5))
## first you need to define your color map and value name as a dic
t = 1 ## alpha value
cmap = {1:[0.1,0.1,1.0,t],2:[1.0,0.1,0.1,t],3:[1.0,0.5,0.1,t]}
labels = {1:'agricultural land',2:'forest land',3:'grassland'}
arrayShow = np.array([[cmap[i] for i in j] for j in arrayLucc])
## create patches as legend
patches =[mpatches.Patch(color=cmap[i],label=labels[i]) for i in cmap]
plt.imshow(arrayShow)
plt.legend(handles=patches, loc=4, borderaxespad=0.)
plt.show()
This resolution doesn't seem very good but it can works. I am also looking for my other methods.
I guess you have to fake your legend, since it requires a line for creating the legend.
You can do something like this:
import pylab as pl
mycmap = pl.cm.jet # for example
for entry in pl.unique(raw_data):
mycolor = mycmap(entry*255/(max(raw_data) - min(raw_data)))
pl.plot(0, 0, "-", c=mycolor, label=mynames[entry])
pl.imshow(raw_data)
pl.legend()
Of cause this is not very satisfying yet. But maybe you can build something on that.
[edit: added missing parenthesis]
I have a lot of different files (10-20) that I read in x and y data from, then plot as a line.
At the moment I have the standard colors but I would like to use a colormap instead.
I have looked at many different examples but can't get the adjustment for my code right.
I would like the colour to change between each line (rather than along the line) using a colormap such as gist_rainbow i.e. a discrete colourmap
The image below is what I can currently achieve.
This is what I have attempted:
import pylab as py
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import rc, rcParams
numlines = 20
for i in np.linspace(0,1, numlines):
color1=plt.cm.RdYlBu(1)
color2=plt.cm.RdYlBu(2)
# Extract and plot data
data = np.genfromtxt('OUZ_QRZ_Lin_Disp_Curves')
OUZ_QRZ_per = data[:,1]
OUZ_QRZ_gvel = data[:,0]
plt.plot(OUZ_QRZ_per,OUZ_QRZ_gvel, '--', color=color1, label='OUZ-QRZ')
data = np.genfromtxt('PXZ_WCZ_Lin_Disp_Curves')
PXZ_WCZ_per = data[:,1]
PXZ_WCZ_gvel = data[:,0]
plt.plot(PXZ_WCZ_per,PXZ_WCZ_gvel, '--', color=color2, label='PXZ-WCZ')
# Lots more files will be plotted in the final code
py.grid(True)
plt.legend(loc="lower right",prop={'size':10})
plt.savefig('Test')
plt.show()
You could take a few different approaches. On your initial example you color each line specifically with a different color. That works fine if you are able to loop over the data/colors you want to plot. Manually assigning each color, like you do now, is a lot of work, even for 20 lines, but imagine if you have hundred or more. :)
Matplotlib also allows you to edit the default 'color cycle' with your own colors. Consider this example:
numlines = 10
data = np.random.randn(150, numlines).cumsum(axis=0)
plt.plot(data)
This gives the default behavior, and results in:
If you want to use a default Matplotlib colormap, you can use it to retrieve the colors values.
# pick a cmap
cmap = plt.cm.RdYlBu
# get the colors
# if you pass floats to a cmap, the range is from 0 to 1,
# if you pass integer, the range is from 0 to 255
rgba_colors = cmap(np.linspace(0,1,numlines))
# the colors need to be converted to hexadecimal format
hex_colors = [mpl.colors.rgb2hex(item[:3]) for item in rgba_colors.tolist()]
You can then assign the list of colors to the color cycle setting from Matplotlib.
mpl.rcParams['axes.color_cycle'] = hex_colors
Any plot made after this change will automatically cycle through these colors:
plt.plot(data)
I am attempting to use matplotlib to plot some figures for a paper I am working on. I have two sets of data in 2D numpy arrays: An ascii hillshade raster which I can happily plot and tweak using:
import matplotlib.pyplot as pp
import numpy as np
hillshade = np.genfromtxt('hs.asc', delimiter=' ', skip_header=6)[:,:-1]
pp.imshow(hillshade, vmin=0, vmax=255)
pp.gray()
pp.show()
Which gives:
And a second ascii raster which delineates properties of a river flowing across the landscape. This data can be plotted in the same manner as above, however values in the array which do not correspond to the river network are assigned a no data value of -9999. The aim is to have the no data values set to be transparent so the river values overlie the hillshade.
This is the river data, ideally every pixel represented here as 0 would be completely transparent.
Having done some research on this it seems I may be able to convert my data into an RGBA array and set the alpha values to only make the unwanted cells transparent. However, the values in the river array are floats and cannot be transformed (as the original values are the whole point of the figure) and I believe the imshow function can only take unsigned integers if using the RGBA format.
Is there any way around this limitation? I had hoped I could simply create a tuple with the pixel value and the alpha value and plot them like that, but this does not seem possible.
I have also had a play with PIL to attempt to create a PNG file of the river data with the no data value transparent, however this seems to automatically scale the pixel values to 0-255, thereby losing the values I need to preserve.
I would welcome any insight anyone has on this problem.
Just mask your "river" array.
e.g.
rivers = np.ma.masked_where(rivers == 0, rivers)
As a quick example of overlaying two plots in this manner:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
# Generate some data...
gray_data = np.arange(10000).reshape(100, 100)
masked_data = np.random.random((100,100))
masked_data = np.ma.masked_where(masked_data < 0.9, masked_data)
# Overlay the two images
fig, ax = plt.subplots()
ax.imshow(gray_data, cmap=cm.gray)
ax.imshow(masked_data, cmap=cm.jet, interpolation='none')
plt.show()
Also, on a side note, imshow will happily accept floats for its RGBA format. It just expects everything to be in a range between 0 and 1.
An alternate way to do this with out using masked arrays is to set how the color map deals with clipping values below the minimum of clim (shamelessly using Joe Kington's example):
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
# Generate some data...
gray_data = np.arange(10000).reshape(100, 100)
masked_data = np.random.random((100,100))
my_cmap = cm.jet
my_cmap.set_under('k', alpha=0)
# Overlay the two images
fig, ax = plt.subplots()
ax.imshow(gray_data, cmap=cm.gray)
im = ax.imshow(masked_data, cmap=my_cmap,
interpolation='none',
clim=[0.9, 1])
plt.show()
There as also a set_over for clipping off the top and a set_bad for setting how the color map handles 'bad' values in the data.
An advantage of doing it this way is you can change your threshold by just adjusting clim with im.set_clim([bot, top])
Another option is to set all cells which shall remain transparent to np.nan (not sure what's more efficient here, I guess tacaswell's answer based on clim will be the fastet). Example adapting Joe Kington's answer:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
# Generate some data...
gray_data = np.arange(10000).reshape(100, 100)
masked_data = np.random.random((100,100))
masked_data[np.where(masked_data < 0.9)] = np.nan
# Overlay the two images
fig, ax = plt.subplots()
ax.imshow(gray_data, cmap=cm.gray)
ax.imshow(masked_data, cmap=cm.jet, interpolation='none')
plt.show()
Note that for arrays of dtype=bool you should not follow your IDE's advice to compare masked_data is True for the sake of PEP 8 (E712) but stick with masked_data == True for element-wise comparison, otherwise the masking will fail: