Set seaborn diverging cmap palette's center/middle parameter to 0 - python

I am trying to create a diverging red-green formatting for my pandas dataframe. Specifically, I want any value below 0 to be red, and any value above 0 to be green (& the higher the value, the darker green the cell ... and vice versa).
Currently, I cannot find a way to center this around 0. See below for a quick example of my code and a screenshot of my current output.
Thanks!
rdgn = sns.diverging_palette(h_neg=0, h_pos=112, s=68, l=50, sep=10,n=9, as_cmap=True)
s = test.style.background_gradient(rdgn).set_precision(2)

Related

plostly histogram facet row animation frame

Here is a sample of my data:
Time,Value,Name,Type
0,6.9,A,start
40,6.9,A,start
60,6.9,A,start
0,0.01,B,start
40,0.01,B,start
60,0.01,B,start
0,1.0,C,start
40,1.0,C,start
60,1.0,C,start
0,0.08,D,start
40,0.08,D,start
60,0.08,D,start
0,0.000131,E,End
40,0.00032,E,End
60,0.99209,E,End
0,0.002754,F,End
40,0.00392,F,End
60,0.01857,F,End
0,0.003,G,End
40,0.00516,G,End
60,0.00746,G,End
0,0.00426,H,End
40,0.0043,H,End
60,0.0095,H,End
0,0,I,End
40,0.0017,I,End
60,0.0183,I,End
And my code below:
import plotly.express as px
import pandas as pd
df=pd.read_csv('tohistogram.csv')
fig_bar = px.histogram(df,x='Name',y='Value',animation_frame='Time',color='Name',facet_row='Type')
fig_bar.update_layout(yaxis_title="value")
fig_bar.update_xaxes(matches=None)
fig_bar.for_each_xaxis(lambda xaxis: xaxis.update(showticklabels=True))
fig_bar.show()
`
Fig1:
Fig2:
With the data point listed above, I wanted 2 histogram separated by type (start,end) in one frame with one animation_frame
Tried the above code, as one can see from the image I could partial achieve but from Fig1: second histogram has (A,B,C,D),excepted just E to I.
2. Figure 2 was when I played the run button and auto scaled then I see A-D are gone and only E-I,
This is what I wanted to achieve in the first place itself, before running 2 histogram should sort as per 'Type'
A. Is it possible I tried couple of things like removed color
fig_bar = px.histogram(df,x='Name',y='Value',animation_frame='Time',facet_row='Type')
histogram sorts as per 'Type' of course no color but no label in second x-axis.
B.fig_bar = px.histogram(df,x='Name',y='Value',color='Name',facet_row='Type')
It sorts but no animation
What I am trying is it possible?
need 2 histogram with in the same frame sorted by 'Type',color and animation_frame?
C. Only if possible then, how to label y-axis of the first histogram from sumofValues to user-defined axis name and also have its own axis range.
D.I didn't come across any example but on the histogram, on mouse hover can I show another simple line graph image instead of text or value?
Thank you

plt.imshow() shows only one color

I am trying to plot a heatmap from a 2000x2000 NumPy array. I have tried every solution from this post and many others. I have tried many cmaps and interpolation combinations.
This is the code that prepares the data:
def parse_cords(cord: float):
cord = str(cord).split(".")
h_map[int(cord[0])][int(cord[1])] += 1
df["coordinate"] is a pandas series of floats x,y coordinate. x and y are ranging from 0 to 1999.
I have decided to modify the array so that values will range from 0 to 1, but I have tested the code also without changing the range.
h_map = np.zeros((2000, 2000), dtype='int')
cords = df["coordinate"].map(lambda cord: parse_cords(cord))
maximum = float(np.max(h_map))
precent = lambda x: x/maximum
h_map = precent(h_map)
h_map looks like this:
[[0.58396242 0.08840799 0.03153833 ... 0.00285187 0.00419393 0.06324442]
[0.09075658 0.11172622 0.01476262 ... 0.00134206 0.00687804 0.0082201 ]
[0.02986076 0.01862104 0.03959067 ... 0.00100654 0.00134206 0.00251636]
...
[0.00301963 0.00134206 0.00134206 ... 0.00100654 0.00150981 0.00553598]
[0.00419393 0.00268411 0.00100654 ... 0.00201309 0.00402617 0.01342057]
[0.05183694 0.00251636 0.00184533 ... 0.00301963 0.00838785 0.1016608 ]]
Now the plot:
fig, ax = plt.subplots(figsize=figsize)
ax = plt.imshow(h_map)
And result:
final plot
The result is always a heatmap with only a single color depending on the cmap used. Is my array just too big to be plotted like this or am I doing something wrong?
EDIT:
I have added plt.colorbar() and removed scaling from 0 to 1. The plot knows the range of data (0 to 5500) but assumes that every value is equal to 0.
I think that is because you only provide one color channel. Therefore, plt.imshow() interprets the data as black and white image. You could either add more channels or use a different function e.g. sns.heatmap().
from seaborn import sns

How to edit a color scheme?

I am using Altair for Python and my current heatmap code uses a redyellowblue color scheme (A) that uses yellow as the middle color. I am trying to edit this color scheme in order to achieve the scheme on (B), which the only difference is replacing yellow with white as the middle color. Does anyone have any idea on how to achieve that in Altair?
The color scheme on (B) was created in R, by using the RdYlBu color pallete (the one with 11 colors) and overwrite the middle (6th color) with white. Then, they increased the number of colors in the pallete to 99, to make the fade look more fluid.
My current code (A):
color=alt.Color('Spline_WW_Diff_Trend:Q', scale=alt.Scale(scheme='redyellowblue',reverse=True, domain=[-3.57,2.270], domainMid=0, clamp=True), legend=alt.Legend(title="Trend"))
I have tried manually setting up the colors using range but got an odd result. I've also used a condition to override the color for the value 0, but it wasn't satisfactory because the numbers neighboring 0 should have a white(ish) color.
You probably want interpolate='rgb' when defining your own range.Using the interpolate property for the color scale you can define one of the interpolation methods as is defined by d3-interpolate, https://github.com/d3/d3-interpolate#color-spaces.
The default value for interpolate is hcl, which is not always what you want. Observe the changes in color interpolation once you change the interpolation methods with a fixed range/domain:
import altair as alt
import pandas as pd
import numpy as np
df = pd.DataFrame({'x': np.arange(-10, 10)})
def charter(method):
return alt.Chart(df, title=method).mark_rect().encode(
x=alt.X('x:O',title=None),
color=alt.Color('x:Q',
scale=alt.Scale(
domain=[-10,-5,0,5,9],
range=['red','orange','white','lightblue','darkblue'],
interpolate=method
),
legend=alt.Legend(direction='horizontal', orient='top', title=None)
)
)
methods = ['hcl', 'rgb', 'hsl', 'hsl-long', 'lab', 'hcl-long', 'cubehelix', 'cubehelix-long']
alt.vconcat(*[charter(method) for method in methods]).resolve_scale(color='independent')

Reduce color list to more common colors - Python

I have a list of 1723 colors in hex codes (I could also turn them to RGB but the issue is not the format of the color), like this: cols = ['#A62E2E', '#D99036', '#D9C27E', '#D9AB9A', '#592C22'].
I'm trying to reduce the amount of colors in that list to 1/10th of what it already is by grouping similar colors. So in my example, the 1723 colors will be mapped to 172.
I have already checked these posts: post1, post2, post3, post4, post5 but they are not exactly what I want. Basically I want to create the groups dynamically from the color list I have and not a preexisting one.
It would also be very beneficial for me if I could keep as much variety as possible so preferably id like as different groups as possible.
What I've already tried:
The only solution I've found so far is a function from a stackoverflow post
def closest_color(rgb):
r, g, b = rgb
color_diffs = []
for color in nodes['Rgb']:
cr, cg, cb = color
color_diff = sqrt(abs(r - cr)**2 + abs(g - cg)**2 + abs(b - cb)**2)
color_diffs.append((color_diff, color))
return min(color_diffs)[1]
Which basically gives you the closest color to your agument from a list, but this requires a preexisting list of colors and does the mapping.
The way I'm thinking it could be done is iterate over my list, leaving the current element out and calling this function for the rest of the list and grouping the 2 colors, then doing that until I have only 172 colors. However im not sure if that will give me enough distinct colors or how to group the 2 colors I get for that matter.
I don't know enough about colors to figure out a way of doing that without messing my color range.
Here is a method that clusters the colors using scipy. Please note that using RBG is not recommended, and you would need to transform your data to a uniform color space) before clustering. The transformation in the example is included as a placeholder: it is not really useful since YIQ is not a uniform color space. There are different modules that can be used to perform the transformation.
The final list is in rgb_clusters.
import colorsys
from scipy.cluster.vq import kmeans2
from numpy.random import random_sample
n_colors = 1732
n_seek = n_colors // 10
rgb_data = random_sample((n_colors, 3))
# Let's change from RGB to another colour space. YIQ is not a
# good choice: a uniform colour space should be
# used (see https://en.wikipedia.org/wiki/Color_appearance_model).
# in this example I use YIQ for its simplicity.
yiq_data = [colorsys.rgb_to_yiq(*rgb) for rgb in rgb_data]
yiq_clusters, mapping = kmeans2(yiq_data, n_seek, minit='++')
rgb_clusters = [colorsys.yiq_to_rgb(*yiq) for yiq in yiq_clusters]
# Let's see the results graphically
from matplotlib import pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(projection='3d')
ax.scatter(*zip(*rgb_clusters), c=rgb_clusters, s=20, label="centroids")
ax.scatter(*zip(*rgb_data), c=rgb_data, s=6, label="data")
for o_data, idx_cluster in zip(rgb_data, mapping):
cluster = rgb_clusters[idx_cluster]
ax.plot(*zip(o_data, cluster), c=cluster)
ax.set_xlabel("R")
ax.set_ylabel("G")
ax.set_zlabel("B")
plt.legend()
Result:

how to not color nan values when making a contour plot with matplotlib.collections.PolyCollection

I am trying to plot a tri/quad mesh along with results on that mesh. I am plotting results of a CFD simulation.
I am using matplotlib.collections.PolyCollection to plot because it handles non-tri elements, where other methods only support tri elements.
my current code works fine, but when I try to plot results where some cells have no water (have them set to np.nan right now), the plotting crashes and the contour colors get all screwed up.
My current code is:
ax = plt.subplot(111)
cmap = matplotlib.cm.jet
polys = element_coords #list of Nx2 np.arrays containing the coordinates of each element polygon)
facecolors = element_values #np array of values at each element, same length as polys
pc = matplotlib.collections.PolyCollection(polys, cmap=cmap)
pc.set_array(facecolors)
ax.add_collection(pc)
ax.plot()
When element_values does not contain any nan values, it works fine and looks something like this:
However, when element_values does contain nan values, it crashes and I get this error:
C:\Users\deden\AppData\Local\Continuum\anaconda3\envs\test\lib\site-packages\matplotlib\colors.py:527: RuntimeWarning: invalid value encountered in less
xa[xa < 0] = -1
I played around with element_values and can confirm this only happens when nan values are present.
I initially tried to ignore the nan values by doing this just to make them clear:
pc.cmap.set_bad(color='white',alpha=0)
But I still get the same error.
So... I tried setting all the nan values to -999 then trying to cut off the colormap like this:
vmin = np.nanmin(facecolors)
vmax = np.nanmax(facecolors)
facecolors[np.isnan(facecolors)] = -999
pc.cmap.set_under(color='white',alpha=0)
then tried to set the limits of the colormap based on other stack questions I've seen..like:
pc.cmap.set_clim(vmin,vmax)
but then I get:
AttributeError: 'ListedColormap' object has no attribute 'set_clim'
I'm out of ideas here...can anyone help me? I just want to NOT COLOR any element where the value is nan.
To reproduce my error..you can try using this dummy data:
polys = [np.array([[ 223769.2075899 , 1445713.24572239],
[ 223769.48419606, 1445717.09102757],
[ 223764.48282055, 1445714.84782264]]),
np.array([[ 223757.9584215 , 1445716.57576502],
[ 223764.48282055, 1445714.84782264],
[ 223762.05868674, 1445720.48031478]])]
facecolors = np.array([np.nan, 1]) #will work if you replace np.nan with a number
SIDE NOTE - if anyone knows how I can plot this mesh+data without polycollections that'd be great..it includes 3 and 4 sided mesh elements
Matplotlib's colormapping mechanics come from a time when numpy.nan wasn't around. Instead it works with masked arrays.
facecolors = np.ma.array(facecolors, mask=np.isnan(facecolors))
Concerning the other error you get, note that .set_clim is an attribute of the colorbar, not the colormap.
Finally, if your mesh contained only 3-sided elements, you could use tripcolor, but that won't work with 4-sided meshes.

Categories