How to edit a color scheme? - python

I am using Altair for Python and my current heatmap code uses a redyellowblue color scheme (A) that uses yellow as the middle color. I am trying to edit this color scheme in order to achieve the scheme on (B), which the only difference is replacing yellow with white as the middle color. Does anyone have any idea on how to achieve that in Altair?
The color scheme on (B) was created in R, by using the RdYlBu color pallete (the one with 11 colors) and overwrite the middle (6th color) with white. Then, they increased the number of colors in the pallete to 99, to make the fade look more fluid.
My current code (A):
color=alt.Color('Spline_WW_Diff_Trend:Q', scale=alt.Scale(scheme='redyellowblue',reverse=True, domain=[-3.57,2.270], domainMid=0, clamp=True), legend=alt.Legend(title="Trend"))
I have tried manually setting up the colors using range but got an odd result. I've also used a condition to override the color for the value 0, but it wasn't satisfactory because the numbers neighboring 0 should have a white(ish) color.

You probably want interpolate='rgb' when defining your own range.Using the interpolate property for the color scale you can define one of the interpolation methods as is defined by d3-interpolate, https://github.com/d3/d3-interpolate#color-spaces.
The default value for interpolate is hcl, which is not always what you want. Observe the changes in color interpolation once you change the interpolation methods with a fixed range/domain:
import altair as alt
import pandas as pd
import numpy as np
df = pd.DataFrame({'x': np.arange(-10, 10)})
def charter(method):
return alt.Chart(df, title=method).mark_rect().encode(
x=alt.X('x:O',title=None),
color=alt.Color('x:Q',
scale=alt.Scale(
domain=[-10,-5,0,5,9],
range=['red','orange','white','lightblue','darkblue'],
interpolate=method
),
legend=alt.Legend(direction='horizontal', orient='top', title=None)
)
)
methods = ['hcl', 'rgb', 'hsl', 'hsl-long', 'lab', 'hcl-long', 'cubehelix', 'cubehelix-long']
alt.vconcat(*[charter(method) for method in methods]).resolve_scale(color='independent')

Related

Set seaborn diverging cmap palette's center/middle parameter to 0

I am trying to create a diverging red-green formatting for my pandas dataframe. Specifically, I want any value below 0 to be red, and any value above 0 to be green (& the higher the value, the darker green the cell ... and vice versa).
Currently, I cannot find a way to center this around 0. See below for a quick example of my code and a screenshot of my current output.
Thanks!
rdgn = sns.diverging_palette(h_neg=0, h_pos=112, s=68, l=50, sep=10,n=9, as_cmap=True)
s = test.style.background_gradient(rdgn).set_precision(2)

Understanding the interaction between mark_line point overlay and legend

I have found some unintuitive behavior in the interaction between the point property of mark_line and the appearance of the color legend for Altair/Vega-Lite. I ran into this when attempting to create a line with very large and mostly-transparent points in order to increase the area that would trigger the line's tooltip, but was unable to preserve a visible type=gradient legend.
The following code is an MRE for this problem, showing 6 cases: the use of [False, True, and a custom OverlayMarkDef] for the point property and the use of plain and customized color encoding.
import pandas as pd
import altair as alt
# create data
df = pd.DataFrame()
df['x_data'] = [0, 1, 2] * 3
df['y2'] = [0] * 3 + [1] * 3 + [2] * 3
# initialize
base = alt.Chart(df)
markdef = alt.OverlayMarkDef(size=1000, opacity=.001)
color_encode = alt.Color(shorthand='y2', legend=alt.Legend(title='custom legend', type='gradient'))
marks = [False, True, markdef]
encodes = ['y2', color_encode]
plots = []
for i, m in enumerate(marks):
for j, c in enumerate(encodes):
plot = base.mark_line(point=m).\
encode(x='x_data', y='y2', color=c, tooltip=['x_data','y2']).\
properties(title=', '.join([['False', 'True', 'markdef'][i], ['plain encoding', 'custom encoding'][j]]))
plots.append(plot)
combined = alt.vconcat(
alt.hconcat(*plots[:2]).resolve_scale(color='independent'),
alt.hconcat(*plots[2:4]).resolve_scale(color='independent'),
alt.hconcat(*plots[4:]).resolve_scale(color='independent')
).resolve_scale(color='independent')
The resulting plot (the interactive tooltips work as expected):
The color data is the same for each of these plots, and yet the color legend is all over the place. In my real case, the gradient is preferred (the data is quantitative and continuous).
With no point on the mark_line, the legend is correct.
Adding point=True converts the legend to a symbol type - I'm not sure why this is the case since the default legend type is gradient for quantitative data (as seen in the first row) and this is the same data - but can be forced back to gradient by the custom encoding.
Attempting to make a custom point via OverlayMarkDef however renders the forced gradient colorbar invisible - matching the opacity of the OverlayMarkDef. But it is not simply a matter of the legend always inheriting the properties of the point, because the symbol legend does not attempt to reflect the opacity.
I would like to have the normal gradient colorbar available for the custom OverlayMarkDef, but I would also love to build up some intuition for what is going on here.
The transparency issue with the bottom right plot has been fixed since Altair 4.2.0, so now all occasions that include a point on the line changes the legend to 'Ordinal' instead of 'Quantitative'.
I believe the reason the legend is converted to a symbol instead of a gradient, is that your are adding filled points and the fill channel is not set to a quantitative field so it defaults to either ordinal or nominal with a sort:
plot = base.mark_line().encode(
x='x_data',
y='y2',
color='y2',
)
plot + plot.mark_circle(opacity=1)
mark_point gives a gradient legend since it has not fill, and if we set the fill for mark_circle explicitly we also get a gradient legend (one for fill and one for color.
plot = base.mark_line().encode(
x='x_data',
y='y2',
color='y2',
fill='y2'
)
plot + plot.mark_circle(opacity=1)
I agree with you that this is a bit unexpected and it would be more convenient if the encoding type of point=True was set to the same as that used for the lines. You might suggest this as an enhancement in VegaLite together with reporting the apparent bug that you can't override the legend type via type='gradient'.

Reduce color list to more common colors - Python

I have a list of 1723 colors in hex codes (I could also turn them to RGB but the issue is not the format of the color), like this: cols = ['#A62E2E', '#D99036', '#D9C27E', '#D9AB9A', '#592C22'].
I'm trying to reduce the amount of colors in that list to 1/10th of what it already is by grouping similar colors. So in my example, the 1723 colors will be mapped to 172.
I have already checked these posts: post1, post2, post3, post4, post5 but they are not exactly what I want. Basically I want to create the groups dynamically from the color list I have and not a preexisting one.
It would also be very beneficial for me if I could keep as much variety as possible so preferably id like as different groups as possible.
What I've already tried:
The only solution I've found so far is a function from a stackoverflow post
def closest_color(rgb):
r, g, b = rgb
color_diffs = []
for color in nodes['Rgb']:
cr, cg, cb = color
color_diff = sqrt(abs(r - cr)**2 + abs(g - cg)**2 + abs(b - cb)**2)
color_diffs.append((color_diff, color))
return min(color_diffs)[1]
Which basically gives you the closest color to your agument from a list, but this requires a preexisting list of colors and does the mapping.
The way I'm thinking it could be done is iterate over my list, leaving the current element out and calling this function for the rest of the list and grouping the 2 colors, then doing that until I have only 172 colors. However im not sure if that will give me enough distinct colors or how to group the 2 colors I get for that matter.
I don't know enough about colors to figure out a way of doing that without messing my color range.
Here is a method that clusters the colors using scipy. Please note that using RBG is not recommended, and you would need to transform your data to a uniform color space) before clustering. The transformation in the example is included as a placeholder: it is not really useful since YIQ is not a uniform color space. There are different modules that can be used to perform the transformation.
The final list is in rgb_clusters.
import colorsys
from scipy.cluster.vq import kmeans2
from numpy.random import random_sample
n_colors = 1732
n_seek = n_colors // 10
rgb_data = random_sample((n_colors, 3))
# Let's change from RGB to another colour space. YIQ is not a
# good choice: a uniform colour space should be
# used (see https://en.wikipedia.org/wiki/Color_appearance_model).
# in this example I use YIQ for its simplicity.
yiq_data = [colorsys.rgb_to_yiq(*rgb) for rgb in rgb_data]
yiq_clusters, mapping = kmeans2(yiq_data, n_seek, minit='++')
rgb_clusters = [colorsys.yiq_to_rgb(*yiq) for yiq in yiq_clusters]
# Let's see the results graphically
from matplotlib import pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(projection='3d')
ax.scatter(*zip(*rgb_clusters), c=rgb_clusters, s=20, label="centroids")
ax.scatter(*zip(*rgb_data), c=rgb_data, s=6, label="data")
for o_data, idx_cluster in zip(rgb_data, mapping):
cluster = rgb_clusters[idx_cluster]
ax.plot(*zip(o_data, cluster), c=cluster)
ax.set_xlabel("R")
ax.set_ylabel("G")
ax.set_zlabel("B")
plt.legend()
Result:

Pandas styling: Conditionally change background color of column by absolute value

I have pandas dataframe, and I would like to give a background gradient color based on absolute value.Imagine, my desire value is 6 in column A in the data frame. As number move from desire value, background gradient color changes on both directions same by absolute value (regardless to positive and negative direction).
The following post comes to close what I want, but these case color does not consider absolute value.pandas style background gradient both rows and columns,also pandas documentation http://pandas.pydata.org/pandas-docs/stable/user_guide/style.html
I have created minimum code:
import pandas as pd
colum1 = [-1,0,1,2,3,4,5,6,7,8,9,10,11,12]
df = pd.DataFrame(data=colum1,columns=["A"])
I have created expected output as image in the excel. I would like to get similar output from code. background gradient change by absolute valye
I faced with similar task, and after little 'research' i have found the solution.
from matplotlib import colors
import seaborn as sns
def b_g(s):
cm=sns.light_palette("red", as_cmap=True)
max_val = max(s.max(), abs(s.min()))
norm = colors.Normalize(0,max_val)
normed = norm(abs(s.values))
c = [colors.rgb2hex(x) for x in plt.cm.get_cmap(cm)(normed)]
return ['background-color: %s' % color for color in c]
The b_g function allows you to generate your custom colors_map based on your dataframe data and cm=sns.light_palette("red", as_cmap=True) could help to customize colors.
Applying of this function on a dataframe looks like:
dataframe.style.apply(b_g)
The trick here is in the using abs() function, when calculate normed values for data from a dataframe.
Example result
Links on the documentation:
https://seaborn.pydata.org/generated/seaborn.light_palette.html
https://pandas.pydata.org/pandas-docs/version/0.18/style.html

Vizualizing speed of movement with color scale for location in Python

i have 3 main values (longitude, latitude and speed). Using Folium library i can map the location with lon and lat degree. but now i want also put the velocity with color scale. for example if the speed is between 0-20 the that part of line is red, if speed is between 20-60 the yellow, if the speed is higher than 60 then the line is green. is it possible to do it in python? Can anybody help me with this? my current code is:
my_map = folium.Map(location=[ave_lat, ave_long], zoom_start=14)
folium.PolyLine(points, color="blue", weight=2.5, opacity=1).add_to(my_map)
my_map
"points" here is lon and lat pairs. but i have also speed column in my csv. my output is like this. Can anybody help me with this? Thanks!
but i want to add speed column for data visualising to get something like this
I thought I might as well add my own answer because the one from #GlobalTraveler involves drawing many lines which is a bit dirty I think.
It seems that indeed there is no option in folium to do this, but you can draw multiple markers instead, and color them individually
from matplotlib import cm
import folium
# rgb tuple to hexadecimal conversion
def rgb2hex(rgb):
rgb = [hex(int(256*x)) for x in rgb)]
r, g, b = [str(x)[2:] for x in rgb]
return f"#{r}{g}{b}"
# Defines the color mapping from speeds to rgba
color_mapper = cm.ScalarMappable(cmap=cm.cividis)
rgb_values = [c[:3] for c in color_mapper.to_rgba(speeds)] # keep rgb and drop the "a" column
colors = [rgb2hex(rgb) for rgb in rgb_values]
my_map = folium.Map(location=[ave_lat, ave_long], zoom_start=14)
for point, color, speed in zip(points, colors, speeds):
folium.CircleMarker(location=point,
radius=1.25,
popup=str(speed),
color=color).add_to(my_map)
my_map
For this to work you will need to have an array points with 2 columns and an array speeds as many lines as points.
Note that you can change cm.cividis to whatever suits your needs (see the reference here)
You can add rgba values to the color keyword for each point.

Categories