For any given chart, I'd like to specify the first n colors for alt.Scale(range=), and then if there are _n+1_ data values, for Altair to fallback on color scheme, e.g. 'category10'`.
In the following example, if there are 6 name values, Altair will render them in red and green in a cyclical sequence:
chart = alt.Chart(df).mark_bar().encode(x='name', y='amount',
fill=alt.Color('name',
scale=alt.Scale(range=['red', 'green'])))
However, what I would like to happen is for name values 3 through 6 to be, say, the first 4 colors of a scheme, like category10. Pretend the Altair API recognized this kind of call (it doesn't, obviously, just trying to explain in code):
chart = alt.Chart(df).mark_bar().encode(x='name', y='amount',
fill=alt.Color('name',
scale=alt.Scale(range=['red', 'green'], scheme='category10'
)))
I guess another way I could ask my question is, is there a way to access a colorscheme object, and then manually tweak its color sequence? Here's another pseudocode explanation of what I mean:
mycolors = alt.Scale(range=['red', 'green']) + alt.Scale(scheme='category10')
chart = alt.Chart(df).mark_bar().encode(x='name', y='amount',
fill=alt.Color('name', scale=mycolors))
According to this answer from jakevdp, "the color palette details are not available via Altair from the Python package itself", which makes sense. But is there a way to essentially to designate/customize a new scheme, using existing specified schemes?
first of all: I'm completely new to python.
I'm trying to visualize some measured data. Each entry has a quadrant, number and sector. The original data lies in a .xlsx file. I've managed to use a .pivot_table to sort the data according to its sector. Due to overlapping, number and quadrant also have to be indexed. Now I want to plot it as a bar chart, where the bars are grouped by sector and the colors represent the quadrant.
But because number also has to be indexed, it shows up in the bar chart as a separate group. There should only be three groups, 0, i and a.
MWE:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
d = {'quadrant': ["0","0","0","0","0","0","I","I","I","I","I","I","I","I","I","I","I","I","II","II","II","II","II","II","II","II","II","II","II","II","III","III","III","III","III","III","III","III","III","III","III","III","IV","IV","IV","IV","IV","IV","IV","IV","IV","IV","IV","IV"], 'sector': [0,"0","0","0","0","0","a","a","a","a","a","a","i","i","i","i","i","i","a","a","a","a","a","a","i","i","i","i","i","i","a","a","a","a","a","a","i","i","i","i","i","i","a","a","a","a","a","a","i","i","i","i","i","i"], 'number': [1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6], 'Rz_m': [67.90,44.17,44.30,63.43,49.87,39.33,61.17,69.37,66.20,44.20,64.77,39.93,44.33,50.97,55.90,51.33,58.23,44.53,50.03,47.40,58.67,71.57,57.60,70.77,63.93,47.37,46.90,34.73,41.27,48.23,58.30,47.07,50.53,51.20,32.67,50.37,37.50,55.50,41.20,48.07,56.80,49.77,40.87,44.43,44.00,60.03,63.73,72.80,51.60,45.53,60.27,71.00,59.63,48.70]}
df = pd.DataFrame(data=d)
B = df.pivot_table(index=['sector','number', 'quadrant'])
B.unstack().plot.bar(y='Rz_m')
The data viz ecosystem in Python is pretty diverse and there are multiple libraries you can use to produce the same chart. Matplotlib is a very powerful library, but it's also quite low-level, meaning you often have to do a lot of preparatory work before getting to the chart, so usually you'll find people use seaborn for static visualisations, especially if there is a scientific element to them (it has built-in support for things like error bars, etc.)
Out of the box, it has a lot of chart types to support exploratory data analysis and is built on top of matplotlib. For your example, if I understood it right, it would be as simple as:
import seaborn as sns
sns.catplot(x="sector", y="Rz_m", hue="quadrant", data=df, ci=None,
height=6, kind="bar", palette="muted")
And the output would look like this:
Note that in your example, you missed out "" for one of the zeroes and 0 and "0" are plotted as separate columns. If you're using seaborn, you don't need to pivot the data, just feed it the df as you've defined it.
For interactive visualisations (with tooltips, zoom, pan, etc.), you can also check out bokeh.
There is an interesting wrinkle to this - how to center the nested bars on the label. By default the bars are drawn with center alignment which works fine for an odd number of columns. However, for an even number, you'd want them to be centered on the right edge. You can make a small alteration in the source code categorical.py, lines beginning 1642 like so:
# Draw the bars
offpos = barpos + self.hue_offsets[j]
barfunc(offpos, self.statistic[:, j], -self.nested_width,
color=self.colors[j], align="edge",
label=hue_level, **kws)
Save the .png and then change it back, but it's not ideal. Probably worth flagging up to the library maintainers.
As shown in the figure, the same font size for Greek letters seems smaller than normal characters. I want to make them looks the same size, how to achieve this?
The code of the graph is as follows:
import numpy as np
import math
import matplotlib.pyplot as plt
alpha=np.arange(0,1,0.01)
gamma=np.sin(2*np.pi*alpha)
x=alpha
y=np.cos(2*np.pi*x)
plt.plot(x,y,label=r'cosine function')
plt.plot(alpha,gamma,label=r'$\gamma=\sin(\alpha)$')
plt.legend(loc=0,fontsize=20)
plt.show()
There's a little bit of a trick to this. Scroll down to the end if you're just interested in the solution.
plt.legend returns a Legend object with methods that allow you to modify the appearance of the legend. So first we'll save the Legend object:
legend = plt.legend(loc=0, fontsize=20)
The method we are looking for is Legend.get_texts(). This will return a list of Text objects whose methods control the size, color, font, etc. of the legend text. We only want the second Text object:
text = legend.get_texts()[1]
The Text object has a method called Text.set_fontsize. So let's try that. Altogether, the end of your code should look like:
legend = plt.legend(loc=0,fontsize=20)
text = legend.get_texts()[1]
text.set_fontsize(40)
And this is what we get:
Hm. It looks like both of the legend entries have been made bigger. This certainly isn't what we want. What is going on here, and how do we fix it?
The short of it is that the size, color, etc. of each of the legend entries are managed by an instance of a FontProperties class. The problem is that the two entries share the same instance. So setting the size of one instance also changes the size of the other.
The workaround is to create a new, independent instance of the font properties, as follows. First, we get our text, just as before:
text = legend.get_texts()[1]
Now, instead of setting the size immediately, we get the font properties object, but then make sure to copy it:
props = text.get_font_properties().copy()
Now we make this new, independent font properties instance our text's properties:
text.set_fontproperties(props)
And we can now try setting this legend entry's size:
text.set_size(40)
Solution
The end of your code should now look like:
legend = plt.legend(loc=0,fontsize=20)
text = legend.get_texts()[1]
props = text.get_font_properties().copy()
text.set_fontproperties(props)
text.set_size(40)
Producing a plot looking like
In matplotlib I wish to know the cleanest and most robust means of overlaying labels onto an axis. This is probably best demonstrated with an example:
While normal axis labels/ticks are placed every 5.00 units additional labels without ticks have been overlayed onto the axis (this can be seen at 1113.75 which partially covers 1114.00 and 1105.00 which is covered entirely). The labels also have the same font and size as their normal, ticked, counterparts with the background (if any) going right up to the axis (as a tick mark would).
What is the simplest way of obtaining this effect in matplotlib?
Edit
Following on from #Ken's suggestion I have managed to obtain the effect for an existing tick/label by using ax.yaxis.get_ticklines and ax.yaxis.get_ticklabels to both remove the tick marker and change the background/font/zorder of a label. However, I am unsure how best to add a new tick/label to an axis.
In other words I am looking for a function add_tick(ax.yaxis, loc) that adds a tick at location loc and returns the tickline and ticklabel objects for me to operate on.
I haven't ever tried to do that, but I think that the Artist tutorial might be helpful for you. In particular, the last section has the following code:
for line in ax1.yaxis.get_ticklines():
# line is a Line2D instance
line.set_color('green')
line.set_markersize(25)
line.set_markeredgewidth(3)
I think that using something like line.set_markersize(0) might make the markers have size zero. The difficult part might be finding the ones that need that done. It is possible that the line.xdata or line.ydata arrays might contain enough information to isolate the ones you need. Of course, if you are manually adding the tick marks, it is possible that as you do that the instance gets returned, so you can just modify them as you create them.
The best solution I have been able to devise:
# main: axis; olocs: locations list; ocols: location colours
def overlay_labels(main, olocs, ocols):
# Append the overlay labels as ticks
main.yaxis.set_ticks(np.append(main.yaxis.get_ticklocs(), olocs))
# Perform generic formatting to /all/ ticks
# [...]
labels = reversed(main.yaxis.get_ticklabels())
markers = reversed(main.yaxis.get_ticklines()[1::2]) # RHS ticks only
glines = reversed(main.yaxis.get_gridlines())
rocols = reversed(ocols)
# Suitably format each overlay tick (colours and lines)
for label,marker,grid,colour in izip(labels, markers, glines, rocols):
label.set_color('white')
label.set_backgroundcolor(colour)
marker.set_visible(False)
grid.set_visible(False)
It is not particularly elegant but does appear to work.