Sorry to my noob question, but how can I add a shadow area/color between the upper and lower lines in a seaborn chart?
The primary code I've working on is the following:
plt.figure(figsize=(18,10))
sns.set(style="darkgrid")
palette = sns.color_palette("mako_r", 3)
sns.lineplot(x="Date", y="Value", hue='Std_Type', style='Value_Type', sizes=(.25, 2.5), palette = palette, data=tbl4)
The idea is to get some effect like below (the example from seaborn website):
But I could not replicate the effect although my data structure is pretty much in the same fashion as fmri (seaborn example)
from seaborn link:
import seaborn as sns
sns.set(style="darkgrid")
# Load an example dataset with long-form data
fmri = sns.load_dataset("fmri")
# Plot the responses for different events and regions
sns.lineplot(x="timepoint", y="signal",
hue="region", style="event",
data=fmri)
Do you have some ideas?
I tried to change the chart style, but if I go to a distplot or relplot, for example, the x_axis cannot show the timeframe...
Check this code:
# import
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
sns.set(style = 'darkgrid')
# data generation
time = pd.date_range(start = '2006-01-01', end = '2020-01-01', freq = 'M')
tbl4 = pd.DataFrame({'Date': time,
'down': 1 - 0.5*np.random.randn(len(time)),
'up': 4 + 0.5*np.random.randn(len(time))})
tbl4 = tbl4.melt(id_vars = 'Date',
value_vars = ['down', 'up'],
var_name = 'Std_Type',
value_name = 'Value')
# figure plot
fig, ax = plt.subplots(figsize=(18,10))
sns.lineplot(ax = ax,
x = 'Date',
y = 'Value',
hue = 'Std_Type',
data = tbl4)
# fill area
plt.fill_between(x = tbl4[tbl4['Std_Type'] == 'down']['Date'],
y1 = tbl4[tbl4['Std_Type'] == 'down']['Value'],
y2 = tbl4[tbl4['Std_Type'] == 'up']['Value'],
alpha = 0.3,
facecolor = 'green')
plt.show()
which gives me this plot:
Since I do not have access to your data, I generated random ones. Replace them with yours.
The shadow area is done with plt.fill_between (documentation here), where you specify the x array (common to both curves), the upper and lower limits of the area as y1 and y2 and, optionally a color and its transparency with the facecolor and alpha parameters respectively.
You cannot do it through ci parameter, since it is used to show the confidence interval of your data.
Related
I am trying to plot two columns of a pandas dataframe against each other, grouped by a values in a third column. The color of each line should be determined by that third column, i.e. one color per group.
For example:
import pandas as pd
from matplotlib import pyplot as plt
fig, ax = plt.subplots()
df = pd.DataFrame({'x': [0.1,0.2,0.3,0.1,0.2,0.3,0.1,0.2,0.3],'y':[1,2,3,2,3,4,4,3,2], 'colors':[0.3,0.3,0.3,0.7,0.7,0.7,1.3,1.3,1.3]})
df.groupby('colors').plot('x','y',ax=ax)
If I do it this way, I end up with three different lines plotting x against y, with each line a different color. I now want to determine the color by the values in 'colors'. How do I do this using a gradient colormap?
Looks like seaborn is applying the color intensity automatically based on the value in hue..
import pandas as pd
from matplotlib import pyplot as plt
df = pd.DataFrame({'x': [0.1,0.2,0.3,0.1,0.2,0.3,0.1,0.2,0.3,0.1,0.2,0.3],'y':[1,2,3,2,3,4,4,3,2,3,4,2], 'colors':[0.3,0.3,0.3,0.7,0.7,0.7,1.3,1.3,1.3,1.5,1.5,1.5]})
import seaborn as sns
sns.lineplot(data = df, x = 'x', y = 'y', hue = 'colors')
Gives:
you can change the colors by adding palette argument as below:
import seaborn as sns
sns.lineplot(data = df, x = 'x', y = 'y', hue = 'colors', palette = 'mako')
#more combinations : viridis, mako, flare, etc.
gives:
Edit (for colormap):
based on answers at Make seaborn show a colorbar instead of a legend when using hue in a bar plot?
import seaborn as sns
fig = sns.lineplot(data = df, x = 'x', y = 'y', hue = 'colors', palette = 'mako')
norm = plt.Normalize(vmin = df['colors'].min(), vmax = df['colors'].max())
sm = plt.cm.ScalarMappable(cmap="mako", norm = norm)
fig.figure.colorbar(sm)
fig.get_legend().remove()
plt.show()
gives..
Hope that helps..
Complementing to Prateek's very good answer, once you have assigned the colors based on the intensity of the palette you choose (for example Mako):
plots = sns.lineplot(data = df, x = 'x', y = 'y', hue = 'colors',palette='mako')
You can add a colorbar with matplotlib's function plt.colorbar() and assign the palette you used:
sm = plt.cm.ScalarMappable(cmap='mako')
plt.colorbar(sm)
After plt.show(), we get the combined output:
let's assume I have the following data
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
data = pd.DataFrame(dict(x=[1,2,3]*4,
y=list(range(12)),
row_separator = ["a"]*6 + ["b"]*6,
col_separator = (["a"]*3 + ["b"]*3) * 2
)
)
data
I'd like to have a simple line plot of x and y for each combination of row_separator and col_separator, and I want to do this using sns.relplot(). y-scales are different, which is fine, but I want to have the same lower bound 0 for all sub-plots.
With plt.subplots, it's like this:
fig, axs = plt.subplots(2,2)
for i in [0,1]:
for j in [0,1]:
ind = (data.row_separator == data.row_separator.unique()[i]) &\
(data.col_separator == data.col_separator.unique()[j])
axs[i,j].plot(data.loc[ind,"x"], data.loc[ind,"y"])
axs[i,j].set_ylim(bottom=0)
With seaborn, I'm struggling how to do this. (My) Basis would be
g = sns.relplot(data=data,
x="x",
y="y",
col="row_separator",
row="col_separator",
kind="line"
)
I would expect facet_kws={"ylim":(0, None)} to do what I want, but it isn't (it's rather limiting all plots to 0 and 1).
I'm fairly new to Python and I'm struggling annotating plots at the minute.
I've come from R so I'm used to the ease of being able to annotate scatterplot points with minimum code.
Code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
url = ('https://fbref.com/en/share/nXtrf')
df = pd.read_html(url)[0]
df = df[['Unnamed: 1_level_0', 'Unnamed: 2_level_0', 'Play', 'Perf']].copy()
df.columns = df.columns.droplevel()
df = df[['Player','Squad','Min','SoTA','Saves']]
df = df.drop([25])
df['Min'] = pd.to_numeric(df['Min'])
df['SoTA'] = pd.to_numeric(df['SoTA'])
df['Saves'] = pd.to_numeric(df['Saves'])
df['Min'] = df[df['Min'] > 1600]['Min']
df = df.dropna()
df.plot(x = 'Saves', y = 'SoTA', kind = "scatter")
I've tried numerous ways to annotate this plot. I'd like the points to be annotated with corresponding data from 'Player' column.
I've tried using a label_point function that I've found while trying to find a work around buy I keep getting Key Error 0 on most ways I try.
Any assistance would be great. Thanks.
You could loop through both columns and add a text for each entry. Note that you need to save the ax returned by df.plot(...).
ax = df.plot(x='Saves', y='SoTA', kind="scatter")
for x, y, player in zip(df['Saves'], df['SoTA'], df['Player']):
ax.text(x, y, f'{player}', ha='left', va='bottom')
xmin, xmax = ax.get_xlim()
ax.set_xlim(xmin, xmax + 0.15 * (xmax - xmin)) # some more margin to fit the texts
An alternative is to use the mplcursors library to show an annotation while hovering (or after a click):
import mplcursors
mplcursors.cursor(hover=True)
I want to overlay 95 percentile values on seaborn boxplot. I could not figure out the ways to overlay text or if there is seaborn capability for that. How would I modify following code to overlay the 95 percentile values on plot.
import pandas as pd
import numpy as np
import seaborn as sns
df = pd.DataFrame(np.random.randn(200, 4), columns=list('ABCD'))*100
alphabet = list('AB')
df['Gr'] = np.random.choice(np.array(alphabet, dtype="|S1"), df.shape[0])
df_long = pd.melt(df, id_vars=['Gr'], value_vars = ['A','B','C','D'])
sns.boxplot(x = "variable", y="value", hue = 'Gr', data=df_long, whis = [5,95])
Consider seaborn's plot.text, borrowing from #bernie's answer (also a healty +1 for including sample dataset). The only challenge is adjusting the alignment due to grouping in hue field to have labels overlay over each boxplot series. Even have labels color coded according to series.
import pandas as pd
import numpy as np
import seaborn as sns
np.random.seed(61518)
# ... same as OP
# 95TH PERCENTILE SERIES
pctl95 = df_long.groupby(['variable', 'Gr'])['value'].quantile(0.95)
pctl95_labels = [str(np.round(s, 2)) for s in pctl95]
# GROUP INDEX TUPLES
grps = [(i, 2*i, 2*i+1) for i in range(4)]
# [(0,0,1), (1,2,3), (2,4,5), (3,6,7)]
pos = range(len(pctl95))
# ADJUST HORIZONTAL ALIGNMENT WITH MORE SERIES
for tick, label in zip(grps, hplot.get_xticklabels()):
hplot.text(tick[0]-0.1, pctl95[tick[1]] + 0.95, pctl95_labels[tick[1]],
ha='center', size='x-small', color='b', weight='semibold')
hplot.text(tick[0]+0.1, pctl95[tick[2]] + 0.95, pctl95_labels[tick[2]],
ha='center', size='x-small', color='g', weight='semibold')
sns.plt.show()
I'm drawing several point plots in seaborn on the same graph. The x-axis is ordinal, not numerical; the ordinal values are the same for each point plot. I would like to shift each plot a bit to the side, the way pointplot(dodge=...) parameter does within multiple lines within a single plot, but in this case for multiple different plots drawn on top of each other. How can I do that?
Ideally, I'd like a technique that works for any matplotlib plot, not just seaborn specifically. Adding an offset to the data won't work easily, since the data is not numerical.
Example that shows the plots overlapping and making them hard to read (dodge within each plot works okay)
import pandas as pd
import seaborn as sns
df1 = pd.DataFrame({'x':list('ffffssss'), 'y':[1,2,3,4,5,6,7,8], 'h':list('abababab')})
df2 = df1.copy()
df2['y'] = df2['y']+0.5
sns.pointplot(data=df1, x='x', y='y', hue='h', ci='sd', errwidth=2, capsize=0.05, dodge=0.1, markers='<')
sns.pointplot(data=df2, x='x', y='y', hue='h', ci='sd', errwidth=2, capsize=0.05, dodge=0.1, markers='>')
I could use something other than seaborn, but the automatic confidence / error bars are very convenient so I'd prefer to stick with seaborn here.
Answering this for the most general case first.
A dodge can be implemented by shifting the artists in the figure by some amount. It might be useful to use points as units of that shift. E.g. you may want to shift your markers on the plot by 5 points.
This shift can be accomplished by adding a translation to the data transform of the artist. Here I propose a ScaledTranslation.
Now to keep this most general, one may write a function which takes the plotting method, the axes and the data as input, and in addition some dodge to apply, e.g.
draw_dodge(ax.errorbar, X, y, yerr =y/4., ax=ax, dodge=d, marker="d" )
The full functional code:
import matplotlib.pyplot as plt
from matplotlib import transforms
import numpy as np
import pandas as pd
def draw_dodge(*args, **kwargs):
func = args[0]
dodge = kwargs.pop("dodge", 0)
ax = kwargs.pop("ax", plt.gca())
trans = ax.transData + transforms.ScaledTranslation(dodge/72., 0,
ax.figure.dpi_scale_trans)
artist = func(*args[1:], **kwargs)
def iterate(artist):
if hasattr(artist, '__iter__'):
for obj in artist:
iterate(obj)
else:
artist.set_transform(trans)
iterate(artist)
return artist
X = ["a", "b"]
Y = np.array([[1,2],[2,2],[3,2],[1,4]])
Dodge = np.arange(len(Y),dtype=float)*10
Dodge -= Dodge.mean()
fig, ax = plt.subplots()
for y,d in zip(Y,Dodge):
draw_dodge(ax.errorbar, X, y, yerr =y/4., ax=ax, dodge=d, marker="d" )
ax.margins(x=0.4)
plt.show()
You may use this with ax.plot, ax.scatter etc. However not with any of the seaborn functions, because they don't return any useful artist to work with.
Now for the case in question, the remaining problem is to get the data in a useful format. One option would be the following.
df1 = pd.DataFrame({'x':list('ffffssss'),
'y':[1,2,3,4,5,6,7,8],
'h':list('abababab')})
df2 = df1.copy()
df2['y'] = df2['y']+0.5
N = len(np.unique(df1["x"].values))*len([df1,df2])
Dodge = np.linspace(-N,N,N)/N*10
fig, ax = plt.subplots()
k = 0
for df in [df1,df2]:
for (n, grp) in df.groupby("h"):
x = grp.groupby("x").mean()
std = grp.groupby("x").std()
draw_dodge(ax.errorbar, x.index, x.values,
yerr =std.values.flatten(), ax=ax,
dodge=Dodge[k], marker="o", label=n)
k+=1
ax.legend()
ax.margins(x=0.4)
plt.show()
You can use linspace to easily shift your graphs to where you want them to start and end. The function also makes it very easy to scale the graph so they would be visually the same width
import numpy as np
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.pyplot as plt
start_offset = 3
end_offset = start_offset
y1 = np.random.randint(0, 10, 20) ##y1 has 20 random ints from 0 to 10
y2 = np.random.randint(0, 10, 10) ##y2 has 10 random ints from 0 to 10
x1 = np.linspace(0, 20, y1.size) ##create a number of steps from 0 to 20 equal to y1 array size-1
x2 = np.linspace(0, 20, y2.size)
plt.plot(x1, y1)
plt.plot(x2, y2)
plt.show()