I can't seem to get the labels on the x-axis to rotate 90 degrees.
Example df:
import pandas as pd
import matplotlib.pyplot as plt
d = ({
'A' : ['1','1','2','2','3','3','3'],
'B' : ['A','B','C','C','D','B','C'],
'C' : ['Foo','Bar','Foo','Bar','Cat','Bar','Cat'],
})
df = pd.DataFrame(data=d)
fig,ax = plt.subplots(figsize = (9,4))
df.assign(A=df.A.astype(int)).pivot_table(index="C", columns="B", values="A",aggfunc='count').rename_axis(None).rename_axis(None,1).plot(kind='bar')
plt.show()
I have tried the basic:
plt.xticks(rotation = 90)
Also tried this but it returns an Attribute Error:
df.assign(A=df.A.astype(int)).pivot_table(index="C", columns="B", values="A",aggfunc='count').rename_axis(None).rename_axis(None,1).plot(kind='bar', rotation = 90)
I have got the labels to rotate through this:
xticklabels = df.C.unique()
ax.set_xticklabels(xticklabels, rotation = 0)
But it returns incorrect ordering. It just takes the values as they appear. Rather than determining the appropriate label
I run the code below to produce the labels with angle 0. I don't understand why there are two plots generated so I deleted the line fig,ax = plt.subplots()
import pandas as pd
import matplotlib.pyplot as plt
d = ({
'A' : ['1','1','2','2','3','3','3'],
'B' : ['A','B','C','C','D','B','C'],
'C' : ['Foo','Bar','Foo','Bar','Cat','Bar','Cat'],
})
df = pd.DataFrame(data=d)
#fig,ax = plt.subplots()
df.assign(A=df.A.astype(int)).pivot_table(index="C", columns="B",
values="A",aggfunc='count').rename_axis(None).rename_axis(None,1).plot(kind='bar')
plt.xticks(rotation = 0)
plt.show()
You can control the xticks labels through creating a subplot and configuring the label settings, like this:
import pandas as pd
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
d = ({
'A' : ['1','1','2','2','3','3','3'],
'B' : ['A','B','C','C','D','B','C'],
'C' : ['Foo','Bar','Foo','Bar','Cat','Bar','Cat'],
})
df = pd.DataFrame(data=d)
udf = (df.assign(A=df.A.astype(int))
.pivot_table(index="C", columns="B", values="A",aggfunc='count')
.rename_axis(None)
.rename_axis(None,1))
udf.plot(kind='bar', ax=ax)
labels = ax.set_xticklabels(udf.index.values, rotation=0, fontsize=14)
The output would be:
One more thing, I think you need 0 degree rotation as the default is 90.
PS: Long chaining in pandas operations really eats away the readability.
Related
I am trying to plot two columns of a pandas dataframe against each other, grouped by a values in a third column. The color of each line should be determined by that third column, i.e. one color per group.
For example:
import pandas as pd
from matplotlib import pyplot as plt
fig, ax = plt.subplots()
df = pd.DataFrame({'x': [0.1,0.2,0.3,0.1,0.2,0.3,0.1,0.2,0.3],'y':[1,2,3,2,3,4,4,3,2], 'colors':[0.3,0.3,0.3,0.7,0.7,0.7,1.3,1.3,1.3]})
df.groupby('colors').plot('x','y',ax=ax)
If I do it this way, I end up with three different lines plotting x against y, with each line a different color. I now want to determine the color by the values in 'colors'. How do I do this using a gradient colormap?
Looks like seaborn is applying the color intensity automatically based on the value in hue..
import pandas as pd
from matplotlib import pyplot as plt
df = pd.DataFrame({'x': [0.1,0.2,0.3,0.1,0.2,0.3,0.1,0.2,0.3,0.1,0.2,0.3],'y':[1,2,3,2,3,4,4,3,2,3,4,2], 'colors':[0.3,0.3,0.3,0.7,0.7,0.7,1.3,1.3,1.3,1.5,1.5,1.5]})
import seaborn as sns
sns.lineplot(data = df, x = 'x', y = 'y', hue = 'colors')
Gives:
you can change the colors by adding palette argument as below:
import seaborn as sns
sns.lineplot(data = df, x = 'x', y = 'y', hue = 'colors', palette = 'mako')
#more combinations : viridis, mako, flare, etc.
gives:
Edit (for colormap):
based on answers at Make seaborn show a colorbar instead of a legend when using hue in a bar plot?
import seaborn as sns
fig = sns.lineplot(data = df, x = 'x', y = 'y', hue = 'colors', palette = 'mako')
norm = plt.Normalize(vmin = df['colors'].min(), vmax = df['colors'].max())
sm = plt.cm.ScalarMappable(cmap="mako", norm = norm)
fig.figure.colorbar(sm)
fig.get_legend().remove()
plt.show()
gives..
Hope that helps..
Complementing to Prateek's very good answer, once you have assigned the colors based on the intensity of the palette you choose (for example Mako):
plots = sns.lineplot(data = df, x = 'x', y = 'y', hue = 'colors',palette='mako')
You can add a colorbar with matplotlib's function plt.colorbar() and assign the palette you used:
sm = plt.cm.ScalarMappable(cmap='mako')
plt.colorbar(sm)
After plt.show(), we get the combined output:
Sorry to my noob question, but how can I add a shadow area/color between the upper and lower lines in a seaborn chart?
The primary code I've working on is the following:
plt.figure(figsize=(18,10))
sns.set(style="darkgrid")
palette = sns.color_palette("mako_r", 3)
sns.lineplot(x="Date", y="Value", hue='Std_Type', style='Value_Type', sizes=(.25, 2.5), palette = palette, data=tbl4)
The idea is to get some effect like below (the example from seaborn website):
But I could not replicate the effect although my data structure is pretty much in the same fashion as fmri (seaborn example)
from seaborn link:
import seaborn as sns
sns.set(style="darkgrid")
# Load an example dataset with long-form data
fmri = sns.load_dataset("fmri")
# Plot the responses for different events and regions
sns.lineplot(x="timepoint", y="signal",
hue="region", style="event",
data=fmri)
Do you have some ideas?
I tried to change the chart style, but if I go to a distplot or relplot, for example, the x_axis cannot show the timeframe...
Check this code:
# import
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
sns.set(style = 'darkgrid')
# data generation
time = pd.date_range(start = '2006-01-01', end = '2020-01-01', freq = 'M')
tbl4 = pd.DataFrame({'Date': time,
'down': 1 - 0.5*np.random.randn(len(time)),
'up': 4 + 0.5*np.random.randn(len(time))})
tbl4 = tbl4.melt(id_vars = 'Date',
value_vars = ['down', 'up'],
var_name = 'Std_Type',
value_name = 'Value')
# figure plot
fig, ax = plt.subplots(figsize=(18,10))
sns.lineplot(ax = ax,
x = 'Date',
y = 'Value',
hue = 'Std_Type',
data = tbl4)
# fill area
plt.fill_between(x = tbl4[tbl4['Std_Type'] == 'down']['Date'],
y1 = tbl4[tbl4['Std_Type'] == 'down']['Value'],
y2 = tbl4[tbl4['Std_Type'] == 'up']['Value'],
alpha = 0.3,
facecolor = 'green')
plt.show()
which gives me this plot:
Since I do not have access to your data, I generated random ones. Replace them with yours.
The shadow area is done with plt.fill_between (documentation here), where you specify the x array (common to both curves), the upper and lower limits of the area as y1 and y2 and, optionally a color and its transparency with the facecolor and alpha parameters respectively.
You cannot do it through ci parameter, since it is used to show the confidence interval of your data.
I was trying to plot multiple lmplots in the same figure. But I am getting too many unwanted subplots.
I found another SO link How to plot 2 seaborn lmplots side-by-side? but that also did not help me.
In this example I want 1 row 2 columns.
MWE
# imports
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# data
df = sns.load_dataset('titanic')
# plot
m,n = 1,2
figsize=(12,8)
cols1 = ['age','fare']
cols2 = ['fare','age']
target = 'survived'
fontsize = 12
fig, ax = plt.subplots(m,n,figsize=figsize)
for i, (col1,col2) in enumerate(zip(cols1,cols2)):
plt.subplot(m,n,i+1)
sns.lmplot(x=col1,y=col2,data=df,
hue=target, palette='Set1',
scatter_kws={'alpha':0.3})
plt.xlabel(col1,fontsize=fontsize)
plt.ylabel(col2,fontsize=fontsize)
plt.tick_params(axis='both', which='major', labelsize=fontsize)
plt.tight_layout()
for i in range(m*n-len(cols1)):
ax.flat[-(i+1)].set_visible(False)
My attempt so far:
df = pd.DataFrame({'x0':[10,20,30,40],
'y0': [100,200,300,400],
'x1':[0.1,0.2,0.3,0.1],
'y1':[0.01,0.02,0.03,0.01],
'target': [0,1,1,1]
})
df1 = df.append(df)
df1 = df1.reset_index(drop=True)
df1['x0'].iloc[len(df):] = df['x1'].to_numpy()
df1['y0'].iloc[len(df):] = df['y1'].to_numpy()
df1['col'] = ['c0']* len(df) + ['c1'] * len(df)
df1 = df1.drop(['x1','y1'],axis=1)
df1 = df1.rename(columns={'x0':'x','y0':'y'})
sns.lmplot(x='x',y='y',hue='target',data=df1,col='col')
Output:
I have a pandas dataframe df with columns x (categorical), y, and z (both floats).
Here is my bar plot.
sns.barplot(data=df, x=x, y=y)
How can I set a color palette for the bars based on the values of the z column? I would like to set a Matplotlib style palette like magma or RdYlBu. Basically, like setting the hue argument, but with a scalar variable.
Thanks in advance!
I'm not sure if there is a way to do this in seaborn. But usually using matplotlib directly works as well.
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame({"x" : list("ABCDEFGH"),
"y" : [3,4,5,2,1,6,3,4],
"z" : [4,5,7,1,4,5,3,4]})
norm = plt.Normalize(df.z.min(), df.z.max())
cmap = plt.get_cmap("magma")
plt.bar(x="x", height="y", data=df, color=cmap(norm(df.z.values)))
plt.show()
If your "categorical" column contains pandas categories, instead of simple strings, you would first need to convert it, df["x"] = df["x"].astype(str).
Simply use the palette argument that corresponds to the hue variable:
sns.barplot(data=df, x=x, y=y, hue=z, palette='magma')
To demonstrate with random data:
import numpy as np
import pandas as pd
import time
from datetime import datetime
import matplotlib.pyplot as plt
import seaborn as sns
data_tools = ['sas', 'stata', 'spss', 'python', 'r', 'julia']
np.random.seed(11212018)
rand_df = pd.DataFrame({'GROUP': np.random.choice(data_tools, 500),
'INT': np.random.randint(1, 10, 500),
'NUM': np.random.randn(500),
})
fig, ax = plt.subplots(figsize=(15,5))
sns.barplot(data=rand_df, x='GROUP', y='NUM', hue='INT', palette='magma', ax=ax, ci=None)
plt.legend(bbox_to_anchor=(1,0.5), loc="center right",)
plt.show()
I am trying to write a loop that will make a figure with 25 subplots, 1 for each country. My code makes a figure with 25 subplots, but the plots are empty. What can I change to make the data appear in the graphs?
fig = plt.figure()
for c,num in zip(countries, xrange(1,26)):
df0=df[df['Country']==c]
ax = fig.add_subplot(5,5,num)
ax.plot(x=df0['Date'], y=df0[['y1','y2','y3','y4']], title=c)
fig.show()
You got confused between the matplotlib plotting function and the pandas plotting wrapper.
The problem you have is that ax.plot does not have any x or y argument.
Use ax.plot
In that case, call it like ax.plot(df0['Date'], df0[['y1','y2']]), without x, y and title. Possibly set the title separately.
Example:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
countries = np.random.choice(list("ABCDE"),size=25)
df = pd.DataFrame({"Date" : range(200),
'Country' : np.repeat(countries,8),
'y1' : np.random.rand(200),
'y2' : np.random.rand(200)})
fig = plt.figure()
for c,num in zip(countries, xrange(1,26)):
df0=df[df['Country']==c]
ax = fig.add_subplot(5,5,num)
ax.plot(df0['Date'], df0[['y1','y2']])
ax.set_title(c)
plt.tight_layout()
plt.show()
Use the pandas plotting wrapper
In this case plot your data via df0.plot(x="Date",y =['y1','y2']).
Example:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
countries = np.random.choice(list("ABCDE"),size=25)
df = pd.DataFrame({"Date" : range(200),
'Country' : np.repeat(countries,8),
'y1' : np.random.rand(200),
'y2' : np.random.rand(200)})
fig = plt.figure()
for c,num in zip(countries, xrange(1,26)):
df0=df[df['Country']==c]
ax = fig.add_subplot(5,5,num)
df0.plot(x="Date",y =['y1','y2'], title=c, ax=ax, legend=False)
plt.tight_layout()
plt.show()
I don't remember that well how to use original subplot system but you seem to be rewriting the plot. In any case you should take a look at gridspec. Check the following example:
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
fig = plt.figure()
gs1 = gridspec.GridSpec(5, 5)
countries = ["Country " + str(i) for i in range(1, 26)]
axs = []
for c, num in zip(countries, range(1,26)):
axs.append(fig.add_subplot(gs1[num - 1]))
axs[-1].plot([1, 2, 3], [1, 2, 3])
plt.show()
Which results in this:
Just replace the example with your data and it should work fine.
NOTE: I've noticed you are using xrange. I've used range because my version of Python is 3.x. Adapt to your version.