I plot time series of data where the y values of the data are orders of magnitude different.
I am using seaborn.lmplot and was expecting to find a normalise keyword, but have been unable to.
I tried to use a log scale, but this failed (see diagram).
This is my best attempt so far:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
gbp_stats = pd.read_csv('price_data.csv')
sns.lmplot(data=gbp_stats, x='numeric_time', y='last trade price', col='symbol')
plt.yscale('log')
plt.show()
Which gave me this:
As you can see, the result needs to scale or normalize the y-axis for each plot. I could do a normalization in pandas, but wanted to avoid such if possible.
So my question is this: Does seaborn have a normailze feature such that the y-axis can be compared better than what i have achieved?
I post this answer which was directly derived from mwaskom comment sharey=False, with a small tweak as this format was depreciated in seaborn and sharey=False now goes into a dict.
The implementation is to add the keyword which takes a dict like this: facet_kws={'sharey':False}
So the answer becomes this:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
gbp_stats = pd.read_csv('price_data.csv')
sns.lmplot(data=gbp_stats, x='numeric_time', y='last trade price',
col='symbol', hue='symbol', facet_kws={'sharey':False})
plt.yscale('log') # this is optional now.
plt.show()
And the result is this:
Related
Is there a way to add a mean and a mode to a violinplot ? I have categorical data in one of my columns and the corresponding values in the next column. I tried looking into matplotlib violin plot as it technically offers the functionality I am looking for but it does not allow me to specify a categorical variable on the x axis, and this is crucial as I am looking at the distribution of the data per category. I have added a small table illustrating the shape of the data.
plt.figure(figsize=10,15)
ax=sns.violinplot(x='category',y='value',data=df)
First we calculate the the mode and means:
import seaborn as sns
import pandas as pd
from matplotlib import pyplot as plt
df = pd.DataFrame({'Category':[1,2,5,1,2,4,3,4,2],
'Value':[1.5,1.2,2.2,2.6,2.3,2.7,5,3,0]})
Means = df.groupby('Category')['Value'].mean()
Modes = df.groupby('Category')['Value'].agg(lambda x: pd.Series.mode(x)[0])
You can use seaborn to make the basic plot, below I remove the inner boxplot using the inner= argument, so that we can see the mode and means:
fig, ax = plt.subplots()
sns.violinplot(x='Category',y='Value',data=df,inner=None)
plt.setp(ax.collections, alpha=.3)
plt.scatter(x=range(len(Means)),y=Means,c="k")
plt.scatter(x=range(len(Modes)),y=Modes)
If I use the following code I end up with an overcrowded x-axis. I would like to show only every 10th number on the x axis. Meaning [0,10,...].
Any idea how to do this?
import pandas as pd
import numpy as np
import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt
a = pd.DataFrame({'y':np.random.randn(100)})
a['time']=a.index
ax = sns.pointplot(x='time', y="y", data=a)
plt.show()
You may decide not to use a pointplot at all. A usual lineplot seems to suffice.
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
a = pd.DataFrame({'y':np.random.randn(100)})
plt.plot(a.index, a.y)
plt.show()
Now this gives ticks at steps of 20. The easiest option here would be to use
plt.xticks(range(0,101,10))
to get the steps of 10. Or equally possible,
plt.gca().locator_params(nbins=11)
to devide the axis into 11 bins.
Of course the use of an appropriate locator would be equally possible.
I try to plot group wise median values using seaborn's pointlot on top of a swarmplot. Even though I call pointPlot second, the point plot ends up behind the swarmplot. How can I change the 'layer order' such that the point plot is in front of the swarmplot?
datDf=pd.DataFrame({'values':np.random.randint(0,100,100)})
datDf['group']=np.random.randint(0,5,100)
sns.swarmplot(data=datDf,x='group',y='values')
sns.pointplot(data=datDf,x='group',y='values',estimator=np.median,join=False)
Use zorder property to set proper drawing order.
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pylab as plt
datDf=pd.DataFrame({'values':np.random.randint(0,100,100)})
datDf['group']=np.random.randint(0,5,100)
sns.swarmplot(data=datDf,x='group',y='values',zorder=1)
sns.pointplot(data=datDf,x='group',y='values',estimator=np.median,join=False, zorder=100)
plt.show()
I need to make a plot of the following data, with the year_week on x-axis, the test_duration on the y-axis, and each operator as a different series. There may be multiple data points for the same operator in one week. I need to show standard deviation bands around each series.
data = pd.DataFrame({'year_week':[1601,1602,1603,1604,1604,1604],
'operator':['jones','jack','john','jones','jones','jack'],
'test_duration':[10,12,43,7,23,9]})
prints as:
I have looked at seaborn, matplotlib, and pandas, but I cannot find a solution.
It could be that you are looking for seaborn pointplot.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
data = pd.DataFrame({'year_week':[1601,1602,1603,1604,1604,1604],
'operator':['jones','jack','john','jones','jones','jack'],
'test_duration':[10,12,43,7,23,9]})
sns.pointplot(x="year_week", y="test_duration", hue="operator", data=data)
plt.show()
i am having some trouble with a seaborn pointplot.
I am to plot the Temperature vs. growth rate of four kinds of bacteria, so that each type has its own graph, but all four are in the same plot. The thing is, i cannot connect the individual points, i can only get the individual points.
My code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats, integrate
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import seaborn as sns
dataSorted=data.sort_values(['Temperature','Growth_rate'],ascending=[True,True])
plt.subplots()
ax2=sns.pointplot(x='Temperature',y='Growth_rate', hue='Bacteria' ,data=dataSorted,scale=0.7,join=True)
axes2=ax2.axes
axes2.set_xlim(10,60)
axes2.set_ylim(0,1.5)
axes2.set_xticks(np.arange(1,7)*10)
axes2.set_xticklabels(np.arange(1,7)*10)
The output is exactly as specified, apart from the lines between points:
My plot - without lines
I have no idea how to fix this, i have even set the "join" parameter manually, even though it is set as True by default.