I want to create ridgelines plots for the distribution of property Rg as it changes with temperature. It turns out that I have an attribute Z that changes too, so I want the distribution of Rg at a given condition, for both attributes Z1 and Z2. I want the ridgeline plots to be side by side.
This is what I have so far:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import joypy as j
from joypy import joyplot
import seaborn as sns
df_iso = pd.DataFrame(data=d_iso)
df_atac = pd.DataFrame(data=d_atac)
plt.figure()
joyplot(data=df_iso[['temperature', 'Rg']], by='temperature', column='Rg', figsize=(12, 8))
joyplot(data=df_atac[['temperature', 'Rg']], by='temperature', column='Rg', figsize=(12, 8))
plt.title('Ridgeline plot of Rg histograms')
plt.show()
My plots look like this:
I want them to be on the same plot, with different colors and legends for each color.
How can I go about this? Any advice you have would be appreciated.
Related
I have two DataFrame for two different datasets that contain columns RA,Dec, and Vel. I need to plot them to a same scatter plot and show one colorbar instead of two. There's similar question using pure matplotlib here, but I need to do it using scatter plot function from pandas. Here's my experiment so far:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
data1 = pd.DataFrame({'RA':np.random.randint(-100,100,5),
'Dec':np.random.randint(-100,100,5),'Vel':np.random.randint(-20,10,5)})
data2 = pd.DataFrame({'RA':np.random.randint(-100,100,5),
'Dec':np.random.randint(-100,100,5),'Vel':np.random.randint(-10,20,5)})
fig, ax = plt.subplots(figsize=(12, 10))
data1.plot.scatter(x='RA',y='Dec',c='Vel',cmap='rainbow',
marker='^',ax=ax,label='Methanol',vmin=-20, vmax=20)
data2.plot.scatter(x='RA',y='Dec',c='Vel',cmap='rainbow',
marker='o',ax=ax,label='Water',vmin=-20, vmax=20)
ax.set_xlabel('$\Delta$RA (arcsec.)')
ax.set_ylabel('$\Delta$Dec. (arcsec.)')
ax.set_title('Maser Spot')
ax.invert_xaxis()
ax.legend(loc=2)
Using this code, I managed to plot two DataFrame into one scatter plot. But it shows two colorbars as you can see here:
Test Case.
Any help is appreciated.
You can just add colorbar = False in the first plot.
The final code will be :
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
data1 = pd.DataFrame({'RA':np.random.randint(-100,100,5),
'Dec':np.random.randint(-100,100,5),'Vel':np.random.randint(-20,10,5)})
data2 = pd.DataFrame({'RA':np.random.randint(-100,100,5),
'Dec':np.random.randint(-100,100,5),'Vel':np.random.randint(-10,20,5)})
fig, ax = plt.subplots(figsize=(12, 10))
data1.plot.scatter(x='RA',y='Dec',c='Vel',cmap='rainbow',
marker='^',ax=ax,label='Methanol',vmin=-20, vmax=20,
colorbar=False)
data2.plot.scatter(x='RA',y='Dec',c='Vel',cmap='rainbow',
marker='o',ax=ax,label='Water',vmin=-20, vmax=20)
ax.set_xlabel('$\Delta$RA (arcsec.)')
ax.set_ylabel('$\Delta$Dec. (arcsec.)')
ax.set_title('Maser Spot')
ax.invert_xaxis()
ax.legend(loc=2)
I've run multiple regressions and stored the coefficients and standard errors into a data frame like this:
I wanted to make a graph that shows how the coefficient changes for each group over time, like so:
import matplotlib.pyplot as plt
import seaborn as sns
plt.figure(figsize=(14,8))
sns.set(style= "whitegrid")
sns.lineplot(x="time", y="coef",
hue="group",
data=eventstudy)
plt.axhline(y=0 , color='r', linestyle='--')
plt.legend(bbox_to_anchor=(1, 1), loc=2)
plt.show
plt.savefig('eventstudygraph.png')
Which produces:
But I would like to include error bars using the 'stderr' data from my main data set.
I think I can do it using 'plt.errorbar'. But can't seem to figure out how to make it work. At the moment, I've tried adding the 'plt.errorbar line and experimenting different with different iterations:
import matplotlib.pyplot as plt
import seaborn as sns
plt.figure(figsize=(14,8))
sns.set(style= "whitegrid")
sns.lineplot(x="time", y="coef",
hue="group",
data=eventstudy)
plt.axhline(y=0 , color='r', linestyle='--')
plt.errorbar("time", "coef", xerr="stderr", data=eventstudy)
plt.legend(bbox_to_anchor=(1, 1), loc=2)
plt.show
plt.savefig('eventstudygraph.png')
As you can see, it seems to be creating it's own group/line in the graph. I think I would know how to use 'plt.errorbar' if I had just one group, but I don't have a clue how to make it work for 3 groups. Is there some way of making 3 versions of 'plt.errorbar' so I can create the error bars for each group separately? Or is there something simpler?
You need to iterate through the different groups, and plot the errorbar separately, what you have above is plotting all the error bars at one go:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
np.random.seed(111)
df = pd.DataFrame({"time":[1,2,3,4,5]*3,"coef":np.random.uniform(-0.5,0.5,15),
"stderr":np.random.uniform(0.05,0.1,15),
"group":np.repeat(['Monthly','3 Monthly','6 Monthly'],5)})
fig,ax = plt.subplots(figsize=(14,8))
sns.set(style= "whitegrid")
lvls = df.group.unique()
for i in lvls:
ax.errorbar(x = df[df['group']==i]["time"],
y=df[df['group']==i]["coef"],
yerr=df[df['group']==i]["stderr"],label=i)
ax.axhline(y=0 , color='r', linestyle='--')
ax.legend()
The code below takes a dataframe filters by a string in a column and then plot the values of another column
I plot the values of the using histogram and than worked fine until I added Mean, Median and standard deviation but now I am just getting an empty graph where instead the all of the variables mentioned below should be plotted in one graph together with their labels
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import pyplot as plt
from matplotlib import pyplot as plt
import numpy as np
df = pd.read_csv(r'C:/Users/output.csv', delimiter=";", encoding='unicode_escape')
df['Plot_column'] = df['Plot_column'].str.split(',').str[0]
df['Plot_column'] = df['Plot_column'].astype('int64', copy=False)
X=df[df['goal_colum']=='start running']['Plot_column'].values
dev_x= X
mean_=np.mean(dev_x)
median_=np.median(dev_x)
standard_=np.std(dev_x)
plt.hist(dev_x, bins=5)
plt.plot(mean_, label='Mean')
plt.plot(median_, label='Median')
plt.plot(standard_, label='Std Deviation')
plt.title('Data')
https://matplotlib.org/3.1.1/gallery/statistics/histogram_features.html
There are two major ways to plot in matplotlib, pyplot (the easy way) and ax (the hard way). Ax lets you customize your plot more and you should work to move towards that. Try something like the following
num_bins = 50
fig, ax = plt.subplots()
# the histogram of the data
n, bins, patches = ax.hist(dev_x, num_bins, density=1)
ax.plot(np.mean(dev_x))
ax.plot(np.median(dev_x))
ax.plot(np.std(dev_x))
# Tweak spacing to prevent clipping of ylabel
fig.tight_layout()
plt.show()
I am creating a joyplot using joypy.
All my data is between[0,1].
But I get a big range of negative values in the graph:
import joypy
import pandas as pd
from matplotlib import pyplot as plt
from matplotlib import cm
import matplotlib.ticker as ticker
import matplotlib
matplotlib.use('TkAgg')
iris = pd.read_csv("1_5.csv")
fig, axes = joypy.joyplot(iris)
x = [0,0.25,0.5,0.75,1]
plt.xticks(x)
plt.show()
It isn't clear that your xticks are in any way tied to the actual joyplot itself (ie, you've created arbitrary x-ticks and placed them on the plot).
Are tick marks not represented on the plot originally (similar plots I've seen all have them by default)?
Basically, I'm doing scalability analysis, so I'm working with numbers like 2,4,8,16,32... etc and the only way graphs look rational is using a log scale.
But instead of the usual 10^1, 10^2, etc labelling, I want to have these datapoints (2,4,8...) indicated on the axes
Any ideas?
There's more than one way to do it, depending on how flexible/fancy you want to be.
The simplest way is just to do something like this:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
x = np.exp2(np.arange(10))
plt.semilogy(x)
plt.yticks(x, x)
# Turn y-axis minor ticks off
plt.gca().yaxis.set_minor_locator(mpl.ticker.NullLocator())
plt.show()
If you want to do it in a more flexible manner, then perhaps you might use something like this:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
x = np.exp2(np.arange(10))
fig = plt.figure()
ax = fig.add_subplot(111)
ax.semilogy(x)
ax.yaxis.get_major_locator().base(2)
ax.yaxis.get_minor_locator().base(2)
# This will place 1 minor tick halfway (in linear space) between major ticks
# (in general, use np.linspace(1, 2.0001, numticks-2))
ax.yaxis.get_minor_locator().subs([1.5])
ax.yaxis.get_major_formatter().base(2)
plt.show()
Or something like this:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
x = np.exp2(np.arange(10))
fig = plt.figure()
ax = fig.add_subplot(111)
ax.semilogy(x)
ax.yaxis.get_major_locator().base(2)
ax.yaxis.get_minor_locator().base(2)
ax.yaxis.get_minor_locator().subs([1.5])
# This is the only difference from the last snippet, uses "regular" numbers.
ax.yaxis.set_major_formatter(mpl.ticker.ScalarFormatter())
plt.show()