I have two plots overlaid on each other generated by the following code:
import matplotlib.pyplot as plt
import pandas as pd
width=.5
t=pd.DataFrame({'bars':[3.4,3.1,5.1,5.1,3.8,4.2,5.2,4.0,3.6],'lines':[2.4,2.2,2.4,2.1,2.0,2.1,1.9,1.8,1.9]})
t['bars'].plot(kind='bar',width=width)
t['lines'].plot(secondary_y=True, color='red')
ax=plt.gca()
plt.xlim([-width,len(t['bars'])-width])
ax.set_xticklabels(('1','2','3','4','5','6','7','8','9'))
plt.show()
I want to be able to scale the range of the second y axis to go from 0.0 to 2.5 (instead of 1.8 to 2.4) in steps of .5. How can I define this without changing the bar chart at all?
Pandas returns the axis on which it plots when you call the plot function. Just save that axis and modify the limits using the object oriented approach.
import matplotlib.pyplot as plt
import pandas as pd
width=.5
t=pd.DataFrame({'bars':[3.4,3.1,5.1,5.1,3.8,4.2,5.2,4.0,3.6],'lines':[2.4,2.2,2.4,2.1,2.0,2.1,1.9,1.8,1.9]})
ax1 = t['bars'].plot(kind='bar',width=width)
ax2 = t['lines'].plot(secondary_y=True, color='red')
ax2.set_ylim(0, 2.5)
ax1.set_xlim([-width,len(t['bars'])-width])
ax1.set_xticklabels(('1','2','3','4','5','6','7','8','9'))
plt.show()
Related
I have two DataFrame for two different datasets that contain columns RA,Dec, and Vel. I need to plot them to a same scatter plot and show one colorbar instead of two. There's similar question using pure matplotlib here, but I need to do it using scatter plot function from pandas. Here's my experiment so far:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
data1 = pd.DataFrame({'RA':np.random.randint(-100,100,5),
'Dec':np.random.randint(-100,100,5),'Vel':np.random.randint(-20,10,5)})
data2 = pd.DataFrame({'RA':np.random.randint(-100,100,5),
'Dec':np.random.randint(-100,100,5),'Vel':np.random.randint(-10,20,5)})
fig, ax = plt.subplots(figsize=(12, 10))
data1.plot.scatter(x='RA',y='Dec',c='Vel',cmap='rainbow',
marker='^',ax=ax,label='Methanol',vmin=-20, vmax=20)
data2.plot.scatter(x='RA',y='Dec',c='Vel',cmap='rainbow',
marker='o',ax=ax,label='Water',vmin=-20, vmax=20)
ax.set_xlabel('$\Delta$RA (arcsec.)')
ax.set_ylabel('$\Delta$Dec. (arcsec.)')
ax.set_title('Maser Spot')
ax.invert_xaxis()
ax.legend(loc=2)
Using this code, I managed to plot two DataFrame into one scatter plot. But it shows two colorbars as you can see here:
Test Case.
Any help is appreciated.
You can just add colorbar = False in the first plot.
The final code will be :
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
data1 = pd.DataFrame({'RA':np.random.randint(-100,100,5),
'Dec':np.random.randint(-100,100,5),'Vel':np.random.randint(-20,10,5)})
data2 = pd.DataFrame({'RA':np.random.randint(-100,100,5),
'Dec':np.random.randint(-100,100,5),'Vel':np.random.randint(-10,20,5)})
fig, ax = plt.subplots(figsize=(12, 10))
data1.plot.scatter(x='RA',y='Dec',c='Vel',cmap='rainbow',
marker='^',ax=ax,label='Methanol',vmin=-20, vmax=20,
colorbar=False)
data2.plot.scatter(x='RA',y='Dec',c='Vel',cmap='rainbow',
marker='o',ax=ax,label='Water',vmin=-20, vmax=20)
ax.set_xlabel('$\Delta$RA (arcsec.)')
ax.set_ylabel('$\Delta$Dec. (arcsec.)')
ax.set_title('Maser Spot')
ax.invert_xaxis()
ax.legend(loc=2)
I cannot work out how to change the scale of the y-axis. My code is:
grid = sns.catplot(x='Nationality', y='count',
row='Age', col='Gender',
hue='Type',
data=dfNorthumbria2, kind='bar', ci='No')
I wanted to just go up in full numbers rather than in .5
Update
I just now found this tutorial the probably easiest solution will be the following:
grid.set(yticks=list(range(5)))
From the help of grid.set
Help on method set in module seaborn.axisgrid:
set(**kwargs) method of seaborn.axisgrid.FacetGrid instance
Set attributes on each subplot Axes.
Since seaborn is build on top of matplotlib you can use yticks from plt
import matplotlib.pyplot as plt
plt.yticks(range(5))
However this changed only the yticks of the upper row in my mockup example.
For this reason you probably want to change the y ticks based on the axis with ax.set_yticks(). To get the axis from your grid object you can implemented a list comprehension as follows:
[ax[0].set_yticks(range(0,150,5) )for ax in grid.axes]
A full replicable example would look like this (adapted from here)
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="ticks")
exercise = sns.load_dataset("exercise")
grid = sns.catplot(x="time", y="pulse", hue="kind",
row="diet", data=exercise)
# plt.yticks(range(0,150,5)) # Changed only one y-axis
# Changed y-ticks to steps of 20
[ax[0].set_yticks(range(0,150,20) )for ax in grid.axes]
The code below takes a dataframe filters by a string in a column and then plot the values of another column
I plot the values of the using histogram and than worked fine until I added Mean, Median and standard deviation but now I am just getting an empty graph where instead the all of the variables mentioned below should be plotted in one graph together with their labels
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import pyplot as plt
from matplotlib import pyplot as plt
import numpy as np
df = pd.read_csv(r'C:/Users/output.csv', delimiter=";", encoding='unicode_escape')
df['Plot_column'] = df['Plot_column'].str.split(',').str[0]
df['Plot_column'] = df['Plot_column'].astype('int64', copy=False)
X=df[df['goal_colum']=='start running']['Plot_column'].values
dev_x= X
mean_=np.mean(dev_x)
median_=np.median(dev_x)
standard_=np.std(dev_x)
plt.hist(dev_x, bins=5)
plt.plot(mean_, label='Mean')
plt.plot(median_, label='Median')
plt.plot(standard_, label='Std Deviation')
plt.title('Data')
https://matplotlib.org/3.1.1/gallery/statistics/histogram_features.html
There are two major ways to plot in matplotlib, pyplot (the easy way) and ax (the hard way). Ax lets you customize your plot more and you should work to move towards that. Try something like the following
num_bins = 50
fig, ax = plt.subplots()
# the histogram of the data
n, bins, patches = ax.hist(dev_x, num_bins, density=1)
ax.plot(np.mean(dev_x))
ax.plot(np.median(dev_x))
ax.plot(np.std(dev_x))
# Tweak spacing to prevent clipping of ylabel
fig.tight_layout()
plt.show()
I am plotting with pandas plot() functions as follows:
In:
from matplotlib.pyplot import *
from datetime import date
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
fig, ax = subplots()
df['session_duration_seconds'].sort_index().value_counts().plot(figsize=(25,10), fontsize=24)
ax.legend(['session_duration_seconds'],fontsize=22)
ax.set_xlabel("Title", fontsize=22)
ax.set_ylabel("Title", fontsize=22)
ax.grid()
However, my plot looks very "behind" I would like to expand the plot in order to show in more detail the following section of the figure:
Out:
Thus, my question is how can I expand or getting more close with pandas plot over that portion of the image?
Just an example to show how this could work:
df = pd.DataFrame({'Values': [1000, 1, 2, 3 , 4 , 2, 5]})
df.plot()
Now let's restrict the y-range
import matplotlib.pyplot as plt
df.plot()
plt.ylim(0, 10)
and we see the details of the curve.
Note that the curve is so steep near 0 due to the huge slope induced by the first y-value of 1000.
Also you can just scale the y-axis directly form within pandas plot functions:
df.plot(logy=True)
Using a complicated script that nests among other pandas.DataFrame.plot() and GridSpec in a subplot setting, I have the following problem:
When I create a 2-cols 1-row gridspec, the tick lables are all correct. When I create a 1-col 2-rows gridspec however, as soon as I plot onto the first (upper row) axes using pandas.DataFrame.plot(), the x-ticklabels for the top row disappear (the ticks remain).
It is not the case that the top ticks change once I draw something on the lower ax, sharex appears to not be the issue.
However, my x-labels are still stored:
axes[0].get_xaxis().get_ticklabels()
Out[59]:
<a list of 9 Text major ticklabel objects>
It's just that they're not displayed. I suspected a NullFormatter, but that's not the case either:
axes[0].get_xaxis().get_major_formatter()
Out[57]:
<matplotlib.ticker.ScalarFormatter at 0x7f7414330710>
I get both ticks and labels on the top of the first axes when I do
axes[0].get_xaxis().tick_top()
However, when I then go back to tick_bottom(), I only have ticks on bottom, not the labels.
What can cause my stored labels to not to be displayed despite a "normal" formatter?
Here's a simple example:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import gridspec
df = pd.DataFrame(np.random.rand(100,2), columns=['A', 'B'])
figure = plt.figure()
GridSpec = gridspec.GridSpec(nrows=2, ncols=1)
[plt.subplot(gsSpec) for gsSpec in GridSpec]
axes = figure.axes
df.plot(secondary_y=['B'], ax=axes[0], sharex=False)
It's the secondary_y=['B'] that causes the xticks to disappear. I'm not sure why it does that.
Fortunately, you can use plt.setp(ax.get_xticklabels(), visible=True) (docs) to turn them back on manually:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import gridspec
df = pd.DataFrame(np.random.rand(100,2), columns=['A', 'B'])
figure = plt.figure()
GridSpec = gridspec.GridSpec(nrows=2, ncols=1)
axes = [plt.subplot(gsSpec) for gsSpec in GridSpec]
ax = axes[0]
df.plot(secondary_y=['B'], ax=ax, sharex=True)
plt.setp(ax.get_xticklabels(), visible=True)