I have a pandas dataframe which I would like to slice, and plot each slice in a separate subplot. I would like to use the sharey='all' and have matplotlib decide on some reasonable y-axis limits, rather than having to search the dataframe for the min and max and add offsets.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.arange(50).reshape((5,10))).transpose()
fig, axes = plt.subplots(nrows=0,ncols=0, sharey='all', tight_layout=True)
for i in range(1, len(df.columns) + 1):
ax = fig.add_subplot(2,3,i)
iC = df.iloc[:, i-1]
iC.plot(ax=ax)
Which gives the following plot:
In fact, it gives that irrespective of what I specify sharey to be ('all','col','row',True, or False). What I sought after using sharey='all' would be something like:
Can somebody perhaps explain me what I'm doing wrong here?
The following version would only add those axes you need for your df-columns and share their y-scales:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.arange(50).reshape((5,10))).transpose()
fig = plt.figure(tight_layout=True)
ref_ax = None
for i in range(len(df.columns)):
ax = fig.add_subplot(2, 3, i+1, sharey=ref_ax)
ref_ax=ax
iC = df.iloc[:, i]
iC.plot(ax=ax)
plt.show()
The grid-layout Parameters, which are explicitly given as ...add_subplot(2, 3, ... here can of course be calculated with respect to len(df.columns).
Your plots are not shared. You create a subplot grid with 0 rows and 0 columns, i.e. no subplots at all, but those nonexisting subplots have their y axes shared. Then you create some other (existing) subplots, which are not shared. Those are the ones that are plotted to.
Instead you need to set nrows and ncols to some useful values and plot to those hence created axes.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.arange(50).reshape((5,10))).transpose()
fig, axes = plt.subplots(nrows=2,ncols=3, sharey='all', tight_layout=True)
for i, ax in zip(range(len(df.columns)), axes.flat):
iC = df.iloc[:, i]
iC.plot(ax=ax)
for j in range(len(df.columns),len(axes.flat)):
axes.flatten()[j].axis("off")
plt.show()
Related
I'm going insane here ... this should be a simple exercise but I'm stuck:
I have a Jupyter notebook and am using the ruptures Python package. All I want to do is, take the figure or AxesSubplot(s) that the display() function returns and add it to a figure of my own, so I can share the x-axis, have a single image, etc.:
import pandas as pd
import matplotlib.pyplot as plt
myfigure = plt.figure()
l = len(df.columns)
for index, series in enumerate(df):
data = series.to_numpy().astype(int)
algo = rpt.KernelCPD(kernel='rbf', min_size=4).fit(data)
result = algo.predict(pen=3)
myfigure.add_subplot(l, 1, index+1)
rpt.display(data, result)
plt.title(series.name)
plt.show()
What I get is a figure with the desired number of subplots (all empty) and n separate figures from ruptures:
When instead I want want the subplots to be filled with the figures ...
I basically had to recreate the plot that ruptures.display(data,result) produces, to get my desired figure:
import pandas as pd
import numpy as np
import ruptures as rpt
import matplotlib.pyplot as plt
from matplotlib.ticker import EngFormatter
fig, axs = plt.subplots(len(df.columns), figsize=(22,20), dpi=300)
for index, series in enumerate(df):
resampled = df[series].dropna().resample('6H').mean().pad()
data = resampled.to_numpy().astype(int)
algo = rpt.KernelCPD(kernel='rbf', min_size=4).fit(data)
result = algo.predict(pen=3)
# Create ndarray of tuples from the result
result = np.insert(result, 0, 0) # Insert 0 as first result
tuples = np.array([ result[i:i+2] for i in range(len(result)-1) ])
ax = axs[index]
# Fill area beween results alternating blue/red
for i, tup in enumerate(tuples):
if i%2==0:
ax.axvspan(tup[0], tup[1], lw=0, alpha=.25)
else:
ax.axvspan(tup[0], tup[1], lw=0, alpha=.25, color='red')
ax.plot(data)
ax.set_title(series)
ax.yaxis.set_major_formatter(EngFormatter())
plt.subplots_adjust(hspace=.3)
plt.show()
I've wasted more time on this than I can justify, but it's pretty now and I can sleep well tonight :D
I am trying to create a grid of subplots. each subplot will look like the one that is on this site.
https://python-graph-gallery.com/24-histogram-with-a-boxplot-on-top-seaborn/
If I have 10 different sets of this style of plot I want to make them into a 5x2 for example.
I have read through the documentation of Matplotlib and cannot seem to figure out how do it. I can loop the subplots and have each output but I cannot make it into the rows and columns
import pandas as pd
import numpy as np
import seaborn as sns
df = pd.DataFrame(np.random.randint(0,100,size=(100, 10)),columns=list('ABCDEFGHIJ'))
for c in df :
# Cut the window in 2 parts
f, (ax_box,
ax_hist) = plt.subplots(2,
sharex=True,
gridspec_kw={"height_ratios":(.15, .85)},
figsize = (10, 10))
# Add a graph in each part
sns.boxplot(df[c], ax=ax_box)
ax_hist.hist(df[c])
# Remove x axis name for the boxplot
plt.show()
the results would just take this loop and put them into a set of rows and columns in this case 5x2
You have 10 columns, each of which creates 2 subplots: a box plot and a histogram. So you need a total of 20 figures. You can do this by creating a grid of 2 rows and 10 columns
Complete answer: (Adjust the figsize and height_ratios as per taste)
import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
f, axes = plt.subplots(2, 10, sharex=True, gridspec_kw={"height_ratios":(.35, .35)},
figsize = (12, 5))
df = pd.DataFrame(np.random.randint(0,100,size=(100, 10)),columns=list('ABCDEFGHIJ'))
for i, c in enumerate(df):
sns.boxplot(df[c], ax=axes[0,i])
axes[1,i].hist(df[c])
plt.tight_layout()
plt.show()
I am trying to write a loop that will make a figure with 25 subplots, 1 for each country. My code makes a figure with 25 subplots, but the plots are empty. What can I change to make the data appear in the graphs?
fig = plt.figure()
for c,num in zip(countries, xrange(1,26)):
df0=df[df['Country']==c]
ax = fig.add_subplot(5,5,num)
ax.plot(x=df0['Date'], y=df0[['y1','y2','y3','y4']], title=c)
fig.show()
You got confused between the matplotlib plotting function and the pandas plotting wrapper.
The problem you have is that ax.plot does not have any x or y argument.
Use ax.plot
In that case, call it like ax.plot(df0['Date'], df0[['y1','y2']]), without x, y and title. Possibly set the title separately.
Example:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
countries = np.random.choice(list("ABCDE"),size=25)
df = pd.DataFrame({"Date" : range(200),
'Country' : np.repeat(countries,8),
'y1' : np.random.rand(200),
'y2' : np.random.rand(200)})
fig = plt.figure()
for c,num in zip(countries, xrange(1,26)):
df0=df[df['Country']==c]
ax = fig.add_subplot(5,5,num)
ax.plot(df0['Date'], df0[['y1','y2']])
ax.set_title(c)
plt.tight_layout()
plt.show()
Use the pandas plotting wrapper
In this case plot your data via df0.plot(x="Date",y =['y1','y2']).
Example:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
countries = np.random.choice(list("ABCDE"),size=25)
df = pd.DataFrame({"Date" : range(200),
'Country' : np.repeat(countries,8),
'y1' : np.random.rand(200),
'y2' : np.random.rand(200)})
fig = plt.figure()
for c,num in zip(countries, xrange(1,26)):
df0=df[df['Country']==c]
ax = fig.add_subplot(5,5,num)
df0.plot(x="Date",y =['y1','y2'], title=c, ax=ax, legend=False)
plt.tight_layout()
plt.show()
I don't remember that well how to use original subplot system but you seem to be rewriting the plot. In any case you should take a look at gridspec. Check the following example:
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
fig = plt.figure()
gs1 = gridspec.GridSpec(5, 5)
countries = ["Country " + str(i) for i in range(1, 26)]
axs = []
for c, num in zip(countries, range(1,26)):
axs.append(fig.add_subplot(gs1[num - 1]))
axs[-1].plot([1, 2, 3], [1, 2, 3])
plt.show()
Which results in this:
Just replace the example with your data and it should work fine.
NOTE: I've noticed you are using xrange. I've used range because my version of Python is 3.x. Adapt to your version.
I have a dataframe as shown below:
PNO,SID,SIZE,N3IC,S4IC,N4TC,KPAC,NAAC,ECTC
SJV026,VIDAIC,FINE,0.0926,0.0446,0.0333,0.0185,0.005,0.0516
SJV028,CHCRUC,FINE,0,0.1472,0.0076,0.0001,0.0025,0.0301
SJV051,AMSUL,FINE,0,0.727,0.273,0,0,0
SJV035,MOVES1,FINE,0.02,0.04092,0,0,0,0.45404
I am looking to plot this using matplotlib or seaborn where there will be 'n' number of subplots for each row of data (that is one bar plot for each row of data).
import matplotlib.pyplot as plt
import pandas as pd
from tkinter.filedialog import askopenfilename
Inp_Filename = askopenfilename()
df = pd.read_csv(Inp_Filename)
rows, columns = df.shape
fig, axes = plt.subplots(rows, 1, figsize=(15, 20))
count = 0
for each in df.iterrows():
row = df.iloc[count,3:]
row.plot(kind='bar')
count = count + 1
plt.show()
The above code output is not what I am looking for. Is there a way to plot each row of the data in the 'fig' and 'axes' above?
In principle the approach is correct. There are just a couple of errors in your code, which, when corrected, give the desired result.
import io
import matplotlib.pyplot as plt
import pandas as pd
u = u"""PNO,SID,SIZE,N3IC,S4IC,N4TC,KPAC,NAAC,ECTC
SJV026,VIDAIC,FINE,0.0926,0.0446,0.0333,0.0185,0.005,0.0516
SJV028,CHCRUC,FINE,0,0.1472,0.0076,0.0001,0.0025,0.0301
SJV051,AMSUL,FINE,0,0.727,0.273,0,0,0
SJV035,MOVES1,FINE,0.02,0.04092,0,0,0,0.45404"""
df = pd.read_csv(io.StringIO(u))
rows, columns = df.shape
fig, axes = plt.subplots(rows, 1, sharex=True, sharey=True)
for i, r in df.iterrows():
row = df.iloc[i,3:]
row.plot(kind='bar', ax=axes[i])
plt.show()
i am try to plot subplot in matplotlib with pandas but there are issue i am facing. when i am plot subplot not show the date of stock...there is my program
import pandas as pd
import datetime
import matplotlib.pyplot as plt
import pandas.io.data
df = pd.io.data.get_data_yahoo('goog', start=datetime.datetime(2008,1,1),end=datetime.datetime(2014,10,23))
fig = plt.figure()
r = fig.patch
r.set_facecolor('#0070BB')
ax1 = fig.add_subplot(2,1,1, axisbg='#0070BB')
ax1.grid(True)
ax1.plot(df['Close'])
ax2 = fig.add_subplot(2,1,2, axisbg='#0070BB')
ax2.plot(df['Volume'])
plt.show()
run this program own your self and solve date issue.....
When you're calling matplotlib's plot(), you are only giving it one array (e.g. df['Close'] in the first case). When there's only one array, matplotlib doesn't know what to use for the x axis data, so it just uses the index of the array. This is why your x axis shows the numbers 0 to 160: there are presumably 160 items in your array.
Use ax1.plot(df.index, df['Close']) instead, since df.index should hold the date values in your pandas dataframe.
import pandas as pd
import datetime
import matplotlib.pyplot as plt
import pandas.io.data
df = pd.io.data.get_data_yahoo('goog', start=datetime.datetime(2008,1,1),end=datetime.datetime(2014,10,23))
fig = plt.figure()
r = fig.patch
r.set_facecolor('#0070BB')
ax1 = fig.add_subplot(2,1,1, axisbg='#0070BB')
ax1.grid(True)
ax1.plot(df.index, df['Close'])
ax2 = fig.add_subplot(2,1,2, axisbg='#0070BB')
ax2.plot(df.index, df['Volume'])
plt.show()