matplotlib loop make subplot for each category - python

I am trying to write a loop that will make a figure with 25 subplots, 1 for each country. My code makes a figure with 25 subplots, but the plots are empty. What can I change to make the data appear in the graphs?
fig = plt.figure()
for c,num in zip(countries, xrange(1,26)):
df0=df[df['Country']==c]
ax = fig.add_subplot(5,5,num)
ax.plot(x=df0['Date'], y=df0[['y1','y2','y3','y4']], title=c)
fig.show()

You got confused between the matplotlib plotting function and the pandas plotting wrapper.
The problem you have is that ax.plot does not have any x or y argument.
Use ax.plot
In that case, call it like ax.plot(df0['Date'], df0[['y1','y2']]), without x, y and title. Possibly set the title separately.
Example:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
countries = np.random.choice(list("ABCDE"),size=25)
df = pd.DataFrame({"Date" : range(200),
'Country' : np.repeat(countries,8),
'y1' : np.random.rand(200),
'y2' : np.random.rand(200)})
fig = plt.figure()
for c,num in zip(countries, xrange(1,26)):
df0=df[df['Country']==c]
ax = fig.add_subplot(5,5,num)
ax.plot(df0['Date'], df0[['y1','y2']])
ax.set_title(c)
plt.tight_layout()
plt.show()
Use the pandas plotting wrapper
In this case plot your data via df0.plot(x="Date",y =['y1','y2']).
Example:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
countries = np.random.choice(list("ABCDE"),size=25)
df = pd.DataFrame({"Date" : range(200),
'Country' : np.repeat(countries,8),
'y1' : np.random.rand(200),
'y2' : np.random.rand(200)})
fig = plt.figure()
for c,num in zip(countries, xrange(1,26)):
df0=df[df['Country']==c]
ax = fig.add_subplot(5,5,num)
df0.plot(x="Date",y =['y1','y2'], title=c, ax=ax, legend=False)
plt.tight_layout()
plt.show()

I don't remember that well how to use original subplot system but you seem to be rewriting the plot. In any case you should take a look at gridspec. Check the following example:
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
fig = plt.figure()
gs1 = gridspec.GridSpec(5, 5)
countries = ["Country " + str(i) for i in range(1, 26)]
axs = []
for c, num in zip(countries, range(1,26)):
axs.append(fig.add_subplot(gs1[num - 1]))
axs[-1].plot([1, 2, 3], [1, 2, 3])
plt.show()
Which results in this:
Just replace the example with your data and it should work fine.
NOTE: I've noticed you are using xrange. I've used range because my version of Python is 3.x. Adapt to your version.

Related

sharey='all' argument in plt.subplots() not passed to df.plot()?

I have a pandas dataframe which I would like to slice, and plot each slice in a separate subplot. I would like to use the sharey='all' and have matplotlib decide on some reasonable y-axis limits, rather than having to search the dataframe for the min and max and add offsets.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.arange(50).reshape((5,10))).transpose()
fig, axes = plt.subplots(nrows=0,ncols=0, sharey='all', tight_layout=True)
for i in range(1, len(df.columns) + 1):
ax = fig.add_subplot(2,3,i)
iC = df.iloc[:, i-1]
iC.plot(ax=ax)
Which gives the following plot:
In fact, it gives that irrespective of what I specify sharey to be ('all','col','row',True, or False). What I sought after using sharey='all' would be something like:
Can somebody perhaps explain me what I'm doing wrong here?
The following version would only add those axes you need for your df-columns and share their y-scales:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.arange(50).reshape((5,10))).transpose()
fig = plt.figure(tight_layout=True)
ref_ax = None
for i in range(len(df.columns)):
ax = fig.add_subplot(2, 3, i+1, sharey=ref_ax)
ref_ax=ax
iC = df.iloc[:, i]
iC.plot(ax=ax)
plt.show()
The grid-layout Parameters, which are explicitly given as ...add_subplot(2, 3, ... here can of course be calculated with respect to len(df.columns).
Your plots are not shared. You create a subplot grid with 0 rows and 0 columns, i.e. no subplots at all, but those nonexisting subplots have their y axes shared. Then you create some other (existing) subplots, which are not shared. Those are the ones that are plotted to.
Instead you need to set nrows and ncols to some useful values and plot to those hence created axes.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.arange(50).reshape((5,10))).transpose()
fig, axes = plt.subplots(nrows=2,ncols=3, sharey='all', tight_layout=True)
for i, ax in zip(range(len(df.columns)), axes.flat):
iC = df.iloc[:, i]
iC.plot(ax=ax)
for j in range(len(df.columns),len(axes.flat)):
axes.flatten()[j].axis("off")
plt.show()

Format x-axis tick labels to seams like the default pandas plot

I'm trying to set my plot xticks to similar to the pandas dataframe default format.
I've been trying to set using the plt.set_xticklabels functions, but did not succeed.
fig, axarr = plt.subplots(len(stations), 2, figsize=(10,11))
plt.subplots_adjust(bottom=0.05)
hPc3.plot(use_index=True, subplots=True, ax=axarr[0:len(stations),0],
for i in range(0,len(axarr)):
axarr[i,0].set_ylabel('$nT$')
axarr[len(stations)-1,0].set_xlabel('$(UT)$')
for i in range(0,len(axarr)):
plot4 = axarr[i,1].pcolormesh(tti, wPc3_period[i], np.log10(abs(wPc3_power[i])), cmap = 'jet')
axarr[i,1].set_yscale('log', basey=2, subsy=None)
axarr[i,1].set_xlabel('$(UT)$')
axarr[i,1].set_ylabel('$Period$ $(s)$')
axarr[i,1].set_ylim([np.min(wPc3_period[i]), np.max(wPc3_period[i])])
axarr[i,1].invert_yaxis()
axarr[i,1].plot(tti, te_coi3, 'w')
cbar_coord = replace_at_index1(make_axes_locatable(axarr[i,1]).get_position(), [0,2], [0.92, 0.01])
cbar_ax = fig.add_axes(cbar_coord)
cbar = plt.colorbar(plot4, cax=cbar_ax, boundaries=np.linspace(-10, 10, 512),
ticks=[-10, -5, 0, 5, 10], label='$log_{2}$')
cbar.set_clim([-10,5])
the left panel show the default label of pandas data frame plot. The right panel is how is my formatation
Matplotlib dates api provides plenty of convenience functions and classes to represent and convert date and time data.
You can reproduce pandas style using a simple combination of DateFormatter, DayLocator and HourLocator. Here's an example on a dummy dataset given you didn't provide complete working code, but it shouldn't be hard to adapt to your use case.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
# create toy dataset
index = pd.date_range("2018-05-25 00:00:00", "2018-05-26 00:00:00", freq = "1min")
series = pd.Series(np.random.random(len(index)), index=index)
x = index.to_pydatetime()
y = series
# plot
fig = plt.figure(figsize=(5,1))
ax = fig.gca()
ax.xaxis.set_minor_formatter(mdates.DateFormatter("%H:%M"))
ax.xaxis.set_minor_locator(mdates.HourLocator(interval=3))
ax.tick_params(which='minor', labelrotation=30)
ax.xaxis.set_major_formatter(mdates.DateFormatter("%d-%b"))
ax.xaxis.set_major_locator(mdates.DayLocator())
ax.tick_params(which='major', pad=10, labelrotation=30)
ax.set_xlim(x.min(), x.max())
ax.plot(x, y)
plt.show()

How to plot heat map in matplotlib with label at both side right and left

UPDATED
I have write down a code like the given bellow..
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
df = pd.read_csv("data_1.csv",index_col="Group")
print df
fig,ax = plt.subplots(1)
heatmap = ax.pcolor(df)########
ax.pcolor(df,edgecolors='k')
cbar = plt.colorbar(heatmap)##########
plt.ylim([0,12])
ax.invert_yaxis()
locs_y, labels_y = plt.yticks(np.arange(0.5, len(df.index), 1), df.index)
locs_x, labels_x = plt.xticks(np.arange(0.5, len(df.columns), 1), df.columns)
ax.set_xticklabels(labels_x, rotation=10)
ax.set_yticklabels(labels_y,fontsize=10)
plt.show()
Which takes input like given bellow and plot a heat map with the two side leabel left and bottom..
GP1,c1,c2,c3,c4,c5
S1,21,21,20,69,30
S2,28,20,20,39,25
S3,20,21,21,44,21
I further want to add additional labels at right side as given bellow to the data and want to plot a heatmap with three side label. right left and bottom.
GP1,c1,c2,c3,c4,c5
S1,21,21,20,69,30,V1
S2,28,20,20,39,25,V2
S3,20,21,21,44,21,V3
What changes should i incorporate into the code.
Please help ..
You may create a new axis on the right of the plot, called twinx. Then you need to essentially adjust this axis the same way you already did with the first axis.
u = u"""GP1,c1,c2,c3,c4,c5
S1,21,21,20,69,30
S2,28,20,20,39,25
S3,20,21,21,44,21"""
import io
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df= pd.read_csv(io.StringIO(u),index_col="GP1")
fig,ax = plt.subplots(1)
heatmap = ax.pcolor(df, edgecolors='k')
cbar = plt.colorbar(heatmap, pad=0.1)
bx = ax.twinx()
ax.set_yticks(np.arange(0.5, len(df.index), 1))
ax.set_xticks(np.arange(0.5, len(df.columns), 1), )
ax.set_xticklabels(df.columns, rotation=10)
ax.set_yticklabels(df.index,fontsize=10)
bx.set_yticks(np.arange(0.5, len(df.index), 1))
bx.set_yticklabels(["V1","V2","V3"],fontsize=10)
ax.set_ylim([0,12])
bx.set_ylim([0,12])
ax.invert_yaxis()
bx.invert_yaxis()
plt.show()

Seaborn boxplot with 2 y-axes

How can I create a seaborn boxplot with 2 y-axes? I need this because of different scales. My current code will overwrite the first box in the boxplot, eg. it is populated by 2 first data item from first ax and first item from second ax.
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
matplotlib.style.use('ggplot')
import seaborn as sns
df = pd.DataFrame({'A': pd.Series(np.random.uniform(0,1,size=10)),
'B': pd.Series(np.random.uniform(10,20,size=10)),
'C': pd.Series(np.random.uniform(10,20,size=10))})
fig = plt.figure()
# 2/3 of A4
fig.set_size_inches(7.8, 5.51)
plt.ylim(0.0, 1.1)
ax1 = fig.add_subplot(111)
ax1 = sns.boxplot(ax=ax1, data=df[['A']])
ax2 = ax1.twinx()
boxplot = sns.boxplot(ax=ax2, data=df[['B','C']])
fig = boxplot.get_figure()
fig
How do I prevent the first item getting overwritten?
EDIT:
If I add positions argument
boxplot = sns.boxplot(ax=ax2, data=df[['B','C']], positions=[2,3])
I get an exception:
TypeError: boxplot() got multiple values for keyword argument 'positions'
Probably because seaborn already sets that argument internally.
It may not make too much sense to use seaborn here. Using usual matplotlib boxplots allows you to use the positions argument as expected.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('ggplot')
df = pd.DataFrame({'A': pd.Series(np.random.uniform(0,1,size=10)),
'B': pd.Series(np.random.uniform(10,20,size=10)),
'C': pd.Series(np.random.uniform(10,20,size=10))})
fig, ax1 = plt.subplots(figsize=(7.8, 5.51))
props = dict(widths=0.7,patch_artist=True, medianprops=dict(color="gold"))
box1=ax1.boxplot(df['A'].values, positions=[0], **props)
ax2 = ax1.twinx()
box2=ax2.boxplot(df[['B','C']].values,positions=[1,2], **props)
ax1.set_xlim(-0.5,2.5)
ax1.set_xticks(range(len(df.columns)))
ax1.set_xticklabels(df.columns)
for b in box1["boxes"]+box2["boxes"]:
b.set_facecolor(next(ax1._get_lines.prop_cycler)["color"])
plt.show()

iPython/Jupyter Notebook and Pandas, how to plot multiple graphs in a for loop?

Consider the following code running in iPython/Jupyter Notebook:
from pandas import *
%matplotlib inline
ys = [[0,1,2,3,4],[4,3,2,1,0]]
x_ax = [0,1,2,3,4]
for y_ax in ys:
ts = Series(y_ax,index=x_ax)
ts.plot(kind='bar', figsize=(15,5))
I would expect to have 2 separate plots as output, instead, I get the two series merged in one single plot.
Why is that? How can I get two separate plots keeping the for loop?
Just add the call to plt.show() after you plot the graph (you might want to import matplotlib.pyplot to do that), like this:
from pandas import Series
import matplotlib.pyplot as plt
%matplotlib inline
ys = [[0,1,2,3,4],[4,3,2,1,0]]
x_ax = [0,1,2,3,4]
for y_ax in ys:
ts = Series(y_ax,index=x_ax)
ts.plot(kind='bar', figsize=(15,5))
plt.show()
In the IPython notebook the best way to do this is often with subplots. You create multiple axes on the same figure and then render the figure in the notebook. For example:
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
ys = [[0,1,2,3,4],[4,3,2,1,0]]
x_ax = [0,1,2,3,4]
fig, axs = plt.subplots(ncols=2, figsize=(10, 4))
for i, y_ax in enumerate(ys):
pd.Series(y_ax, index=x_ax).plot(kind='bar', ax=axs[i])
axs[i].set_title('Plot number {}'.format(i+1))
generates the following charts

Categories