How to plot multiple dataframes in subplots - python

I have a few Pandas DataFrames sharing the same value scale, but having different columns and indices. When invoking df.plot(), I get separate plot images. what I really want is to have them all in the same plot as subplots, but I'm unfortunately failing to come up with a solution to how and would highly appreciate some help.

You can manually create the subplots with matplotlib, and then plot the dataframes on a specific subplot using the ax keyword. For example for 4 subplots (2x2):
import matplotlib.pyplot as plt
fig, axes = plt.subplots(nrows=2, ncols=2)
df1.plot(ax=axes[0,0])
df2.plot(ax=axes[0,1])
...
Here axes is an array which holds the different subplot axes, and you can access one just by indexing axes.
If you want a shared x-axis, then you can provide sharex=True to plt.subplots.

You can see e.gs. in the documentation demonstrating joris answer. Also from the documentation, you could also set subplots=True and layout=(,) within the pandas plot function:
df.plot(subplots=True, layout=(1,2))
You could also use fig.add_subplot() which takes subplot grid parameters such as 221, 222, 223, 224, etc. as described in the post here. Nice examples of plot on pandas data frame, including subplots, can be seen in this ipython notebook.

You can plot multiple subplots of multiple pandas data frames using matplotlib with a simple trick of making a list of all data frame. Then using the for loop for plotting subplots.
Working code:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# dataframe sample data
df1 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df2 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df3 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df4 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df5 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df6 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
#define number of rows and columns for subplots
nrow=3
ncol=2
# make a list of all dataframes
df_list = [df1 ,df2, df3, df4, df5, df6]
fig, axes = plt.subplots(nrow, ncol)
# plot counter
count=0
for r in range(nrow):
for c in range(ncol):
df_list[count].plot(ax=axes[r,c])
count+=1
Using this code you can plot subplots in any configuration. You need to define the number of rows nrow and the number of columns ncol. Also, you need to make list of data frames df_list which you wanted to plot.

You can use the familiar Matplotlib style calling a figure and subplot, but you simply need to specify the current axis using plt.gca(). An example:
plt.figure(1)
plt.subplot(2,2,1)
df.A.plot() #no need to specify for first axis
plt.subplot(2,2,2)
df.B.plot(ax=plt.gca())
plt.subplot(2,2,3)
df.C.plot(ax=plt.gca())
etc...

You can use this:
fig = plt.figure()
ax = fig.add_subplot(221)
plt.plot(x,y)
ax = fig.add_subplot(222)
plt.plot(x,z)
...
plt.show()

You may not need to use Pandas at all. Here's a matplotlib plot of cat frequencies:
x = np.linspace(0, 2*np.pi, 400)
y = np.sin(x**2)
f, axes = plt.subplots(2, 1)
for c, i in enumerate(axes):
axes[c].plot(x, y)
axes[c].set_title('cats')
plt.tight_layout()

Option 1: Create subplots from a dictionary of dataframes with long (tidy) data
Assumptions:
There is a dictionary of multiple dataframes of tidy data that are either:
Created by reading in from files
Created by separating a single dataframe into multiple dataframes
The categories, cat, may be overlapping, but all dataframes don't necessarily contain all values of cat
hue='cat'
This example uses a dict of dataframes, but a list of dataframes would be similar.
If the dataframes are wide, use pandas.DataFrame.melt to convert them to long form.
Because dataframes are being iterated through, there's no guarantee that colors will be mapped the same for each plot
A custom color map needs to be created from the unique 'cat' values for all the dataframes
Since the colors will be the same, place one legend to the side of the plots, instead of a legend in every plot
Tested in python 3.10, pandas 1.4.3, matplotlib 3.5.1, seaborn 0.11.2
Imports and Test Data
import pandas as pd
import numpy as np # used for random data
import matplotlib.pyplot as plt
from matplotlib.patches import Patch # for custom legend - square patches
from matplotlib.lines import Line2D # for custom legend - round markers
import seaborn as sns
import math import ceil # determine correct number of subplot
# synthetic data
df_dict = dict()
for i in range(1, 7):
np.random.seed(i) # for repeatable sample data
data_length = 100
data = {'cat': np.random.choice(['A', 'B', 'C'], size=data_length),
'x': np.random.rand(data_length), 'y': np.random.rand(data_length)}
df_dict[i] = pd.DataFrame(data)
# display(df_dict[1].head())
cat x y
0 B 0.944595 0.606329
1 A 0.586555 0.568851
2 A 0.903402 0.317362
3 B 0.137475 0.988616
4 B 0.139276 0.579745
# display(df_dict[6].tail())
cat x y
95 B 0.881222 0.263168
96 A 0.193668 0.636758
97 A 0.824001 0.638832
98 C 0.323998 0.505060
99 C 0.693124 0.737582
Create color mappings and plot
# create color mapping based on all unique values of cat
unique_cat = {cat for v in df_dict.values() for cat in v.cat.unique()} # get unique cats
colors = sns.color_palette('tab10', n_colors=len(unique_cat)) # get a number of colors
cmap = dict(zip(unique_cat, colors)) # zip values to colors
col_nums = 3 # how many plots per row
row_nums = math.ceil(len(df_dict) / col_nums) # how many rows of plots
# create the figue and axes
fig, axes = plt.subplots(row_nums, col_nums, figsize=(9, 6), sharex=True, sharey=True)
# convert to 1D array for easy iteration
axes = axes.flat
# iterate through dictionary and plot
for ax, (k, v) in zip(axes, df_dict.items()):
sns.scatterplot(data=v, x='x', y='y', hue='cat', palette=cmap, ax=ax)
sns.despine(top=True, right=True)
ax.legend_.remove() # remove the individual plot legends
ax.set_title(f'dataset = {k}', fontsize=11)
fig.tight_layout()
# create legend from cmap
# patches = [Patch(color=v, label=k) for k, v in cmap.items()] # square patches
patches = [Line2D([0], [0], marker='o', color='w', markerfacecolor=v, label=k, markersize=8) for k, v in cmap.items()] # round markers
# place legend outside of plot; change the right bbox value to move the legend up or down
plt.legend(title='cat', handles=patches, bbox_to_anchor=(1.06, 1.2), loc='center left', borderaxespad=0, frameon=False)
plt.show()
Option 2: Create subplots from a single dataframe with multiple separate datasets
The dataframes must be in a long form with the same column names.
This option uses pd.concat to combine multiple dataframes into a single dataframe, and .assign to add a new column.
See Import multiple csv files into pandas and concatenate into one DataFrame for creating a single dataframes from a list of files.
This option is easier because it doesn't require manually mapping colors to 'cat'
Combine DataFrames
# using df_dict, with dataframes as values, from the top
# combine all the dataframes in df_dict to a single dataframe with an identifier column
df = pd.concat((v.assign(dataset=k) for k, v in df_dict.items()), ignore_index=True)
# display(df.head())
cat x y dataset
0 B 0.944595 0.606329 1
1 A 0.586555 0.568851 1
2 A 0.903402 0.317362 1
3 B 0.137475 0.988616 1
4 B 0.139276 0.579745 1
# display(df.tail())
cat x y dataset
595 B 0.881222 0.263168 6
596 A 0.193668 0.636758 6
597 A 0.824001 0.638832 6
598 C 0.323998 0.505060 6
599 C 0.693124 0.737582 6
Plot a FacetGrid with seaborn.relplot
sns.relplot(kind='scatter', data=df, x='x', y='y', hue='cat', col='dataset', col_wrap=3, height=3)
Both options create the same result, however, it's less complicated to combine all the dataframes, and plot a figure-level plot with sns.relplot.

Building on #joris response above, if you have already established a reference to the subplot, you can use the reference as well. For example,
ax1 = plt.subplot2grid((50,100), (0, 0), colspan=20, rowspan=10)
...
df.plot.barh(ax=ax1, stacked=True)

Here is a working pandas subplot example, where modes is the column names of the dataframe.
dpi=200
figure_size=(20, 10)
fig, ax = plt.subplots(len(modes), 1, sharex="all", sharey="all", dpi=dpi)
for i in range(len(modes)):
ax[i] = pivot_df.loc[:, modes[i]].plot.bar(figsize=(figure_size[0], figure_size[1]*len(modes)),
ax=ax[i], title=modes[i], color=my_colors[i])
ax[i].legend()
fig.suptitle(name)

import numpy as np
import pandas as pd
imoprt matplotlib.pyplot as plt
fig, ax = plt.subplots(2,2)
df = pd.DataFrame({'A':np.random.randint(1,100,10),
'B': np.random.randint(100,1000,10),
'C':np.random.randint(100,200,10)})
for ax in ax.flatten():
df.plot(ax =ax)

Related

Pointplot and Scatterplot in one figure but X axis is shifting

Hi I'm trying to plot a pointplot and scatterplot on one graph with the same dataset so I can see the individual points that make up the pointplot.
Here is the code I am using:
xlPath = r'path to data here'
df = pd.concat(pd.read_excel(xlPath, sheet_name=None),ignore_index=True)
sns.pointplot(data=df, x='ID', y='HM (N/mm2)', palette='bright', capsize=0.15, alpha=0.5, ci=95, join=True, hue='Layer')
sns.scatterplot(data=df, x='ID', y='HM (N/mm2)')
plt.show()
When I plot, for some reason the points from the scatterplot are offsetting one ID spot right on the x-axis. When I plot the scatter or the point plot separately, they each are in the correct ID spot. Why would plotting them on the same plot cause the scatterplot to offset one right?
Edit: Tried to make the ID column categorical, but that didn't work either.
Seaborn's pointplot creates a categorical x-axis while here the scatterplot uses a numerical x-axis.
Explicitly making the x-values categorical: df['ID'] = pd.Categorical(df['ID']), isn't sufficient, as the scatterplot still sees numbers. Changing the values to strings does the trick. To get them in the correct order, sorting might be necessary.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
# first create some test data
df = pd.DataFrame({'ID': np.random.choice(np.arange(1, 49), 500),
'HM (N/mm2)': np.random.uniform(1, 10, 500)})
df['Layer'] = ((df['ID'] - 1) // 6) % 4 + 1
df['HM (N/mm2)'] += df['Layer'] * 8
df['Layer'] = df['Layer'].map(lambda s: f'Layer {s}')
# sort the values and convert the 'ID's to strings
df = df.sort_values('ID')
df['ID'] = df['ID'].astype(str)
fig, ax = plt.subplots(figsize=(12, 4))
sns.pointplot(data=df, x='ID', y='HM (N/mm2)', palette='bright',
capsize=0.15, alpha=0.5, ci=95, join=True, hue='Layer', ax=ax)
sns.scatterplot(data=df, x='ID', y='HM (N/mm2)', color='purple', ax=ax)
ax.margins(x=0.02)
plt.tight_layout()
plt.show()

Trying to plot a bar chart with age categories issue. Seaborn and Pandas df

HI all I have the following groups of data:
sumcosts = df.groupby('AgeGroup').Costs.sum()
print(sumcosts):
AgeGroup
18-25 536295.37
25-35 1784085.88
35-45 2395250.62
45-55 5483060.33
55-65 11652094.30
65-75 9633490.63
75+ 5186867.32
Name: Costs, dtype: float64
countoftrips = df.groupby('AgeGroup').Booking.nunique()
print(countoftrips):
AgeGroup
18-25 139
25-35 398
35-45 379
45-55 738
55-65 1417
65-75 995
75+ 545
Name: Booking, dtype: int64
When trying to plot these i have used the following:
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import seaborn as sns
sns.set()
fig, ax1 = plt.subplots()
sns.barplot(data=sumcosts, palette="rocket", ax=ax1)
ax2 = ax1.twinx()
sns.lineplot(data=countoftrips, palette="rocket", ax=ax2)
plt.show()
the output is this:
The line section looks correct but the bar chart has obviously stoppoed in the first age bracket. Any ideas on how to correct? I tried to define the x='Agegroup' and y='Costs' but then got errors and this is the most progress I can get to. Thanks very much!
your barplot appears to be showing the sum of all costs, not just those of the 18-25 age group. The fact this bar is appearing under the x-axis label for the 18-25 group is only b/c of the positioning of your axis for the line plot - which makes it confusing.
I created a dummy data set of 1000 rows in a .csv to graph this
example, but my values are different - so the plots will look visually
different, everything else will work the same for you.
Jupyter Notebook Setup:
(images added to reflect outputs)
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sb
%matplotlib inline
# Read in dataset 'df', showing the header
df = pd.read_csv('./data-raw.csv')
df.head()
Assuming you have no NaN values in your data ... otherwise you can use dropna() to remove them.
# Check if there are any NaN values in the all_stocks dataframe
print('Number of NaN values in the columns of our DataFrame:\n', df.isnull().sum())
# Remove any rows that contain NaN values using dropna (as applicable)
data.dropna(axis=0, inplace=True)
Your sumcosts and countoftrips are not a requirement for creating your plots, and I believe are the cause of your plotting error for the bar graph. I've included them here, but are not using them when creating the plot.
Plot Type:
It is also important to keep in mind that a bar plot shows only the mean (or other estimator, i.e std) value, but in many cases, it may be more informative to show the distribution of values at each level of the categorical variables. In that case, other approaches such as a box or violin plot may be more appropriate.
Solution:
This is assuming you want to have the line and bar plot layered over each other, as in your example:
# This plot has both graphs on the axis you outlined in your code,
# I used the ci = None parameter to remove the confidence intervals to
# make the combined plot easier to read (optional)
fig, ax1 = plt.subplots()
sb.barplot(data = df, x = 'AgeGroup', y = 'Costs', ci = None,
ax = ax1, palette = 'rocket', order = ['18-25',
'25-35','35-45','45-55','55-65', '65-75', '75+']);
ax2 = ax1.twinx()
sb.lineplot(data = df, x = 'AgeGroup', y = 'Booking', ax = ax2, ci = None);
plt.xlabel('Age Group Ranges');
plt.show()
Here is an alternative you could try, also using subplot, but separating the two plots.
# Adjusting the plot size just to make it easier to read here:
plt.figure(figsize = [14, 4])
#Bar Chart on Left
plt.subplot(1, 2, 1) # 1 row, 2 cols, subplot 1
sb.barplot(data = df, x = 'AgeGroup', y = 'Costs', palette = 'rocket',
ci = 'sd', order = ['18-25', '25-35', '35-45',
'45-55','55-65', '65-75', '75+']);
plt.xlabel('Age Group Ranges')
plt.ylabel('Costs')
# Line Chart on Right
plt.subplot(1, 2, 2) # 1 row, 2 cols, subplot 2
sb.lineplot(data = df, x = 'AgeGroup', y = 'Booking', ci = None)
plt.xlabel('Age Group Ranges')
plt.ylabel('Bookings');
Hope you find helpful!

Python. Use two y axis for line and bar plots on Seaborn Facetgrid

Updated question and code!
Probably, the tips dataset is not the best example to use, however my issue is reproduced in it, i.e. we see that both point and bar plots share the same Y
I need to combine line and bar plots on one chart. To do this I used seaborn and the following code:
tips = sns.load_dataset('tips')
g = sns.FacetGrid(tips, hue='sex', col='sex', size=4, aspect=2.1, sharey=False, sharex=False)
g = g.map(sns.pointplot, 'day', 'tip', ci=0)
g = g.map(sns.barplot, 'day', 'total_bill', ci=0)
g.set_xticklabels(rotation=45, fontsize=9)
g.set_xticklabels(rotation=45, fontsize=9)
plt.show()
Here is the result:
Everything is okay except the fact that one Y axis is used for both bars and lines on each facetgrid object. I am new to seaborn and currently cannot find a solution. Tried to add "sharey=False" to this line of code
> `g.map(sns.pointplot, 'date', 'worthusdcount')`
however it didn't help.
Any solutions on how to add second Y axis would be appreciated
Here's an example where you apply a custom mapping function to the dataframe of interest. Within the function, you can call plt.gca() to get the current axis at the facet being currently plotted in FacetGrid. Once you have the axis, twinx() can be called just like you would in plain old matplotlib plotting.
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
def facetgrid_two_axes(*args, **kwargs):
data = kwargs.pop('data')
dual_axis = kwargs.pop('dual_axis')
alpha = kwargs.pop('alpha', 0.2)
kwargs.pop('color')
ax = plt.gca()
if dual_axis:
ax2 = ax.twinx()
ax2.set_ylabel('Second Axis!')
ax.plot(data['x'],data['y1'], **kwargs, color='red',alpha=alpha)
if dual_axis:
ax2.bar(df['x'],df['y2'], **kwargs, color='blue',alpha=alpha)
df = pd.DataFrame()
df['x'] = np.arange(1,5,1)
df['y1'] = 1 / df['x']
df['y2'] = df['x'] * 100
df['facet'] = 'foo'
df2 = df.copy()
df2['facet'] = 'bar'
df3 = pd.concat([df,df2])
win_plot = sns.FacetGrid(df3, col='facet', size=6)
(win_plot.map_dataframe(facetgrid_two_axes, dual_axis=True)
.set_axis_labels("X", "First Y-axis"))
plt.show()
This isn't the prettiest plot as you might want to adjust the presence of the second y-axis' label, the spacing between plots, etc. but the code suffices to show how to plot two series of differing magnitudes within FacetGrids.

Pandas and Matplotlib plotting df as subplots with 2 y-axes

I'm trying to plot a dataframe to a few subplots using pandas and matplotlib.pyplot. But I want to have the two columns use different y axes and have those shared between all subplots.
Currently my code is:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'Area':['A', 'A', 'A', 'B', 'B', 'C','C','C','D','D','D','D'],
'Rank':[1,2,3,1,2,1,2,3,1,2,3,4],
'Count':[156,65,152,70,114,110,195,92,44,179,129,76],
'Value':[630,426,312,191,374,109,194,708,236,806,168,812]}
)
df = df.set_index(['Area', 'Rank'])
fig = plt.figure(figsize=(6,4))
for i, l in enumerate(['A','B','C','D']):
if i == 0:
sub1 = fig.add_subplot(141+i)
else:
sub1 = fig.add_subplot(141+i, sharey=sub1)
df.loc[l].plot(kind='bar', ax=sub1)
This produces:
This works to plot the 4 graphs side by side which is what I want but both columns use the same y-axis I'd like to have the 'Count' column use a common y-axis on the left and the 'Value' column use a common secondary y-axis on the right.
Can anybody suggest a way to do this? My attempts thus far have lead to each graph having it's own independent y-axis.
To create a secondary y axis, you can use twinax = ax.twinx(). Once can then join those twin axes via the join method of an axes Grouper, twinax.get_shared_y_axes().join(twinax1, twinax2). See this question for more details.
The next problem is then to get the two different barplots next to each other. Since I don't think there is a way to do this using the pandas plotting wrappers, one can use a matplotlib bar plot, which allows to specify the bar position quantitatively. The positions of the left bars would then be shifted by the bar width.
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'Area':['A', 'A', 'A', 'B', 'B', 'C','C','C','D','D','D','D'],
'Rank':[1,2,3,1,2,1,2,3,1,2,3,4],
'Count':[156,65,152,70,114,110,195,92,44,179,129,76],
'Value':[630,426,312,191,374,109,194,708,236,806,168,812]}
)
df = df.set_index(['Area', 'Rank'])
fig, axes = plt.subplots(ncols=len(df.index.levels[0]), figsize=(6,4), sharey=True)
twinaxes = []
for i, l in enumerate(df.index.levels[0]):
axes[i].bar(df["Count"].loc[l].index.values-0.4,df["Count"].loc[l], width=0.4, align="edge" )
ax2 = axes[i].twinx()
twinaxes.append(ax2)
ax2.bar(df["Value"].loc[l].index.values,df["Value"].loc[l], width=0.4, align="edge", color="C3" )
ax2.set_xticks(df["Value"].loc[l].index.values)
ax2.set_xlabel("Rank")
[twinaxes[0].get_shared_y_axes().join(twinaxes[0], ax) for ax in twinaxes[1:]]
[ax.tick_params(labelright=False) for ax in twinaxes[:-1]]
axes[0].set_ylabel("Count")
axes[0].yaxis.label.set_color('C0')
axes[0].tick_params(axis='y', colors='C0')
twinaxes[-1].set_ylabel("Value")
twinaxes[-1].yaxis.label.set_color('C3')
twinaxes[-1].tick_params(axis='y', colors='C3')
twinaxes[0].relim()
twinaxes[0].autoscale_view()
plt.show()

Pandas groupby results on the same plot

I am dealing with the following data frame (only for illustration, actual df is quite large):
seq x1 y1
0 2 0.7725 0.2105
1 2 0.8098 0.3456
2 2 0.7457 0.5436
3 2 0.4168 0.7610
4 2 0.3181 0.8790
5 3 0.2092 0.5498
6 3 0.0591 0.6357
7 5 0.9937 0.5364
8 5 0.3756 0.7635
9 5 0.1661 0.8364
Trying to plot multiple line graph for the above coordinates (x as "x1 against y as "y1").
Rows with the same "seq" is one path, and has to be plotted as one separate line, like all the x, y coordinates corresponding the seq = 2 belongs to one line, and so on.
I am able to plot them, but on a separate graphs, I want all the lines on the same graph, Using subplots, but not getting it right.
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib notebook
df.groupby("seq").plot(kind = "line", x = "x1", y = "y1")
This creates 100's of graphs (which is equal to the number of unique seq). Suggest me a way to obtain all the lines on the same graph.
**UPDATE*
To resolve the above problem, I implemented the following code:
fig, ax = plt.subplots(figsize=(12,8))
df.groupby('seq').plot(kind='line', x = "x1", y = "y1", ax = ax)
plt.title("abc")
plt.show()
Now, I want a way to plot the lines with specific colors. I am clustering path from seq = 2 and 5 in cluster 1; and path from seq = 3 in another cluster.
So, there are two lines under cluster 1 which I want in red and 1 line under cluster 2 which can be green.
How should I proceed with this?
You need to init axis before plot like in this example
import pandas as pd
import matplotlib.pylab as plt
import numpy as np
# random df
df = pd.DataFrame(np.random.randint(0,10,size=(25, 3)), columns=['ProjID','Xcoord','Ycoord'])
# plot groupby results on the same canvas
fig, ax = plt.subplots(figsize=(8,6))
df.groupby('ProjID').plot(kind='line', x = "Xcoord", y = "Ycoord", ax=ax)
plt.show()
Consider the dataframe df
df = pd.DataFrame(dict(
ProjID=np.repeat(range(10), 10),
Xcoord=np.random.rand(100),
Ycoord=np.random.rand(100),
))
Then we create abstract art like this
df.set_index('Xcoord').groupby('ProjID').Ycoord.plot()
Another way:
for k,g in df.groupby('ProjID'):
plt.plot(g['Xcoord'],g['Ycoord'])
plt.show()
Here is a working example including the ability to adjust legend names.
grp = df.groupby('groupCol')
legendNames = grp.apply(lambda x: x.name) #Get group names using the name attribute.
#legendNames = list(grp.groups.keys()) #Alternative way to get group names. Someone else might be able to speak on speed. This might iterate through the grouper and find keys which could be slower? Not sure
plots = grp.plot('x1','y1',legend=True, ax=ax)
for txt, name in zip(ax.legend_.texts, legendNames):
txt.set_text(name)
Explanation:
Legend values get stored in the parameter ax.legend_ which in turn contains a list of Text() objects, with one item per group, where Text class is found within the matplotlib.text api. To set the text object values, you can use the setter method set_text(self, s).
As a side note, the Text class has a number of set_X() methods that allow you to change the font sizes, fonts, colors, etc. I haven't used those, so I don't know for sure they work, but can't see why not.
based on Serenity's anwser, i make the legend better.
import pandas as pd
import matplotlib.pylab as plt
import numpy as np
# random df
df = pd.DataFrame(np.random.randint(0,10,size=(25, 3)), columns=['ProjID','Xcoord','Ycoord'])
# plot groupby results on the same canvas
grouped = df.groupby('ProjID')
fig, ax = plt.subplots(figsize=(8,6))
grouped.plot(kind='line', x = "Xcoord", y = "Ycoord", ax=ax)
ax.legend(labels=grouped.groups.keys()) ## better legend
plt.show()
and you can also do it like:
grouped = df.groupby('ProjID')
fig, ax = plt.subplots(figsize=(8,6))
g_plot = lambda x:x.plot(x = "Xcoord", y = "Ycoord", ax=ax, label=x.name)
grouped.apply(g_plot)
plt.show()
and it looks like:

Categories