This is DataFrame:
age weight score height name
0 12 100 12 501 aa
1 23 120 12 502 bb
2 34 121 13 499 bb
3 32 134 10 499 cc
4 23 133 11 498 cc
5 12 112 19 503 aa
I need to do a four scatter graphs for columns: 'age', 'weight','score','height' , so my code:
fig,axes = plt.subplots(2,2,figsize=(12,8))
property = ['age','weight','score','height']
indexes = df.index.tolist()
for counter in range(0,4):
i = counter % 2
j = math.floor(counter / 2)
scatter = axes[i,j].scatter(indexes,df[property[counter]],c=y)
axes[i,j].set_title(property[counter])
legend = axes[i,j].legend(*scatter.legend_elements())
axes[i,j].add_artist(legend)
As result i got labels as '1','2','3'
How to get labels as 'aa','bb','cc' and with different colors?
Seaborn could create the legends automatically:
from matplotlib import pyplot as plt
import seaborn as sns
import pandas as pd
from io import StringIO
data_str = ''' age weight score height name
0 12 100 12 501 aa
1 23 120 12 502 bb
2 34 121 13 499 bb
3 32 134 10 499 cc
4 23 133 11 498 cc
5 12 112 19 503 aa'''
df = pd.read_csv(StringIO(data_str), delim_whitespace=True)
fig, axes = plt.subplots(2, 2, figsize=(12, 8))
property = ['age', 'weight', 'score', 'height']
indexes = df.index.tolist()
for ax, prop in zip(axes.ravel(), property):
scatter = sns.scatterplot(x=indexes, y=prop, hue='name', data=df, ax=ax)
ax.set_title(prop)
ax.set_ylabel('') # remove default y label
plt.tight_layout()
plt.show()
Related
Below is the data that is used to create the histogram subplot charts in ploty express graph objects.
Below code is used to create histogram subplot charts in ploty express graph objects.
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
specs = [[{'type':'histogram'}, {'type':'histogram'},{'type':'histogram'}]]
fig = make_subplots(rows=1, cols=3, specs=specs, subplot_titles=['<b> Millenials </b>',
'<b> Generation X </b>',
'<b> Boomers </b>'])
fig.add_trace(go.Histogram(
x=df[df['Generation']=='Millenials']['NumCompaniesWorked'],
opacity = 0.5,
marker_color = ['#455f66'] * 15
),1,1)
fig.add_trace(go.Histogram(
x=df[df['Generation']=='Generation X']['NumCompaniesWorked'],
opacity = 0.5,
marker_color = ['#455f66'] * 15
),1,2)
fig.add_trace(go.Histogram(
x=df[df['Generation']=='Boomers']['NumCompaniesWorked'],
opacity = 0.5,
marker_color = ['#455f66'] * 15
),1,3)
fig.update_layout(
showlegend=False,
title=dict(text="<b> Histogram - <br> <span style='color: #f55142'> How to add the box plot and mean vertical line on each diagram </span></b> ",
font=dict(
family="Arial",
size=20,
color='#283747')
))
fig.show()
And below is the output I get from the above code
How can I include the mean (Average) vertical line in a histogram diagrams as the mean values are,
Millenials = 2.2
Generation X = 3.4
Boomers = 4.1
and a box plot above all 03 histogram diagrams.
Which should look like the shown diagram below for all 03 histogram diagrams.
import pandas as pd
import numpy as np
#original df
df = pd.DataFrame({'NumCompaniesWorked':list(range(10)),
'Millenials':[139,407,54,57,55,32,35,28,17,24],
'Generation X':[53,108,83,90,70,27,32,40,26,24],
'Boomers':[5,6,9,12,14,4,3,6,6,4]})
#reorganizing df
dfs = []
for col in ['Millenials', 'Generation X', 'Boomers']:
dfs.append(df[['NumCompaniesWorked', col]].rename(columns={col:'count'}).assign(Generation=col))
df = pd.concat(dfs)
#output
NumCompaniesWorked count Generation
0 0 139 Millenials
1 1 407 Millenials
2 2 54 Millenials
3 3 57 Millenials
4 4 55 Millenials
5 5 32 Millenials
6 6 35 Millenials
7 7 28 Millenials
8 8 17 Millenials
9 9 24 Millenials
0 0 53 Generation X
1 1 108 Generation X
2 2 83 Generation X
3 3 90 Generation X
4 4 70 Generation X
5 5 27 Generation X
6 6 32 Generation X
7 7 40 Generation X
8 8 26 Generation X
9 9 24 Generation X
0 0 5 Boomers
1 1 6 Boomers
2 2 9 Boomers
3 3 12 Boomers
4 4 14 Boomers
5 5 4 Boomers
6 6 3 Boomers
7 7 6 Boomers
8 8 6 Boomers
9 9 4 Boomers
fig = px.histogram(df,
x='NumCompaniesWorked',
y='count',
marginal='box',
facet_col='Generation')
fig.add_vline(x=2.2, line_width=1, line_dash='dash', line_color='gray', col=1)
fig.add_vline(x=3.4, line_width=1, line_dash='dash', line_color='gray', col=2)
fig.add_vline(x=4.1, line_width=1, line_dash='dash', line_color='gray', col=3)
fig.show()
I'm saving the daily stock price for several stocks in a Pandas Dataframe. I'm using python and Jupyter notebook.
Once saved, I'm using matplotlib to graph the prices to check the data.
The idea is to graph 9 stocks at at time in a 3 x 3 subplot.
When I want to check other stock tickers I have to mannualy change each ticker in each subplot, which takes a long time and seems inefficient.
¿Is there a way to do this with some sort of list and for loop?
Here is my current code. It works but it seems to long and hard to update. (Stock tickers are only examples from a vanguard model portfolio).
x = price_df.index
a = price_df["P_VOO"]
b = price_df["P_VGK"]
c = price_df["P_VPL"]
d = price_df["P_IEMG"]
e = price_df["P_MCHI"]
f = price_df["P_VNQ"]
g = price_df["P_GDX"]
h = price_df["P_BND"]
i = price_df["P_BNDX"]
# Plot a figure with various axes scales
fig = plt.figure(figsize=(15,10))
# Subplot 1
plt.subplot(331)
plt.plot(x, a)
plt.title("VOO")
plt.ylim([0,550])
plt.grid(True)
plt.subplot(332)
plt.plot(x, b)
plt.title("VGK")
plt.ylim([0,400])
plt.grid(True)
plt.subplot(333)
plt.plot(x, c)
plt.title('VPL')
plt.ylim([0,110])
plt.grid(True)
plt.subplot(334)
plt.plot(x, d)
plt.title('IEMG')
plt.ylim([0,250])
plt.grid(True)
plt.subplot(335)
plt.plot(x, e)
plt.title('MCHI')
plt.ylim([0,75])
plt.grid(True)
plt.subplot(336)
plt.plot(x, f)
plt.title('P_VNQ')
plt.ylim([0,55])
plt.grid(True)
plt.subplot(337)
plt.plot(x, g)
plt.title('P_GDX')
plt.ylim([0,8])
plt.grid(True)
plt.subplot(338)
plt.plot(x, h)
plt.title('P_BND')
plt.ylim([0,200])
plt.grid(True)
plt.subplot(339)
plt.plot(x, i)
plt.title('P_BNDX')
plt.ylim([0,350])
plt.grid(True)
plt.tight_layout()
Try with DataFrame.plot and enable subplots, set the layout and figsize:
axes = df.plot(subplots=True, title=df.columns.tolist(),
grid=True, layout=(3, 3), figsize=(15, 10))
plt.tight_layout()
plt.show()
Or use plt.subplots to set the layout then plot on those axes with DataFrame.plot:
# setup subplots
fig, axes = plt.subplots(nrows=3, ncols=3, figsize=(15, 10))
# Plot DataFrame on axes
df.plot(subplots=True, ax=axes, title=df.columns.tolist(), grid=True)
plt.tight_layout()
plt.show()
Sample Data and imports:
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
np.random.seed(5)
df = pd.DataFrame(np.random.randint(10, 100, (10, 9)),
columns=list("ABCDEFGHI"))
df:
A B C D E F G H I
0 88 71 26 83 18 72 37 40 90
1 17 86 25 63 90 37 54 87 85
2 75 57 40 94 96 28 19 51 72
3 11 92 26 88 15 68 10 90 14
4 46 61 37 41 12 78 48 93 29
5 28 17 40 72 21 77 75 65 13
6 88 37 39 43 99 95 17 26 24
7 41 19 48 57 26 15 44 55 69
8 34 23 41 42 86 54 15 24 57
9 92 10 17 96 26 74 18 54 47
Does this implementation not work out in your case?
x = price_df.index
cols = ["P_VOO","P_VGK",...] #Populate before running
ylims = [[0,550],...] #Populate before running
# Plot a figure with various axes scales
fig = plt.figure(figsize=(15,10))
# Subplot 1
for i, (col, ylim) in enumerate(zip(cols, ylims)):
plt.subplot(331+i)
plt.plot(x, price_df[col])
plt.title(col.split('_')[1])
plt.ylim(ylim)
plt.grid(True)
Haven't run the code in my local, could have some minor bugs. But you get the general idea, right?
I have a table like this:
data = {'Category':["Toys","Toys","Toys","Toys","Food","Food","Food","Food","Food","Food","Food","Food","Furniture","Furniture","Furniture"],
'Product':["AA","BB","CC","DD","SSS","DDD","FFF","RRR","EEE","WWW","LLLLL","PPPPPP","LPO","NHY","MKO"],
'QTY':[100,200,300,50,20,800,300,450,150,320,400,1000,150,900,1150]}
df = pd.DataFrame(data)
df
Out:
Category Product QTY
0 Toys AA 100
1 Toys BB 200
2 Toys CC 300
3 Toys DD 50
4 Food SSS 20
5 Food DDD 800
6 Food FFF 300
7 Food RRR 450
8 Food EEE 150
9 Food WWW 320
10 Food LLLLL 400
11 Food PPPPP 1000
12 Furniture LPO 150
13 Furniture NHY 900
14 Furniture MKO 1150
So, I need to make bars subplots like this (Sum Products in each Category):
My problem is that I can't figure out how to combine categories, series, and aggregation.
I manage to split them into 3 subplots (1 always stays blank) but I can not unite them ...
import matplotlib.pyplot as plt
fig, axarr = plt.subplots(2, 2, figsize=(12, 8))
df['Category'].value_counts().plot.bar(
ax=axarr[0][0], fontsize=12, color='b'
)
axarr[0][0].set_title("Category", fontsize=18)
df['Product'].value_counts().plot.bar(
ax=axarr[1][0], fontsize=12, color='b'
)
axarr[1][0].set_title("Product", fontsize=18)
df['QTY'].value_counts().plot.bar(
ax=axarr[1][1], fontsize=12, color='b'
)
axarr[1][1].set_title("QTY", fontsize=18)
plt.subplots_adjust(hspace=.3)
plt.show()
Out
What do I need to add to combine them?
This would be a lot easier with seaborn and FacetGrid
import pandas as pd
import seaborn as sns
data = {'Category':["Toys","Toys","Toys","Toys","Food","Food","Food","Food","Food","Food","Food","Food","Furniture","Furniture","Furniture"],
'Product':["AA","BB","CC","DD","SSS","DDD","FFF","RRR","EEE","WWW","LLLLL","PPPPPP","LPO","NHY","MKO"],
'QTY':[100,200,300,50,20,800,300,450,150,320,400,1000,150,900,1150]}
df = pd.DataFrame(data)
g = sns.FacetGrid(df, col='Category', sharex=False, sharey=False, col_wrap=2, height=3, aspect=1.5)
g.map_dataframe(sns.barplot, x='Product', y='QTY')
I will like to know how I can go about plotting a barchart with upper and lower limits of the bins represented by the values in the age_classes column of the dataframe shown below with pandas, seaborn or matplotlib. A sample of the dataframe looks like this:
age_classes total_cases male_cases female_cases
0 0-9 693 381 307
1 10-19 931 475 454
2 20-29 4530 1919 2531
3 30-39 7466 3505 3885
4 40-49 13701 6480 7130
5 50-59 20975 11149 9706
6 60-69 18089 11761 6254
7 70-79 19238 12281 6868
8 80-89 16252 8553 7644
9 >90 4356 1374 2973
10 Unknown 168 84 81
If you want a chart like this:
then you can make it with sns.barplot setting age_classes as x and one columns (in my case total_cases) as y, like in this code:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv('data.csv')
fig, ax = plt.subplots()
sns.barplot(ax = ax,
data = df,
x = 'age_classes',
y = 'total_cases')
plt.show()
I currently have a dataframe that has as an index the years from 1990 to 2014 (25 rows). I want my plot to have the X axis with all the years showing. I'm using add_subplot as I plan to have 4 plots in this figure (all of them with the same X axis).
To create the dataframe:
import pandas as pd
import numpy as np
index = np.arange(1990,2015,1)
columns = ['Total Population','Urban Population']
pop_plot = pd.DataFrame(index=index, columns=columns)
pop_plot = df_.fillna(0)
pop_plot['Total Population'] = np.arange(150,175,1)
pop_plot['Urban Population'] = np.arange(50,125,3)
Total Population Urban Population
1990 150 50
1991 151 53
1992 152 56
1993 153 59
1994 154 62
1995 155 65
1996 156 68
1997 157 71
1998 158 74
1999 159 77
2000 160 80
2001 161 83
2002 162 86
2003 163 89
2004 164 92
2005 165 95
2006 166 98
2007 167 101
2008 168 104
2009 169 107
2010 170 110
2011 171 113
2012 172 116
2013 173 119
2014 174 122
The code that I currently have:
fig = plt.figure(figsize=(10,5))
ax1 = fig.add_subplot(2,2,1, xticklabels=pop_plot.index)
plt.subplot(2, 2, 1)
plt.plot(pop_plot)
legend = plt.legend(pop_plot, bbox_to_anchor=(0.1, 1, 0.8, .45), loc=3, ncol=1, mode='expand')
legend.get_frame().set_alpha(0)
ax1.set_xticks(range(len(pop_plot.index)))
This is the plot that I get:
When I comment the set_xticks I get the following plot:
#ax1.set_xticks(range(len(pop_plot.index)))
I've tried a couple of answers that I found here, but I didn't have much success.
It's not clear what ax1.set_xticks(range(len(pop_plot.index))) should be used for. It will set the ticks to the numbers 0,1,2,3 etc. while your plot should range from 1990 to 2014.
Instead, you want to set the ticks to the numbers of your data:
ax1.set_xticks(pop_plot.index)
Complete corrected example:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
index = np.arange(1990,2015,1)
columns = ['Total Population','Urban Population']
pop_plot = pd.DataFrame(index=index, columns=columns)
pop_plot['Total Population'] = np.arange(150,175,1)
pop_plot['Urban Population'] = np.arange(50,125,3)
fig = plt.figure(figsize=(10,5))
ax1 = fig.add_subplot(2,2,1)
ax1.plot(pop_plot)
legend = ax1.legend(pop_plot, bbox_to_anchor=(0.1, 1, 0.8, .45), loc=3, ncol=1, mode='expand')
legend.get_frame().set_alpha(0)
ax1.set_xticks(pop_plot.index)
plt.show()
The easiest option is to use the xticks parameter for pandas.DataFrame.plot
Pass the dataframe index to xticks: xticks=pop_plot.index
# given the dataframe in the OP
ax = pop_plot.plot(xticks=pop_plot.index, figsize=(15, 5))
# move the legend
ax.legend(bbox_to_anchor=(0.1, 1, 0.8, .45), loc=3, ncol=1, mode='expand', frameon=False)