Using the code below,
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("population.csv")
df.head()
df["MonthYear"] = df["Month"].map(str) + " " + df["Year"].map(str)
df["MonthYear"] = pd.to_datetime(df["MonthYear"], format="%b %Y")
x = df["MonthYear"]
y = df["Population"]
fig, axs = plt.subplots(nrows=9, ncols=2, figsize = (9,19))
for col, ax in zip(df.columns, axs.flatten()):
ax.plot(x,y)
fig.tight_layout()
plt.show()
Can someone please help me try to figure out how to fix this? I'm doing it for days yet I can't figure it out.
Below:
create a datetime column and set it as index
split your dataset according to different possible values for "Region"
-> there is one subplot per Region
EDIT: with real dataset
EDIT: the author of the question has removed key informations from their question and deleted their comments. So to fully understand this answer:
the dataset is from here
in order to remove the last (empty) subplot: you should add fig.delaxes(axs.flat[-1])
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('denguecases.csv')
df['Date'] = pd.to_datetime(df.apply(lambda row: row.Month + ' ' + str(row.Year), axis=1))
df.set_index('Date', inplace=True)
fig, axs = plt.subplots(nrows=9, ncols=2, figsize = (9,19))
for region, ax in zip(df.Region.unique(), axs.flat):
ax.plot(df.query('Region == #region').Dengue_Cases)
ax.tick_params(axis='x', labelrotation = 45)
ax.set_title(region)
fig.tight_layout()
Try this instead:
for ax in axs.flatten():
ax.plot(x,y)
But this of course will plot the same plot in all the subplots. I am not sure if you have data for each subplot or you are expecting the same data for all plots.
Update:
Lets say you have n columns and you want to make n subplots
x = df["MonthYear"]
column_names = df.columns
n = len(column_names)
fig, axs = plt.subplots(nrows=9, ncols=2, figsize = (9,19))
for i in range(n):
y = df[column_names[i]]
axs.flatten()[i].plot(x,y)
Related
I am trying to put a y-axis on the right and left side of a graph. I am using pandas where I have a data frame take a certain range in an excel sheet and graph it out. The code is able to plot out the three columns that I want vs y however I'm confused on how to get the PM3 scatter plot (ax2) on the right side while keeping the PM1 and AFS scatter plot (ax1 and ax3) on the left. I tried using twinx() and other commands but it doesn't work how I want it. Any suggestions?
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
testproject = r"C:\Users\223070186\Documents\PleaseWork.xlsx"
var = pd.read_excel(testproject, sheet_name ="Test1")
df = pd.DataFrame(var, columns = ["Time", "PM1", "PM3", "AFS"])
df2 = df.iloc[1108:1142, 0:4]
ax1 = df2.plot(kind = "scatter", x = "Time", y = "PM1", color = "r")
ax2 = df2.plot(kind = "scatter", x="Time", y = "PM3", color = "purple", ax =ax1)
ax3 = df2.plot(kind = "scatter", x = "Time", y= "AFS", color = "orange", ax = ax2)
plt.xlabel("Time")
plt.ylabel("PM1, PM3, AFS")
plt.title("Time vs PM1, PM3, AFS splits")
plt.show(ax1 == ax2 == ax3)
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("population.csv")
fig, axs = plt.subplots(nrows=2, ncols=2)
for col, ax in zip(df.columns, axs.flatten()):
ax.plot(x,y)
ax.set_title(col)
plt.subplots_adjust(wspace=.5, hspace=.5)
fig.tight_layout()
plt.show()
The code above results to this:
https://i.stack.imgur.com/vbCnI.png
you need to change the subplot fig, axs = plt.subplots(nrows=1, ncols=2)
Your problem is not with the axis iteration but plotting with a continuous linestyle a set of points which are not x-axis ordered meaning that the line keeps going left and right, hence adds a lot of noise to the visualization.
Try:
fig, axs = plt.subplots(nrows=2, ncols=2)
for col, ax in zip(df.columns, axs.flatten()):
x_order = x.argsort()
ax.plot(x[x_order],y[x_order])
ax.set_title(col)
plt.subplots_adjust(wspace=.5, hspace=.5)
It seems to work in my environment when reproducing it on your sample
import matplotlib.pyplot as plt
import pandas as pd
s = """Month,Year,Region,Population
Jan,2008,Region.V,2.953926
Feb,2008,Region.V,2.183336
Jan,2009,Region.V,5.23598
Feb,2009,Region.V,3.719351
Jan,2008,Region.VI,3.232928
Feb,2008,Region.VI,2.297784
Jan,2009,Region.VI,6.231395
Feb,2009,Region.VI,7.493449"""
data = [l.split(',') for l in s.splitlines() if l]
df = pd.DataFrame(data[1:], columns=data[0])
df['Population'] = df['Population'].astype(float)
df["MonthYear"] = df["Month"].map(str) + " " + df["Year"].map(str)
df["MonthYear"] = pd.to_datetime(df["MonthYear"], format="%b %Y")
x = df["MonthYear"]
y = df['Population']
fig, axs = plt.subplots(nrows=2, ncols=2)
for col, ax in zip(df.columns, axs.flatten()):
x_order = x.argsort()
ax.plot(x[x_order],y[x_order])
ax.set_title(col)
plt.subplots_adjust(wspace=.5, hspace=.5)
fig.tight_layout()
plt.show()
which produces
I want to plot normalized count grouped values with seaborn. At first, I tried doing the following:
fig, ax = plt.subplots(figsize=(10, 6))
ax = sns.histplot(
data = df,
x = 'age_bins',
hue = 'Showup',
multiple="dodge",
stat = 'count',
shrink = 0.4,
)
Original Count
Now I want to normalize each bar relative to the overall 'bin' count. The only way I successeded to do so was by doing this:
fig, ax = plt.subplots(figsize=(10, 6))
ax = sns.histplot(
data = df,
x = 'age_bins',
hue = 'Showup',
multiple="fill",
stat = 'count',
shrink = 0.4,
)
multiple = 'fill'
Now this made me achieve what I wanted in terms of values, but is there anyway to plot the same results but with bars dodged beside each other instead of above each other?
You can group by ages and "showup", count them, then change "showup" to individual columns. Then divide each row by the row total and create a bar plot via pandas:
import matplotlib.pyplot as plt
from matplotlib.ticker import PercentFormatter
import seaborn as sns
import pandas as pd
import numpy as np
ages = ['<10', '<20', '<30', '<40', '<50', '<60', '<70', '++70']
df = pd.DataFrame({'age_bins': np.random.choice(ages, 10000),
'Showup': np.random.choice([True, False], 10000, p=[0.88, 0.12])})
df_counts = df.groupby(['age_bins', 'Showup']).size().unstack().reindex(ages)
df_percentages = df_counts.div(df_counts.sum(axis=1), axis=0) * 100
sns.set() # set default seaborn style
fig, ax = plt.subplots(figsize=(10, 6))
df_percentages.plot.bar(rot=0, ax=ax)
ax.set_xlabel('')
ax.set_ylabel('Percentage per age group')
ax.yaxis.set_major_formatter(PercentFormatter(100))
plt.tight_layout()
plt.show()
Trying to create multiple charts and save it as one image. I managed to combine multiple charts but there is couple things that going wrong. Could not set tittles for all charts only for last one for some reason. Also numbers is not showing in full as last chart. Also want to change colors for line(white), labels(white), background(black) and rotate a date so it would be easily to read it.
dataSet = {"info":[{"title":{"Value":[list of data]}},{"title":{"Value":[list of data]}},
...]}
fig, ax = plt.subplots(2, 3, sharex=False, sharey=False, figsize=(22, 10), dpi=70,
linewidth=0.5)
ax = np.array(ax).flatten()
for i, data in enumerate(dataSet['info']):
for key in data:
df: DataFrame = pd.DataFrame.from_dict(data[key]).fillna(method="backfill")
df['Date'] = pd.to_datetime(df['Date'], unit='ms')
df.index = pd.DatetimeIndex(df['Date'])
x = df['Date']
y = df['Value']
ax[i].plot(x, y)
current_values = plt.gca().get_yticks()
plt.gca().set_yticklabels(['{:,.0f}'.format(x) for x in current_values])
plt.title(key)
plt.show()
Your figure consists of the various axes objects. To set the title for each plot you need to use the corresponding axes object, which provides the relevant methods you need to change the appearance.
See for example:
import matplotlib.pyplot as plt
import numpy as np
fig, axarr = plt.subplots(2, 2)
titles = list("abcd")
for ax, title in zip(axarr.ravel(), titles):
x = np.arange(10)
y = np.random.random(10)
ax.plot(x, y, color='white')
ax.set_title(title)
ax.set_facecolor((0, 0, 0))
fig.tight_layout()
In order to change labels, show the legend, change the background, I would recommend to read the documentations.
For the dates, you can rotate the labels or use fig.autofmt_xdate().
Could someone give me a tip on how to do multiple Y axis plots?
This is some made up data below, how could I put Temperature its own Y axis, Pressure on its own Y axis, and then have both Value1 and Value2 on the same Y axis. I am trying to go for the same look and feel of this SO post answer. Thanks for any tips, I don't understand ax3 = ax.twinx() process, like as far as do I need to define an ax.twinx() for each separate Y axis plot I need?
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
rows,cols = 8760,4
data = np.random.rand(rows,cols)
tidx = pd.date_range('2019-01-01', periods=rows, freq='H')
df = pd.DataFrame(data, columns=['Temperature','Value1','Pressure','Value2'], index=tidx)
# using subplots() function
fig, ax = plt.subplots(figsize=(25,8))
plt.title('Multy Y Plot')
ax2 = ax.twinx()
ax3 = ax.twinx()
ax4 = ax.twinx()
plot1, = ax.plot(df.index, df.Temperature)
plot2, = ax2.plot(df.index, df.Value1, color = 'r')
plot3, = ax3.plot(df.index, df.Pressure, color = 'g')
plot4, = ax4.plot(df.index, df.Value2, color = 'b')
ax.set_xlabel('Date')
ax.set_ylabel('Temperature')
ax2.set_ylabel('Value1')
ax3.set_ylabel('Pressure')
ax4.set_ylabel('Value2')
plt.legend([plot1,plot2,plot3,plot4],list(df.columns))
# defining display layout
plt.tight_layout()
# show plot
plt.show()
This will output everything jumbled up on the same side without separate Y axis for Pressure, Value1, and Value2.
You are adding 4 different plots in one, which is not helpful. I would recommend breaking it into 2 plots w/ shared x-axis "Date":
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
rows,cols = 8760,4
data = np.random.rand(rows,cols)
tidx = pd.date_range('2019-01-01', periods=rows, freq='H')
df = pd.DataFrame(data, columns=['Temperature','Value1','Pressure','Value2'], index=tidx)
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(25,8))
plt.title('Multy Y Plot')
ax1b = ax1.twinx()
plot1a, = ax1.plot(df.index, df.Temperature)
plot1b, = ax1b.plot(df.index, df.Pressure, color='r')
ax1.set_ylabel('Temperature')
ax1b.set_ylabel('Pressure')
ax2b = ax2.twinx()
plot2a, = ax2.plot(df.index, df.Value1, color='k')
plot2b, = ax2b.plot(df.index, df.Value2, color='g')
ax2.set_xlabel('Date')
ax2.set_ylabel('Value1')
ax2b.set_ylabel('Value2')
plt.legend([plot1a, plot1b, plot2a, plot2b], df.columns)
# defining display layout
plt.tight_layout()
# show plot
plt.show()
Here I have added in the first plot (on the top) Temperature and Pressure and on the second plot (on the bottom) Value 1 and Value 2. Normally, we add in the same plot things that make sense to compare on the same x-axis. Pressure and Temperature is a valid combination that is why I combined those two together. But you can do as you wish.
This answer below uses mpatches is how to make the subplot of Value1 and Value2 on the same axis. The solution for this post has subplot for Value1 and Value2 on different axis. Thanks for the help #tzinie!
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
rows,cols = 8760,4
data = np.random.rand(rows,cols)
tidx = pd.date_range('2019-01-01', periods=rows, freq='H')
df = pd.DataFrame(data, columns=['Temperature','Value1','Pressure','Value2'], index=tidx)
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(25,8))
plt.title('Multy Y Plot')
ax1b = ax1.twinx()
plot1a, = ax1.plot(df.index, df.Temperature, color='r') # red
plot1b, = ax1b.plot(df.index, df.Pressure, color='b') # blue
ax1.set_ylabel('Temperature')
ax1b.set_ylabel('Pressure')
ax2.plot(df.index, df.Value1, color='k') # black
ax2.plot(df.index, df.Value2, color='g') # green
ax2.set_xlabel('Date')
ax2.set_ylabel('Value1 & Value2')
red_patch = mpatches.Patch(color='red', label='Temperature')
blue_patch = mpatches.Patch(color='blue', label='Pressure')
green_patch = mpatches.Patch(color='green', label='Value2')
black_patch = mpatches.Patch(color='black', label='Value1')
plt.legend(handles=[red_patch,blue_patch,green_patch,black_patch])
# defining display layout
#plt.tight_layout()
# show plot
plt.show()