Could not plot multiple horizontal bar side by side - python

I would like to plot multiple horizontal bar charts sharing the same y-axis. To elaborate, I have 4 dataframes, each representing a bar chart. I want to use these dataframes to plot 2 horizontal bar charts at the left and another 2 at the right. Right now, I am only able to display one horizontal bar chart at left and right. Below are my desired output, code, and error
data1 = {
'age': ['20-24 Years', '25-29 Years', '30-34 Years', '35-39 Years', '40-44 Years', '45-49 Years'],
'single_value': [97, 75, 35, 19, 15, 13]
}
data2 = {
'age': ['20-24 Years', '25-29 Years', '30-34 Years', '35-39 Years', '40-44 Years', '45-49 Years'],
'single_value': [98, 79, 38, 16, 15, 13]
}
data3 = {
'age': ['20-24 Years', '25-29 Years', '30-34 Years', '35-39 Years', '40-44 Years', '45-49 Years'],
'single_value': [89, 52, 22, 16, 12, 13]
}
data4 = {
'age': ['20-24 Years', '25-29 Years', '30-34 Years', '35-39 Years', '40-44 Years', '45-49 Years'],
'single_value': [95, 64, 27, 18, 15, 13]
}
df_male_1 = pd.DataFrame(data1)
df_male_2 = pd.DataFrame(data2)
df_female_1 = pd.DataFrame(data3)
df_female_2 = pd.DataFrame(data4)
fig, axes = plt.subplots(ncols=2, sharey=True, figsize=(12,12))
axes[0].barh(df_male_1['age'], df_male_1['single_value'], align='center',
color='red', zorder=10)
axes[0].barh(df_male_2['age'], df_male_2['single_value'], align='center',
color='blue', zorder=10)
axes[0].set(title='Age Group (Male)')
axes[1].barh(df_female_1['age'], df_female_1['single_value'],
align='center', color='pink', zorder=10)
axes[1].barh(df_female_2['age'], df_female_2['single_value'],
align='center', color='purple', zorder=10)
axes[1].set(title='Age Group (Female)')
axes[0].invert_xaxis()
axes[0].set(yticks=df_male_1['age'])
axes[0].yaxis.tick_right()
for ax in axes.flat:
ax.margins(0.09)
ax.grid(True)
fig.tight_layout()
fig.subplots_adjust(wspace=0.09)
plt.show()
Error output

The problem is that currently your bars are overlapping each other because they are center aligned by default. To get the desired figure, you have to align them at the edges. To have them adjacent to each other, you have to use negative and positive heights (horizontal width of bars). You can choose the value of height as per needs
Following is the modified code (only showing relevant part)
fig, axes = plt.subplots(ncols=2, sharey=True, figsize=(12,12))
axes[0].barh(df_male_1['age'], df_male_1['single_value'], align='edge', height=0.3,
color='red', zorder=10)
axes[0].barh(df_male_2['age'], df_male_2['single_value'], align='edge', height=-0.3,
color='blue', zorder=10)
axes[0].set(title='Age Group (Male)')
axes[1].barh(df_female_1['age'], df_female_1['single_value'], align='edge',height=0.3,
color='pink', zorder=10)
axes[1].barh(df_female_2['age'], df_female_2['single_value'], align='edge', height=-0.3,
color='purple', zorder=10)

Related

When I run the code, graph axes appear under the table [duplicate]

This question already has answers here:
How to hide axes in matplotlib.pyplot
(2 answers)
Closed 2 months ago.
Consider:
import matplotlib as mpl
import matplotlib.patches as patches
from matplotlib import pyplot as plt
import datetime
# First, we'll create a new figure and axis object
fig, ax = plt.subplots(figsize=(8, 6))
# Set the number of rows and cols for our table
rows = 10
cols = 6
# Create a coordinate system based on the number of rows/columns
# Adding a bit of padding on bottom (-1), top (1), right (0.5)
ax.set_ylim(-1, rows + 1)
ax.set_xlim(0, cols + .5)
x = datetime.datetime.now()
e = datetime.datetime.now()
# Sample data
data = [
{'id': 'player10', 'Price %': 125.658, 'Vol %': 255.489, 'goals': 125.859},
{'id': 'player9', 'Price %': 2, 'Vol %': 72, 'goals': 0},
{'id': 'player8', 'Price %': 3, 'Vol %': 47, 'goals': 0},
{'id': 'player7', 'Price %': 4, 'Vol %': 99, 'goals': 0},
{'id': 'player6', 'Price %': 5, 'Vol %': 84, 'goals': 1},
{'id': 'player5', 'Price %': 6, 'Vol %': 56, 'goals': 2},
{'id': 'player4', 'Price %': 7, 'Vol %': 67, 'goals': 0},
{'id': 'player3', 'Price %': 8, 'Vol %': 91, 'goals': 1},
{'id': 'player2', 'Price %': 9, 'Vol %': 75, 'goals': 3},
{'id': 'player1', 'Price %': 10, 'Vol %': 70, 'goals': 4}
]
for row in range(rows):
d = data[row]
ax.text(x=.5, y=row, s=d['id'], va='center', ha='left')
ax.text(x=2.5, y=row, s=d['Price %'], va='center', ha='right')
ax.text(x=3.5, y=row, s=d['Vol %'], va='center', ha='right')
ax.text(x=4.5, y=row, s=d['goals'], va='center', ha='right')
ax.text(.5, 9.75, '', weight='bold', ha='left')
ax.text(2.5, 9.75, 'Price %', weight='bold', ha='right')
ax.text(3.5, 9.75, 'Vol %', weight='bold', ha='right')
ax.text(4.5, 9.75, 'Goals', weight='bold', ha='right')
for row in range(rows):
ax.plot(
[0, cols + 1],
[row -.5, row - .5],
ls=':',
lw='.5',
c='grey'
)
ax.plot([0, cols + 1], [9.5, 9.5], lw='.5', c='black')
ax.set_title(
" test1 %s/%s/%s" % (e.day, e.month, e.year),
loc='left',
fontsize=15,
weight='bold',
color='r'
)
plt.show()
When I run the Python code, the chart axes appear under the table. How can I destroy them?
Waiting for the axes not to come out. How can I fix this?
I don't think there is an error anywhere, but I think there is a place where I wrote missing. I have not encountered such a problem in other graphics, but I do not know what the fix will be in this code.
Try this (matplotlib.pyplot.axis, axis -off parameter):
plt.axis('off')
plt.show()
Gives #

How to stack only selected columns in pandas barh plot

I am trying to plot a bar chart where I would like to have two bars, one stacked and another one not stacked by the side of the stacked one.
I have the first plot which is a stacked plot:
And another plot, with the same lines and columns:
I want to plot it side by side to the columns of the last plot, and not stack it:
This is a code snippet to replicate my problem:
d = pd.DataFrame({'DC': {'col0': 257334.0,
'col1': 0.0,
'col2': 0.0,
'col3': 186146.0,
'col4': 0.0,
'col5': 366431.0,
'col6': 461.0,
'col7': 0.0,
'col8': 0.0},
'DC - IDC': {'col0': 32665.0,
'col1': 0.0,
'col2': 156598.0,
'col3': 0.0,
'col4': 176170.0,
'col5': 0.0,
'col6': 0.0,
'col7': 0.0,
'col8': 0.0},
'No Address': {'col0': 292442.0,
'col1': 227.0,
'col2': 298513.0,
'col3': 117167.0,
'col4': 249.0,
'col5': 747753.0,
'col6': 271976.0,
'col7': 9640.0,
'col8': 211410.0}})
d[['DC', 'DC - IDC']].plot.barh(stacked=True)
d[['No Address']].plot.barh( stacked=False, color='red')
Use position parameter to draw 2 columns on the same index:
fig, ax = plt.subplots()
d[['DC', 'DC - IDC']].plot.barh(width=0.4, position=0, stacked=True, ax=ax)
d[['No Address']].plot.barh(width=0.4, position=1, stacked=True, ax=ax, color='red')
plt.show()
You can achieve this only by using matplotlib.pyplot library. First, you need to import NumPy and matplotlib libraries.
import matplotlib.pyplot as plt
import numpy as np
Then,
plt.figure(figsize=(15,8))
plt.barh(d.index, d['DC'], 0.4, label='DC', align='edge')
plt.barh(d.index, d['DC - IDC'], 0.4, label='DC - IDC', align='edge')
plt.barh(np.arange(len(d.index))-0.4, d['No Address'], 0.4, color='red', label='No Address', align='edge')
plt.legend();
Here is what I did:
Increase the figure size (optional)
Create a BarContainer for each column
Decrease the width of each bar to 0.4 to make them fit
Align the left edges of the bars with the y positions
Normally all bars now are stacked. To put the red bars to the side you need to subtract each y coordinate by the width of the bars (0.4) np.arange(len(d.index))-0.4
Finally, add a legend
It should look like that:

How to set x-axis ticks to show for each datetime value in plotly.py express scatter plot?

In the docs for plotly.py tick formatting here, it states that you can set the tickmode to array and just specify the tickvals and ticktext e.g.
import plotly.graph_objects as go
go.Figure(go.Scatter(
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
y = [28.8, 28.5, 37, 56.8, 69.7, 79.7, 78.5, 77.8, 74.1, 62.6, 45.3, 39.9]
))
fig.update_layout(
xaxis = dict(
tickmode = 'array',
tickvals = [1, 3, 5, 7, 9, 11],
ticktext = ['One', 'Three', 'Five', 'Seven', 'Nine', 'Eleven']
)
)
fig.show()
But this does not seem to work when tickvals is a list of datetime objects.
What I want to do is show an x-axis tick for each point in my scatter plot where the x values are all datetime objects but this does not seem to work. No error is thrown and the graph is rendered as if I did not try update the x ticks. My code for this is below:
# lambda expression to convert datetime object to string of desired format
date_to_string_lambda = lambda x: x.strftime("%e %b")
fig.update_layout(
xaxis = dict(
tickmode = 'array',
# all points should have a corresponding tick
tickvals = list(fig.data[0].x),
# datetime value represented as %e %b string i.e. space padded day and abreviated month.
ticktext = list(map(date_to_string_lambda, list(fig.data[0].x))),
)
)
Instead of showing a tick for each value it goes to the default tick mode and shows ticks at intervals i.e.
Image of graph produced
The values for layout when print(fig) is run after the above code are below, where the xaxis dict is important. Note that the tickvals are no longer of type datetime.
'layout': {'hovermode': 'x',
'legend': {'title': {'text': ''}, 'tracegroupgap': 0, 'x': 0.01, 'y': 0.98},
'margin': {'b': 0, 'l': 0, 'r': 0, 't': 0},
'template': '...',
'title': {'text': ''},
'xaxis': {'anchor': 'y',
'domain': [0.0, 1.0],
'fixedrange': True,
'tickmode': 'array',
'ticktext': [27 Apr, 3 May, 9 May, 13 May, 20 May],
'tickvals': [2020-04-27 00:00:00, 2020-05-03 00:00:00,
2020-05-09 00:00:00, 2020-05-13 00:00:00,
2020-05-20 00:00:00],
'title': {'text': 'Date'}},
'yaxis': {'anchor': 'x', 'domain': [0.0, 1.0], 'fixedrange': True, 'title': {'text': 'Total Tests'}}}
This seems to be a bug with plotly.py, so is there a workaround for this?

How to create an Area plot

Is there any way to create an Area plot in Seaborn. I checked the documentation but I couldn't able to find it.
Here is the data that I want to plot.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
data = {'launch_year': [1957, 1958, 1959, 1960, 1961, 1957, 1958, 1959, 1960, 1961, 1957, 1958, 1959,
1960, 1961, 1957, 1958, 1959, 1960, 1961, 1957, 1958, 1959, 1960, 1961],
'state_code': ['China', 'China', 'China', 'China', 'China', 'France', 'France', 'France', 'France',
'France', 'Japan', 'Japan', 'Japan', 'Japan', 'Japan', 'Russia', 'Russia', 'Russia',
'Russia', 'Russia', 'United States', 'United States', 'United States', 'United States', 'United States'],
'value': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 5, 4, 8, 9, 1, 22, 18, 29, 41]}
# create a long format DataFrame
df = pd.DataFrame(data)
# pivot the DataFrame to a wide format
year_countries = df.pivot(index='launch_year', columns='state_code', values='value')
# display(year_countries)
state_code China France Japan Russia United States
launch_year
1957 0 0 0 2 1
1958 0 0 0 5 22
1959 0 0 0 4 18
1960 0 0 0 8 29
1961 0 0 0 9 41
I created a line plot using this code -
sns.relplot(data=year_countries, kind='line',
height=7, aspect=1.3,linestyle='solid')
plt.xlabel('Lanuch Year', fontsize=15)
plt.ylabel('Number of Launches', fontsize=15)
plt.title('Space Launches By Country',fontsize=17)
plt.show()
but the Plot isn't so clear when using a line chart
Also can't able to make the lines Solid and Sort the legends based on the values in descending order.
How about using pandas.DataFrame.plot with kind='area'.
Setting a seaborn style with plt.style.use('seaborn') is deprecated.
In addition, you need to manually sort the legend, as shown here. However, changing the legend order does not change the plot order.
xticks=range(1957, 1962) can be used to specify the xticks, otherwise the 'launch_year' is treated as floats on the x-axis
Tested in python 3.11, pandas 1.5.2, matplotlib 3.6.2
ax = year_countries.plot(kind='area', figsize=(9, 6), xticks=range(1957, 1962))
ax.set_xlabel('Launch Year', fontsize=15)
ax.set_ylabel('Number of Launches', fontsize=15)
ax.set_title('Space Launches By Country', fontsize=17)
handles, labels = ax.get_legend_handles_labels()
labels, handles = zip(*sorted(zip(labels, handles), key=lambda t: t[0], reverse=True))
ax.legend(handles, labels)
plt.show()
Alternatively, use pandas.Categorical to set the order of the columns in df, prior to pivoting. This will ensure the plot order and legend order are the same (e.g. the first group in the legend is the first group in the plot stack).
# set the order of the column in df
df.state_code = pd.Categorical(df.state_code, sorted(df.state_code.unique())[::-1], ordered=True)
# now pivot df
year_countries = df.pivot(index='launch_year', columns='state_code', values='value')
# plot
ax = year_countries.plot(kind='area', figsize=(9, 6), xticks=range(1957, 1962))
ax.set_xlabel('Launch Year', fontsize=15)
ax.set_ylabel('Number of Launches', fontsize=15)
ax.set_title('Space Launches By Country', fontsize=17)
# move the legend
ax.legend(title='Countries', bbox_to_anchor=(1, 1.02), loc='upper left', frameon=False)

seaborn plot from total

I have the following data frame:
df = pd.DataFrame({'group': ['Red', 'Red', 'Red', 'Blue', 'Blue', 'Blue'],
'valueA_found': [10, 40, 50, 20, 50, 70],
'valueA_total': [100,200, 210, 100, 200, 210],
'date': ['2017-01-01', '2017-02-01', '2017-03-01', '2017-01-01', '2017-02-01', '2017-03-01']})
and can create a plot:
fig, ax = plt.subplots(figsize=(15,8))
sns.set_style("whitegrid")
g = sns.barplot(x="date", y="valueA_found", hue="group", data=df)
# g.set_yscale('log')
g.set_xticklabels(df.date, rotation=45)
g.set(xlabel='date', ylabel='value from total')
But, I would rather like to see below per each point in time:
as you can see per each model valueA_found is plotted as a bar and the total is plotted as a single bar.
Initially suggested, it would also be possible to plot the total as a line - but as outlined in the comments it is probably better to produce a bar as well. valueA_total i.e. the total should be the same per group per month.
An option might be to plot the total values in a desaturated/more transparent bar plot behind the first dataset.
import matplotlib.pyplot as plt
import pandas as pd
import seaborn.apionly as sns
df = pd.DataFrame({'group': ['Red', 'Red', 'Red', 'Blue', 'Blue', 'Blue'],
'valueA': [10, 40, 50, 20, 50, 70],
'valueB': [100,200, 210, 100, 200, 210],
'date': ['2017-01-01', '2017-02-01', '2017-03-01',
'2017-01-01', '2017-02-01', '2017-03-01']})
fig, ax = plt.subplots(figsize=(6,4))
sns.barplot(x="date", y="valueB", hue="group", data=df,
ax=ax, palette={"Red":"#f3c4c4","Blue":"#c5d6f2" }, alpha=0.6)
sns.barplot(x="date", y="valueA", hue="group", data=df,
ax=ax, palette={"Red":"#d40000","Blue":"#0044aa" })
ax.set_xticklabels(df.date, rotation=45)
ax.set(xlabel='date', ylabel='value from total')
plt.show()
Or just putting one bar plot in the background, assuming that the totals of each group are always the same:
import matplotlib.pyplot as plt
import pandas as pd
import seaborn.apionly as sns
df = pd.DataFrame({'group': ['Red', 'Red', 'Red', 'Blue', 'Blue', 'Blue'],
'valueA': [10, 40, 50, 20, 50, 70],
'valueB': [100,200, 210, 100, 200, 210],
'date': ['2017-01-01', '2017-02-01', '2017-03-01',
'2017-01-01', '2017-02-01', '2017-03-01']})
fig, ax = plt.subplots(figsize=(6,4))
sns.barplot(x="date", y="valueB", data=df[df.group=="Red"],
ax=ax, color="#e7e2e8", label="total")
sns.barplot(x="date", y="valueA", hue="group", data=df,
ax=ax, palette={"Red":"#d40000","Blue":"#0044aa" })
ax.set_xticklabels(df.date, rotation=45)
ax.set(xlabel='date', ylabel='value from total')
plt.show()

Categories