Build a bar chart with a small step in matplotlib - python

I have a dataset of about 100 time measurements, from 4.5 to 5.5 seconds. I need to build a chart with a bar for every 0.01 of a second (a bar for 4.5, a bar for 4.51, a bar for 4.52, etc. to 5.5), with the height of the bar representing the number of times that value has been recorded. Some of the values have not been recorded, so there should be a gap in the chart in those places instead of a bar.
The code for plotting now is
plt.xticks(np.arange(4.50, 5.50, step=0.01))
plt.yticks(np.arange(0, 5, step=1))
plt.xlim(4.50,5.50)
plt.ylim(0,5)
plt.bar(t, m)
plt.show()
It does not do what I want at all, instead showing this the height of the bars is incorrect, and it does not represent the values not recorded

Related

How to center the histogram bars around tick marks using seaborn displot? Stacking bars is essential

I have searched many ways of making histograms centered around tick marks but not able to find a solution that works with seaborn displot. The function displot lets me stack the histogram according to a column in the dataframe and thus would prefer a solution using displot or something that allows stacking based on a column in a data frame with color-coding as with palette.
Even after setting the tick values, I am not able to get the bars to center around the tick marks.
Example code
# Center the histogram on the tick marks
tips = sns.load_dataset('tips')
sns.displot(x="total_bill",
hue="day", multiple = 'stack', data=tips)
plt.xticks(np.arange(0, 50, 5))
I would also like to plot a histogram of a variable that takes a single value and choose the bin width of the resulting histogram in such a way that it is centered around the value. (0.5 in this example.)
I can get the center point by choosing the number of bins equal to a number of tick marks but the resulting bar is very thin. How can I increase the bin size in this case, where there is only one bar but want to display all the other possible points. By displaying all the tick marks, the bar width is very tiny.
I want the same centering of the bar at the 0.5 tick mark but make it wider as it is the only value for which counts are displayed.
Any solutions?
tips['single'] = 0.5
sns.displot(x='single',
hue="day", multiple = 'stack', data=tips, bins = 10)
plt.xticks(np.arange(0, 1, 0.1))
Edit:
Would it be possible to have more control over the tick marks in the second case? I would not want to display the round off to 1 decimal place but chose which of the tick marks to display. Is it possible to display just one value in the tick mark and have it centered around that?
Does the min_val and max_val in this case refer to value of the variable which will be 0 in this case and then the x axis would be plotted on negative values even when there are none and dont want to display them.
For your first problem, you may want to figure out a few properties of the data that your plotting. For example the range of the data. Additionally, you may want to choose beforehand the number of bins that you want displayed.
tips = sns.load_dataset('tips')
min_val = tips.total_bill.min()
max_val = tips.total_bill.max()
val_width = max_val - min_val
n_bins = 10
bin_width = val_width/n_bins
sns.histplot(x="total_bill",
hue="day", multiple = 'stack', data=tips,
bins=n_bins, binrange=(min_val, max_val),
palette='Paired')
plt.xlim(0, 55) # Define x-axis limits
Another thing to remember is that width a of a bar in a histogram identifies the bounds of its range. So a bar spanning [2,5] on the x-axis implies that the values represented by that bar belong to that range.
Considering this, it is easy to formulate a solution. Assume that we want the original bar graphs - identifying the bounds of each bar graph, one solution may look like
plt.xticks(np.arange(min_val-bin_width, max_val+bin_width, bin_width))
Now, if we offset the ticks by half a bin-width, we will get to the centers of the bars.
plt.xticks(np.arange(min_val-bin_width/2, max_val+bin_width/2, bin_width))
For your single value plot, the idea remains the same. Control the bin_width and the x-axis range and ticks. Bin-width has to be controlled explicitly since automatic inference of bin-width will probably be 1 unit wide which on the plot will have no thickness. Histogram bars always indicate a range - even though when we have just one single value. This is illustrated in the following example and figure.
single_val = 23.5
tips['single'] = single_val
bin_width = 4
fig, axs = plt.subplots(1, 2, sharey=True, figsize=(12,4)) # Get 2 subplots
# Case 1 - With the single value as x-tick label on subplot 0
sns.histplot(x='single',
hue="day", multiple = 'stack', data=tips,
binwidth=bin_width, binrange=(single_val-bin_width, single_val+bin_width),
palette='rocket',
ax=axs[0])
ticks = [single_val, single_val+bin_width] # 2 ticks - given value and given_value + width
axs[0].set(
title='Given value as tick-label starts the bin on x-axis',
xticks=ticks,
xlim=(0, int(single_val*2)+bin_width)) # x-range such that bar is at middle of x-axis
axs[0].xaxis.set_major_formatter(FormatStrFormatter('%.1f'))
# Case 2 - With centering on the bin starting at single-value on subplot 1
sns.histplot(x='single',
hue="day", multiple = 'stack', data=tips,
binwidth=bin_width, binrange=(single_val-bin_width, single_val+bin_width),
palette='rocket',
ax=axs[1])
ticks = [single_val+bin_width/2] # Just the bin center
axs[1].set(
title='Bin centre is offset from single_value by bin_width/2',
xticks=ticks,
xlim=(0, int(single_val*2)+bin_width) ) # x-range such that bar is at middle of x-axis
axs[1].xaxis.set_major_formatter(FormatStrFormatter('%.1f'))
Output:
I feel from your description that what you are really implying by a bar graph is a categorical bar graph. The centering is then automatic. Because the bar is not a range anymore but a discrete category. For the numeric and continuous nature of the variable in the example data, I would not recommend such an approach. Pandas provides for plotting categorical bar plots. See here. For our example, one way to do this is as follows:
n_colors = len(tips['day'].unique()) # Get number of uniques categories
agg_df = tips[['single', 'day']].groupby(['day']).agg(
val_count=('single', 'count'),
val=('single','max')
).reset_index() # Get aggregated information along the categories
agg_df.pivot(columns='day', values='val_count', index='val').plot.bar(
stacked=True,
color=sns.color_palette("Paired", n_colors), # Choose "number of days" colors from palette
width=0.05 # Set bar width
)
plt.show()
This yields:

Change distance between bar groups in grouped bar chart (plotting with Pandas)

I have a Dataframe with 14 rows and 7 columns where the columns represent groups and the rows represent months. I am trying to create a grouped bar plot such that at each month (on the x-axis) I will have the values for each of the groups as bars. The code is simply
ax = df.plot.bar(width=1,color=['b','g','r','c','orange','purple','y']);
ax.legend(loc='center left', bbox_to_anchor=(1.0, 0.5))
ax.set_xticklabels(months2,rotation=45)
Which produces the following result:
I would like to make the individual bars in each group wider but without them overlapping and I would also like to increase the distance between each group of bars so that there is enough space in the plot.
It might be worth mentioning that the index of the dataframe is 0,...,13.
Help would be greatly appreciated!
TH
If you want to pack 10 apples in a box and want the apples to have more space between them you have two options: (1) take a larger box, or (2) use smaller apples.
(1) How do you change the size of figures drawn with matplotlib?
(2) change the width argument.

Pandas / matplotlib: stacked bar charts should overlap

I am plotting a pandas dataframe. I have a bar chart, with stacked=True. The problem is, it puts the results on top of each other, instead of overlapping. If value a is 20, and value b is 40, the height of the bar will be 60. What I really want is a from 0-20 from the axis, and b from 20-40 (so that values are overlapped). Is this possible?

Overlapping bars in horizontal Bar graph

I am generating horizontal bar graphs using mathplotlib from
SO: How to plot multiple horizontal bars in one chart with matplotlib. The problem is when I have more than 2 horizontal bars, the bars are getting overlapped. Not sure, what I am doing wrong.
Here is the following graph code
import pandas
import matplotlib.pyplot as plt
import numpy as np
df = pandas.DataFrame(dict(graph=['q1','q2','q3' , 'q4','q5',' q6'],
n=[3, 5, 2,3 ,5 , 2], m=[6, 1, 3, 6 , 1 , 3]))
ind = np.arange(len(df))
width = 0.4
opacity = 0.4
fig, ax = plt.subplots()
ax.barh(ind, df.n, width, alpha=opacity, color='r', label='Existing')
ax.barh(ind + width, df.m, width, alpha=opacity,color='b', label='Community')
ax.barh(ind + 2* width, df.m, width, alpha=opacity,color='g', label='Robust')
ax.set(yticks=ind + width , yticklabels=df.graph, ylim=[2*width - 1, len(df)])
ax.legend()
#plt.xlabel('Queries')
plt.xlabel('Precesion')
plt.title('Precesion for these queries')
plt.show()
Currently, the graph looks like this
You set the width of the bars to 0.4, but you have three bars in each group. That means the width of each group is 1.2. But you set the ticks only 1 unit apart, so your bars don't fit into the spaces.
Since you are using pandas, you don't really need to do all that. Just do df.plot(kind='barh') and you will get a horizontal bar chart of the dataframe data. You can tweak the display colors, etc., by using various paramters to plot that you can find in the documentation. (If you want the "graph" column to be used as y-axis labels, set it as the index: df.set_index('graph').plot(kind='barh'))
(Using df.plot will give a barplot with only two bars per group, since your DataFrame has only two numeric columns. In your example, you plotted column m twice, which doesn't seem very useful. If you really want to do that, you could add a duplicate column into the DataFrame.)
ax=df.plot.barh(width=0.85,figsize=(15,15))
adjust width(should be less than 1 otherwise overlaps) and figsize to get the best and clear view of bars. Because if the figure is bigger you can have a clear and bigger view of bars which is the ultimate goal.

Python - Matplotlib - Subplot - x-axis overlap

My issue is that my x-axis tick labels are overlapping, as such I have a similar issue to the picture shown in this question:
matplotlib: how to prevent x-axis labels from overlapping each other
The distinction is that my labels are not rotated like they're in this picture. And I'm not using plt but instead:
ax=pylab.figure().add_subplot(111)
I am creating a bar chart and I can set the width of the bars. Can I similarly set the width of the xticklabels?
E.g so the first tick label appears on the x-axis between 0 and 0.5, the second appears between 0.5 and 1.0 etc...

Categories