xticks and bins won't match each other - matplotlib.hist - python

i'm trying to create simple hist plot, using plt.hist. I encountered a strange problem, as you can see in the figure - the bins just overlap each other.
Here is my code:
intervals = np.arange(0.5,max(data)+0.5, .5)
data = np.array(data)
# build labels list
labels = [ str(intervals[i])+'-'+str(intervals[i+1]) for i in range(len(intervals)-1) ]
labels.append(str(max(intervals))+'-'+str(max(intervals)+.5))
# plot
fig, ax = plt.subplots(figsize=(12, 9))
plt.hist(x=data, bins=intervals, width=1)
ax.set_xticks(range(len(intervals)))
ax.set_xticklabels(labels=labels, rotation=45, fontsize=12)
ax.set_title("Max wind speed ~24hr before dust emission ("+loc+")", fontsize=18)
ax.set_xlabel('Wind Speed [m/s]', fontsize=14)
ax.set_ylabel('No. of Events', fontsize=14)
plt.grid(True)
plt.tight_layout()
plt.savefig('th_hist'+loc+'.png')
plt.show()
`
I tried to change the axis size, and also played with the width value.

Related

How do I plot percentage labels for a horizontal bar graph in Python?

Can someone please help me plot x axis labels in percentages given the following code of my horizontal bar chart?
Finding it difficult to find as I want a more simplistic chart without x axis labels and ticks.
[Horizontal Bar Chart][1]
# Plot the figure size
plt.figure(figsize= (8,6))
# New variable and plot the question of the data frame in a normalized in a horizontal bar chat.
ax1 = df[q1].value_counts(normalize=True).sort_values().plot(kind="barh", color='#fd6a02', width=0.75, zorder=2)
# Draw vague vertical axis lines and set lines to the back of the order
vals = ax1.get_xticks()
for tick in vals:
ax1.axvline(x=tick, linestyle='dashed', alpha=0.4, color = '#d3d3d3', zorder=1)
# Tot3als to produce a composition ratio
total_percent = df[q1].value_counts(normalize=True) *100
# Remove borders
ax1.spines['right'].set_visible(False)
ax1.spines['top'].set_visible(False)
ax1.spines['left'].set_visible(False)
ax1.spines['bottom'].set_visible(False)
# Set the title of the graph inline with the Y axis labels.
ax1.set_title(q1, weight='bold', size=14, loc = 'left', pad=20, x = -0.16)
# ax.text(x,y,text,color)
for i,val in enumerate(total):
ax1.text(val - 1.5, i, str("{:.2%}".format(total_percent), color="w", fontsize=10, zorder=3)
# Create axis labels
plt.xlabel("Ratio of Responses", labelpad=20, weight='bold', size=12)
Each time I get a EOF error. Can someone help?
It's not based on your code, but I'll customize the answer from the official reference.
The point is achieved with ax.text(), which is a looping process.
import matplotlib.pyplot as plt
import numpy as np
# Fixing random state for reproducibility
np.random.seed(19680801)
plt.rcdefaults()
fig, ax = plt.subplots()
# Example data
people = ('Tom', 'Dick', 'Harry', 'Slim', 'Jim')
y_pos = np.arange(len(people))
performance = 3 + 10 * np.random.rand(len(people))
ax.barh(y_pos, performance, align='center')
ax.set_yticks(y_pos)
ax.set_yticklabels(people)
ax.invert_yaxis() # labels read top-to-bottom
ax.set_xlabel('Performance')
ax.set_title('How fast do you want to go today?')
# Totals to produce a composition ratio
total = sum(performance)
# ax.text(x,y,text,color)
for i,val in enumerate(performance):
ax.text(val - 1.5, i, str("{:.2%}".format(val/total)), color="w", fontsize=10)
plt.show()

Align twinx with second axis with non linear scale

I'm facing some problems in the alignment of the ticks of two different y-axes with the first characterized by a linear range and the second by a non linear range as depicted in the following picture.
HS, TMN = np.meshgrid(hs, period)
r = function(HS, TMN)
cax = plt.contourf(HS, TMN, np.log10(HS), cmap=plt.cm.RdYlGn_r)
ax = plt.gca()
ax2 = ax.twinx()
ticks2 = get_y2values(ax.get_yticks()) # Non linear function
ax2.yaxis.set_major_locator(mpl.ticker.FixedLocator(ticks))
ax2.set_ylim([0, 700])
ax.grid()
ax.set_ylabel('Y1', fontsize=14)
ax2.set_ylabel('Y2', fontsize=14)
plt.show()
More precisely, the right axis requires a different scale from the one on the left. And as final outcome, the idea is to have ticks values on the left aligned with the ticks values on the right (due to the non-linear function depicted below). E.g.: the value 8.08 from Y1 aligned with 101.5; 16.07 aligned with 309.5...
The new scale is required in order to insert new plot in the new scale.
As suggested in the comments the definition of a new scale works perfectly.
Referring to the SegmentedScale defined at the following link, the code that worked for me is the following:
hs = np.linspace(0.1, 15, 1000) # [meters]
period = np.linspace(0.1, 35, 1000) # [seconds]
HS, TMN = np.meshgrid(hs, period)
cax = plt.contourf(HS, TMN, np.log10(HS), cmap=plt.cm.RdYlGn_r)
ax1 = plt.gca()
ax2 = ax.twinx()
ticks = get_y2values(ax1.get_yticks()) # Non linear function
ax2.set_yscale('segmented', points=ticks)
ax1.grid()
ax1.set_yticks(ax1.get_yticks())
ax2.set_yticks(ticks)
ax1.set_ylabel('Y1', fontsize=14)
ax2.set_ylabel('Y2', fontsize=14)
plt.show()
If it is necessary to add new plots on the ax2 axis, it is required to do the plot before the application of the new custom scale.

Modifying y-axis in histogram in Pandas matplotlib

I have 33960 - 0's and 144 - 1's in data_train['fk_action_code_id'].
On plotting histogram, the bar of 1 is so less that it is not visible. Is there any way I can raise the bar of 1 by modifying the Y-Axis so that the bar of 1 is visible?
I tried this but it doesn't work
b=[0,145, 35000]
plt.yticks(b)
plt.hist(data_train['fk_action_code_id'], histtype='bar', rwidth=0.8)
A few suggestions: you could
1.) create two y axes, one for the zeros and the other for the ones
2.) multiply one of the bars by a numerical factor, so that they are of the same order of magnitude (you should explain this in the plot legend then)
3.) draw a logarithmic histogram with the option log=True in the plt.hist() command.
The following will produce plots for these three options:
import numpy as np
import matplotlib.pyplot as plt
zeros = np.zeros([35000])
modifier = 100
ones = np.ones([145*modifier])
arr = np.hstack((zeros, ones))
bins = np.asarray([-0.5, 0.5, 1.5])
plt.hist(arr, bins=bins, facecolor='green', alpha=0.75, log=False)
plt.xticks([0,1])
plt.title('Multiplied with a factor')
plt.savefig('multiplied.png')
plt.show()
plt.clf()
modifier = 1
ones = np.ones([145*modifier])
arr = np.hstack((zeros, ones))
plt.hist(arr, bins=bins, facecolor='green', alpha=0.75, log=True)
plt.xticks([0,1])
plt.title('Logarithmic')
plt.savefig('log.png')
plt.show()
plt.clf()
ax1 = plt.gca()
ax2 = ax1.twinx()
ax1.set_yticks([0, 35000, 40000])
ax1.set_ylim(0, 40000)
ax2.set_yticks([0, 145, 200])
ax2.set_ylim(0, 200)
ax1.hist(arr, bins=[bins[0], bins[1]], facecolor='green', alpha=0.75, log=False)#, histtype='bar')#, rwidth=1.0)
ax2.hist(arr, bins=[bins[1], bins[2]], facecolor='green', alpha=0.75, log=False)#, histtype='bar')#, rwidth=1.0)
plt.xticks([0,1])
plt.title('Two y axes')
plt.savefig('two_axes.png')
plt.show()
plt.clf()

How to plot a superimposed bar chart using matplotlib in python?

I want to plot a bar chart or a histogram using matplotlib. I don't want a stacked bar plot, but a superimposed barplot of two lists of data, for instance I have the following two lists of data with me:
Some code to begin with :
import matplotlib.pyplot as plt
from numpy.random import normal, uniform
highPower = [1184.53,1523.48,1521.05,1517.88,1519.88,1414.98,1419.34,
1415.13,1182.70,1165.17]
lowPower = [1000.95,1233.37, 1198.97,1198.01,1214.29,1130.86,1138.70,
1104.12,1012.95,1000.36]
plt.hist(highPower, bins=10, histtype='stepfilled', normed=True,
color='b', label='Max Power in mW')
plt.hist(lowPower, bins=10, histtype='stepfilled', normed=True,
color='r', alpha=0.5, label='Min Power in mW')
I want to plot these two lists against the number of values in the two lists such that I am able to see the variation per reading.
You can produce a superimposed bar chart using plt.bar() with the alpha keyword as shown below.
The alpha controls the transparency of the bar.
N.B. when you have two overlapping bars, one with an alpha < 1, you will get a mixture of colours. As such the bar will appear purple even though the legend shows it as a light red. To alleviate this I have modified the width of one of the bars, this way even if your powers should change you will still be able to see both bars.
plt.xticks can be used to set the location and format of the x-ticks in your graph.
import matplotlib.pyplot as plt
import numpy as np
width = 0.8
highPower = [1184.53,1523.48,1521.05,1517.88,1519.88,1414.98,
1419.34,1415.13,1182.70,1165.17]
lowPower = [1000.95,1233.37, 1198.97,1198.01,1214.29,1130.86,
1138.70,1104.12,1012.95,1000.36]
indices = np.arange(len(highPower))
plt.bar(indices, highPower, width=width,
color='b', label='Max Power in mW')
plt.bar([i+0.25*width for i in indices], lowPower,
width=0.5*width, color='r', alpha=0.5, label='Min Power in mW')
plt.xticks(indices+width/2.,
['T{}'.format(i) for i in range(len(highPower))] )
plt.legend()
plt.show()
Building on #Ffisegydd's answer, if your data is in a Pandas DataFrame, this should work nicely:
def overlapped_bar(df, show=False, width=0.9, alpha=.5,
title='', xlabel='', ylabel='', **plot_kwargs):
"""Like a stacked bar chart except bars on top of each other with transparency"""
xlabel = xlabel or df.index.name
N = len(df)
M = len(df.columns)
indices = np.arange(N)
colors = ['steelblue', 'firebrick', 'darksage', 'goldenrod', 'gray'] * int(M / 5. + 1)
for i, label, color in zip(range(M), df.columns, colors):
kwargs = plot_kwargs
kwargs.update({'color': color, 'label': label})
plt.bar(indices, df[label], width=width, alpha=alpha if i else 1, **kwargs)
plt.xticks(indices + .5 * width,
['{}'.format(idx) for idx in df.index.values])
plt.legend()
plt.title(title)
plt.xlabel(xlabel)
plt.ylabel(ylabel)
if show:
plt.show()
return plt.gcf()
And then in a python command line:
low = [1000.95, 1233.37, 1198.97, 1198.01, 1214.29, 1130.86, 1138.70, 1104.12, 1012.95, 1000.36]
high = [1184.53, 1523.48, 1521.05, 1517.88, 1519.88, 1414.98, 1419.34, 1415.13, 1182.70, 1165.17]
df = pd.DataFrame(np.matrix([high, low]).T, columns=['High', 'Low'],
index=pd.Index(['T%s' %i for i in range(len(high))],
name='Index'))
overlapped_bar(df, show=False)
It is actually simpler than the answers all over the internet make it appear.
a = range(1,10)
b = range(4,13)
ind = np.arange(len(a))
fig = plt.figure()
ax = fig.add_subplot(111)
ax.bar(x=ind, height=a, width=0.35,align='center')
ax.bar(x=ind, height=b, width=0.35/3, align='center')
plt.xticks(ind, a)
plt.tight_layout()
plt.show()

Axis labelling with matplotlib too sparse

Matplotlib tries to label the ticks on this x-axis intelligently, but it is a little too sparse. There should be a label for 0 and maybe one for 10 and 100.
This is the code which produces the figure. How can I make the labelling on the x-axis more verbose?
def timePlot2(data, saveto=None, leg=None, annotate=False, limits=timeLimits, labelColumn="# Threads", valueColumn="Average (s)", size=screenMedium):
labels = list(data[labelColumn])
figure(figsize=size)
ax = gca()
ax.grid(True)
xi = range(len(labels))
rts = data[valueColumn] # running time in seconds
ax.scatter(rts, xi, color='r')
if annotate:
for i,j in zip(rts, xi):
ax.annotate("%0.2f" % i, xy=(i,j), xytext=(7,0), textcoords="offset points")
ax.set_yticks(range(len(labels)))
ax.set_yticklabels(labels)
ax.set_xscale('log')
plt.xlim(limits)
if leg:
legend(leg, loc="upper left", fontsize=10)
else:
legend([r"$t$"], fontsize=10)
plt.draw()
if saveto:
plt.savefig(saveto, transparent=True, bbox_inches="tight")
You can define your own X-Axis-Ticks together with their labels using ax.set_xticks(), in your example
ax.set_xticks((10,100,1000))
should do the trick.
If you like to keep the 10^x-labels, you can add the labels explicitly:
ax.set_xticks((10,100,1000),('$10^1$','$10^2$','$10^3$'))

Categories