How to write a function for barchart in python? - python

I am using bar-charts for my Exploratory data analysis.I have generated around 18 bar-charts in the entire analysis with similar peace of code.So i don't want to write the same code all the time for every bar-chart. the code i have used for the bar-chart is
y = textranges_freq['smstext']
xlabels = textranges_freq['buckets']
bar_width = 0.50
x = np.arange(len(y))
fig, ax = plt.subplots()
ax.bar(x, y, width=bar_width)
ax.set_xticks(x+(bar_width/2.0))
ax.set_xticklabels(xlabels)
ax.set_title('Avg text Frequency by range')
ax.set_xlabel('buckets')
ax.set_ylabel('Avg text messages')
plt.show()
I have used the same code around 18 times in my analysis because i need to
change y,xlabels,title,ax.set_title,ax.set_xlabel,ax.set_ylabel.
so how can i write the function for this to use further.
In the above code textranges_freq is my dataframe and smstext,buckets are my variables.
please help me on this. I am new to python.

Just wrap the whole thing in a function:
y = textranges_freq['smstext']
xlabels = textranges_freq['buckets']
def makebar(y, xlabels, xlabel, ylabel, title):
bar_width = 0.50
x = np.arange(len(y))
fig, ax = plt.subplots()
ax.bar(x, y, width=bar_width)
ax.set_xticks(x+(bar_width/2.0))
ax.set_xticklabels(xlabels)
ax.set_title(title)
ax.set_xlabel(xlabel)
ax.set_ylabel(ylabel)
plt.show()
However, an even easier approach would be to plot from the DataFrame directly:
ax = textranges_freq.plot(x='buckets',y='smstext',kind='bar',title='Avg text Frequency by range', width=0.5, legend=False)
ax.set_xlabel('buckets')
ax.set_ylabel('Avg text messages')
plt.show()
This isn't much more work than just calling the function directly, but if you really wanted you could wrap it in a function, too:
def df_bar(df, xcol, ycol, xlabel=None, ylabel=None, title=None):
if xlabel is None:
xlabel = xcol
if ylabel is None:
ylabel = xcol
ax = textranges_freq.plot(x=xcol,y=ycol,kind='bar',title=title, width=0.5, legend=False)
ax.set_xlabel(xlabel)
ax.set_ylabel(ylabel)
plt.show()
This also has the advantage that if the x or y label is the same as the column name (as is the case for the xlabel in the example), you can just skip the corresponding label and it will use the column name instead. You also can leave the title blank.

I would structure your data into lists.
for example:
yn = [[1,2,3],[2,3,4], [3,4,5],[4,5,6], ...]
x = [[1,2,3],[2,3,4], [3,4,5],[4,5,6], ...]
labels = ['label1', 'label2', 'label3', ...]
and then:
fig = plot.figure(figsize=(11.69, 8.27), dpi=100)
for i,y in enumerate(yn):
#new subplot
ax=fig.add_subplot(18,1,i+1)
#plot
ax.plot(x[i], y, 'bo-')
#y labels
ax.set_ylabel(labels[i])
# grid
ax.grid(True)
plot.show()

Related

Pyplot combine two subplot axes

I am currently trying to have two plots in one figure. I am stuck on this for a while now and I don't have any idea why it wouldn't work like I want it to. I have two functions, which return similar axes. The data comes from a csv file, where I get the frequency (y-axis) according to the size of specific objects (x-axis). I expect to have one figure displaying the plots on top of each other. However my plot only contains the legend to axs[1] and the data also only contains axs[1].
My code:
fig, axs = plt.subplots(2)
axs[0].plot(ax=return_some_ax())
axs[1].plot(ax=return_similar_ax())
plt.savefig('plot.png')
I hope that you can help me out :)
Thank you!
This is how you do it in general. You can substitute your functions in place of x and y in here if they return a list of those values:
x_data = list(range(10))
y = [x**2 for x in x_data]
y1 = [x+5 for x in x_data]
fig, [ax1, ax2] = plt.subplots(2)
ax1.plot(x_data, y, label = "quadratic", color = 'red')
ax2.plot(x_data, y1, label = "linear", color = 'blue')
ax1.legend()
ax2.legend()
plt.show()
plt.savefig('plot.png')
Here is the same code using two functions:
def function1(x):
return x**2
def function2(x):
return x+5
x_data = list(range(10))
fig, [ax1, ax2] = plt.subplots(2)
ax1.plot(x_data, [function1(x) for x in x_data], label = "quadratic", color = 'red')
ax2.plot(x_data, [function2(x) for x in x_data], label = "linear", color = 'blue')
ax1.legend()
ax2.legend()
plt.show()
plt.savefig('plot.png')

How do I plot percentage labels for a horizontal bar graph in Python?

Can someone please help me plot x axis labels in percentages given the following code of my horizontal bar chart?
Finding it difficult to find as I want a more simplistic chart without x axis labels and ticks.
[Horizontal Bar Chart][1]
# Plot the figure size
plt.figure(figsize= (8,6))
# New variable and plot the question of the data frame in a normalized in a horizontal bar chat.
ax1 = df[q1].value_counts(normalize=True).sort_values().plot(kind="barh", color='#fd6a02', width=0.75, zorder=2)
# Draw vague vertical axis lines and set lines to the back of the order
vals = ax1.get_xticks()
for tick in vals:
ax1.axvline(x=tick, linestyle='dashed', alpha=0.4, color = '#d3d3d3', zorder=1)
# Tot3als to produce a composition ratio
total_percent = df[q1].value_counts(normalize=True) *100
# Remove borders
ax1.spines['right'].set_visible(False)
ax1.spines['top'].set_visible(False)
ax1.spines['left'].set_visible(False)
ax1.spines['bottom'].set_visible(False)
# Set the title of the graph inline with the Y axis labels.
ax1.set_title(q1, weight='bold', size=14, loc = 'left', pad=20, x = -0.16)
# ax.text(x,y,text,color)
for i,val in enumerate(total):
ax1.text(val - 1.5, i, str("{:.2%}".format(total_percent), color="w", fontsize=10, zorder=3)
# Create axis labels
plt.xlabel("Ratio of Responses", labelpad=20, weight='bold', size=12)
Each time I get a EOF error. Can someone help?
It's not based on your code, but I'll customize the answer from the official reference.
The point is achieved with ax.text(), which is a looping process.
import matplotlib.pyplot as plt
import numpy as np
# Fixing random state for reproducibility
np.random.seed(19680801)
plt.rcdefaults()
fig, ax = plt.subplots()
# Example data
people = ('Tom', 'Dick', 'Harry', 'Slim', 'Jim')
y_pos = np.arange(len(people))
performance = 3 + 10 * np.random.rand(len(people))
ax.barh(y_pos, performance, align='center')
ax.set_yticks(y_pos)
ax.set_yticklabels(people)
ax.invert_yaxis() # labels read top-to-bottom
ax.set_xlabel('Performance')
ax.set_title('How fast do you want to go today?')
# Totals to produce a composition ratio
total = sum(performance)
# ax.text(x,y,text,color)
for i,val in enumerate(performance):
ax.text(val - 1.5, i, str("{:.2%}".format(val/total)), color="w", fontsize=10)
plt.show()

rainbowtext() function and y axis label

Hey I'm using rainbow text function, which can be found in here
in order to make y axis label have colors that match closest colors of the conosle names on y axis.
So currently I've came up with this code:
fig, ax= plt.subplots(figsize=(5,6)) #used to take care of the size
sns.barplot(x=gbyplat,y=gbyplat.index, palette='husl') #creating barplot
ax.set_ylabel('Publisher', color='deepskyblue', size=15, alpha=0.8) #setting labels
ax.set_xlabel('Number of titles published', color='slateblue', size=15, alpha=0.7)
ax.set_title('Titles per platform ranking', color='deeppink', size=17, alpha=0.6)
ax.set_xlim(0,2350) #setting limit for the plot
ax.set_xticks(np.arange(0, max(gbyplat), 250)) #ticks frequency
ax.annotate('newest', size=12, xy=(390, 13), xytext=(700, 13.3),
arrowprops=dict(arrowstyle="fancy")) #annotations on plot
ax.annotate('max', size=9, xy=(2230,0.3), bbox=dict(boxstyle="round", fc="w", alpha=0.5))
ax.plot(2161,0, 'o', color='cyan') #creating the cricle highlight for PS2 max
p = sns.color_palette("husl", len(gbyplat))
for i, label in enumerate(ax.get_yticklabels()):
label.set_color(p[i])
rainbow_text(0,5, "Pub lis her".split(),
[p[10],p[11],p[12]],
size=10)
However, the issue is that I have to manually set coordinates for newly produced 'Publisher' label. According to the function code i can pass ax argument which would automatically fit the label to the y axis (if I understood correctly). So how can I do that? And second question, is there a way to access ylabel coordinates (of the current y axis label 'Publisher')?
Thanks
One can set the text at the position where the ylabel would normally reside by first drawing the ylabel, obtaining its coordinates and then setting it to an empty string. One can then adapt the example rainbow text function to use the obtained coordinates.
It will still be very tricky to select the colors and coordinates such that the text will have exactly the color of the bars next to it. This probably involves a lot a trial and error.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import transforms
import seaborn as sns
l =list("ABCDEFGHIJK")
x = np.arange(1,len(l)+1)[::-1]
f, ax=plt.subplots(figsize=(7,4.5))
sns.barplot(x=x,y=l, palette='husl', ax=ax)
plt.xlabel('Number of titles published', color='slateblue', size=15, alpha=0.7)
p = sns.color_palette("husl", len(l))
for i, label in enumerate(ax.get_yticklabels()):
label.set_color(p[i])
def rainbow_text(x, y, strings, colors, ax=None, **kw):
if ax is None:
ax = plt.gca()
canvas = ax.figure.canvas
lab = ax.set_ylabel("".join(strings))
canvas.draw()
labex = lab.get_window_extent()
t = ax.transAxes
labex_data = t.inverted().transform((labex.x0, labex.y0- labex.height/2.))
ax.set_ylabel("")
for s, c in zip(strings, colors):
text = ax.text(labex_data[0]+x, labex_data[1]+y, s, color=c, transform=t,
rotation=90, va='bottom', ha='center', **kw)
text.draw(canvas.get_renderer())
ex = text.get_window_extent()
t = transforms.offset_copy(text._transform, y=ex.height, units='dots')
rainbow_text(0, 0.06, ["Pub", "lish", "er"],[p[6], p[5],p[4]],size=15)
plt.show()

Two Y axis Bar plot: custom xticks

I am trying to add custom xticks to a relatively complicated bar graph plot and I am stuck.
I am plotting from two data frames, merged_90 and merged_15:
merged_15
Volume y_err_x Area_2D y_err_y
TripDate
2015-09-22 1663.016032 199.507503 1581.591701 163.473202
merged_90
Volume y_err_x Area_2D y_err_y
TripDate
1990-06-10 1096.530711 197.377497 1531.651913 205.197493
I want to create a bar graph with two axes (i.e. Area_2D and Volume) where the Area_2D and Volume bars are grouped based on their respective data frame. An example script would look like:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy
fig = plt.figure()
ax1 = fig.add_subplot(111)
merged_90.Volume.plot(ax=ax1, color='orange', kind='bar',position=2.5, yerr=merged_90['y_err_x'] ,use_index=False , width=0.1)
merged_15.Volume.plot(ax=ax1, color='red', kind='bar',position=0.9, yerr=merged_15['y_err_x'] ,use_index=False, width=0.1)
ax2 = ax1.twinx()
merged_90.Area_2D.plot(ax=ax2,color='green', kind='bar',position=3.5, yerr=merged_90['y_err_y'],use_index=False, width=0.1)
merged_15.Area_2D.plot(ax=ax2,color='blue', kind='bar',position=0, yerr=merged_15['y_err_y'],use_index=False, width=0.1)
ax1.set_xlim(-0.5,0.2)
x = scipy.arange(1)
ax2.set_xticks(x)
ax2.set_xticklabels(['2015'])
plt.tight_layout()
plt.show()
The resulting plot is:
One would think I could change:
x = scipy.arange(1)
ax2.set_xticks(x)
ax2.set_xticklabels(['2015'])
to
x = scipy.arange(2)
ax2.set_xticks(x)
ax2.set_xticklabels(['1990','2015'])
but that results in:
I would like to see the ticks ordered in chronological order (i.e. 1990,2015)
Thanks!
Have you considered dropping the second axis and plotting them as follows:
ind = np.array([0,0.3])
width = 0.1
fig, ax = plt.subplots()
Rects1 = ax.bar(ind, [merged_90.Volume.values, merged_15.Volume.values], color=['orange', 'red'] ,width=width)
Rects2 = ax.bar(ind + width, [merged_90.Area_2D.values, merged_15.Area_2D.values], color=['green', 'blue'] ,width=width)
ax.set_xticks([.1,.4])
ax.set_xticklabels(('1990','2015'))
This produces:
I omitted the error and colors but you can easily add them. That would produce a readable graph given your test data. As you mentioned in comments you would still rather have two axes, presumably for different data with proper scales. To do this you could do:
fig = plt.figure()
ax1 = fig.add_subplot(111)
merged_90.Volume.plot(ax=ax, color='orange', kind='bar',position=2.5, use_index=False , width=0.1)
merged_15.Volume.plot(ax=ax, color='red', kind='bar',position=1.0, use_index=False, width=0.1)
ax2 = ax1.twinx()
merged_90.Area_2D.plot(ax=ax,color='green', kind='bar',position=3.5,use_index=False, width=0.1)
merged_15.Area_2D.plot(ax=ax,color='blue', kind='bar',position=0,use_index=False, width=0.1)
ax1.set_xlim([-.45, .2])
ax2.set_xlim(-.45, .2])
ax1.set_xticks([-.35, 0])
ax1.set_xticklabels([1990, 2015])
This produces:
Your problem was with resetting just one axis limit and not the other, they are created as twins but do not necessarily follow the changes made to one another.

How to display all label values in matplotlib

I have two lists, when I plot with the following code, the x axis only shows up to 12 (max is 15). May I know how can I show all of the values in x list to the x axis? Thanks in advance.
x = [4,5,6,7,8,9,10,11,12,13,14,15,0,1,2,3]
y = [10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160]
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.plot(np.arange(len(x)), y, 'o')
ax1.set_xticklabels(x)
plt.show()
If I set minor=True in the set_xticklabels function, it shows me all x=2,4,6,8,..,16... but I want ALL values.
P.S. My x axis is not sorted, should display as it shows.
The issue here is that the number of ticks -set automatically - isn’t the same as the number of points in your plot.
To resolve this, set the number of ticks:
ax1.set_xticks(np.arange(len(x)))
Before the ax1.set_xticklabels(x) call.
or better
ax.xaxis.set_major_locator(ticker.MultipleLocator(1))
ax.yaxis.set_major_locator(ticker.MultipleLocator(1))
from other answers in SO
from matplotlib import ticker
import numpy as np
labels = [
"tench",
"English springer",
"cassette player",
"chain saw",
"church",
"French horn",
"garbage truck",
"gas pump",
"golf ball",
"parachute",
]
fig = plt.figure()
ax = fig.add_subplot(111)
plt.title('Confusion Matrix', fontsize=18)
data = np.random.random((10,10))
ax.matshow(data, cmap=plt.cm.Blues, alpha=0.7)
ax.set_xticklabels([''] + labels,rotation=90)
ax.set_yticklabels([''] + labels)
ax.xaxis.set_major_locator(ticker.MultipleLocator(1))
ax.yaxis.set_major_locator(ticker.MultipleLocator(1))
for i in range(data.shape[0]):
for j in range(data.shape[1]):
ax.text(x=j, y=i,s=int(data[i, j]), va='center', ha='center', size='xx-small')
plt.xlabel('Predicted')
plt.ylabel('True')
plt.show()

Categories