Data Visulation Problem With Bar Plot and It's axes

Data Visulation Problem With Bar Plot and It's axes - python

i have a problem. I will show it you with pictures and tables.
0
MGROS 4.983566
SOKM 4.983566
BIMAS 4.983566
POLHO 4.043808
VESBE 2.722698
ARCLK 2.722698
VESTL 2.722698
HURGZ 2.125138
YATAS 2.030432
SELEC 1.986755
My dataframe is like above and graph is like below.
# creating the bar plot
br = plt.bar(df.index, df.values.squeeze(), color =colorLIST,
width = 0.9)
#for rect in br:
# height = rect.get_height()
# plt.text(rect.get_x() + rect.get_width() / 2.0, height, "%"+f'{height:.2f}', ha='center', va='bottom', color = "#003C5F", fontsize = 5.5)
plt.xlabel("Sembol", color = "#7F2A3C", fontsize = 20)
plt.ylabel("Getiri", color = "#7F2A3C", fontsize = 20)
plt.xticks(rotation = 90, fontsize = 20 )
labels = plt.gca().get_xticklabels()
for i in range(len(labels)):
labels[i].set_color(colorLIST[i])
plt.title("Global Sektörler", color = "#7F2A3C", fontsize = 20)
ax.spines['top'].set_color('#C2B280')
ax.spines['right'].set_color('none')
ax.spines['left'].set_smart_bounds(True)
ax.spines['bottom'].set_smart_bounds(True)
ax.yaxis.set_major_formatter(mtick.PercentFormatter())
#ax.grid(zorder=0)
#ax.xaxis.grid()
minor_locator = AutoMinorLocator(2)
plt.gca().xaxis.set_minor_locator(minor_locator)
plt.grid(which='minor')
plt.savefig("peers_company.png", dpi = 100)
plt.show()
The code is above. I want to show the same values in one bar. For example, MGROS, SOKM and BIMAS same values. How can i show it one bar and one xticks as with all three names one under the other?

I solve this problem. Here is the code.
I only rearranged my dataframe.
df['marker'] = (df[0] != df[0].shift()).cumsum()
df["TICKERS"] = df.index.tolist()
df = df.groupby('marker').agg({ 0: "first", "TICKERS": lambda x: list(x)})
df["VALUES"] = df[0].values
df.TICKERS = df.TICKERS.apply(lambda x : " ".join(x))

Related

Show dates in xticks only where value exist in plot chart of multiple dataframes

I have got a matplotlib question about xticks. I wanted to hide all those values that do not occur. I actually did it, but for the second set of values (red chart). I found how to hide for a specific data frame but not for 2 or more.
This is my code:
plt.subplots(figsize=(2, 1), dpi=400)
width = 0.005
xlim = np.arange(0, 1, 0.01)
ylim = np.arange(0, 0.1, 0.001)
plt.xticks(density_2.index.unique(), rotation=90, fontsize=1.5)
plt.yticks(density_2.unique(), fontsize=2)
plt.bar(density_1.index, density_1, width, color='Green', label=condition_1,alpha=0.5)
plt.bar(density_2.index, density_2, width, color='Red', label=condition_2,alpha=0.5)
plt.legend(loc="upper right", fontsize=2)
plt.show()
Link where I saw the solution: show dates in xticks only where value exist in plot chart and hide unnecessary interpolated xtick labels
Thank you very much in advance!

You need to find the intersection of the two lists of density_1's and density_2's ticks, as reported here.
Working example:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
N = 150
values_1 = np.random.randint(low = 5, high = 75, size = N)/100
density_1 = pd.DataFrame({'density_1': values_1})
density_1 = density_1.value_counts().sort_index(ascending = True)
density_1.index = sorted(list(set(values_1)), reverse = False)
values_2 = np.random.randint(low = 35, high = 100, size = N)/100
density_2 = pd.DataFrame({'density_2': values_2})
density_2 = density_2.value_counts().sort_index(ascending = True)
density_2.index = sorted(list(set(values_2)), reverse = False)
width = 0.005
condition_1 = 'Adele'
condition_2 = 'Extremoduro'
fig, ax = plt.subplots(figsize = (10, 5))
ax.bar(density_1.index, density_1, width, color = 'Green', label = condition_1, alpha = 0.5)
ax.bar(density_2.index, density_2, width, color = 'Red', label = condition_2, alpha = 0.5)
ax.legend(loc = 'upper right')
ax.set_xticks(list(set(density_1.index.unique()) & set(density_2.index.unique())), rotation = 90)
plt.show()
In the line:
list(set(density_1.index.unique()) & set(density_2.index.unique()))
you can select ticks which blongs to both density_1 and density_2.
Zoom in:

Matpolitlib Stacked Barchart - Adding Labels for the Total Value on each of my bars [duplicate]

I have a grouped bar chart and each bar is stacked.
I have annotated each section of the stack with its individual value and now I would like to sum those values and annotate the total value(height) of each bar. I would like this annotation to be on top of each bar.
This is one of the two dataframes I am working from:
df_title = pd.DataFrame(index=['F','M'],
data={'<10':[2.064897, 1.573255], '10-12':[3.933137, 4.326450], '13-17':[9.242871, 16.715831],
'18-24':[10.226155, 12.487709], '18-24':[8.161259, 10.717797], '35-44':[5.801377, 4.916421],
'45-54':[3.539823, 2.851524], '55+':[1.671583, 1.769912]})
I convert both dataframes (df_title and df_comps) into numpy arrays before plotting.
df_title_concat = np.concatenate((np.zeros((len,1)), df_title.T.values), axis=1)
Here is the full code:
df_title
df_comps
len = df_title.shape[1]
df_title_concat = np.concatenate((np.zeros((len,1)), df_title.T.values), axis=1)
df_comps_concat = np.concatenate((np.zeros((len,1)), df_comps.T.values), axis=1)
fig = plt.figure(figsize=(20,10))
ax = plt.subplot()
title_colors = ['skyblue', 'royalblue']
comps_colors = ['lightgoldenrodyellow', 'orange']
for i in range(1,3):
for j in list(range(0, df_title.shape[1]-1)):
j += 1
ax_1 = ax.bar(j, df_title_concat[j,i], width=-0.4, bottom=np.sum(df_title_concat[j,:i]), color = title_colors[i-1],
edgecolor='black', linewidth=3, align='edge')
for p in ax_1.patches:
width, height = p.get_width(), p.get_height()
x, y = p.get_xy()
if height > 2:
ax.annotate('{:.2f}%'.format(height), (p.get_x()+0.875*width, p.get_y()+.4*height),
fontsize=16, fontweight='bold', color='black')
ax_2 = ax.bar(j, df_comps_concat[j,i], width=0.4, bottom=np.sum(df_comps_concat[j,:i]), color = comps_colors[i-1],
edgecolor='black', linewidth=3, align='edge')
for p in ax_2.patches:
width, height = p.get_width(), p.get_height()
x, y = p.get_xy()
if height > 2:
ax.annotate('{:.2f}%'.format(height), (p.get_x()+0.15*width, p.get_y()+.4*height),
fontsize=16, fontweight='bold', color='black')

Here is a solution:
df_title = pd.DataFrame(index=['F','M'],
data={'<10':[2.064897, 1.573255], '10-12':[3.933137, 4.326450], '13-17':[9.242871, 16.715831],
'18-24':[10.226155, 12.487709], '18-24':[8.161259, 10.717797], '35-44':[5.801377, 4.916421],
'45-54':[3.539823, 2.851524], '55+':[1.671583, 1.769912]})
df_title_concat = np.concatenate((np.zeros((len(df_title),1)), df_title.T.values), axis=1)
fig = plt.figure(figsize=(12,8))
ax = plt.subplot()
title_colors = ['skyblue', 'royalblue']
for i in range(1,3):
for j in list(range(0, df_title.shape[1]-1)):
j += 1
bottom=np.sum(df_title_concat[j,:i])
ax_1 = ax.bar(j, df_title_concat[j,i], width=-0.4, bottom=bottom, color = title_colors[i-1],
edgecolor='black', linewidth=3, align='edge')
for p in ax_1.patches:
width, height = p.get_width(), p.get_height()
if bottom != 0:
ax.annotate('{:.2f}%'.format(height+bottom), (p.get_x()+0.875*width, (height+bottom)+0.3),
fontsize=16, fontweight='bold', color='black')
However, I would suggest you to rethink the whole approach you are following and change the plot to something like:
plt.bar(df_title.columns,df_title.loc['M'])
plt.bar(df_title.columns,df_title.loc['F'],bottom=df_title.loc['M'])

How to Rotate Count Plot In Seaborn?

plt.figure(figsize = (12, 8))
sns.set(style = 'dark', palette = 'colorblind', color_codes = True)
ax = sns.countplot('Position', data = data, color = 'orange')
ax.set_xlabel(xlabel = 'Different Positions in Football', fontsize = 16)
ax.set_ylabel(ylabel = 'Number of of Players', fontsize = 16)
ax.set_title(label = 'Comparison of Positions and Players', fontsize = 20)
plt.show()
After excuting this code the labels get Overlapped
Is there any way to rotate the image to prevent overlapping?

Insted of using
ax = sns.countplot('Position', data = data, color = 'orange')
Where 'Position' = x, try to use 'Position'=y, just like that:
ax = sns.countplot(y='Position', data = data, color = 'orange')
The rest of the code remains the same

Using Hlines ruins legends in Matplotlib

I'm struggling to adjust my plot legend after adding the axline/ hline on 100 level in the graph.(screenshot added)
if there's a way to run this correctly so no information will be lost in legend, and maybe add another hline and adding it to the legend.
adding the code here, maybe i'm not writing it properly.
fig, ax1 = plt.subplots(figsize = (9,6),sharex=True)
BundleFc_Outcome['Spend'].plot(kind = 'bar',color = 'blue',width = 0.4, ax = ax1,position = 1)
#
# Make the y-axis label, ticks and tick labels match the line color.
ax1.set_ylabel('SPEND', color='b', size = 18)
ax1.set_xlabel('Bundle FC',color='w',size = 18)
ax2 = ax1.twinx()
ax2.set_ylabel('ROAS', color='r',size = 18)
ax1.tick_params(axis='x', colors='w',size = 20)
ax2.tick_params(axis = 'y', colors='w',size = 20)
ax1.tick_params(axis = 'y', colors='w',size = 20)
#ax1.text()
#
ax2.axhline(100)
BundleFc_Outcome['ROAS'].plot(kind = 'bar',color = 'red',width = 0.4, ax = ax2,position = 0.25)
plt.grid()
#ax2.set_ylim(0, 4000)
ax2.set_ylim(0,300)
plt.title('ROAS & SPEND By Bundle FC',color = 'w',size= 20)
plt.legend([ax2,ax1],labels = ['SPEND','ROAS'],loc = 0)
The code gives me the following picture:
After implementing the suggestion in the comments, the picture looks like this (does not solve the problem):

You can use bbox_to_anchor attribute to set legend location manually.
ax1.legend([ax1],labels = ['SPEND'],loc='upper right', bbox_to_anchor=(1.25,0.70))
plt.legend([ax2,ax1],labels = ['SPEND','ROAS'],loc='upper right', bbox_to_anchor=(1.25,0.70))
https://matplotlib.org/users/legend_guide.html#legend-location

So finally figured it out , was simpler for a some reason
Even managed to add another threshold at level 2 for minimum spend.
fig, ax1 = plt.subplots(figsize = (9,6),sharex=True)
BundleFc_Outcome['Spend'].plot(kind = 'bar',color = 'blue',width = 0.4, ax = ax1,position = 1)
#
# Make the y-axis label, ticks and tick labels match the line color.
ax1.set_ylabel('SPEND', color='b', size = 18)
ax1.set_xlabel('Region',color='w',size = 18)
ax2 = ax1.twinx()
ax2.set_ylabel('ROAS', color='r',size = 18)
ax1.tick_params(axis='x', colors='w',size = 20)
ax2.tick_params(axis = 'y', colors='w',size = 20)
ax1.tick_params(axis = 'y', colors='w',size = 20)
#ax1.text()
#
BundleFc_Outcome['ROAS'].plot(kind = 'bar',color = 'red',width = 0.4, ax = ax2,position = 0.25)
plt.grid()
#ax2.set_ylim(0, 4000)
ax2.set_ylim(0,300)
plt.title('ROAS & SPEND By Region',color = 'w',size= 20)
fig.legend([ax2,ax1],labels = ['SPEND','ROAS'],loc = 0)
plt.hlines([100,20],xmin = 0,xmax = 8,color= ['r','b'])

I don't recommend using the builtin functions of pandas to do more complex plotting. Also when asking a question it is common courtesy to provide a minimal and verifiable example (see here). I took the liberty to simulate your problem.
Due to the change in axes, we need to generate our own legend. First the results:
Which can be achieved with:
import matplotlib.pyplot as plt, pandas as pd, numpy as np
# generate dummy data.
X = np.random.rand(10, 2)
X[:,1] *= 1000
x = np.arange(X.shape[0]) * 2 # xticks
df = pd.DataFrame(X, columns = 'Spend Roast'.split())
# end dummy data
fig, ax1 = plt.subplots(figsize = (9,6),sharex=True)
ax2 = ax1.twinx()
# tmp axes
axes = [ax1, ax2] # setup axes
colors = plt.cm.tab20(x)
width = .5 # bar width
# generate dummy legend
elements = []
# plot data
for idx, col in enumerate(df.columns):
tax = axes[idx]
tax.bar(x + idx * width, df[col], label = col, width = width, color = colors[idx])
element = tax.Line2D([0], [0], color = colors[idx], label = col) # setup dummy label
elements.append(element)
# desired hline
tax.axhline(200, color = 'red')
tax.set(xlabel = 'Bundle FC', ylabel = 'ROAST')
axes[0].set_ylabel('SPEND')
tax.legend(handles = elements)

Bar graph doesn't fill the Axis

I'm trying to make a stacked bar chart of a list of a variable number of "Accumulators", which have a person's name and three percentages which always add up to 100. But when I have a large number of entries in the list, all the bars are crowded to the left side of the graph.
Here's the code:
per_unreviewed = np.array([p.accum_per_unreviewed for p in accumulators])
per_reviewed = np.array([p.accum_per_reviewed for p in accumulators])
per_signed_off = np.array([p.accum_per_signed_off for p in accumulators])
fig = Figure(facecolor="w", figsize=(15, 7))
ax = fig.add_subplot(111)
ind = np.arange(len(accumulators))
logger.debug("len(acc) = %d, ind = %s", len(accumulators), ind)
width = 0.45
p1 = ax.bar(ind, per_signed_off, width, color="g")
p2 = ax.bar(ind, per_reviewed, width, color="b", bottom=per_signed_off)
p3 = ax.bar(ind, per_unreviewed, width, color="r",
bottom=per_signed_off + per_reviewed)
ax.set_title(title)
ax.set_ylabel("Percent by status")
ax.set_yticks(np.arange(0, 101, 20))
ax.set_xticks(ind + width / 2.0)
ax.set_xticklabels(
[p.person for p in accumulators],
rotation='vertical', clip_on=False)
fig.tight_layout()
box = ax.get_position()
ax.set_position([box.x0, box.y0, box.width * 0.7, box.height])
if (len(p1) > 0 or len(p2) > 0 or len(p3) > 0):
ax.legend(
(p1[0], p2[0], p3[0]),
('Signed Off', 'Reviewed', 'Unreviewed'),
loc="upper left", bbox_to_anchor=(1.05, 1), borderaxespad=0
)
canvas = FigureCanvas(fig)
outstr = StringIO.StringIO()
canvas.print_png(outstr)
And the result

Have you tried playing with the x axis range? You have the ticks and a figure size, but nothing that tells the plot the range of x.
I don't use subplots myself, but is there something like ax.set_xlim([]) or ax.xlim() that does this?
Update from Paul Tomblin: I tried those suggestions and they didn't help, but they did point me to the right idea:
ax.set_xbound(lower=0, upper=len(accumulators))

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Data Visulation Problem With Bar Plot and It's axes - python

Related

Show dates in xticks only where value exist in plot chart of multiple dataframes

Matpolitlib Stacked Barchart - Adding Labels for the Total Value on each of my bars [duplicate]

How to Rotate Count Plot In Seaborn?

Using Hlines ruins legends in Matplotlib

Bar graph doesn't fill the Axis

Categories

Resources