Overflow x label MatPlotLib - python

I have this bar graph but the X labels that are long keep overflowing into the other label. Is there a way I can create more space or cause a line break when it is doing this?
Below is the part of the code that accounts for the graph
import pandas as pd
import matplotlib.pyplot as plt
ax = tweets_df.plot(kind='bar', x='name', y='tweet_volume', fontsize=7, width=.5)
ax.set_xlabel('Hastag')
ax.set_ylabel('Tweets w/ Hashtag')
plt.xticks(rotation='horizontal')
plt.show()

IMHO you can use rotation=90 instead of rotation='horizontal' or if you want to keep horizontal and truncating values,
import pandas as pd
import matplotlib.pyplot as plt
N = 5
ax = tweets_df.plot(kind='bar', x='name', y='tweet_volume', fontsize=7, width=.5)
ax.set_xlabel('Hastag')
ax.set_ylabel('Tweets w/ Hashtag')
plt.xticks(rotation='horizontal')
labels = [item.get_text() for item in ax.get_xticklabels()]
ax.set_xticklabels([label[:N] for label in labels])
plt.show()

Related

How to align text with ylabel in matplotlib?

I want to add a text "a)", "b)" and "c)" in subfigures and align it with the yaxis label.
Lets say we have a simple plot, with variable y-axis tick labels "0.8", "0.08", "0.016". This would increase the distance between ax[i].transAxes=0 and ylabel.
import matplotlib.pyplot as plt
import numpy as np
x=np.arange(0,1,0.01)
y=np.sin(x)
fig,ax=plt.subplots(1,3,figsize=(6,4))
ax[0].plot(x,y)
ax[1].plot(x,y*0.1)
ax[2].plot(x,y*0.02)
for i in range(3):
ax[i].spines['top'].set_visible(False)
ax[i].spines['right'].set_visible(False)
ax[i].set_ylabel('Sine')
ax[i].set_xlabel('x')
plt.tight_layout()
abc='abc'
for i in range(3):
ax[i].text(0,1,abc[i]+')',transform=ax[i].transAxes)
plt.show()
Currently I am trying to find the (x,y) position by trial and error to right align "a)", "b)" or "c)" with the ylabel. Is there a better way to do this?
If I try to get the position of ylabel using,
ylbl=ax[i].set_ylabel('Sine')
print(ylbl.get_position())
I get (0,0.5), which is not really helpful.
More bizzare is when I do a tight_layout.
ylbls=[]
for i in range(3):
ylbls.append(ax[i].set_ylable('Sine'))
plt.tight_layout()
for i in range(3):
print(ylbls[i].get_position()
I get values (37.7,0.5), (201.8,0.5), (365.9,0.5). I have no idea what these 37.7, 201.8, 365.9 imply and if I can use them to align ylabel with my text somehow?
First, get the y-label text box information, and then use the blend transform method, x is from y-label box info, y is from ax[i].transAxes.
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.transforms import IdentityTransform
import matplotlib.transforms as transforms
x=np.arange(0,1,0.01)
y=np.sin(x)
fig,ax=plt.subplots(1,3,figsize=(7,5))
ax[0].plot(x,y)
ax[1].plot(x,y*0.1)
ax[2].plot(x,y*0.02)
for i in range(3):
ax[i].spines['top'].set_visible(False)
ax[i].spines['right'].set_visible(False)
ax[i].set_ylabel('Sine')
ax[i].set_xlabel('x')
plt.tight_layout()
fig.canvas.draw()
abc = r'abc'
for i in range(3):
iax = ax[i]
trans = transforms.blended_transform_factory(IdentityTransform(), iax.transAxes)
bb = iax.yaxis.label.get_window_extent()
iax.text(bb.x0-3,1.01,abc[i]+')',ha='left',fontsize=14,transform=trans)
plt.savefig('output_text.png',dpi=300)

How can I change the labels in this pie chart [plotly]?

I want to change the labels [2,3,4,5] from my pie chart and instead have them say [Boomer, Gen X, Gen Y, Gen Z] respectively. I can't seem to find a direct way of doing this without changing the dataframe. Is there any way to do this by working through the code I have?
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
data = df.groupby("Q10_Ans")["Q4_Agree"].count()
pie, ax = plt.subplots(figsize=[10,6])
labels = data.keys()
plt.pie(x=data, autopct="%.1f%%", explode=[0.05]*4, labels=labels, pctdistance=0.5)
plt.title("Generations that agree data visualization will help with job prospects", fontsize=14);
pie.savefig("DeliveryPieChart.png")
how about change the code
labels = data.keys()
to
labels = ['Boomer','Gen X','Gen Y','Gen Z']
I don't know the data structure of your data, so I made a sample data and created a pie chart. Please modify your code to follow this.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
# data = df.groupby("Q10_Ans")["Q4_Agree"].count()
data = pd.DataFrame({'Q10_Ans':['Boomer','Gen X','Gen Y','Gen Z'],'Q4_Agree':[2,3,4,5]})
fig, ax = plt.subplots(figsize=[10,6])
labels = data['Q10_Ans']
ax.pie(x=data['Q4_Agree'], autopct="%.1f%%", explode=[0.05]*4, labels=labels, pctdistance=0.5)
ax.set_title("Generations that agree data visualization will help with job prospects", fontsize=14);
plt.savefig("DeliveryPieChart.png")

How to make horizontal linechart with categorical variables and timeseries?

I want to replicate plots from this paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5000555/pdf/nihms774453.pdf I'm particularly interested in plot on page 16, right panel. I tried to do this in matplotlib but it seems to me that there is no way to access lines in linecollection.
I don't know how to change the color of the each line, according to the value at every index. I'd like to eventually get something like here: https://matplotlib.org/3.1.1/gallery/lines_bars_and_markers/multicolored_line.html but for every line, according to the data.
this is what I tried:
the data in numpy array: https://pastebin.com/B1wJu9Nd
import pandas as pd, numpy as np
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
from matplotlib import colors as mcolors
%matplotlib inline
base_range = np.arange(qq.index.max()+1)
fig, ax = plt.subplots(figsize=(12,8))
ax.set_xlim(qq.index.min(), qq.index.max())
# ax.set_ylim(qq.columns[0], qq.columns[-1])
ax.set_ylim(-5, len(qq.columns) +5)
line_segments = LineCollection([np.column_stack([base_range, [y]*len(qq.index)]) for y in range(len(qq.columns))],
cmap='viridis',
linewidths=(5),
linestyles='solid',
)
line_segments.set_array(base_range)
ax.add_collection(line_segments)
axcb = fig.colorbar(line_segments)
plt.show()
my result:
what I want to achieve:

how to reduce y-axis in matplot with same distance

I want this plot's y-axis to be centered at 38, and the y-axis scaled such that the 'humps' disappear. How do I accomplish this?
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
s=['05/02/2019', '06/02/2019', '07/02/2019', '08/02/2019',
'09/02/2019', '10/02/2019', '11/02/2019', '12/02/2019',
'13/02/2019', '20/02/2019', '21/02/2019', '22/02/2019',
'23/02/2019', '24/02/2019', '25/02/2019']
df[0]=['38.02', '33.79', '34.73', '36.47', '35.03', '33.45',
'33.82', '33.38', '34.68', '36.93', '33.44', '33.55',
'33.18', '33.07', '33.17']
# Data for plotting
fig, ax = plt.subplots(figsize=(17, 2))
for i,j in zip(s,df[0]):
ax.annotate(str(j),xy=(i,j+0.8))
ax.plot(s, df[0])
ax.set(xlabel='Dates', ylabel='Latency',
title='Hongkong to sing')
ax.grid()
#plt.yticks(np.arange(min(df[p]), max(df[p])+1, 2))
fig.savefig("test.png")
plt.show()
I'm not entirely certain if this is what you're looking for but you can adjust the y-limits explicitly to change the scale, i.e.
ax.set_ylim([ax.get_ylim()[0], 42])
Which only sets the upper bound, leaving the lower limit unchanged, this would give you
you can supply any values you find appropriate, i.e.
ax.set_ylim([22, 52])
will give you something that looks like
Also note that the tick labels and general appearance of your plot will differ from what is shown here.
Edit - Here is the complete code as requested:
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.DataFrame()
s=['05/02/2019', '06/02/2019', '07/02/2019', '08/02/2019',
'09/02/2019', '10/02/2019', '11/02/2019', '12/02/2019',
'13/02/2019', '20/02/2019', '21/02/2019', '22/02/2019',
'23/02/2019', '24/02/2019', '25/02/2019']
df[0]=['38.02','33.79','34.73','36.47','35.03','33.45',
'33.82','33.38','34.68','36.93','33.44','33.55',
'33.18','33.07','33.17']
# Data for plotting
fig, ax = plt.subplots(figsize=(17, 3))
#for i,j in zip(s,df[0]):
# ax.annotate(str(j),xy=(i,j+0.8))
ax.plot(s, pd.to_numeric(df[0]))
ax.set(xlabel='Dates', ylabel='Latency',
title='Hongkong to sing')
ax.set_xticklabels(pd.to_datetime(s).strftime('%m.%d'), rotation=45)
ax.set_ylim([22, 52])
plt.show()

How to customize a scatter matrix to see all titles?

I'm running this code to build a scatter matrix. The problem is that the plot looks like a mess, because it's impossible to see the names of variables (see image below). Is there any way to change the orientation of titles and switch off the ticks with numbers?
import pandas as pd
import matplotlib.pyplot as plt
train = pd.read_csv('data/train.csv', parse_dates=[0])
plt.figure()
a = pd.scatter_matrix(train, alpha=0.05, figsize=(10,10), diagonal='hist')
plt.show()
As a minimal scatter_matrix example to switch off axis ticks and rotate the labels,
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
try:
from pandas.tools.plotting import scatter_matrix
except ImportError:
#Fix suggested by #Raimundo Jimenez as tools is deprecated
from pandas.plotting import scatter_matrix
df = pd.DataFrame(np.random.randn(1000, 4), columns=['long label', 'testing', 'another label', 'something else'])
sm = scatter_matrix(df, alpha=0.2, figsize=(6, 6), diagonal='kde')
#Change label rotation
[s.xaxis.label.set_rotation(45) for s in sm.reshape(-1)]
[s.yaxis.label.set_rotation(0) for s in sm.reshape(-1)]
#May need to offset label when rotating to prevent overlap of figure
[s.get_yaxis().set_label_coords(-0.3,0.5) for s in sm.reshape(-1)]
#Hide all ticks
[s.set_xticks(()) for s in sm.reshape(-1)]
[s.set_yticks(()) for s in sm.reshape(-1)]
plt.show()
and similarly, you can adjust labels, resize, etc with any of the axis objects contained in the returned handle from scatter_matrix. This results in,
pandas.tools.plotting.scatter_matrix is now deprecated. Use pandas.plotting.scatter_matrix instead.
Updated code from the one proposed by Ed Smith (#ed-smith):
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame(np.random.randn(1000, 4), columns=['long label', 'testing', 'another label', 'something else'])
sm = pd.plotting.scatter_matrix(df, alpha=0.2, figsize=(6, 6), diagonal='kde')
#Change label rotation
[s.xaxis.label.set_rotation(45) for s in sm.reshape(-1)]
[s.yaxis.label.set_rotation(0) for s in sm.reshape(-1)]
#May need to offset label when rotating to prevent overlap of figure
[s.get_yaxis().set_label_coords(-0.3,0.5) for s in sm.reshape(-1)]
#Hide all ticks
[s.set_xticks(()) for s in sm.reshape(-1)]
[s.set_yticks(()) for s in sm.reshape(-1)]
plt.show()

Categories