I'm plotting a Seaborn heatmap and I want to center the y-axis tick labels, but can't find a way to do this. 'va' text property doesn't seem to be available on yticks().
Considering the following image
I'd like to align the days of the week to the center of the row of squares
Code to generate this graph:
import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
#Generate dummy data
startDate = '2017-11-25'
dateList = pd.date_range(startDate, periods=365).tolist()
df = pd.DataFrame({'Date': dateList,
'Distance': np.random.normal(loc=15, scale=15, size=(365,))
})
#set week and day
df['Week'] = [x.isocalendar()[1] for x in df['Date']]
df['Day'] = [x.isocalendar()[2] for x in df['Date']]
#create dataset for heatmap
#group by axis to plot
df = df.groupby(['Week','Day']).sum().reset_index()
#restructure for heatmap
data = df.pivot("Day","Week","Distance")
#configure the heatmap plot
sns.set()
fig, ax = plt.subplots(figsize=(15,6))
ax=sns.heatmap(data,xticklabels=1,ax = ax, robust=True, square=True,cmap='RdBu_r',cbar_kws={"shrink":.3, "label": "Distance (KM)"})
ax.set_title('Running distance', fontsize=16, fontdict={})
#configure the x and y ticks
plt.xticks(fontsize="9")
plt.yticks(np.arange(7),('Mon','Tue','Wed','Thu','Fri','Sat','Sun'), rotation=0, fontsize="10", va="center")
#set labelsize of the colorbar
cbar = ax.collections[0].colorbar
cbar.ax.tick_params(labelsize=10)
plt.show()
Adding +0.5 to np.arange(7) in the plt.yticks() worked for me
plt.yticks(np.arange(7)+0.5,('Mon','Tue','Wed','Thu','Fri','Sat','Sun'),
rotation=0, fontsize="10", va="center")
onno's solution works for this specific case (matrix-type plots typically have labels in the middle of the patches), but also consider these more general ways to help you out:
a) find out where the ticks are first
pos, textvals = plt.yticks()
print(pos)
>>> [0.5 1.5 2.5 3.5 4.5 5.5 6.5]
and of course you can use these positions directly during the update:
plt.yticks(pos,('Mon','Tue','Wed','Thu','Fri','Sat','Sun'),
rotation=0, fontsize="10", va="center")
b) use the object-based API to adjust only the text
the pyplot commands xticks & yticks update both the positions and the text at once. But the axes object has independent methods for the positions (ax.set_yticks(pos)) and for the text (ax.set_yticklabels(labels)).
So long as you know how many labels to produce (and their order), you need not even think about their positions to update the text.
ax.set_yticklabels(('Mon','Tue','Wed','Thu','Fri','Sat','Sun'),
rotation=0, fontsize="10", va="center")
This is an old question, but I recently had this issue and found this worked for me:
g = sns.heatmap(df)
g.set_yticklabels(labels=g.get_yticklabels(), va='center')
and of course you can just define labels=myLabelList also, as done in the OP
Related
This question already has answers here:
MonthLocator in Matplotlib
(1 answer)
Editing the date formatting of x-axis tick labels
(4 answers)
Closed 5 months ago.
Here's how plot this figure:
plt.figure(1, figsize = (20,8))
ax = sns.lineplot(data=df, x=df['timestamp'], y=df['speed'])
plt.xticks(rotation=90)
plt.title('Trip 543365 timeline', fontsize=22)
plt.ylabel('GPS speed', fontsize=18)
plt.xlabel('Timestamp', fontsize=16,)
plt.savefig('trip537685', dpi=600)
The x-axis is not readable despite setting plt.xticks(rotation=90), how to I change the scale so it appears readable?
As you have not provided the data, I have taken some random data of ~1500 rows with datetime as DD-MM-YYYY format. First, as this is in text, change it to datetime using to_datetime(), then plot it. That should, as #JohanC said, give you fairly good result. But, if you still need to adjust it, use set_major_locator() and set_major_formatter() to adjust as you need. I have shown this as interval of 3 months. You can, however, adjust it as you see fit. Hope this helps.
df=pd.read_csv('austin_weather.csv')
df.rename(columns={'Date': 'timestamp'}, inplace=True)
df.rename(columns={'TempHighF': 'speed'}, inplace=True)
df['timestamp']=pd.to_datetime(df['timestamp'],format="%d-%m-%Y")
plt.figure(1, figsize = (20,8))
ax = sns.lineplot(data=df, x=df['timestamp'], y=df['speed'])
import matplotlib.dates as mdates
ax.xaxis.set_major_locator(mdates.MonthLocator(interval=3))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b-%Y'))
It seems you have a lot of datapoints plotted so that the xticks just get overlayed due to the label font size.
If you don't need every single x-ticks displayed you can set the label locations with xticks along with an array to display only every nth tick.
Data preparation:
Just strings for x-axis lables as an example.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import random
import string
def random_string():
return ''.join(random.choices(string.ascii_lowercase +
string.digits, k=7))
size=1000
x_list = []
for i in range(size):
x_list.append(random_string())
y = np.random.randint(low=0, high=50, size=size)
df = pd.DataFrame(list(zip(x_list, y)),
columns =['timestamp', 'speed'])
Plot with a lot of datapoints for reference:
plt.figure(1, figsize = (20,8))
ax = sns.lineplot(data=df, x=df['timestamp'], y=df['speed'])
plt.xticks(rotation=90)
plt.title('Trip 543365 timeline', fontsize=22)
plt.ylabel('GPS speed', fontsize=18)
plt.xlabel('Timestamp', fontsize=16,)
plt.show()
Plot with reduced xticks:
plt.figure(1, figsize = (20,8))
ax = sns.lineplot(data=df, x=df['timestamp'], y=df['speed'])
plt.xticks(rotation=90)
plt.title('Trip 543365 timeline', fontsize=22)
plt.ylabel('GPS speed', fontsize=18)
plt.xlabel('Timestamp', fontsize=16,)
every_nth_xtick = 50
plt.xticks(np.arange(0, len(x_list)+1, every_nth_xtick))
plt.show()
To cross check you can add:
print(x_list[0])
print(x_list[50])
print(x_list[100])
Just make sure it's within the same random call.
I am having a data frame containing dates, stations id and rain fall mm/day. i am trying to generate a bar plot. I am using matplotlib subplot to generate the bar graph. Once i run the below code it generates a bar chart(shown below) with messy dates in x axis. i am analyzing the data from 2017-04-16 to 2017-08-16. I want to show months like april 2017, may 2017 and so on. Can anyone please help me? Thanks in advance.
fig = plt.figure(dpi= 136, figsize=(16,8))
ax1 = fig.add_subplot(221)
ax2 = fig.add_subplot(222)
ax3 = fig.add_subplot(223)
ax4 = fig.add_subplot(224)
df1[216].plot(ax = ax1, kind='bar', stacked=True)
df1[2947].plot(ax = ax2, kind='bar')
df1[5468].plot(ax = ax3, kind='bar')
df1[1300].plot(ax = ax4, kind='bar')
plt.show()
Here is the output i am getting
This is the dataframe i am having
Bar plots in pandas are designed to compare categories rather than to display time-series or other types of continuous variables, as stated in the docstring:
A bar plot shows comparisons among discrete categories. One axis of
the plot shows the specific categories being compared, and the other
axis represents a measured value.
This is why the scale of the x-axis of pandas bar plots is made of integers starting from zero and each bar has a tick and a tick label by default, regardless of the data type of the x variable.
You have two options: either plot the data with plt.bar and use the date tick locators and formatters from the matplotlib.dates module or stick to pandas and apply custom ticks and tick labels based on the datetime index and formatted using appropriate format codes like in this example:
import numpy as np # v 1.19.2
import pandas as pd # v 1.2.3
import matplotlib.pyplot as plt # v 3.3.4
# Create sample dataset
rng = np.random.default_rng(seed=1234)
date = pd.date_range('2017-04-16', '2017-06-16', freq='D')
df = pd.DataFrame(rng.exponential(scale=7, size=(date.size, 4)), index=date,
columns=['216','2947','5468','1300'])
# Generate plots
axs = df.plot.bar(subplots=True, layout=(2,2), figsize=(10,7),
sharex=False, legend=False, color='tab:blue')
# Create lists of ticks and tick labels
ticks = [idx for idx, timestamp in enumerate(df.index)
if (timestamp.month != df.index[idx-1].month) | (idx == 0)]
labels = [tick.strftime('%d-%b\n%Y') if (df.index[ticks[idx]].year
!= df.index[ticks[idx-1]].year) | (idx == 0) else tick.strftime('%d-%b')
for idx, tick in enumerate(df.index[ticks])]
# Set ticks and tick labels for each plot, edit titles
for ax in axs.flat:
ax.set_title('Station '+ax.get_title())
ax.set_xticks(ticks)
ax.set_xticklabels(labels, rotation=0, ha='center')
ax.figure.subplots_adjust(hspace=0.4)
plt.show()
I created to a seaborn heatmap to summarize Teils_U coefficients. The data is horizontally displayed in the heatmap. Now, I would like to rotate the data and the legend. I know that you can roate the x axis and y axis labels in a plot, but how can I rotate the data and the legend ?
This is my code:
#creates padnas dataframe to hold the values
theilu = pd.DataFrame(index=['Y'],columns=matrix.columns)
#store column names in variable columns
columns = matrix.columns
#iterate through each variable
for j in range(0,len(columns)):
#call teil_u function on "ziped" independant and dependant variable -> respectivley x & y in the functions section
u = theil_u(matrix['Y'].tolist(),matrix[columns[j]].tolist())
#select respecive columns needed for output
theilu.loc[:,columns[j]] = u
#handle nans if any
theilu.fillna(value=np.nan,inplace=True)
#plot correlation between fraud reported (y) and all other variables (x)
plt.figure(figsize=(20,1))
sns.heatmap(theilu,annot=True,fmt='.2f')
plt.show()
Here an image of what I am looking for:
Please let me know if you need and sample data or the teil_u function to recreate the problem. Thank you
The parameters of the annotation can be changed via annot_kws. One of them is the rotation.
Some parameters of the colorbar can be changed via cbar_kwsdict, but the unfortunately the orientation of the labels isn't one of them. Therefore, you need a handle to the colorbar's ax. One way is to create an ax beforehand, and pass it to sns.heatmap(..., cbar_ax=ax). An easier way is to get the handle afterwards: cbar = heatmap.collections[0].colorbar.
With this ax handle, you can change more properties of the colorbar, such as the orientation of its labels. Also, their vertical alignment can be changed to get them centered.
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
data = np.random.rand(1, 12)
fig, ax = plt.subplots(figsize=(10,2))
heatmap = sns.heatmap(data, cbar=True, ax=ax,
annot=True, fmt='.2f', annot_kws={'rotation': 90})
cbar = heatmap.collections[0].colorbar
# heatmap.set_yticklabels(heatmap.get_yticklabels(), rotation=90)
heatmap.set_xticklabels(heatmap.get_xticklabels(), rotation=90)
cbar.ax.set_yticklabels(cbar.ax.get_yticklabels(), rotation=90, va='center')
plt.tight_layout()
plt.show()
You can pass argument to ax.text() (which is used to write the annotation) using the annot_kws= argument.
Therefore:
flights = sns.load_dataset("flights")
flights = flights.pivot("month", "year", "passengers")
fig, ax = plt.subplots(figsize=(8,8))
ax = sns.heatmap(flights, annot=True, fmt='d', annot_kws={'rotation':90})
I took data from excel and plotted it. The first column is date, while the next two columns are prices of different indexes.
I managed to plot them, but they are on separate graphs. I need them plotted against each other with one y-axis (date) and two x-axis.
Also, I can't figure out how to make the line dotted for one and a diamond marker for the other.
import matplotlib.pyplot as plt
import pandas as pd
excel_data = pd.read_excel('Python_assignment_InputData.xlsx', '^GSPTSE')
excel_data.plot(kind='line', x = 'Date', y = 'Bitcoin CAD (BTC-CAD)', color = 'green')
excel_data.plot(kind='line', x = 'Date', y = 'S&P/TSX Composite index (^GSPTSE)', color = 'blue')
plt.show()
I expect Bitcoin and S%P prices to be on one y axis, with dates being on the x axis.
I am providing a sample answer using the iris DataFrame from seaborn. You can modify it to your needs. What you need is a single x axis and two y-axes.
import seaborn as sns
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
iris = sns.load_dataset("iris")
iris.plot(x='sepal_length', y='sepal_width', linestyle=':', ax=ax)
iris.plot(x='petal_length', y='petal_width', marker='d',
linestyle='None', secondary_y=True, ax=ax)
I am trying to plot a polar plot using Seaborn's facetGrid, similar to what is detailed on seaborn's gallery
I am using the following code:
sns.set(context='notebook', style='darkgrid', palette='deep', font='sans-serif', font_scale=1.25)
# Set up a grid of axes with a polar projection
g = sns.FacetGrid(df_total, col="Construct", hue="Run", col_wrap=5, subplot_kws=dict(projection='polar'), size=5, sharex=False, sharey=False, despine=False)
# Draw a scatterplot onto each axes in the grid
g.map(plt.plot, 'Rad', ''y axis label', marker=".", ms=3, ls='None').set_titles("{col_name}")
plt.savefig('./image.pdf')
Which with my data gives the following:
I want to keep this organisation of 5 plots per line.
The problem is that the title of each subplot overlap with the values of the ticks, same for the y axis label.
Is there a way to prevent this behaviour? Can I somehow shift the titles slightly above their current position and can I shift the y axis labels slightly on the left of their current position?
Many thanks in advance!
EDIT:
This is not a duplicate of this SO as the problem was that the title of one subplot overlapped with the axis label of another subplot.
Here my problem is that the title of one subplot overlaps with the ticks label of the same subplot and similarly the axis label overlaps with the ticks label of the same subplot.
I also would like to add that I do not care that they overlap on my jupyter notebook (as it as been created with it), however I want the final saved image with no overlap, so perhaps there is something I need to do to save the image in a slightly different format to avoid that, but I don't know what (I am only using plt.savefig to save it).
EDIT 2: If someone would like to reproduce the problem here is a minimal example:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
sns.set()
sns.set(context='notebook', style='darkgrid', palette='deep', font='sans-serif', font_scale=1.5)
# Generate an example radial datast
r = np.linspace(0, 10000, num=100)
df = pd.DataFrame({'label': r, 'slow': r, 'medium-slow': 1 * r, 'medium': 2 * r, 'medium-fast': 3 * r, 'fast': 4 * r})
# Convert the dataframe to long-form or "tidy" format
df = pd.melt(df, id_vars=['label'], var_name='speed', value_name='theta')
# Set up a grid of axes with a polar projection
g = sns.FacetGrid(df, col="speed", hue="speed",
subplot_kws=dict(projection='polar'), size=4.5, col_wrap=5,
sharex=False, sharey=False, despine=False)
# Draw a scatterplot onto each axes in the grid
g.map(plt.scatter, "theta", "label")
plt.savefig('./image.png')
plt.show()
Which gives the following image in which the titles are not as bad as in my original problem (but still some overlap) and the label on the left hand side overlap completely.
In order to move the title a bit higher you can set at new position,
ax.title.set_position([.5, 1.1])
In order to move the ylabel a little further left, you can add some padding
ax.yaxis.labelpad = 25
To do this for the axes of the facetgrid, you'd do:
for ax in g.axes:
ax.title.set_position([.5, 1.1])
ax.yaxis.labelpad = 25
The answer provided by ImportanceOfBeingErnest in this SO question may help.