This question already has answers here:
MonthLocator in Matplotlib
(1 answer)
Editing the date formatting of x-axis tick labels
(4 answers)
Closed 5 months ago.
Here's how plot this figure:
plt.figure(1, figsize = (20,8))
ax = sns.lineplot(data=df, x=df['timestamp'], y=df['speed'])
plt.xticks(rotation=90)
plt.title('Trip 543365 timeline', fontsize=22)
plt.ylabel('GPS speed', fontsize=18)
plt.xlabel('Timestamp', fontsize=16,)
plt.savefig('trip537685', dpi=600)
The x-axis is not readable despite setting plt.xticks(rotation=90), how to I change the scale so it appears readable?
As you have not provided the data, I have taken some random data of ~1500 rows with datetime as DD-MM-YYYY format. First, as this is in text, change it to datetime using to_datetime(), then plot it. That should, as #JohanC said, give you fairly good result. But, if you still need to adjust it, use set_major_locator() and set_major_formatter() to adjust as you need. I have shown this as interval of 3 months. You can, however, adjust it as you see fit. Hope this helps.
df=pd.read_csv('austin_weather.csv')
df.rename(columns={'Date': 'timestamp'}, inplace=True)
df.rename(columns={'TempHighF': 'speed'}, inplace=True)
df['timestamp']=pd.to_datetime(df['timestamp'],format="%d-%m-%Y")
plt.figure(1, figsize = (20,8))
ax = sns.lineplot(data=df, x=df['timestamp'], y=df['speed'])
import matplotlib.dates as mdates
ax.xaxis.set_major_locator(mdates.MonthLocator(interval=3))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b-%Y'))
It seems you have a lot of datapoints plotted so that the xticks just get overlayed due to the label font size.
If you don't need every single x-ticks displayed you can set the label locations with xticks along with an array to display only every nth tick.
Data preparation:
Just strings for x-axis lables as an example.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import random
import string
def random_string():
return ''.join(random.choices(string.ascii_lowercase +
string.digits, k=7))
size=1000
x_list = []
for i in range(size):
x_list.append(random_string())
y = np.random.randint(low=0, high=50, size=size)
df = pd.DataFrame(list(zip(x_list, y)),
columns =['timestamp', 'speed'])
Plot with a lot of datapoints for reference:
plt.figure(1, figsize = (20,8))
ax = sns.lineplot(data=df, x=df['timestamp'], y=df['speed'])
plt.xticks(rotation=90)
plt.title('Trip 543365 timeline', fontsize=22)
plt.ylabel('GPS speed', fontsize=18)
plt.xlabel('Timestamp', fontsize=16,)
plt.show()
Plot with reduced xticks:
plt.figure(1, figsize = (20,8))
ax = sns.lineplot(data=df, x=df['timestamp'], y=df['speed'])
plt.xticks(rotation=90)
plt.title('Trip 543365 timeline', fontsize=22)
plt.ylabel('GPS speed', fontsize=18)
plt.xlabel('Timestamp', fontsize=16,)
every_nth_xtick = 50
plt.xticks(np.arange(0, len(x_list)+1, every_nth_xtick))
plt.show()
To cross check you can add:
print(x_list[0])
print(x_list[50])
print(x_list[100])
Just make sure it's within the same random call.
Related
This question already has answers here:
Barplot and line plot in seaborn/matplotlib
(1 answer)
How to line plot timeseries data on a bar plot
(1 answer)
pandas bar plot combined with line plot shows the time axis beginning at 1970
(2 answers)
Closed 4 months ago.
I have barplot and lineplots that share the same x axis that I want to plot together. Here's the picture:
I want the graph plot to keep the "average_daily_price" as y axis and disregard "num_sales" as y axis. Here's the result I want to achieve:
I've tried the following
fig, ax1 = plt.subplots()
sns.lineplot(filtered_df, x='date', y='average_daily_price', ax=ax1)
sns.barplot(filtered_df, x="date", y="num_sales", alpha=0.5, ax=ax1)
But it gives weird result. I've also tried twinx() but couldn't make it work, besides it creates second y axis which I don't want.
Edit: running rafael's code results in this plot:
I'd like to add that date is in a datetime64[ns] format.
Edit 2: This post has been closed for duplicate. I've already seen the posts in duplicate list and tried the solutions listed, but they do not apply to my case, I don't know why, that's what I'm trying to figure out by opening new question. I'm guessing it has to do with my x variable being a datetime object.
The seaborn "barplot" is dedicated to plotting categorical variables. As such, it understands that each date is an unique value and plots the corresponding values sequentially.
This breaks the behavior of the dates in the x-axis.
A workaround for this is to use matplotlibs ax.bar directly:
# imports
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib as mpl
import pandas as pd
# generate dummy data
rng = np.random.default_rng()
size=100
vals = rng.normal(loc=0.02,size=size).cumsum() + 50
drange = pd.date_range("2014-01", periods=size, freq="D")
num_sales = rng.binomial(size=size,n=50,p=0.4)
# store data in a pandas DF
df = pd.DataFrame({'date': drange,
'average_daily_price': vals,
'num_sales': num_sales})
# setup axes
fig, ax1 = plt.subplots(figsize=(12,3))
# double y-axis is necessary due to the difference in the range of both variables
ax2 = ax1.twinx()
# plot the number of sales as a series of vertical bars
ax2.bar(df['date'], df['num_sales'], color='grey', alpha=0.5, label='Number of sales')
# plot the price as a time-series line plot
sns.lineplot(data=df, x='date', y='average_daily_price', ax=ax1)
# format the x-axis ticks as dates in weekly intervals
# the format is datetime64[ns]
ax1.xaxis.set_major_locator(mpl.dates.WeekdayLocator(interval=1, byweekday=1)) #weekly
ax1.xaxis.set_major_formatter(mpl.dates.DateFormatter('%Y-%m-%d'))
# rotate the x-axis tick labels for readability
ax1.tick_params(axis='x', rotation=50)
and the output is
I am having a data frame containing dates, stations id and rain fall mm/day. i am trying to generate a bar plot. I am using matplotlib subplot to generate the bar graph. Once i run the below code it generates a bar chart(shown below) with messy dates in x axis. i am analyzing the data from 2017-04-16 to 2017-08-16. I want to show months like april 2017, may 2017 and so on. Can anyone please help me? Thanks in advance.
fig = plt.figure(dpi= 136, figsize=(16,8))
ax1 = fig.add_subplot(221)
ax2 = fig.add_subplot(222)
ax3 = fig.add_subplot(223)
ax4 = fig.add_subplot(224)
df1[216].plot(ax = ax1, kind='bar', stacked=True)
df1[2947].plot(ax = ax2, kind='bar')
df1[5468].plot(ax = ax3, kind='bar')
df1[1300].plot(ax = ax4, kind='bar')
plt.show()
Here is the output i am getting
This is the dataframe i am having
Bar plots in pandas are designed to compare categories rather than to display time-series or other types of continuous variables, as stated in the docstring:
A bar plot shows comparisons among discrete categories. One axis of
the plot shows the specific categories being compared, and the other
axis represents a measured value.
This is why the scale of the x-axis of pandas bar plots is made of integers starting from zero and each bar has a tick and a tick label by default, regardless of the data type of the x variable.
You have two options: either plot the data with plt.bar and use the date tick locators and formatters from the matplotlib.dates module or stick to pandas and apply custom ticks and tick labels based on the datetime index and formatted using appropriate format codes like in this example:
import numpy as np # v 1.19.2
import pandas as pd # v 1.2.3
import matplotlib.pyplot as plt # v 3.3.4
# Create sample dataset
rng = np.random.default_rng(seed=1234)
date = pd.date_range('2017-04-16', '2017-06-16', freq='D')
df = pd.DataFrame(rng.exponential(scale=7, size=(date.size, 4)), index=date,
columns=['216','2947','5468','1300'])
# Generate plots
axs = df.plot.bar(subplots=True, layout=(2,2), figsize=(10,7),
sharex=False, legend=False, color='tab:blue')
# Create lists of ticks and tick labels
ticks = [idx for idx, timestamp in enumerate(df.index)
if (timestamp.month != df.index[idx-1].month) | (idx == 0)]
labels = [tick.strftime('%d-%b\n%Y') if (df.index[ticks[idx]].year
!= df.index[ticks[idx-1]].year) | (idx == 0) else tick.strftime('%d-%b')
for idx, tick in enumerate(df.index[ticks])]
# Set ticks and tick labels for each plot, edit titles
for ax in axs.flat:
ax.set_title('Station '+ax.get_title())
ax.set_xticks(ticks)
ax.set_xticklabels(labels, rotation=0, ha='center')
ax.figure.subplots_adjust(hspace=0.4)
plt.show()
I'm plotting a Seaborn heatmap and I want to center the y-axis tick labels, but can't find a way to do this. 'va' text property doesn't seem to be available on yticks().
Considering the following image
I'd like to align the days of the week to the center of the row of squares
Code to generate this graph:
import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
#Generate dummy data
startDate = '2017-11-25'
dateList = pd.date_range(startDate, periods=365).tolist()
df = pd.DataFrame({'Date': dateList,
'Distance': np.random.normal(loc=15, scale=15, size=(365,))
})
#set week and day
df['Week'] = [x.isocalendar()[1] for x in df['Date']]
df['Day'] = [x.isocalendar()[2] for x in df['Date']]
#create dataset for heatmap
#group by axis to plot
df = df.groupby(['Week','Day']).sum().reset_index()
#restructure for heatmap
data = df.pivot("Day","Week","Distance")
#configure the heatmap plot
sns.set()
fig, ax = plt.subplots(figsize=(15,6))
ax=sns.heatmap(data,xticklabels=1,ax = ax, robust=True, square=True,cmap='RdBu_r',cbar_kws={"shrink":.3, "label": "Distance (KM)"})
ax.set_title('Running distance', fontsize=16, fontdict={})
#configure the x and y ticks
plt.xticks(fontsize="9")
plt.yticks(np.arange(7),('Mon','Tue','Wed','Thu','Fri','Sat','Sun'), rotation=0, fontsize="10", va="center")
#set labelsize of the colorbar
cbar = ax.collections[0].colorbar
cbar.ax.tick_params(labelsize=10)
plt.show()
Adding +0.5 to np.arange(7) in the plt.yticks() worked for me
plt.yticks(np.arange(7)+0.5,('Mon','Tue','Wed','Thu','Fri','Sat','Sun'),
rotation=0, fontsize="10", va="center")
onno's solution works for this specific case (matrix-type plots typically have labels in the middle of the patches), but also consider these more general ways to help you out:
a) find out where the ticks are first
pos, textvals = plt.yticks()
print(pos)
>>> [0.5 1.5 2.5 3.5 4.5 5.5 6.5]
and of course you can use these positions directly during the update:
plt.yticks(pos,('Mon','Tue','Wed','Thu','Fri','Sat','Sun'),
rotation=0, fontsize="10", va="center")
b) use the object-based API to adjust only the text
the pyplot commands xticks & yticks update both the positions and the text at once. But the axes object has independent methods for the positions (ax.set_yticks(pos)) and for the text (ax.set_yticklabels(labels)).
So long as you know how many labels to produce (and their order), you need not even think about their positions to update the text.
ax.set_yticklabels(('Mon','Tue','Wed','Thu','Fri','Sat','Sun'),
rotation=0, fontsize="10", va="center")
This is an old question, but I recently had this issue and found this worked for me:
g = sns.heatmap(df)
g.set_yticklabels(labels=g.get_yticklabels(), va='center')
and of course you can just define labels=myLabelList also, as done in the OP
This is my code:
from matplotlib.ticker import FuncFormatter
import pandas as pd
import numpy as np
from datetime import datetime
from matplotlib import pyplot as plt
dates = pd.date_range('01/01/2016', datetime.today(), freq = 'M')
X = pd.DataFrame(index = dates)
X['values'] = np.random.rand(len(X)) * 300
fig, ax = plt.subplots()
fig.set_size_inches(8 * phi, 8 )
X['values'].plot(ax = ax)
ax.yaxis.set_major_formatter(FuncFormatter(lambda x, pos: '$ {:,.0f}'.format(x)))
plt.show()
I've been trying for half an hour now and I really need some help with this.
What is the cleanest, simplest way to show the other months on the minor tick labels for the xaxis? Instead of what it wants to do for some reason, show only months that start with a J....
Notes: I do have seaborne installed.
First, in order to be able to use matplotlib tickers on pandas date plots you needs to set the compatibility option x_compat=True.
X.plot(ax = ax, x_compat=True)
Next, in order to format the x axis, you needs to use xaxis. In order to set the minor ticklabels, you need to use set_minor_formatter.
In order to assign some ticks to certain positions you need a Locator not a Formatter.
Now it seems you want to have full control over the output plot, hence you need to set the major and minor locators and formatters.
Note that labeling each month will surely let the labels overlap. So a larger figure or smaller fontsize would be needed.
fig, ax = plt.subplots(figsize=(12,3))
X.plot(ax = ax, x_compat=True)
ax.xaxis.set_major_locator(mdates.YearLocator())
ax.xaxis.set_minor_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter("\n%Y"))
ax.xaxis.set_minor_formatter(mdates.DateFormatter("%b"))
plt.setp(ax.get_xticklabels(), rotation=0, ha="center")
plt.show()
I'm making a candlestick chart with two data sets: [open, high, low, close] and volume. I'm trying to overlay the volumes at the bottom of the chart like this:
I'm calling volume_overlay3 but instead of bars it fills the whole plot area. What am I doing wrong?
My other option is to use .bar(), which doesn't have the up and down colors but would work if I could get the scale right:
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
candlestick(ax, candlesticks)
ax2 = ax.twinx()
volume_overlay3(ax2, quotes)
ax2.xaxis_date()
ax2.set_xlim(candlesticks[0][0], candlesticks[-1][0])
ax.yaxis.set_label_position("right")
ax.yaxis.tick_right()
ax2.yaxis.set_label_position("left")
ax2.yaxis.tick_left()
The volume_overlay3 did not work for me. So I tried your idea to add a bar plot to the candlestick plot.
After creating a twin axis for the volume re-position this axis (make it short) and modify the range of the candlestick y-data to avoid collisions.
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
# from matplotlib.finance import candlestick
# from matplotlib.finance import volume_overlay3
# finance module is no longer part of matplotlib
# see: https://github.com/matplotlib/mpl_finance
from mpl_finance import candlestick_ochl as candlestick
from mpl_finance import volume_overlay3
from matplotlib.dates import num2date
from matplotlib.dates import date2num
import matplotlib.mlab as mlab
import datetime
datafile = 'data.csv'
r = mlab.csv2rec(datafile, delimiter=';')
# the dates in my example file-set are very sparse (and annoying) change the dates to be sequential
for i in range(len(r)-1):
r['date'][i+1] = r['date'][i] + datetime.timedelta(days=1)
candlesticks = zip(date2num(r['date']),r['open'],r['close'],r['max'],r['min'],r['volume'])
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.set_ylabel('Quote ($)', size=20)
candlestick(ax, candlesticks,width=1,colorup='g', colordown='r')
# shift y-limits of the candlestick plot so that there is space at the bottom for the volume bar chart
pad = 0.25
yl = ax.get_ylim()
ax.set_ylim(yl[0]-(yl[1]-yl[0])*pad,yl[1])
# create the second axis for the volume bar-plot
ax2 = ax.twinx()
# set the position of ax2 so that it is short (y2=0.32) but otherwise the same size as ax
ax2.set_position(matplotlib.transforms.Bbox([[0.125,0.1],[0.9,0.32]]))
# get data from candlesticks for a bar plot
dates = [x[0] for x in candlesticks]
dates = np.asarray(dates)
volume = [x[5] for x in candlesticks]
volume = np.asarray(volume)
# make bar plots and color differently depending on up/down for the day
pos = r['open']-r['close']<0
neg = r['open']-r['close']>0
ax2.bar(dates[pos],volume[pos],color='green',width=1,align='center')
ax2.bar(dates[neg],volume[neg],color='red',width=1,align='center')
#scale the x-axis tight
ax2.set_xlim(min(dates),max(dates))
# the y-ticks for the bar were too dense, keep only every third one
yticks = ax2.get_yticks()
ax2.set_yticks(yticks[::3])
ax2.yaxis.set_label_position("right")
ax2.set_ylabel('Volume', size=20)
# format the x-ticks with a human-readable date.
xt = ax.get_xticks()
new_xticks = [datetime.date.isoformat(num2date(d)) for d in xt]
ax.set_xticklabels(new_xticks,rotation=45, horizontalalignment='right')
plt.ion()
plt.show()
data.csv is up here:
http://pastebin.com/5dwzUM6e
See the answer here. Apparently a bug and it's going to be fixed.
For now you need to assign the returned collection from the volume_overlay3 call to a variable then add that to the chart.
vc = volume_overlay3(ax2, quotes)
ax2.add_collection(vc)
If you want to stack up graphs on top of one another (i.e. plot them on the same axis) use:
plt.hold(True)