Formatting timedelta on x-axis to HMS if only 1 day - python

I have an x-axis labels of timedelta64[ns] coming out as:
0 days 12:01:13.165040
How can I achieve the following format if there's only 1 day?:
12:01:13
If there's more than one day, I need the following format:
2018.11.27
I've been successful in modifyng the labels by making a function and then calling it with:
ax.xaxis.set_major_formatter(plt.FuncFormatter(xaxisFormat))
But I don't know how to exactly go about formatting them.

You could set the respective formatter in dependence of the limits of the plot. This could look as follows.
import numpy as np
import datetime as dt
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter, HourLocator, DayLocator
def plot_something(h, ax=None):
td = np.arange(0,h, np.timedelta64(1, "h"))
y = np.sin(np.linspace(0,h,len(td)))
t = np.datetime64("2018-11-27") + td
(ax or plt.gca()).plot(t, y)
fig, axes = plt.subplots(nrows=4)
plot_something(16, ax=axes[0])
plot_something(24, ax=axes[1])
plot_something(40, ax=axes[2])
plot_something(72, ax=axes[3])
def ticking(ax):
d = np.diff(ax.get_xlim())
if d <= 1:
ax.xaxis.set_major_formatter(DateFormatter("%H:%M:%S"))
elif d <= 2:
ax.xaxis.set_major_locator(HourLocator(byhour=(0,6,12,18)))
ax.xaxis.set_major_formatter(DateFormatter("%H:%M:%S"))
else:
ax.xaxis.set_major_locator(DayLocator())
ax.xaxis.set_major_formatter(DateFormatter("%Y.%m.%d"))
for ax in axes.flat:
ticking(ax)
fig.tight_layout()
plt.show()

Related

Matplotlib display only years instead of each 1st January in x axis containing dates

I have this dataframe :
import pandas as pd
import datetime
from sklearn.utils import check_random_state
import math
start = datetime.datetime.strptime("21-06-2014", "%d-%m-%Y")
end = datetime.datetime.strptime("17-03-2017", "%d-%m-%Y")
date_generated = [start + datetime.timedelta(days=x) for x in range(0, (end-start).days)]
X = [d.strftime('%d-%m-%Y') for d in date_generated] # I need this format for my real dataframe
Y = [math.cos(i) for i in range(1000)]
df = pd.DataFrame(dict(date=X,value=Y))
df.head(3)
date value
0 21-06-2014 1.000000
1 22-06-2014 0.540302
2 23-06-2014 -0.416147
df.tail(3)
date value
997 14-03-2017 -0.440062
998 15-03-2017 0.517847
999 16-03-2017 0.999650
When I plot the two columns of my dataframe through the following way, x-axis is unreadable :
from matplotlib import pyplot as plt
plt.figure(figsize=(20, 5))
plt.plot(df["date"].values,df["value"].values)
plt.show()
How please could I display only the years, one time each, instead of each 1st January ?
In that case, I would like therefore to have only 2015, 2016 and 2017 displayed in x-axis
You can use matplotlib.dates locator and formatter to format directly the datetime objects that you want to put on the xaxis:
import pandas as pd
import datetime
from sklearn.utils import check_random_state
import math
from matplotlib import pyplot as plt
import matplotlib.dates as mdates
start = datetime.datetime.strptime("21-06-2014", "%d-%m-%Y")
end = datetime.datetime.strptime("17-03-2017", "%d-%m-%Y")
date_generated = [start + datetime.timedelta(days=x) for x in range(0, (end-start).days)]
Y = [math.cos(i) for i in range(1000)]
formatter = mdates.DateFormatter("%Y") ### formatter of the date
locator = mdates.YearLocator() ### where to put the labels
fig = plt.figure(figsize=(20, 5))
ax = plt.gca()
ax.xaxis.set_major_formatter(formatter) ## calling the formatter for the x-axis
ax.xaxis.set_major_locator(locator) ## calling the locator for the x-axis
plt.plot(date_generated, Y)
# fig.autofmt_xdate() # optional if you want to tilt the date labels - just try it
plt.tight_layout()
plt.show()

questions about matplotlib.dates.DateFormatter() and xticks() [duplicate]

I am trying to plot information against dates. I have a list of dates in the format "01/02/1991".
I converted them by doing the following:
x = parser.parse(date).strftime('%Y%m%d'))
which gives 19910102
Then I tried to use num2date
import matplotlib.dates as dates
new_x = dates.num2date(x)
Plotting:
plt.plot_date(new_x, other_data, fmt="bo", tz=None, xdate=True)
But I get an error. It says "ValueError: year is out of range". Any solutions?
You can do this more simply using plot() instead of plot_date().
First, convert your strings to instances of Python datetime.date:
import datetime as dt
dates = ['01/02/1991','01/03/1991','01/04/1991']
x = [dt.datetime.strptime(d,'%m/%d/%Y').date() for d in dates]
y = range(len(x)) # many thanks to Kyss Tao for setting me straight here
Then plot:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%m/%d/%Y'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator())
plt.plot(x,y)
plt.gcf().autofmt_xdate()
Result:
I have too low reputation to add comment to #bernie response, with response to #user1506145. I have run in to same issue.
The answer to it is an interval parameter which fixes things up
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
import datetime as dt
np.random.seed(1)
N = 100
y = np.random.rand(N)
now = dt.datetime.now()
then = now + dt.timedelta(days=100)
days = mdates.drange(now,then,dt.timedelta(days=1))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator(interval=5))
plt.plot(days,y)
plt.gcf().autofmt_xdate()
plt.show()
As #KyssTao has been saying, help(dates.num2date) says that the x has to be a float giving the number of days since 0001-01-01 plus one. Hence, 19910102 is not 2/Jan/1991, because if you counted 19910101 days from 0001-01-01 you'd get something in the year 54513 or similar (divide by 365.25, number of days in a year).
Use datestr2num instead (see help(dates.datestr2num)):
new_x = dates.datestr2num(date) # where date is '01/02/1991'
Adapting #Jacek Szałęga's answer for the use of a figure fig and corresponding axes object ax:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
import datetime as dt
np.random.seed(1)
N = 100
y = np.random.rand(N)
now = dt.datetime.now()
then = now + dt.timedelta(days=100)
days = mdates.drange(now,then,dt.timedelta(days=1))
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(days,y)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
ax.xaxis.set_major_locator(mdates.DayLocator(interval=5))
ax.tick_params(axis='x', labelrotation=45)
plt.show()

Trying to plot timedelta, but the yaxis is in 1e14, want format HH:MM:SS

I am plotting date time on the xaxis (which is actual dates) and then timedelta on the yaxis, which is actually time spans, or amount of time. Originally I was using date time for the yaxis, but I came across the usecase where the time values went over 24 hours, and then it broke the code. So instead I had to use timedelta in order to accommodate these values. But when I try to plot it using plot_date, the yaxis with the timedelta values comes out funny.
I have my information stored in a dataframe originally, and then change the values to a timedelta. This is the code I have to output this graph
import datetime as dt
import matplotlib.dates as mdates
import matplotlib
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import matplotlib as mpl
from matplotlib.backends.backend_pdf import PdfPages
plt.close('all')
#put data into dataframe
location='D:\CAT'
csvpath=location+('\metrics_summaryTEST.csv')
print csvpath
df=pd.read_csv(csvpath)
#setup plot/figure
media = set(df.mediaNumber.values)
num_plots = len(media)
ax = plt.gca()
pdfpath=location+('\metrics_graphs.pdf')
pp = PdfPages(pdfpath)
#declaring some variables
publishTimevals=np.zeros(len(df.publishTime.values),dtype="S20")
xdates=np.zeros(len(df.publishTime.values),dtype="S20")
ytimes=np.zeros(len(df.totalProcessTime.values),dtype="S8")
for f in sorted(media):
name = f
plt.figure(f)
plt.clf()
color = next(ax._get_lines.color_cycle)
#PROCESS PUBLISHTIME
publishTimevals= df.loc[df['mediaNumber']==f,['publishTime']]
xdates = map(lambda x: mpl.dates.date2num(dt.datetime.strptime(x, '%Y-%m-%d %H:%M')),publishTimevals.publishTime)
#PROCESS TOTALPROCESSTIME
totalProcessTimevals= df.loc[df['mediaNumber']==f,['totalProcessTime']]
ytimes = pd.to_timedelta(totalProcessTimevals.totalProcessTime)
plt.plot_date(xdates,ytimes,'o-',label='totalProcessTime',color=color)
print ytimes
plt.show()
#format the plot
plt.gcf().autofmt_xdate()
plt.xlabel('publishTime')
plt.ylabel('ProcessTime HH:MM:SS')
plt.legend(loc=8, bbox_to_anchor=(0.5,-0.3),ncol=3,prop={'size':9})
ax.grid('on')
plt.title('%s Processing Time' % (f))
plt.margins(0.05)
#plt.grid('on')
plt.minorticks_on()
plt.grid(which = 'minor', alpha = 0.3)
plt.grid(which = 'major', alpha = 0.7)
plt.show()
Could anyone point out what's going on here?

Floating Bar Chart

I'm trying to make a plot where the x-axis is time and the y-axis is a bar chart that will have the bars covering a certain time period like this:
______________
|_____________|
_____________________
|___________________|
----------------------------------------------------->
time
I have 2 lists of datetime values for the start and end of these times I'd like to have covered. So far I have
x = np.array([dt.datetime(2010, 1, 8, i,0) for i in range(24)])
to cover a 24-hour period. My question is then how do I set and plot my y-values to look like this?
You could use plt.barh:
import datetime as DT
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
start = [DT.datetime(2000,1,1)+DT.timedelta(days=i) for i in (2,0,3)]
end = [s+DT.timedelta(days=i) for s,i in zip(start, [15,7,10])]
start = mdates.date2num(start)
end = mdates.date2num(end)
yval = [1,2,3]
width = end-start
fig, ax = plt.subplots()
ax.barh(bottom=yval, width=width, left=start, height=0.3)
xfmt = mdates.DateFormatter('%Y-%m-%d')
ax.xaxis.set_major_formatter(xfmt)
# autorotate the dates
fig.autofmt_xdate()
plt.show()
yields

Not write out all dates on an axis, Matplotlib

Take a look at this example:
import datetime as dt
from matplotlib import pyplot as plt
import matplotlib.dates as mdates
x = []
d = dt.datetime(2013, 7, 4)
for i in range(30):
d = d+dt.timedelta(days=1)
x.append(d)
y = range(len(x))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%d-%m-%Y'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator())
plt.gcf().autofmt_xdate()
plt.bar(x,y)
plt.show()
The code writes out dates on the x-axis in the plot, see the picture below. The problem is that the dates get clogged up, as seen in the picture. How to make matplotlib to only write out every fifth or every tenth coordinate?
You can specify an interval argument to the DateLocator as in the following. With e.g. interval=5 the locator places ticks at every 5th date. Also, place the autofmt_xdate() after the bar method to get the desired output.
import datetime as dt
from matplotlib import pyplot as plt
import matplotlib.dates as mdates
x = []
d = dt.datetime(2013, 7, 4)
for i in range(30):
d = d+dt.timedelta(days=1)
x.append(d)
y = range(len(x))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%d-%m-%Y'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator(interval=5))
plt.bar(x, y, align='center') # center the bars on their x-values
plt.title('DateLocator with interval=5')
plt.gcf().autofmt_xdate()
plt.show()
With interval=3 you will get a tick for every 3rd date:

Categories