When I plots the complete data works fine and displays the date on the x-axis:
.
When I zoom into particular portion to view:
the plot shows only the time rather than date, I do understand with less points can't display different set of date but how to show date or set date format even if the graph is zoomed?
dataToPlot = pd.read_csv(fileName, names=['time','1','2','3','4','plotValue','6','7','8','9','10','11','12','13','14','15','16'],
sep=',', index_col=0, parse_dates=True, dayfirst=True)
dataToPlot.drop(dataToPlot.index[0])
startTime = dataToPlot.head(1).index[0]
endTime = dataToPlot.tail(1).index[0]
ax = pd.rolling_mean(dataToPlot_plot[startTime:endTime][['plotValue']],mar).plot(linestyle='-', linewidth=3, markersize=9, color='#FECB00')
Thanks in advance!
I have a solution to make the labels look consistent, though bear in mind that it will also include the time on the "larger scale" time plot.
The code below uses the matplotlib.dates functionality to choose a date format for the x-axis. Note that as we're using the matplotlib formatting you can't simple use df.plot but must instead use plt.plot_date and convert your index to the correct format.
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import dates
# Generate some random data and plot it
time = pd.date_range('07/11/2014', periods=1000, freq='5min')
ts = pd.Series(pd.np.random.randn(len(time)), index=time)
fig, ax = plt.subplots()
ax.plot_date(ts.index.to_pydatetime(), ts.data)
# Create your formatter object and change the xaxis formatting.
date_fmt = '%d/%m/%y %H:%M:%S'
formatter = dates.DateFormatter(date_fmt)
ax.xaxis.set_major_formatter(formatter)
plt.gcf().autofmt_xdate()
plt.show()
An example showing the fully zoomed out plot
An example showing the plot zoomed in.
Related
I am trying to create a heat map from pandas dataframe using seaborn library. Here, is the code:
test_df = pd.DataFrame(np.random.randn(367, 5),
index = pd.DatetimeIndex(start='01-01-2000', end='01-01-2001', freq='1D'))
ax = sns.heatmap(test_df.T)
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_minor_locator(mdates.DayLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b'))
ax.xaxis.set_minor_formatter(mdates.DateFormatter('%d'))
However, I am getting a figure with nothing printed on the x-axis.
Seaborn heatmap is a categorical plot. It scales from 0 to number of columns - 1, in this case from 0 to 366. The datetime locators and formatters expect values as dates (or more precisely, numbers that correspond to dates). For the year in question that would be numbers between 730120 (= 01-01-2000) and 730486 (= 01-01-2001).
So in order to be able to use matplotlib.dates formatters and locators, you would need to convert your dataframe index to datetime objects first. You can then not use a heatmap, but a plot that allows for numerical axes, e.g. an imshow plot. You may then set the extent of that imshow plot to correspond to the date range you want to show.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
df = pd.DataFrame(np.random.randn(367, 5),
index = pd.DatetimeIndex(start='01-01-2000', end='01-01-2001', freq='1D'))
dates = df.index.to_pydatetime()
dnum = mdates.date2num(dates)
start = dnum[0] - (dnum[1]-dnum[0])/2.
stop = dnum[-1] + (dnum[1]-dnum[0])/2.
extent = [start, stop, -0.5, len(df.columns)-0.5]
fig, ax = plt.subplots()
im = ax.imshow(df.T.values, extent=extent, aspect="auto")
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_minor_locator(mdates.DayLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b'))
fig.colorbar(im)
plt.show()
I found this question when trying to do a similar thing and you can hack together a solution but it's not very pretty.
For example I get the current labels, loop over them to find the ones for January and set those to just the year, setting the rest to be blank.
This gives me year labels in the correct position.
xticklabels = ax.get_xticklabels()
for label in xticklabels:
text = label.get_text()
if text[5:7] == '01':
label.set_text(text[0:4])
else:
label.set_text('')
ax.set_xticklabels(xticklabels)
Hopefully from that you can figure out what you want to do.
In the timeline plot I’m making, I want date tickers to show only specified dates. (In my example I show tickers for events ‘A’, but it can be any list on tickers). I found how to do it when x-axis data is numeric (upper subplot in my example), but this won’t work with timestamp date type (bottom plot).
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.ticker as ticker
myData = pd.DataFrame({'date':['2019-01-15','2019-02-10','2019-03-20','2019-04-17','2019-05-23','2019-06-11'],'cnt':range(6),'event':['a','b','a','b','a','b']})
myData['date'] = [pd.Timestamp(j) for j in myData['date']]
start = pd.Timestamp('2019-01-01')
stop = pd.Timestamp('2019-07-01')
inxa = myData.loc[myData['event'] == 'a'].index
inxb = myData.loc[myData['event'] == 'b'].index
# create two plots, one with 'cnt' as x-axis, the other 'dates' on x-axis.
fig, ax = plt.subplots(2,1,figsize=(16,9))
ax[0].plot((0,6),(0,0), 'k')
ax[1].plot((start, stop),(0,0))
for g in inxa:
ax[0].plot((myData.loc[g,'cnt'],myData.loc[g,'cnt']),(0,1),c='r')
ax[1].plot((myData.loc[g,'date'],myData.loc[g,'date']),(0,1),c='r')
for g in inxb:
ax[0].plot((myData.loc[g,'cnt'],myData.loc[g,'cnt']),(0,2),c='b')
ax[1].plot((myData.loc[g,'date'],myData.loc[g,'date']),(0,2),c='b')
xlist0 = myData.loc[myData['event']=='a','cnt']
xlist1 = myData.loc[myData['event']=='a','date']
ax[0].xaxis.set_major_locator(ticker.FixedLocator(xlist0))
# ax[1].xaxis.set_major_locator(**???**)
Couldn't find a sufficient duplicate, maybe I didn't look hard enough. There are a number of ways to do this:
Converting to numbers first or using the underlying values of a Pandas DateTime Series
xticks = [mdates.date2num(z) for z in xlist1]
# or
xticks = xlist1.values
and at least a couple ways to use it/them
ax[1].xaxis.set_major_locator(ticker.FixedLocator(xticks))
ax[1].xaxis.set_ticks(xticks)
Date tick labels
How to set the xticklabels for date in matplotlib
how to get ticks every hour?
...
Compare the following code:
test = pd.DataFrame({'date':['20170527','20170526','20170525'],'ratio1':[1,0.98,0.97]})
test['date'] = pd.to_datetime(test['date'])
test = test.set_index('date')
ax = test.plot()
I added DateFormatter in the end:
test = pd.DataFrame({'date':['20170527','20170526','20170525'],'ratio1':[1,0.98,0.97]})
test['date'] = pd.to_datetime(test['date'])
test = test.set_index('date')
ax = test.plot()
ax.xaxis.set_minor_formatter(dates.DateFormatter('%d\n\n%a')) ## Added this line
The issue with the second graph is that it starts on 5-24 instead 5-25. Also, 5-25 of 2017 is Thursday not Monday. What is causing the issue? Is this timezone related? (I don't understand why the date numbers are stacked on top of each other either)
In general the datetime utilities of pandas and matplotlib are incompatible. So trying to use a matplotlib.dates object on a date axis created with pandas will in most cases fail.
One reason is e.g. seen from the documentation
datetime objects are converted to floating point numbers which represent time in days since 0001-01-01 UTC, plus 1. For example, 0001-01-01, 06:00 is 1.25, not 0.25.
However, this is not the only difference and it is thus advisable not to mix pandas and matplotlib when it comes to datetime objects.
There is however the option to tell pandas not to use its own datetime format. In that case using the matplotlib.dates tickers is possible. This can be steered via.
df.plot(x_compat=True)
Since pandas does not provide sophisticated formatting capabilities for dates, one can use matplotlib for plotting and formatting.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as dates
df = pd.DataFrame({'date':['20170527','20170526','20170525'],'ratio1':[1,0.98,0.97]})
df['date'] = pd.to_datetime(df['date'])
usePandas=True
#Either use pandas
if usePandas:
df = df.set_index('date')
df.plot(x_compat=True)
plt.gca().xaxis.set_major_locator(dates.DayLocator())
plt.gca().xaxis.set_major_formatter(dates.DateFormatter('%d\n\n%a'))
plt.gca().invert_xaxis()
plt.gcf().autofmt_xdate(rotation=0, ha="center")
# or use matplotlib
else:
plt.plot(df["date"], df["ratio1"])
plt.gca().xaxis.set_major_locator(dates.DayLocator())
plt.gca().xaxis.set_major_formatter(dates.DateFormatter('%d\n\n%a'))
plt.gca().invert_xaxis()
plt.show()
Updated using the matplotlib object oriented API
usePandas=True
#Either use pandas
if usePandas:
df = df.set_index('date')
ax = df.plot(x_compat=True, figsize=(6, 4))
ax.xaxis.set_major_locator(dates.DayLocator())
ax.xaxis.set_major_formatter(dates.DateFormatter('%d\n\n%a'))
ax.invert_xaxis()
ax.get_figure().autofmt_xdate(rotation=0, ha="center")
# or use matplotlib
else:
fig, ax = plt.subplots(figsize=(6, 4))
ax.plot('date', 'ratio1', data=df)
ax.xaxis.set_major_locator(dates.DayLocator())
ax.xaxis.set_major_formatter(dates.DateFormatter('%d\n\n%a'))
fig.invert_xaxis()
plt.show()
I have the following code to plot a chart with matplotlib
#!/usr/bin/env python
import matplotlib.pyplot as plt
import urllib2
import json
req = urllib2.urlopen("http://localhost:17668/retrieval/data/getData.json? pv=LNLS:ANEL:corrente&donotchunk")
data = json.load(req)
secs = [x['secs'] for x in data[0]['data']]
vals = [x['val'] for x in data[0]['data']]
plt.plot(secs, vals)
plt.show()
The secs arrays is epoch time.
What I want is to plot the data in the x axis (secs) as a date (DD-MM-YYYY HH:MM:SS).
How can I do that?
To plot date-based data in matplotlib you must convert the data to the correct format.
One way is to first convert your data to datetime objects, for an epoch timestamp you should use datetime.datetime.fromtimestamp().
You must then convert the datetime objects to the right format for matplotlib, this can be handled using matplotlib.date.date2num.
Alternatively you can use matplotlib.dates.epoch2num and skip converting your date to datetime objects in the first place (while this will suit your use-case better initially, I would recommend trying to keep date based date in datetime objects as much as you can when working, it will save you a headache in the long run).
Once you have your data in the correct format you can plot it using plot_date.
Finally to format your x-axis as you wish you can use a matplotlib.dates.DateFormatter object to choose how your ticks will look.
import matplotlib.pyplot as plt
import matplotlib.dates as mdate
import numpy as np
# Generate some random data.
N = 40
now = 1398432160
raw = np.array([now + i*1000 for i in range(N)])
vals = np.sin(np.linspace(0,10,N))
# Convert to the correct format for matplotlib.
# mdate.epoch2num converts epoch timestamps to the right format for matplotlib
secs = mdate.epoch2num(raw)
fig, ax = plt.subplots()
# Plot the date using plot_date rather than plot
ax.plot_date(secs, vals)
# Choose your xtick format string
date_fmt = '%d-%m-%y %H:%M:%S'
# Use a DateFormatter to set the data to the correct format.
date_formatter = mdate.DateFormatter(date_fmt)
ax.xaxis.set_major_formatter(date_formatter)
# Sets the tick labels diagonal so they fit easier.
fig.autofmt_xdate()
plt.show()
You can change the ticks locations and formats on your plot:
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import time
secs = [10928389,102928123,383827312,1238248395]
vals = [12,8,4,12]
plt.plot(secs,vals)
plt.gcf().autofmt_xdate()
plt.gca().xaxis.set_major_locator(mtick.FixedLocator(secs))
plt.gca().xaxis.set_major_formatter(
mtick.FuncFormatter(lambda pos,_: time.strftime("%d-%m-%Y %H:%M:%S",time.localtime(pos)))
)
plt.tight_layout()
plt.show()
I want to plot some timestamps (Year-month-day Hour-Minute-Second format). I am using the following code, however it doesn't show any hour-minute-second information, it shows them as 00-00-00. I double checked my date array, and as you can see from the snippet below, they are not zero.
Do you have any idea about why I am getting 00-00-00's?
import matplotlib.pyplot as plt
import matplotlib.dates as md
import dateutil
dates = [dateutil.parser.parse(s) for s in datestrings]
# datestrings = ['2012-02-21 11:28:17.980000', '2012-02-21 12:15:32.453000', '2012-02-21 23:26:23.734000', '2012-02-26 17:42:15.804000']
plt.subplots_adjust(bottom=0.2)
plt.xticks( rotation= 80 )
ax=plt.gca()
xfmt = md.DateFormatter('%Y-%m-%d %H:%M:%S')
ax.xaxis.set_major_formatter(xfmt)
plt.plot(dates[0:10],plt_data[0:10], "o-")
plt.show()
Try zooming in on your graph, you will see the datetimes expand as your x axis scale changes.
plotting unix timestamps in matplotlib
I had a similarly annoying problem when trying to plot heatmaps of positive selection on chromosomes. If I zoomed out too far things would disappear entirely!
edit: This code plots your dates exactly as you give them, but doesn't add ticks in between.
import matplotlib.pyplot as plt
import matplotlib.dates as md
import dateutil
datestrings = ['2012-02-21 11:28:17.980000', '2012-02-21 12:15:32.453000', '2012-02-21 23:26:23.734000', '2012-02-26 17:42:15.804000']
dates = [dateutil.parser.parse(s) for s in datestrings]
plt_data = range(5,9)
plt.subplots_adjust(bottom=0.2)
plt.xticks( rotation=25 )
ax=plt.gca()
ax.set_xticks(dates)
xfmt = md.DateFormatter('%Y-%m-%d %H:%M:%S')
ax.xaxis.set_major_formatter(xfmt)
plt.plot(dates,plt_data, "o-")
plt.show()
I can tell you why it shows the 00:00:00. It's because that's the start time of that particular day. For example, one tick is at 2012-02-22 00:00:00 (12 midnight of 2012-02-22) and another is at 2012-02-23 00:00:00 (12 midnight of 2012-02-23).
Ticks for the timestamps in between these two times are not shown.
I myself am trying to figure out how to show ticks for in between these times.