I'm trying to make a plot where the x-axis is time and the y-axis is a bar chart that will have the bars covering a certain time period like this:
______________
|_____________|
_____________________
|___________________|
----------------------------------------------------->
time
I have 2 lists of datetime values for the start and end of these times I'd like to have covered. So far I have
x = np.array([dt.datetime(2010, 1, 8, i,0) for i in range(24)])
to cover a 24-hour period. My question is then how do I set and plot my y-values to look like this?
You could use plt.barh:
import datetime as DT
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
start = [DT.datetime(2000,1,1)+DT.timedelta(days=i) for i in (2,0,3)]
end = [s+DT.timedelta(days=i) for s,i in zip(start, [15,7,10])]
start = mdates.date2num(start)
end = mdates.date2num(end)
yval = [1,2,3]
width = end-start
fig, ax = plt.subplots()
ax.barh(bottom=yval, width=width, left=start, height=0.3)
xfmt = mdates.DateFormatter('%Y-%m-%d')
ax.xaxis.set_major_formatter(xfmt)
# autorotate the dates
fig.autofmt_xdate()
plt.show()
yields
Related
I have a dataframe and I want to show them on graph. When I start my code, the x and y axis are non-sequential. How can I solve it? Also I give a example graph on picture. First image is mine, the second one is what I want.
This is my code:
from datetime import timedelta, date
import datetime as dt #date analyse
import matplotlib.pyplot as plt
import pandas as pd #read file
def daterange(date1, date2):
for n in range(int ((date2 - date1).days)+1):
yield date1 + timedelta(n)
tarih="01-01-2021"
tarih2="20-06-2021"
start=dt.datetime.strptime(tarih, '%d-%m-%Y')
end=dt.datetime.strptime(tarih2, '%d-%m-%Y')
fg=pd.DataFrame()
liste=[]
tarih=[]
for dt in daterange(start, end):
dates=dt.strftime("%d-%m-%Y")
with open("fng_value.txt", "r") as filestream:
for line in filestream:
date = line.split(",")[0]
if dates == date:
fng_value=line.split(",")[1]
liste.append(fng_value)
tarih.append(dates)
fg['date']=tarih
fg['fg_value']=liste
print(fg.head())
plt.subplots(figsize=(20, 10))
plt.plot(fg.date,fg.fg_value)
plt.title('Fear&Greed Index')
plt.ylabel('Fear&Greed Data')
plt.xlabel('Date')
plt.show()
This is my graph:
This is the graph that I want:
Line plot with datetime x axis
So it appears this code is opening a text file, adding values to either a list of dates or a list of values, and then making a pandas dataframe with those lists. Finally, it plots the date vs values with a line plot.
A few changes should help your graph look a lot better. A lot of this is very basic, and I'd recommend reviewing some matplotlib tutorials. The Real Python tutorial is a good starting place in my opinion.
Fix the y axis limit:
plt.set_ylim(0, 100)
Use a x axis locator from mdates to find better spaced x label locations, it depends on your time range, but I made some data and used day locator.
import matplotlib.dates as mdates
plt.xaxis.set_major_locator(mdates.DayLocator())
Use a scatter plot to add data points as on the linked graph
plt.scatter(x, y ... )
Add a grid
plt.grid(axis='both', color='gray', alpha=0.5)
Rotate the x tick labels
plt.tick_params(axis='x', rotation=45)
I simulated some data and plotted it to look like the plot you linked, this may be helpful for you to work from.
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import matplotlib.dates as mdates
fig, ax = plt.subplots(figsize=(15,5))
x = pd.date_range(start='june 26th 2021', end='july 25th 2021')
rng = np.random.default_rng()
y = rng.integers(low=15, high=25, size=len(x))
ax.plot(x, y, color='gray', linewidth=2)
ax.scatter(x, y, color='gray')
ax.set_ylim(0,100)
ax.grid(axis='both', color='gray', alpha=0.5)
ax.set_yticks(np.arange(0,101, 10))
ax.xaxis.set_major_locator(mdates.DayLocator())
ax.tick_params(axis='x', rotation=45)
ax.set_xlim(min(x), max(x))
I have an x-axis labels of timedelta64[ns] coming out as:
0 days 12:01:13.165040
How can I achieve the following format if there's only 1 day?:
12:01:13
If there's more than one day, I need the following format:
2018.11.27
I've been successful in modifyng the labels by making a function and then calling it with:
ax.xaxis.set_major_formatter(plt.FuncFormatter(xaxisFormat))
But I don't know how to exactly go about formatting them.
You could set the respective formatter in dependence of the limits of the plot. This could look as follows.
import numpy as np
import datetime as dt
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter, HourLocator, DayLocator
def plot_something(h, ax=None):
td = np.arange(0,h, np.timedelta64(1, "h"))
y = np.sin(np.linspace(0,h,len(td)))
t = np.datetime64("2018-11-27") + td
(ax or plt.gca()).plot(t, y)
fig, axes = plt.subplots(nrows=4)
plot_something(16, ax=axes[0])
plot_something(24, ax=axes[1])
plot_something(40, ax=axes[2])
plot_something(72, ax=axes[3])
def ticking(ax):
d = np.diff(ax.get_xlim())
if d <= 1:
ax.xaxis.set_major_formatter(DateFormatter("%H:%M:%S"))
elif d <= 2:
ax.xaxis.set_major_locator(HourLocator(byhour=(0,6,12,18)))
ax.xaxis.set_major_formatter(DateFormatter("%H:%M:%S"))
else:
ax.xaxis.set_major_locator(DayLocator())
ax.xaxis.set_major_formatter(DateFormatter("%Y.%m.%d"))
for ax in axes.flat:
ticking(ax)
fig.tight_layout()
plt.show()
I am plotting date time on the xaxis (which is actual dates) and then timedelta on the yaxis, which is actually time spans, or amount of time. Originally I was using date time for the yaxis, but I came across the usecase where the time values went over 24 hours, and then it broke the code. So instead I had to use timedelta in order to accommodate these values. But when I try to plot it using plot_date, the yaxis with the timedelta values comes out funny.
I have my information stored in a dataframe originally, and then change the values to a timedelta. This is the code I have to output this graph
import datetime as dt
import matplotlib.dates as mdates
import matplotlib
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import matplotlib as mpl
from matplotlib.backends.backend_pdf import PdfPages
plt.close('all')
#put data into dataframe
location='D:\CAT'
csvpath=location+('\metrics_summaryTEST.csv')
print csvpath
df=pd.read_csv(csvpath)
#setup plot/figure
media = set(df.mediaNumber.values)
num_plots = len(media)
ax = plt.gca()
pdfpath=location+('\metrics_graphs.pdf')
pp = PdfPages(pdfpath)
#declaring some variables
publishTimevals=np.zeros(len(df.publishTime.values),dtype="S20")
xdates=np.zeros(len(df.publishTime.values),dtype="S20")
ytimes=np.zeros(len(df.totalProcessTime.values),dtype="S8")
for f in sorted(media):
name = f
plt.figure(f)
plt.clf()
color = next(ax._get_lines.color_cycle)
#PROCESS PUBLISHTIME
publishTimevals= df.loc[df['mediaNumber']==f,['publishTime']]
xdates = map(lambda x: mpl.dates.date2num(dt.datetime.strptime(x, '%Y-%m-%d %H:%M')),publishTimevals.publishTime)
#PROCESS TOTALPROCESSTIME
totalProcessTimevals= df.loc[df['mediaNumber']==f,['totalProcessTime']]
ytimes = pd.to_timedelta(totalProcessTimevals.totalProcessTime)
plt.plot_date(xdates,ytimes,'o-',label='totalProcessTime',color=color)
print ytimes
plt.show()
#format the plot
plt.gcf().autofmt_xdate()
plt.xlabel('publishTime')
plt.ylabel('ProcessTime HH:MM:SS')
plt.legend(loc=8, bbox_to_anchor=(0.5,-0.3),ncol=3,prop={'size':9})
ax.grid('on')
plt.title('%s Processing Time' % (f))
plt.margins(0.05)
#plt.grid('on')
plt.minorticks_on()
plt.grid(which = 'minor', alpha = 0.3)
plt.grid(which = 'major', alpha = 0.7)
plt.show()
Could anyone point out what's going on here?
I wrote a simple script below to generate a graph with matplotlib. I would like to increase the x-tick frequency from monthly to weekly and rotate the labels. I'm not sure where to start with the x-axis frequency. My rotation line yields an error: TypeError: set_xticks() got an unexpected keyword argument 'rotation'. For the rotation, I'd prefer not to use plt.xticks(rotation=70) as I may eventually build in multiple subplots, some of which should have a rotated axis and some which should not.
import datetime
import matplotlib
import matplotlib.pyplot as plt
from datetime import date, datetime, timedelta
def date_increments(start, end, delta):
curr = start
while curr <= end:
yield curr
curr += delta
x_values = [[res] for res in date_increments(date(2014, 1, 1), date(2014, 12, 31), timedelta(days=1))]
print len(x_values)
y_values = [x**2 for x in range(len(x_values))]
print len(y_values)
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x_values, y_values)
ax.set_xticks(rotation=70)
plt.show()
Have a look at matplotlib.dates, particularly at this example.
Tick frequency
You will probably want to do something like this:
from matplotlib.dates import DateFormatter, DayLocator, MonthLocator
days = DayLocator()
months = MonthLocator()
months_f = DateFormatter('%m')
ax.xaxis.set_major_locator(months)
ax.xaxis.set_minor_locator(days)
ax.xaxis.set_major_formatter(months_f)
ax.xaxis_date()
This will plot days as minor ticks and months as major ticks, labelled with the month number.
Rotation of the labels
You can use plt.setp() to change axes individually:
plt.setp(ax.get_xticklabels(), rotation=70, horizontalalignment='right')
Hope this helps.
Take a look at this example:
import datetime as dt
from matplotlib import pyplot as plt
import matplotlib.dates as mdates
x = []
d = dt.datetime(2013, 7, 4)
for i in range(30):
d = d+dt.timedelta(days=1)
x.append(d)
y = range(len(x))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%d-%m-%Y'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator())
plt.gcf().autofmt_xdate()
plt.bar(x,y)
plt.show()
The code writes out dates on the x-axis in the plot, see the picture below. The problem is that the dates get clogged up, as seen in the picture. How to make matplotlib to only write out every fifth or every tenth coordinate?
You can specify an interval argument to the DateLocator as in the following. With e.g. interval=5 the locator places ticks at every 5th date. Also, place the autofmt_xdate() after the bar method to get the desired output.
import datetime as dt
from matplotlib import pyplot as plt
import matplotlib.dates as mdates
x = []
d = dt.datetime(2013, 7, 4)
for i in range(30):
d = d+dt.timedelta(days=1)
x.append(d)
y = range(len(x))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%d-%m-%Y'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator(interval=5))
plt.bar(x, y, align='center') # center the bars on their x-values
plt.title('DateLocator with interval=5')
plt.gcf().autofmt_xdate()
plt.show()
With interval=3 you will get a tick for every 3rd date: