I'm implementing a time schedule associated with business hour (8am to 5pm) using pd.offsets.CustomBusinessHour and attempting to plot the gantt chart or horizonal bar chart using matplotlib.
At this point, I want to cut off the interval between x-axis ticks out of business hour which is unnecessary. It seems like breaking hours exist between 5pm of d-day and 8am of d+1 day
I searched parameter configuration of BusinessHour method, way of tick setting using keyword 'interval', 'spacing', however I couldn't find appropriate solution.
I considered other plotting approaches using matplotlib.dates module but the result was in vain.
And this is my python code.
import pandas as pd
from datetime import datetime, date, timedelta, time
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import matplotlib.dates as mdates
num = 6
start_time = datetime(2021, 7, 7, 13, 5, 16, 268902)
int_to_time = pd.offsets.CustomBusinessHour(start="08:00", end="17:00", weekmask="1111111")
duration = num * int_to_time
horizon = [start_time + (i+1) * int_to_time for i in range(num+1)]
horizon = [i.replace(microsecond=0) for i in horizon]
fig, gnt = plt.subplots(figsize=(12,3))
gnt.barh(y=1, width=duration, left=start_time, color="cyan", height=0.2)
gnt.set_xticks(horizon)
gnt.set_xticklabels(horizon, rotation=90)
gnt.tick_params(bottom=False, labelbottom=False, top=True, labeltop=True)
plt.show()
You are trying to develop a Gantt chart and are having issues with spacing of the x axis labels. Your x-axis is representing Timestamps and you want them evenly spaced out (hourly).
Axis tick locations are determined by Tick Locators and the labels are determined by Tick Formatters. The default tick locator for datetimes is AutoDatesLocator which is likely implementing HourLocator. This will return x and y values that correspond to a 24 hour date time axis.
One solution to your problem is to simply use LinearLocator or FixedLocator along with a FixedFormatter. This puts you in very direct control over the tick locations and labels.
I must add that there are many tutorials and posts about how to make a Gantt chart with matplotlib or plotly that are easily searchable. I recommend reviewing some of those as you develop your plots.
The solution is implemented below in the context of your code.
import pandas as pd
from datetime import datetime, date, timedelta, time
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import matplotlib.dates as mdates
num = 6
start_time = datetime(2021, 7, 7, 13, 5, 16, 268902)
int_to_time = pd.offsets.CustomBusinessHour(start="08:00", end="17:00", weekmask="1111111")
duration = num * int_to_time
horizon = [start_time + (i+1) * int_to_time for i in range(num+1)]
horizon = [i.replace(microsecond=0) for i in horizon]
fig, gnt = plt.subplots(figsize=(12,3))
gnt.barh(y=1, width=duration, left=start_time, color="cyan", height=0.2)
gnt.xaxis.set_major_locator(ticker.LinearLocator(7))
gnt.xaxis.set_major_formatter(ticker.FixedFormatter(horizon))
gnt.tick_params(bottom=False, labelbottom=False, top=True, labeltop=True, rotation=90)
Related
I am trying to create a plot with an amount (int) in the y-axis and days in the x-axis.
I want the plot to always have the whole month in the x-axis although I dont have data for all days.
This is the code I tryed:
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.dates as mdates
import datetime as dt
df=get_pandas_data(datab) #Taking data from database in pandas DataFrame
fig = plt.figure(figsize=(10,10)) #Initialize plot
ax1 = fig.add_subplot(1,1,1)
dates=[dt.datetime.strptime(d,'%Y-%m-%d').date() for d in df['date']]
dates=list(set(dates)) #Takes all the dates from de Dataframe and sets to avoid repeated dates
s=df.resample('D', on='date')['amount'].sum() #Takes the total amount for the same date
ax1.bar(dates,s) #Bar plot for dates and amount
ax1.set(xlabel="Date",
ylabel="Balance (€)",
title="Total Monthly balance") # Plot information
ax1.xaxis.set_major_formatter(mdates.DateFormatter('%d-%m-%Y'))
#this is soposed to set all days of the month in the x-axis
ax1.xaxis.set_major_locator(mdates.DayLocator(interval=1))
fig.autofmt_xdate()
plt.show()
The result I get from this is a plot but only with those days that have data.
How can I make the plot to have all days in the month and plot the bar on those who have data?
This works fine with bare datetimes and matplotlib so you must be malforming your data somehow when doing your pandas manipulations. But we can't really help because we don't have your dataframe. Its always preferable to create a standalone example with dummy data, and as little code as possible to recreate the issue. a) 90% of the time you will realize your problem b) if not, we can help...
import numpy as np
import matplotlib.pyplot as plt
import datetime
x = np.array([1, 3, 7, 8, 10])
y = x * 2
dates = [datetime.datetime(2000, 2, xx) for xx in x]
fig, ax = plt.subplots()
ax.bar(dates, y)
fig.autofmt_xdate()
plt.show()
I have pulled in a dataset that I want to use, with columns named Date and Adjusted. Adjusted is just the adjusted percentage growth on the base month.
The code I currently have is:
x = data['Date']
y = data['Adjusted']
fig = plt.figure(dpi=128, figsize=(7,3))
plt.plot(x,y)
plt.title("FTSE 100 Growth", fontsize=25)
plt.xlabel("Date", fontsize=14)
plt.ylabel("Adjusted %", fontsize=14)
plt.show()
However, when I run it I get essentially a solid black line across the bottom where all of the dates are covering each other up. It is trying to show every single date, when obviously I only want to show major ones. That dates are in the format Apr-19, and the data runs from Oct-03 to May-20.
How do I limit the number of date ticks and labels to one per year, or any amount I choose? If you do have a solution, if you could respond with the edits made to the code itself that would be great. I've tried other solutions I've found on here but I haven't been able to get it to work.
dates module of matplotlib will do the job. You can control the interval by modifying the MonthLocator (It's currently set to 6 months). Here's how:
import pandas as pd
from datetime import date, datetime, timedelta
import matplotlib.pyplot as plt
import matplotlib.dates as md
import numpy as np
import matplotlib.ticker as ticker
x = data['Date']
y = data['Adjusted']
#converts differently formatted date to a datetime object
def convert_date(df):
return datetime.strptime(df['Date'], '%b-%y')
data['Formatted_Date'] = data.apply(convert_date, axis=1)
# plot
fig, ax = plt.subplots(1, 1)
ax.plot(data['Formatted_Date'], y,'ok')
## Set time format and the interval of ticks (every 6 months)
xformatter = md.DateFormatter('%Y-%m') # format as year, month
xlocator = md.MonthLocator(interval = 6)
## Set xtick labels to appear every 6 months
ax.xaxis.set_major_locator(xlocator)
## Format xtick labels as YYYY:mm
plt.gcf().axes[0].xaxis.set_major_formatter(xformatter)
plt.title("FTSE 100 Growth", fontsize=25)
plt.xlabel("Date", fontsize=14)
plt.ylabel("Adjusted %", fontsize=14)
plt.show()
Example output:
I'm trying to make a plot where the x-axis is time and the y-axis is a bar chart that will have the bars covering a certain time period like this:
______________
|_____________|
_____________________
|___________________|
----------------------------------------------------->
time
I have 2 lists of datetime values for the start and end of these times I'd like to have covered. So far I have
x = np.array([dt.datetime(2010, 1, 8, i,0) for i in range(24)])
to cover a 24-hour period. My question is then how do I set and plot my y-values to look like this?
You could use plt.barh:
import datetime as DT
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
start = [DT.datetime(2000,1,1)+DT.timedelta(days=i) for i in (2,0,3)]
end = [s+DT.timedelta(days=i) for s,i in zip(start, [15,7,10])]
start = mdates.date2num(start)
end = mdates.date2num(end)
yval = [1,2,3]
width = end-start
fig, ax = plt.subplots()
ax.barh(bottom=yval, width=width, left=start, height=0.3)
xfmt = mdates.DateFormatter('%Y-%m-%d')
ax.xaxis.set_major_formatter(xfmt)
# autorotate the dates
fig.autofmt_xdate()
plt.show()
yields
I am new to matplotlib (1.3.1-2) and I cannot find a decent place to start.
I want to plot the distribution of points over time in a histogram with matplotlib.
Basically I want to plot the cumulative sum of the occurrence of a date.
date
2011-12-13
2011-12-13
2013-11-01
2013-11-01
2013-06-04
2013-06-04
2014-01-01
...
That would make
2011-12-13 -> 2 times
2013-11-01 -> 3 times
2013-06-04 -> 2 times
2014-01-01 -> once
Since there will be many points over many years, I want to set the start date on my x-Axis and the end date, and then mark n-time steps(i.e. 1 year steps) and finally decide how many bins there will be.
How would I achieve that?
Matplotlib uses its own format for dates/times, but also provides simple functions to convert which are provided in the dates module. It also provides various Locators and Formatters that take care of placing the ticks on the axis and formatting the corresponding labels. This should get you started:
import random
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
# generate some random data (approximately over 5 years)
data = [float(random.randint(1271517521, 1429197513)) for _ in range(1000)]
# convert the epoch format to matplotlib date format
mpl_data = mdates.epoch2num(data)
# plot it
fig, ax = plt.subplots(1,1)
ax.hist(mpl_data, bins=50, color='lightblue')
ax.xaxis.set_major_locator(mdates.YearLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d.%m.%y'))
plt.show()
Result:
To add to hitzg's answer, you can use AutoDateLocator and AutoDateFormatter to have matplotlib do the location and formatting for you:
locator = mdates.AutoDateLocator()
ax.xaxis.set_major_locator(locator)
ax.xaxis.set_major_formatter(mdates.AutoDateFormatter(locator))
Here is a more modern solution for matplotlib version 3.5.3.
Also, it explicitly specifies the min/max date instead of relying on min/max values derived from the data.
import random
from datetime import datetime, timedelta
import matplotlib.pyplot as plt
days = 365*3
start_date = datetime.now()
random_dates = [
start_date + timedelta(days=int(random.random()*days))
for _ in range(100)
]
end_date = start_date + timedelta(days=days)
fig, ax = plt.subplots(figsize=(5,3))
n, bins, patches = ax.hist(random_dates, bins=52, range=(start_date, end_date))
fig.autofmt_xdate()
plt.show()
I wrote a simple script below to generate a graph with matplotlib. I would like to increase the x-tick frequency from monthly to weekly and rotate the labels. I'm not sure where to start with the x-axis frequency. My rotation line yields an error: TypeError: set_xticks() got an unexpected keyword argument 'rotation'. For the rotation, I'd prefer not to use plt.xticks(rotation=70) as I may eventually build in multiple subplots, some of which should have a rotated axis and some which should not.
import datetime
import matplotlib
import matplotlib.pyplot as plt
from datetime import date, datetime, timedelta
def date_increments(start, end, delta):
curr = start
while curr <= end:
yield curr
curr += delta
x_values = [[res] for res in date_increments(date(2014, 1, 1), date(2014, 12, 31), timedelta(days=1))]
print len(x_values)
y_values = [x**2 for x in range(len(x_values))]
print len(y_values)
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x_values, y_values)
ax.set_xticks(rotation=70)
plt.show()
Have a look at matplotlib.dates, particularly at this example.
Tick frequency
You will probably want to do something like this:
from matplotlib.dates import DateFormatter, DayLocator, MonthLocator
days = DayLocator()
months = MonthLocator()
months_f = DateFormatter('%m')
ax.xaxis.set_major_locator(months)
ax.xaxis.set_minor_locator(days)
ax.xaxis.set_major_formatter(months_f)
ax.xaxis_date()
This will plot days as minor ticks and months as major ticks, labelled with the month number.
Rotation of the labels
You can use plt.setp() to change axes individually:
plt.setp(ax.get_xticklabels(), rotation=70, horizontalalignment='right')
Hope this helps.