Frequency and rotation of x-axis labels in Matplotlib - python

I wrote a simple script below to generate a graph with matplotlib. I would like to increase the x-tick frequency from monthly to weekly and rotate the labels. I'm not sure where to start with the x-axis frequency. My rotation line yields an error: TypeError: set_xticks() got an unexpected keyword argument 'rotation'. For the rotation, I'd prefer not to use plt.xticks(rotation=70) as I may eventually build in multiple subplots, some of which should have a rotated axis and some which should not.
import datetime
import matplotlib
import matplotlib.pyplot as plt
from datetime import date, datetime, timedelta
def date_increments(start, end, delta):
curr = start
while curr <= end:
yield curr
curr += delta
x_values = [[res] for res in date_increments(date(2014, 1, 1), date(2014, 12, 31), timedelta(days=1))]
print len(x_values)
y_values = [x**2 for x in range(len(x_values))]
print len(y_values)
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x_values, y_values)
ax.set_xticks(rotation=70)
plt.show()

Have a look at matplotlib.dates, particularly at this example.
Tick frequency
You will probably want to do something like this:
from matplotlib.dates import DateFormatter, DayLocator, MonthLocator
days = DayLocator()
months = MonthLocator()
months_f = DateFormatter('%m')
ax.xaxis.set_major_locator(months)
ax.xaxis.set_minor_locator(days)
ax.xaxis.set_major_formatter(months_f)
ax.xaxis_date()
This will plot days as minor ticks and months as major ticks, labelled with the month number.
Rotation of the labels
You can use plt.setp() to change axes individually:
plt.setp(ax.get_xticklabels(), rotation=70, horizontalalignment='right')
Hope this helps.

Related

How can I list sequentially the x and y axis on chart?

I have a dataframe and I want to show them on graph. When I start my code, the x and y axis are non-sequential. How can I solve it? Also I give a example graph on picture. First image is mine, the second one is what I want.
This is my code:
from datetime import timedelta, date
import datetime as dt #date analyse
import matplotlib.pyplot as plt
import pandas as pd #read file
def daterange(date1, date2):
for n in range(int ((date2 - date1).days)+1):
yield date1 + timedelta(n)
tarih="01-01-2021"
tarih2="20-06-2021"
start=dt.datetime.strptime(tarih, '%d-%m-%Y')
end=dt.datetime.strptime(tarih2, '%d-%m-%Y')
fg=pd.DataFrame()
liste=[]
tarih=[]
for dt in daterange(start, end):
dates=dt.strftime("%d-%m-%Y")
with open("fng_value.txt", "r") as filestream:
for line in filestream:
date = line.split(",")[0]
if dates == date:
fng_value=line.split(",")[1]
liste.append(fng_value)
tarih.append(dates)
fg['date']=tarih
fg['fg_value']=liste
print(fg.head())
plt.subplots(figsize=(20, 10))
plt.plot(fg.date,fg.fg_value)
plt.title('Fear&Greed Index')
plt.ylabel('Fear&Greed Data')
plt.xlabel('Date')
plt.show()
This is my graph:
This is the graph that I want:
Line plot with datetime x axis
So it appears this code is opening a text file, adding values to either a list of dates or a list of values, and then making a pandas dataframe with those lists. Finally, it plots the date vs values with a line plot.
A few changes should help your graph look a lot better. A lot of this is very basic, and I'd recommend reviewing some matplotlib tutorials. The Real Python tutorial is a good starting place in my opinion.
Fix the y axis limit:
plt.set_ylim(0, 100)
Use a x axis locator from mdates to find better spaced x label locations, it depends on your time range, but I made some data and used day locator.
import matplotlib.dates as mdates
plt.xaxis.set_major_locator(mdates.DayLocator())
Use a scatter plot to add data points as on the linked graph
plt.scatter(x, y ... )
Add a grid
plt.grid(axis='both', color='gray', alpha=0.5)
Rotate the x tick labels
plt.tick_params(axis='x', rotation=45)
I simulated some data and plotted it to look like the plot you linked, this may be helpful for you to work from.
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import matplotlib.dates as mdates
fig, ax = plt.subplots(figsize=(15,5))
x = pd.date_range(start='june 26th 2021', end='july 25th 2021')
rng = np.random.default_rng()
y = rng.integers(low=15, high=25, size=len(x))
ax.plot(x, y, color='gray', linewidth=2)
ax.scatter(x, y, color='gray')
ax.set_ylim(0,100)
ax.grid(axis='both', color='gray', alpha=0.5)
ax.set_yticks(np.arange(0,101, 10))
ax.xaxis.set_major_locator(mdates.DayLocator())
ax.tick_params(axis='x', rotation=45)
ax.set_xlim(min(x), max(x))

Plotting a Time Schedule with Business Hour

I'm implementing a time schedule associated with business hour (8am to 5pm) using pd.offsets.CustomBusinessHour and attempting to plot the gantt chart or horizonal bar chart using matplotlib.
At this point, I want to cut off the interval between x-axis ticks out of business hour which is unnecessary. It seems like breaking hours exist between 5pm of d-day and 8am of d+1 day
I searched parameter configuration of BusinessHour method, way of tick setting using keyword 'interval', 'spacing', however I couldn't find appropriate solution.
I considered other plotting approaches using matplotlib.dates module but the result was in vain.
And this is my python code.
import pandas as pd
from datetime import datetime, date, timedelta, time
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import matplotlib.dates as mdates
num = 6
start_time = datetime(2021, 7, 7, 13, 5, 16, 268902)
int_to_time = pd.offsets.CustomBusinessHour(start="08:00", end="17:00", weekmask="1111111")
duration = num * int_to_time
horizon = [start_time + (i+1) * int_to_time for i in range(num+1)]
horizon = [i.replace(microsecond=0) for i in horizon]
fig, gnt = plt.subplots(figsize=(12,3))
gnt.barh(y=1, width=duration, left=start_time, color="cyan", height=0.2)
gnt.set_xticks(horizon)
gnt.set_xticklabels(horizon, rotation=90)
gnt.tick_params(bottom=False, labelbottom=False, top=True, labeltop=True)
plt.show()
You are trying to develop a Gantt chart and are having issues with spacing of the x axis labels. Your x-axis is representing Timestamps and you want them evenly spaced out (hourly).
Axis tick locations are determined by Tick Locators and the labels are determined by Tick Formatters. The default tick locator for datetimes is AutoDatesLocator which is likely implementing HourLocator. This will return x and y values that correspond to a 24 hour date time axis.
One solution to your problem is to simply use LinearLocator or FixedLocator along with a FixedFormatter. This puts you in very direct control over the tick locations and labels.
I must add that there are many tutorials and posts about how to make a Gantt chart with matplotlib or plotly that are easily searchable. I recommend reviewing some of those as you develop your plots.
The solution is implemented below in the context of your code.
import pandas as pd
from datetime import datetime, date, timedelta, time
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import matplotlib.dates as mdates
num = 6
start_time = datetime(2021, 7, 7, 13, 5, 16, 268902)
int_to_time = pd.offsets.CustomBusinessHour(start="08:00", end="17:00", weekmask="1111111")
duration = num * int_to_time
horizon = [start_time + (i+1) * int_to_time for i in range(num+1)]
horizon = [i.replace(microsecond=0) for i in horizon]
fig, gnt = plt.subplots(figsize=(12,3))
gnt.barh(y=1, width=duration, left=start_time, color="cyan", height=0.2)
gnt.xaxis.set_major_locator(ticker.LinearLocator(7))
gnt.xaxis.set_major_formatter(ticker.FixedFormatter(horizon))
gnt.tick_params(bottom=False, labelbottom=False, top=True, labeltop=True, rotation=90)

How to set axis date ticks given a starting date in matplotlib

While making make a plot with two vectors, for example:
plt.plot([1,2,3],[2,4,6])
I would like to change my xaxis to date ticks with a given starting, for ex, "2019-2-28" then I want my xaxis ticks to be
["2019-2-28","2019-3-1","2019-3-2"]
import pandas as pd
import datetime
x= [1,2,3]
y=[2,4,6]
start_date = pd.to_datetime('2019-2-20')
labels=pd.date_range(start=start_date, end=start_date + datetime.timedelta(days=len(x)),)
plt.figure()
plt.plot(x,y)
plt.xticks(x,[i.date() for i in labels])
plt.show()

how to highlight weekends for time series line plot in python

I am trying to do analysis on a bike share dataset. Part of the analysis includes showing the weekends' demand in date wise plot.
My dataframe in pandas with last 5 row looks like this.
Here is my code for date vs total ride plot.
import seaborn as sns
sns.set_style("darkgrid")
plt.plot(d17_day_count)
plt.show()
.
I want to highlight weekends in the plot. So that it could look something similar to this plot.
I am using Python with matplotlib and seaborn library.
You can easily highlight areas by using axvspan, to get the areas to be highlighted you can run through the index of your dataframe and search for the weekend days. I've also added an example for highlighting 'occupied hours' during a working week (hopefully that doesn't confuse things).
I've created dummy data for a dataframe based on days and another one for hours.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# dummy data (Days)
dates_d = pd.date_range('2017-01-01', '2017-02-01', freq='D')
df = pd.DataFrame(np.random.randint(1, 20, (dates_d.shape[0], 1)))
df.index = dates_d
# dummy data (Hours)
dates_h = pd.date_range('2017-01-01', '2017-02-01', freq='H')
df_h = pd.DataFrame(np.random.randint(1, 20, (dates_h.shape[0], 1)))
df_h.index = dates_h
#two graphs
fig, axes = plt.subplots(nrows=2, ncols=1, sharex=True)
#plot lines
dfs = [df, df_h]
for i, df in enumerate(dfs):
for v in df.columns.tolist():
axes[i].plot(df[v], label=v, color='black', alpha=.5)
def find_weekend_indices(datetime_array):
indices = []
for i in range(len(datetime_array)):
if datetime_array[i].weekday() >= 5:
indices.append(i)
return indices
def find_occupied_hours(datetime_array):
indices = []
for i in range(len(datetime_array)):
if datetime_array[i].weekday() < 5:
if datetime_array[i].hour >= 7 and datetime_array[i].hour <= 19:
indices.append(i)
return indices
def highlight_datetimes(indices, ax):
i = 0
while i < len(indices)-1:
ax.axvspan(df.index[indices[i]], df.index[indices[i] + 1], facecolor='green', edgecolor='none', alpha=.5)
i += 1
#find to be highlighted areas, see functions
weekend_indices = find_weekend_indices(df.index)
occupied_indices = find_occupied_hours(df_h.index)
#highlight areas
highlight_datetimes(weekend_indices, axes[0])
highlight_datetimes(occupied_indices, axes[1])
#formatting..
axes[0].xaxis.grid(b=True, which='major', color='black', linestyle='--', alpha=1) #add xaxis gridlines
axes[1].xaxis.grid(b=True, which='major', color='black', linestyle='--', alpha=1) #add xaxis gridlines
axes[0].set_xlim(min(dates_d), max(dates_d))
axes[0].set_title('Weekend days', fontsize=10)
axes[1].set_title('Occupied hours', fontsize=10)
plt.show()
I tried using the code in the accepted answer but the way the indices are used, the last weekend in the time series does not get highlighted entirely, despite what the image currently shown suggests (this is noticeable mainly with a frequency of 6 hours or more). Also, it does not work if the frequency of the data is higher than daily. This is why I share here a solution that uses the x-axis units so that weekends (or any other recurring time period) can be highlighted without any problem related to the index.
This solution takes only 6 lines of code and it works with any frequency. In the example below, it highlights full weekend days which makes it more efficient than the accepted answer where small frequencies (e.g. 30 minutes) will produce many polygons to cover the whole weekend.
The x-axis limits are used to compute the range of time covered by the plot in terms of days, which is the unit used for matplotlib dates. Then a weekends mask is computed and passed to the where argument of the fill_between plotting function. The masks are processed as right-exclusive so in this case, they must contain Mondays for the highlights to be drawn up to Mondays 00:00. Because plotting these highlights can alter the x-axis limits when weekends occur near the limits, the x-axis limits are set back to the original values after plotting.
Note that contrary to axvspan, the fill_between function needs the y1 and y2 arguments. For some reason, using the default y-axis limits leaves a small gap between the plot frame and the tops and bottoms of the weekend highlights. This issue is solved by running ax.set_ylim(*ax.get_ylim()) just after creating the plot.
import numpy as np # v 1.19.2
import pandas as pd # v 1.1.3
import matplotlib.pyplot as plt # v 3.3.2
import matplotlib.dates as mdates
# Create sample dataset
rng = np.random.default_rng(seed=1234) # random number generator
dti = pd.date_range('2017-01-01', '2017-05-15', freq='D')
counts = 5000 + np.cumsum(rng.integers(-1000, 1000, size=dti.size))
df = pd.DataFrame(dict(Counts=counts), index=dti)
# Draw pandas plot: x_compat=True converts the pandas x-axis units to matplotlib
# date units (not strictly necessary when using a daily frequency like here)
ax = df.plot(x_compat=True, figsize=(10, 5), legend=None, ylabel='Counts')
ax.set_ylim(*ax.get_ylim()) # reset y limits to display highlights without gaps
# Highlight weekends based on the x-axis units
xmin, xmax = ax.get_xlim()
days = np.arange(np.floor(xmin), np.ceil(xmax)+2)
weekends = [(dt.weekday()>=5)|(dt.weekday()==0) for dt in mdates.num2date(days)]
ax.fill_between(days, *ax.get_ylim(), where=weekends, facecolor='k', alpha=.1)
ax.set_xlim(xmin, xmax) # set limits back to default values
# Create appropriate ticks using matplotlib date tick locators and formatters
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_minor_locator(mdates.MonthLocator(bymonthday=np.arange(5, 31, step=7)))
ax.xaxis.set_major_formatter(mdates.DateFormatter('\n%b'))
ax.xaxis.set_minor_formatter(mdates.DateFormatter('%d'))
# Additional formatting
ax.figure.autofmt_xdate(rotation=0, ha='center')
title = 'Daily count of trips with weekends highlighted from SAT 00:00 to MON 00:00'
ax.set_title(title, pad=20, fontsize=14);
As you can see, the weekends are always highlighted to the full extent, regardless of where the data starts and ends.
You can find more examples of this solution in the answers I have posted here and here.
I have another suggestion to make in this regard, which takes inspirations from previous posts by other contributors. The code is as follows:
import datetime
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
rng = np.random.default_rng(seed=42) # random number generator
dti = pd.date_range('2021-08-01', '2021-08-31', freq='D')
counts = 5000 + np.cumsum(rng.integers(-1000, 1000, size=dti.size))
df = pd.DataFrame(dict(Counts=counts), index=dti)
weekends = [d for d in df.index if d.isoweekday() in [6,7]]
weekend_list = []
for weekendday in weekends:
d1 = weekendday
d2 = weekendday + datetime.timedelta(days=1)
weekend_list.append((d1, d2))
weekend_df = pd.DataFrame(weekend_list)
sns.set()
plt.figure(figsize=(15, 10), dpi=100)
df.plot()
plt.legend(bbox_to_anchor=(1.02, 0), loc="lower left", borderaxespad=0)
plt.ylabel("Counts")
plt.xlabel("Date of visit")
plt.xticks(rotation = 0)
plt.title("Daily counts of shop visits with weekends highlighted in green")
ax = plt.gca()
for d in weekend_df.index:
print(weekend_df[0][d], weekend_df[1][d])
ax.axvspan(weekend_df[0][d], weekend_df[1][d], facecolor="g", edgecolor="none", alpha=0.5)
ax.relim()
ax.autoscale_view()
plt.savefig("junk.png", dpi=100, bbox_inches='tight', pad_inches=0.2)
The result would be something like the following diagram:

Floating Bar Chart

I'm trying to make a plot where the x-axis is time and the y-axis is a bar chart that will have the bars covering a certain time period like this:
______________
|_____________|
_____________________
|___________________|
----------------------------------------------------->
time
I have 2 lists of datetime values for the start and end of these times I'd like to have covered. So far I have
x = np.array([dt.datetime(2010, 1, 8, i,0) for i in range(24)])
to cover a 24-hour period. My question is then how do I set and plot my y-values to look like this?
You could use plt.barh:
import datetime as DT
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
start = [DT.datetime(2000,1,1)+DT.timedelta(days=i) for i in (2,0,3)]
end = [s+DT.timedelta(days=i) for s,i in zip(start, [15,7,10])]
start = mdates.date2num(start)
end = mdates.date2num(end)
yval = [1,2,3]
width = end-start
fig, ax = plt.subplots()
ax.barh(bottom=yval, width=width, left=start, height=0.3)
xfmt = mdates.DateFormatter('%Y-%m-%d')
ax.xaxis.set_major_formatter(xfmt)
# autorotate the dates
fig.autofmt_xdate()
plt.show()
yields

Categories