I'm using matplotlib.pyplot (plt) to plot a graph of temperature against time. I am wanting the xticks to be at 12am and 12pm only, matplot auto picks where the xticks go.
How can I pick exactly 12am and 12pm for each date as the data I'm using doesn't have those data points? Basically I'm wanting an xtick on a point between two data points / on the line between two data points.
Below is a snippet of my data and my function.
Temp UNIX Time Time
5.04 1490562000 2017-03-26 22:00:00
3.21 1490572800 2017-03-27 01:00:00
2.15 1490583600 2017-03-27 04:00:00
1.66 1490594400 2017-03-27 07:00:00
6.92 1490605200 2017-03-27 10:00:00
11.73 1490616000 2017-03-27 13:00:00
13.77 1490626800 2017-03-27 16:00:00
def ploting_graph(self):
ax=plt.gca()
xfmt = md.DateFormatter('%d/%m/%y %p')
ax.xaxis.set_major_formatter(locator)
plt.plot(self.df['Time'],self.df['Temp'], 'k--^')
plt.xticks(rotation=10)
plt.grid(axis='both',color='r')
plt.ylabel("Temp (DegC)")
plt.xlim(min(self.df['Time']),max(self.df['Time']))
plt.ylim(min(self.df['Temp'])-1,max(self.df['Temp'])+1)
plt.title("Forecast Temperature {}".format(self.location))
plt.savefig('{}_day_forecast_{}.png'.format(self.num_of_days,self.location),dpi=300)
As hopefully you can see I'm looking for an xtick on the 2017-03-27 00:00:00 and 12:00:00.
Any help would be greatly appreciated and even a push in the right direction would be fantastic! Thank you very much in advanced!
John.
You need a locator and a formatter. The locator determines that you only want ticks every 12 hour while the formatter determines how the datetime should look like.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates
dates = ["2017-03-26 22:00:00","2017-03-27 01:00:00","2017-03-27 04:00:00","2017-03-27 07:00:00",
"2017-03-27 10:00:00","2017-03-27 13:00:00","2017-03-27 16:00:00"]
temps = [5.04, 3.21, 2.15,1.66, 6.92, 11.73, 13.77 ]
df = pd.DataFrame({"Time":dates, "Temp":temps})
df["Time"] = pd.to_datetime(df["Time"], format="%Y-%m-%d %H:%M:%S")
fig, ax = plt.subplots()
xfmt = matplotlib.dates.DateFormatter('%d/%m/%y 12%p')
ax.xaxis.set_major_formatter(xfmt)
locator = matplotlib.dates.HourLocator(byhour=[0,12])
ax.xaxis.set_major_locator(locator)
plt.plot(df['Time'],df['Temp'], 'k--^')
plt.xticks(rotation=10)
plt.grid(axis='both',color='r')
plt.ylabel("Temp (DegC)")
plt.xlim(min(df['Time']),max(df['Time']))
plt.ylim(min(df['Temp'])-1,max(df['Temp'])+1)
plt.title("Forecast Temperature")
plt.show()
Try playing with the xticks function on plot. Below is a quick example I tried.
import matplotlib.pyplot as plt
x = [2,3,4,5,6,14,16,18,24,22,20]
y = [1, 4, 9, 6,5,6,7,8,9,5,7]
labels = ['12Am', '12Pm']
plt.plot(x, y, 'ro')
'12,24 are location on x-axis for labels'
plt.xticks((12,24),('12Am', '12Pm'))
plt.margins(0.2)
plt.show()
Related
I have a Kaggle dataset (link).
I read the dataset, and I set the Date to be index column:
museum_data = pd.read_csv("museum_visitors.csv", index_col = "Date", parse_dates = True)
Then, the museum_data be like:
Date
Avila Adobe
Firehouse Museum
Chinese American Museum
America Tropical Interpretive Center
2014-01-01
24778
4486
1581
6602
2014-02-01
18976
4172
1785
5029
...
...
...
...
...
2018-10-01
19280
4622
2364
3775
2018-11-01
17163
4082
2385
4562
Here is the code I use to plot the lineplot in seaborn:
plt.figure(figsize = (20,8))
sns.lineplot(data = museum_data)
plt.show()
And, this is what the result looks like:
What I want to know is that, how I can show multiple (not all, for example, first month of each season) months per year in x-axis.
Thank you all for your time, in advance.
You can use MonthLocator and perhaps ConciseDateFormatter to add minor ticks with a few months showing, something like the following:
import matplotlib.dates as mdates
...
fig, ax = plt.subplots(figsize = (20,8))
sns.lineplot(data = museum_data, ax=ax)
locator = mdates.MonthLocator(bymonth=[4,7,10])
ax.xaxis.set_minor_locator(locator)
ax.xaxis.set_minor_formatter(mdates.ConciseDateFormatter(locator))
Output:
Edit (closer): you can add the following to show January as well:
ax.xaxis.set_major_locator(mdates.YearLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b\n%Y'))
Output:
Edit 2 (there's probably a better way but I'm rusty):
length = plt.rcParams["xtick.minor.size"]
pad = plt.rcParams['xtick.minor.pad']
ax.tick_params('x', length=length, pad=pad)
I am trying to draw a stock market graph
timeseries vs closing price and timeseries vs volume.
Somehow the x-axis shows the time in 1970
the following is the graph and the code
The code is:
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
pd_data = pd.DataFrame(data, columns=['id', 'symbol', 'volume', 'high', 'low', 'open', 'datetime','close','datetime_utc','created_at'])
pd_data['DOB'] = pd.to_datetime(pd_data['datetime_utc']).dt.strftime('%Y-%m-%d')
pd_data.set_index('DOB')
print(pd_data)
print(pd_data.dtypes)
ax=pd_data.plot(x='DOB',y='close',kind = 'line')
ax.set_ylabel("price")
#ax.pd_data['volume'].plot(secondary_y=True, kind='bar')
ax1=pd_data.plot(y='volume',secondary_y=True, ax=ax,kind='bar')
ax1.set_ylabel('Volumne')
# Choose your xtick format string
date_fmt = '%d-%m-%y'
date_formatter = mdates.DateFormatter(date_fmt)
ax1.xaxis.set_major_formatter(date_formatter)
# set monthly locator
ax1.xaxis.set_major_locator(mdates.MonthLocator(interval=1))
# set font and rotation for date tick labels
plt.gcf().autofmt_xdate()
plt.show()
Also tried the two graphs independently without ax=ax
ax=pd_data.plot(x='DOB',y='close',kind = 'line')
ax.set_ylabel("price")
ax1=pd_data.plot(y='volume',secondary_y=True,kind='bar')
ax1.set_ylabel('Volumne')
then price graph shows years properly whereas volumen graph shows 1970
And if i swap them
ax1=pd_data.plot(y='volume',secondary_y=True,kind='bar')
ax1.set_ylabel('Volumne')
ax=pd_data.plot(x='DOB',y='close',kind = 'line')
ax.set_ylabel("price")
Now the volume graph shows years properly whereas the price graph shows the years as 1970
I tried removing secondary_y and also changing bar to line. BUt no luck
Somehow pandas Data after first graph is changing the year.
I do not advise plotting a bar plot with such a numerous amount of bars.
This answer explains why there is an issue with the xtick labels, and how to resolve the issue.
Plotting with pandas.DataFrame.plot works without issue with .set_major_locator
Tested in python 3.8.11, pandas 1.3.2, matplotlib 3.4.2
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas_datareader as web # conda install -c anaconda pandas-datareader or pip install pandas-datareader
# download data
df = web.DataReader('amzn', data_source='yahoo', start='2015-02-21', end='2021-04-27')
# plot
ax = df.plot(y='Close', color='magenta', ls='-.', figsize=(10, 6), ylabel='Price ($)')
ax1 = df.plot(y='Volume', secondary_y=True, ax=ax, alpha=0.5, rot=0, lw=0.5)
ax1.set(ylabel='Volume')
# format
date_fmt = '%d-%m-%y'
years = mdates.YearLocator() # every year
yearsFmt = mdates.DateFormatter(date_fmt)
ax.xaxis.set_major_locator(years)
ax.xaxis.set_major_formatter(yearsFmt)
plt.setp(ax.get_xticklabels(), ha="center")
plt.show()
Why are the OP x-tick labels starting from 1970?
Bar plots locations are being 0 indexed (with pandas), and 0 corresponds to 1970
See Pandas bar plot changes date format
Most solutions with bar plots simply reformat the label to the appropriate datetime, however this is cosmetic and will not align the locations between the line plot and bar plot
Solution 2 of this answer shows how to change the tick locators, but is really not worth the extra code, when plt.bar can be used.
print(pd.to_datetime(ax1.get_xticks()))
DatetimeIndex([ '1970-01-01 00:00:00',
'1970-01-01 00:00:00.000000001',
'1970-01-01 00:00:00.000000002',
'1970-01-01 00:00:00.000000003',
...
'1970-01-01 00:00:00.000001552',
'1970-01-01 00:00:00.000001553',
'1970-01-01 00:00:00.000001554',
'1970-01-01 00:00:00.000001555'],
dtype='datetime64[ns]', length=1556, freq=None)
ax = df.plot(y='Close', color='magenta', ls='-.', figsize=(10, 6), ylabel='Price ($)')
print(ax.get_xticks())
ax1 = df.plot(y='Volume', secondary_y=True, ax=ax, kind='bar')
print(ax1.get_xticks())
ax1.set_xlim(0, 18628.)
date_fmt = '%d-%m-%y'
years = mdates.YearLocator() # every year
yearsFmt = mdates.DateFormatter(date_fmt)
ax.xaxis.set_major_locator(years)
ax.xaxis.set_major_formatter(yearsFmt)
[out]:
[16071. 16436. 16801. 17167. 17532. 17897. 18262. 18628.] ← ax tick locations
[ 0 1 2 ... 1553 1554 1555] ← ax1 tick locations
With plt.bar the bar plot locations are indexed based on the datetime
ax = df.plot(y='Close', color='magenta', ls='-.', figsize=(10, 6), ylabel='Price ($)', rot=0)
plt.setp(ax.get_xticklabels(), ha="center")
print(ax.get_xticks())
ax1 = ax.twinx()
ax1.bar(df.index, df.Volume)
print(ax1.get_xticks())
date_fmt = '%d-%m-%y'
years = mdates.YearLocator() # every year
yearsFmt = mdates.DateFormatter(date_fmt)
ax.xaxis.set_major_locator(years)
ax.xaxis.set_major_formatter(yearsFmt)
[out]:
[16071. 16436. 16801. 17167. 17532. 17897. 18262. 18628.]
[16071. 16436. 16801. 17167. 17532. 17897. 18262. 18628.]
sns.barplot(x=df.index, y=df.Volume, ax=ax1) has xtick locations as [ 0 1 2 ... 1553 1554 1555], so the bar plot and line plot did not align.
I could not find the reason for 1970, but rather use matplotlib.pyplot to plot instead of indirectly using pandas and also pass the datatime array instead of pandas
So the following code worked
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
import datetime as dt
import numpy as np
pd_data = pd.read_csv("/home/stockdata.csv",sep='\t')
pd_data['DOB'] = pd.to_datetime(pd_data['datetime2']).dt.strftime('%Y-%m-%d')
dates=[dt.datetime.strptime(d,'%Y-%m-%d').date() for d in pd_data['DOB']]
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%m/%d/%Y'))
plt.gca().xaxis.set_major_locator(mdates.MonthLocator(interval=2))
plt.bar(dates,pd_data['close'],align='center')
plt.gca().xaxis.set_minor_locator(plt.MultipleLocator(1))
plt.gcf().autofmt_xdate()
plt.show()
I have created a dates array in the datetime format. If i make graph using that then the dates are no more shown as 1970
open high low close volume datetime datetime2
35.12 35.68 34.79 35.58 1432995 1244385200000 2012-6-15 10:30:00
35.69 36.02 35.37 35.78 1754319 1244371600000 2012-6-16 10:30:00
35.69 36.23 35.59 36.23 3685845 1245330800000 2012-6-19 10:30:00
36.11 36.52 36.03 36.32 2635777 1245317200000 2012-6-20 10:30:00
36.54 36.6 35.8 35.9 2886412 1245303600000 2012-6-21 10:30:00
36.03 36.95 36.0 36.09 3696278 1245390000000 2012-6-22 10:30:00
36.5 37.27 36.18 37.11 2732645 1245376400000 2012-6-23 10:30:00
36.98 37.11 36.686 36.83 1948411 1245335600000 2012-6-26 10:30:00
36.67 37.06 36.465 37.05 2557172 1245322000000 2012-6-27 10:30:00
37.06 37.61 36.77 37.52 1780126 1246308400000 2012-6-28 10:30:00
37.47 37.77 37.28 37.7 1352267 1246394800000 2012-6-29 10:30:00
37.72 38.1 37.68 37.76 2194619 1246381200000 2012-6-30 10:30:00
The plot i get is
I have tried unsuccessfully to create a bar plot of a time series dataset. I have tried converting the dates to Pandas Datetime objects, Timestamp Objects, primitive strings, floats, and ints. No matter what I do, I get the following error: TypeError: float() argument must be a string or a number, not 'Timestamp' Here are a few minimal examples that produce the error:
Here, the 'Date' object is of type <class 'pandas._libs.tslibs.timestamps.Timestamp'>, so I know why this doesn't work:
import matplotlib.pylab as plt
import matplotlib.dates as mdates
import seaborn as sns
def main():
path = 'Data/AQ+RX Counts.csv'
df = pd.read_csv(path, parse_dates=['Date'], index_col=['Date'])
weekly_df = df.resample('W').mean().reset_index()
weekly_df['count'] = df['count'].resample('W').sum().reset_index()
sns.barplot(x = 'Date', y='count', data = weekly_df)
plt.show()
main()
I then tried making the dates floats, intending to format them back to dates after, but this still doesnt work:
dates = mdates.datestr2num(weekly_df.Date.astype(str))
weekly_df['n_dates'] = dates
sns.barplot(x = 'n_dates', y='count', data = weekly_df)
plt.show()
I also tried making them integers, to no avail:
dates = mdates.datestr2num(weekly_df.Date.astype(str))
dates = dates.astype(int)
dates = pd.Series(dates)
weekly_df['n_dates'] = dates
sns.barplot(x = 'n_dates', y='count', data = weekly_df)
plt.show()
I've tried many other variations, and all produce the same error. I've even compared it to other code and verified that all the types are identical, and the comparison code works fine. I am at a complete loss of where to go from here.
Dataframe:
Date,WSA,WSV,WDV,WSM,SGT,T2M,T10M,DELTA_T,PBAR,SRAD,RH,PM25,AQI,count
2015-01-01,1.0708333333333335,0.8750000000000001,132.95833333333334,3.4708333333333337,35.39166666666667,30.72916666666667,30.625,-0.11666666666666667,738.8249999999998,72.66666666666667,99.75416666666666,24.80833333333333,73.30793131580873,0.0
2015-01-02,1.1086956521739129,0.9391304347826086,148.47826086956522,3.734782608695653,32.46521739130434,34.39130434782609,34.27826086956521,-0.11739130434782602,738.3478260869565,61.39130434782609,100.01304347826084,23.500000000000004,64.15072523318715,4.0
2015-01-03,1.0173913043478258,0.7173913043478259,168.04347826086956,3.773913043478261,42.71739130434783,36.24782608695652,36.160869565217396,-0.09565217391304348,739.4434782608695,49.60869565217392,100.76956521739132,20.460869565217394,55.65271063058384,0.0
2015-01-04,1.0,0.6,159.95833333333334,3.85,49.15,38.8875,38.66666666666666,-0.225,741.5000000000001,31.54166666666667,101.47916666666669,13.012499999999998,46.835258118800965,0.0
2015-01-05,1.0333333333333334,0.4416666666666667,137.0,4.0,57.56666666666666,42.99583333333333,42.94583333333333,-0.04999999999999995,742.5333333333333,44.58333333333334,101.00416666666666,16.654166666666665,52.420271225456766,4.0
2015-01-06,0.7818181818181817,0.5590909090909091,114.72727272727272,3.654545454545455,42.86818181818182,40.7409090909091,41.09545454545454,0.36818181818181817,740.9045454545453,48.27272727272727,100.57727272727274,21.954545454545453,67.31833852518514,6.0
2015-01-07,0.9739130434782608,0.8304347826086954,110.82608695652172,3.956521739130436,30.817391304347833,40.36521739130435,40.59565217391304,0.22173913043478266,739.8652173913043,60.04347826086956,100.19565217391305,24.456521739130434,72.3472505968891,6.0
2015-01-08,0.9833333333333336,0.8250000000000001,156.5,4.208333333333333,32.67083333333333,41.520833333333336,41.36666666666667,-0.12916666666666668,736.35,69.58333333333333,99.95833333333331,22.274999999999995,65.77072473472253,10.0
2015-01-09,0.9583333333333331,0.7291666666666669,133.70833333333334,3.3791666666666664,39.645833333333336,42.279166666666654,42.15833333333333,-0.11666666666666665,735.2041666666665,60.41666666666666,100.04166666666669,19.370833333333334,59.08512936837911,10.0
2015-01-10,0.9666666666666668,0.7583333333333336,164.5,3.675,37.34583333333333,42.96250000000001,42.775,-0.2,734.2875,41.5,100.12083333333337,14.658333333333335,49.31465266245389,0.0
The transformation in the function is converting 'count' from floats to a datetime dtype.
Using the posted sample data
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
path = 'data/test.csv'
df = pd.read_csv(path, parse_dates=['Date'], index_col=['Date'])
# display(df)
WSA WSV WDV WSM SGT T2M T10M DELTA_T PBAR SRAD RH PM25 AQI count
Date
2015-01-01 1.070833 0.875000 132.958333 3.470833 35.391667 30.729167 30.625000 -0.116667 738.825000 72.666667 99.754167 24.808333 73.307931 0.0
2015-01-02 1.108696 0.939130 148.478261 3.734783 32.465217 34.391304 34.278261 -0.117391 738.347826 61.391304 100.013043 23.500000 64.150725 4.0
2015-01-03 1.017391 0.717391 168.043478 3.773913 42.717391 36.247826 36.160870 -0.095652 739.443478 49.608696 100.769565 20.460870 55.652711 0.0
2015-01-04 1.000000 0.600000 159.958333 3.850000 49.150000 38.887500 38.666667 -0.225000 741.500000 31.541667 101.479167 13.012500 46.835258 0.0
2015-01-05 1.033333 0.441667 137.000000 4.000000 57.566667 42.995833 42.945833 -0.050000 742.533333 44.583333 101.004167 16.654167 52.420271 4.0
2015-01-06 0.781818 0.559091 114.727273 3.654545 42.868182 40.740909 41.095455 0.368182 740.904545 48.272727 100.577273 21.954545 67.318339 6.0
2015-01-07 0.973913 0.830435 110.826087 3.956522 30.817391 40.365217 40.595652 0.221739 739.865217 60.043478 100.195652 24.456522 72.347251 6.0
2015-01-08 0.983333 0.825000 156.500000 4.208333 32.670833 41.520833 41.366667 -0.129167 736.350000 69.583333 99.958333 22.275000 65.770725 10.0
2015-01-09 0.958333 0.729167 133.708333 3.379167 39.645833 42.279167 42.158333 -0.116667 735.204167 60.416667 100.041667 19.370833 59.085129 10.0
2015-01-10 0.966667 0.758333 164.500000 3.675000 37.345833 42.962500 42.775000 -0.200000 734.287500 41.500000 100.120833 14.658333 49.314653 0.0
# resample mean
dfr = df.resample('W').mean()
# add the resampled sum to dfr
dfr['mean'] = df['count'].resample('W').sum()
# reset index
dfr = dfr.reset_index()
# display(dfr)
Date WSA WSV WDV WSM SGT T2M T10M DELTA_T PBAR SRAD RH PM25 AQI count mean
0 2015-01-04 1.049230 0.782880 152.359601 3.707382 39.931069 35.063949 34.932699 -0.138678 739.529076 53.802083 100.503986 20.445426 59.986656 1.0 4.0
1 2015-01-11 0.949566 0.690615 136.210282 3.812261 40.152457 41.810743 41.822823 0.015681 738.190794 54.066590 100.316321 19.894900 61.042728 6.0 36.0
# plot dfr
fig, ax = plt.subplots(figsize=(16, 10))
fig = sns.barplot(x='Date', y='count', data=dfr)
# configure the xaxis ticks from datetime to date
x_dates = dfr.Date.dt.strftime('%Y-%m-%d').sort_values().unique()
ax.set_xticklabels(labels=x_dates, rotation=90, ha='right')
plt.show()
Despite trying some solutions available on SO and at Matplotlib's documentation, I'm still unable to disable Matplotlib's creation of weekend dates on the x-axis.
As you can see see below, it adds dates to the x-axis that are not in the original Pandas column.
I'm plotting my data using (commented lines are unsuccessful in achieving my goal):
fig, ax1 = plt.subplots()
x_axis = df.index.values
ax1.plot(x_axis, df['MP'], color='k')
ax2 = ax1.twinx()
ax2.plot(x_axis, df['R'], color='r')
# plt.xticks(np.arange(len(x_axis)), x_axis)
# fig.autofmt_xdate()
# ax1.fmt_xdata = mdates.DateFormatter('%Y-%m-%d')
fig.tight_layout()
plt.show()
An example of my Pandas dataframe is below, with dates as index:
2019-01-09 1.007042 2585.898714 4.052480e+09 19.980000 12.07 1
2019-01-10 1.007465 2581.828491 3.704500e+09 19.500000 19.74 1
2019-01-11 1.007154 2588.605258 3.434490e+09 18.190001 18.68 1
2019-01-14 1.008560 2582.151225 3.664450e+09 19.070000 14.27 1
Some suggestions I've found include a custom ticker here and here however although I don't get errors the plot is missing my second series.
Any suggestions on how to disable date interpolation in matplotlib?
The matplotlib site recommends creating a custom formatter class. This class will contain logic that tells the axis label not to display anything if the date is a weekend. Here's an example using a dataframe I constructed from the 2018 data that was in the image you'd attached:
df = pd.DataFrame(
data = {
"Col 1" : [1.000325, 1.000807, 1.001207, 1.000355, 1.001512, 1.003237, 1.000979],
"MP": [2743.002071, 2754.011543, 2746.121450, 2760.169848, 2780.756857, 2793.953050, 2792.675162],
"Col 3": [3.242650e+09, 3.453480e+09, 3.576350e+09, 3.641320e+09, 3.573970e+09, 3.573970e+09, 4.325970e+09],
"Col 4": [9.520000, 10.080000, 9.820000, 9.880000, 10.160000, 10.160000, 11.660000],
"Col 5": [5.04, 5.62, 5.29, 6.58, 8.32, 9.57, 9.53],
"R": [0,0,0,0,0,1,1]
},
index=['2018-01-08', '2018-01-09', '2018-01-10', '2018-01-11',
'2018-01-12', '2018-01-15', '2018-01-16'])
Move the dates from the index to their own column:
df = df.reset_index().rename({'index': 'Date'}, axis=1, copy=False)
df['Date'] = pd.to_datetime(df['Date'])
Create the custom formatter class:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import Formatter
%config InlineBackend.figure_format = 'retina' # Get nicer looking graphs for retina displays
class CustomFormatter(Formatter):
def __init__(self, dates, fmt='%Y-%m-%d'):
self.dates = dates
self.fmt = fmt
def __call__(self, x, pos=0):
'Return the label for time x at position pos'
ind = int(np.round(x))
if ind >= len(self.dates) or ind < 0:
return ''
return self.dates[ind].strftime(self.fmt)
Now let's plot the MP and R series. Pay attention to the line where we call the custom formatter:
formatter = CustomFormatter(df['Date'])
fig, ax1 = plt.subplots()
ax1.xaxis.set_major_formatter(formatter)
ax1.plot(np.arange(len(df)), df['MP'], color='k')
ax2 = ax1.twinx()
ax2.plot(np.arange(len(df)), df['R'], color='r')
fig.autofmt_xdate()
fig.tight_layout()
plt.show()
The above code outputs this graph:
Now, no weekend dates, such as 2018-01-13, are displayed on the x-axis.
If you would like to simply not show the weekends, but for the graph to still scale correctly matplotlib has a built-in functionality for this in matplotlib.mdates. Specifically, the WeekdayLocator pretty much solves this problem singlehandedly. It's a one line solution (the rest just fabricates data for testing). Note that this works whether or not the data includes weekends:
import matplotlib.pyplot as plt
import datetime
import numpy as np
import matplotlib.dates as mdates
from matplotlib.dates import MO, TU, WE, TH, FR, SA, SU
DT_FORMAT="%Y-%m-%d"
if __name__ == "__main__":
N = 14
#Fake data
x = list(zip([2018]*N, [5]*N, list(range(1,N+1))))
x = [datetime.datetime(*y) for y in x]
x = [y for y in x if y.weekday() < 5]
random_walk_steps = 2 * np.random.randint(0, 6, len(x)) - 3
random_walk = np.cumsum(random_walk_steps)
y = np.arange(len(x)) + random_walk
# Make a figure and plot everything
fig, ax = plt.subplots()
ax.plot(x, y)
### HERE IS THE BIT THAT ANSWERS THE QUESTION
ax.xaxis.set_major_locator(mdates.WeekdayLocator(byweekday=(MO, TU, WE, TH, FR)))
ax.xaxis.set_major_formatter(mdates.DateFormatter(DT_FORMAT))
# plot stuff
fig.autofmt_xdate()
plt.tight_layout()
plt.show()
If you are trying to avoid the fact that matplotlib is interpolating between each point of your dataset, you can exploit the fact that matplotlib will plot a new line segment each time a np.NaN is encountered. Pandas makes it easy to insert np.NaN for the days that are not in your dataset with pd.Dataframe.asfreq():
df = pd.DataFrame(data = {
"Col 1" : [1.000325, 1.000807, 1.001207, 1.000355, 1.001512, 1.003237, 1.000979],
"MP": [2743.002071, 2754.011543, 2746.121450, 2760.169848, 2780.756857, 2793.953050, 2792.675162],
"Col 3": [3.242650e+09, 3.453480e+09, 3.576350e+09, 3.641320e+09, 3.573970e+09, 3.573970e+09, 4.325970e+09],
"Col 4": [9.520000, 10.080000, 9.820000, 9.880000, 10.160000, 10.160000, 11.660000],
"Col 5": [5.04, 5.62, 5.29, 6.58, 8.32, 9.57, 9.53],
"R": [0,0,0,0,0,1,1]
},
index=['2018-01-08', '2018-01-09', '2018-01-10', '2018-01-11', '2018-01-12', '2018-01-15', '2018-01-16'])
df.index = pd.to_datetime(df.index)
#rescale R so I don't need to worry about twinax
df.loc[df.R==0, 'R'] = df.loc[df.R==0, 'R'] + df.MP.min()
df.loc[df.R==1, 'R'] = df.loc[df.R==1, 'R'] * df.MP.max()
df = df.asfreq('D')
df
Col 1 MP Col 3 Col 4 Col 5 R
2018-01-08 1.000325 2743.002071 3.242650e+09 9.52 5.04 2743.002071
2018-01-09 1.000807 2754.011543 3.453480e+09 10.08 5.62 2743.002071
2018-01-10 1.001207 2746.121450 3.576350e+09 9.82 5.29 2743.002071
2018-01-11 1.000355 2760.169848 3.641320e+09 9.88 6.58 2743.002071
2018-01-12 1.001512 2780.756857 3.573970e+09 10.16 8.32 2743.002071
2018-01-13 NaN NaN NaN NaN NaN NaN
2018-01-14 NaN NaN NaN NaN NaN NaN
2018-01-15 1.003237 2793.953050 3.573970e+09 10.16 9.57 2793.953050
2018-01-16 1.000979 2792.675162 4.325970e+09 11.66 9.53 2793.953050
df[['MP', 'R']].plot(); plt.show()
I plotted my wind data (direction and speed) with the windrose module https://windrose.readthedocs.io/en/latest/index.html.
The results look nice but I cannot export them as a figure (png, eps, or anything to start with) because the result is a special object type that does not have a 'savefig' attribute, or I don't find it.
I have two pandas.core.series.Series: ff, dd
print(ff)
result:
TIMESTAMP
2016-08-01 00:00:00 1.643
2016-08-01 01:00:00 2.702
2016-08-01 02:00:00 1.681
2016-08-01 03:00:00 2.208
....
print(dd)
result:
TIMESTAMP
2016-08-01 00:00:00 328.80
2016-08-01 01:00:00 299.60
2016-08-01 02:00:00 306.90
2016-08-01 03:00:00 288.60
...
My code looks like:
from windrose import WindroseAxes
ax2 = WindroseAxes.from_ax()
ax2.bar(dd, ff, normed=True, opening=0.8, edgecolor='white', bins = [0,4,11,17])
ax2.set_legend()
ax2.tick_params(labelsize=18)
ax2.set_legend(loc='center', bbox_to_anchor=(0.05, 0.005), fontsize = 18)
ax2.savefig('./figures/windrose.eps')
ax2.savefig('./figures/windrose.png')
But the result is:
AttributeError: 'WindroseAxes' object has no attribute 'savefig'
Do you know how to create a figure out of the result so I can use it in my work?
We can use pyplot.savefig() from matplotlib.
import pandas as pd
import numpy as np
from windrose import WindroseAxes
from matplotlib import pyplot as plt
from IPython.display import Image
df_ws = pd.read_csv('WindData.csv')
# df_ws has `Wind Direction` and `Wind Speed`
ax = WindroseAxes.from_ax()
ax.bar(df_ws['Wind Direction'], df_ws['Wind Speed'])
ax.set_legend()
# savefig() supports eps, jpeg, jpg, pdf, pgf, png, ps, raw, rgba, svg, svgz, tif, tiff
plt.savefig('WindRose.jpg')
plt.close()
Image(filename='WindRose.jpg')
assuming that you are using box type of windrose module. Then the following code converts your wind rose into an image:
ax = WindroseAxes.from_ax()
ax.box(direction=wd, var=ws, bins=bins)
buff = io.BytesIO()
plt.savefig(buff, format="jpeg")
pixmap = QtGui.QPixmap()
pixmap.loadFromData(buff.getvalue())
dialog.ui.windrose_label.setScaledContents(True)
dialog.ui.windrose_label.setPixmap(pixmap)
The error is occuring because you are trying to save the subplot instead of the figure. Try:
fig,ax2 = plt.subplots(1,1) # Or whatever you need.
# The windrose code you showed
fig.savefig('./figures/windrose.png')