I'm working on subclassing the datetime.datetime class in an attempt to add some calendar-based operations. In particular, I wish to be able to add/subtract days/weeks/month/years from an anchor date while adjusting for weekends and bank holidays.
Here's a snippet that should cover the method:
import datetime
import dateutil
class DateTime(datetime.datetime):
def add_workdays(cls, n = 1):
x = 1
while x <= n:
cls += dateutil.relativedelta.relativedelta(days=1)
if cls.weekday() > 4:
x = x
else:
x += 1
return DateTime(cls.year, cls.month, cls.day, cls.hour, cls.minute,
cls.second, cls.microsecond)
This method trivially adds n business days (only accounting for weekends) to the current date and returns the result.
In order to acheive the bank holiday corrections, I could easily just pass an array containing bank holidays (via a setter method or via overriding the __new__ method) to the method and adjust accordingly. However, the problem occurs when considering, for instance, time-series data. In this case, each datetime object would contain a copy of said array, which, I would suspect, could make memory usage quite high for long time-series data.
So my question is this: how would the sophisticated Python programmer deal with this? I've been looking at the way the bizdays package achieves this, but it seems to suffer from the same "shortcomings".
In other languages I would have been able to simply point to a single instance of a holiday array, but to my (admittedly sparse) knowledge of Python, this is not possible. Is it simply more correct to store dates as strings and convert to DateTime only when needed?
You could simply add a variable inside the class definition:
import datetime
class DateTime(datetime.datetime):
holidays = [datetime.date(2017, 7, 4), datetime.date(2017,12,25)] # ....
print(DateTime.holidays)
# [datetime.date(2017, 7, 4), datetime.date(2017, 12, 25)]
It will be available everywhere, including in any DateTime instance. There will only be one single copy for all your instances:
import datetime
class DateTime(datetime.datetime):
holidays = [datetime.date(2017, 7, 4), datetime.date(2017,12,25)] # ....
def test_only_one_copy_of_holidays(self): # Only defined for testing purposes
return DateTime.holidays
holidays1 = DateTime(2017,7,21).test_only_one_copy_of_holidays()
holidays2 = DateTime(2017,7,30).test_only_one_copy_of_holidays()
print(holidays1 is holidays2) # Are they the same object?
# True
For multiple countries, you could use a dict of lists:
>>> import datetime
>>> holidays = {'France': [datetime.date(2017,7,14), datetime.date(2017,12,25)], 'USA': [datetime.date(2017,7,4), datetime.date(2017,12,25)]}
>>> holidays['USA']
[datetime.date(2017, 7, 4), datetime.date(2017, 12, 25)]
Related
I am trying to create a list of dates to add to a Pandas dataframe as a new column using ...
df['Surveys_Last_Week'] = list
I have done this before without issues. However, with the code below I get the dates returned in the format I want but when I add them to a list the format changes and they become prefixed with datetime.date
2022-05-14
2022-07-09
2022-03-05
2022-03-12
[datetime.date(2022, 5, 14), datetime.date(2022, 7, 9), datetime.date(2022, 3, 5), datetime.date(2022, 3, 12)]
How can I get the dates into a list in the format that they return in?
The code I am using is as follows ...
today = datetime.date.today()
completion_list_80 = []
for value in df.Weeks_to_80pc:
if value == float('inf'):
pass
else:
remaining_weeks = datetime.timedelta(weeks=value)
projected_completion = today + remaining_weeks
print(projected_completion)
completion_list_80.append(projected_completion)
print(completion_list_80)
Any help very much appreciated.
Thank you
To convert the list of datetime.date objects to a pandas.DatetimeIndex column using NumPy's datetime64 dtype, you can wrap pandas.to_datetime() around your list when you insert it into your DataFrame:
df['Surveys_Last_Week'] = pd.to_datetime(completion_list_80)
Alternatively, to convert the datetime.date objects to their yyy-mm-dd string representations, you can format them with strftime as you append them to the list (however, I would recommend sticking to pandas' builtin DatetimeIndex support):
completion_list_80.append(projected_completion.strftime('%Y-%m-%d'))
I am writing a django application where I have records stored on the basis of datetimefield.
first_record = MyModel.objects.filter().order_by('-added').first()
first_record = (first_record.added.month, first_record.added.year)
last_record = MyModel.objects.filter().order_by('-added').first()
last_record = (last_record.added.month, last_record.added.year)
Now I want to make a list of all months/year between the first record and last record. A rough idea is:
for i in range(first_record, last_record):
# do something
Where the range function is supposed to give me a list to iterate over which looks like this:
[('01','2018'),('02','2018'),('03','2018'),....,('11','2020'),('12','2020')]
Any ideas how do I do that?
Also is (last_record.added.month, last_record.added.year) the right way to get a tuple containing month and year. Note that I want months in the format 01 instead of 1 for first month for example.
I believe Django has a built-in function. You can do:
>>> Entry.objects.dates('pub_date', 'month')
[datetime.date(2005, 2, 1), datetime.date(2005, 3, 1)]
>>> Entry.objects.dates('pub_date', 'week')
[datetime.date(2005, 2, 14), datetime.date(2005, 3, 14)]
Which, translated into your code, will be something like
MyModel.objects.dates('added', 'month')
Documentation
You can do this by using dateutil.relativedelta
Here is the code
from dateutil.relativedelta import relativedelta
import datetime
result = []
today = datetime.date.today()
current = datetime.date(2010, 8, 1)
while current <= today:
result.append(current)
current += relativedelta(months=1)
Know more abou in
https://dateutil.readthedocs.io/en/latest/relativedelta.html
I am trying to do some Python date and timedelta maths and stumbled upon this.
>>> import datetime
>>> dt = datetime.date(2000, 4, 20)
>>> td = datetime.timedelta(days=1)
>>> dt - td
datetime.date(2000, 4, 19)
>>> -(td) + dt
datetime.date(2000, 4, 19)
>>> dt - td == dt + (-td)
True
So far so good, but when the timedelta also includes some hours it gets interesting.
>>> td = datetime.timedelta(days=1, hours=1)
>>> dt - td
datetime.date(2000, 4, 19)
>>> -(td) + dt
datetime.date(2000, 4, 18)
or in a comparison:
>>> dt - td == dt + (-td)
False
I would have expected that a - b == a + (-b), but this doesn't seem to work for date and timedelta. As far as I was able to track that down, this happens because adding/subtracting date and timedelta only considers the days field of timedelta, which is probably correct. However negating a timedelta considers all fields and may change the days field as well.
>>> -datetime.timedelta(days=1)
datetime.timedelta(-1)
>>> -datetime.timedelta(days=1, hours=1)
datetime.timedelta(-2, 82800)
As can be seen in the second example, days=-2 after the negation, and therefore date + timedelta will actually subtract 2 days.
Should this be considered a bug in the python datetime module? Or is this rather some 'normal' behaviour which needs to be taken into account when doing things like that?
Internally the datetime module creates a new timedelta, with just the days field of the original timedelta object passed in, when subtracting a timedelta to a date object. Which equates to following, code that seems to be quite odd.
>>> dt + datetime.timedelta(-(-(-dt).days))
datetime.date(2000, 4, 18)
I can't really sea a reason for just using the negated days field when doing date - timedelta subtractions.
Edit:
Here is the relevant code path in python datetime module:
class date:
...
def __sub__(self, other):
"""Subtract two dates, or a date and a timedelta."""
if isinstance(other, timedelta):
return self + timedelta(-other.days)
...
If it would just pass on -other then the condition a - b == a + (-b) would hold true. (It would change current behaviour though).
class date:
...
def __sub__(self, other):
"""Subtract two dates, or a date and a timedelta."""
if isinstance(other, timedelta):
return self - other # timedelta.__rsub__ would take care of negating other
...
Should this be considered a bug in the python datetime module? Or is
this rather some 'normal' behaviour which needs to be taken into
account when doing things like that?
No, this should not be considered a bug. A date does not track its state in terms of hours, minutes, and seconds, which is what would be needed for it to behave in the way you suggest it ought to.
I would consider the code you've presented to be a bug: the programmer is using the wrong datatype for the work they're trying to accomplish. If you want to keep track of time in days, hours, minutes and seconds, then you need a datetime object. (which will happily provide you with a date once you've done all of the arithmetic you care to do)
This is because of the way how negative timedeltas are represented.
import datetime
td = datetime.timedelta(days=1, hours=1)
print (td.days, td.seconds)
# prints 1 3600
minus_td = -td
print (minus_td.days, minus_td.seconds)
# prints -2 82800
I hope you now better understand why days were affected.
Seconds in a timedelta are always normalized to a positive amount between 0 and 86399:
>>> print (datetime.timedelta(seconds=-10).seconds)
86390
I have a dict looking like this:
visits = {'visit_1': {'date': '23-11-2016'},
'visit_2': {'date': '23-12-2016'}}
The dict consists of a lot of visits, where the 'date' is relative to the visit_1 date.
Question: Is it possible for a python dict to refer to self? Something like this:
from datetime import datetime, timedelta
visits = {'visit_1': {'date': datetime('23-11-2016')},
'visit_2': {'date': <reference_to_self>['visit_1']['date'] + timedelta(weeks=4)}
EDIT:
The first visit is not known at initialization. The dict defines a fairly complicated visit sequence (used in treatment of cancer patients)
If I understand the problem correctly, you can define the first visit date prior to creating the dictionary:
from datetime import datetime, timedelta
visit1_date = datetime(2016, 11, 23)
visits = {'visit_1': {'date': visit1_date},
'visit_2': {'date': visit1_date + timedelta(weeks=4)}}
print(visits)
To extend that a little bit further, you may have a visits factory that would dynamically create a dictionary of visits based on a start date and the number of visits (assuming the visits come every 4 weeks evenly):
from datetime import datetime, timedelta
def visit_factory(start_date, number_of_visits):
return {'visit_%d' % (index + 1): start_date + timedelta(weeks=4 * index)
for index in range(number_of_visits)}
visits = visit_factory(datetime(2016, 11, 23), 2)
print(visits)
Python containers (e.g. dictionaries) can definately refer to them selves. The problem with your example is that you are not just storing a reference, but the result of an expression (which involves the reference). This result will not itself keep any reference to how it was made, and so you loose the reference.
Another Small problem is that you cannot create an object and reference it in the same expression. Since dicts are mutable though, you can just create some of its elements first, and then add elements to it in further expression, referencing the dict via its variable name.
I'm creating a timestamp using datetime.date.today().day. Later in the code, this one shall be compared to another (current) timestamp, but just on the day-level: "If the current day is not the day of the former timestamp, do stuff".
To do this, I'm saving the first timestamp using pickle. Now I wonder, if the datetime-object will be auto-updated after pickle.load, if the loading date is not the "dumping" date. After all, the function is named "today"... I hope, this is not a stupid question and I managed to explain my issue properly.
The method datetime.datetime.today() creates a new datetime.datetime object of the current moment. The object itself doesn't know how it was created, i.e. neither the function nor the function's intention. It only know when it was created, and this is what will be stored.
If you look at the documentation of the function (e.g. via help(datetime.datetime.today), it provides this:
Current date or datetime: same as self.__class__.fromtimestamp(time.time())
Now, time.time() provides the current timestamp, e.g. 1468585949.653488. This is a plain number (float or int), which is constant once created. This number is then simply fed to datetime.datetime.fromtimestamp. For any given timestamp, this will always give you the same datetime [1].
In [12]: datetime.datetime.fromtimestamp(1468585949.653488)
Out[12]: datetime.datetime(2016, 7, 15, 14, 32, 29, 653487)
If you dump this object, you get a regular datetime.datetime object. It's just the plain class datetime.datetime and its data, no function or method reference such as datetime.datetime.today.
In [3]: print(pickle.dumps(datetime.datetime.fromtimestamp(1468585949.653488),protocol=0))
# cdatetime # class datetime.\
# datetime # datetime
# p0 # push last object (datetime.datetime) to stack as p0
# (S'\x07\xe0\x07\x0f\x0e \x1d\t\xf8\xb0' # group '(' start, string 'S' from binary ('\x07...')
# p1 # push last object (string) to stack as p1
# tp2 # create tuple from last stack group, push to stack as p2
# Rp3 # call p0(*p2)
# . # done
So, what does this piece of junk do? It looks up the object datetime.datetime as p0, stores the string '\x07\xe0\x07\x0f\x0e \x1d\t\xf8\xb0' as p1, creates the tuple p2 = tuple((p1,)), then calls p0(*p2).
Or in other words, datetime.datetime('\x07\xe0\x07\x0f\x0e \x1d\t\xf8\xb0') is returned. Note that the argument to datetime.datetime is a constant string. This will always give you the original datetime:
In [30]: datetime.datetime('\x07\xe0\x07\x0f\x0e \x1d\t\xf8\xb0')
Out[30]: datetime.datetime(2016, 7, 15, 14, 32, 29, 653488)
[1] Baring differences in timezones etc.
It doesn't autoupdate after loading. To demonstrate it, check this small example:
import pickle
import datetime
today1 = datetime.datetime.today()
pickle.dump(today1, open('today','wb') )
sleep(5)
today2 = pickle.load(open('today','r'))
# today1 => datetime.datetime(2016, 7, 15, 18, 6, 6, 578041)
# today2 => datetime.datetime(2016, 7, 15, 18, 6, 6, 578041)
You can see that even after a lag of 5 seconds, there is NO change in attributes (year, month, day, hour, sec .. etc) of datetime objects: today1 and today2
Hope it helps : )