Python: How to make a dict field relative to another dict field - python

I have a dict looking like this:
visits = {'visit_1': {'date': '23-11-2016'},
'visit_2': {'date': '23-12-2016'}}
The dict consists of a lot of visits, where the 'date' is relative to the visit_1 date.
Question: Is it possible for a python dict to refer to self? Something like this:
from datetime import datetime, timedelta
visits = {'visit_1': {'date': datetime('23-11-2016')},
'visit_2': {'date': <reference_to_self>['visit_1']['date'] + timedelta(weeks=4)}
EDIT:
The first visit is not known at initialization. The dict defines a fairly complicated visit sequence (used in treatment of cancer patients)

If I understand the problem correctly, you can define the first visit date prior to creating the dictionary:
from datetime import datetime, timedelta
visit1_date = datetime(2016, 11, 23)
visits = {'visit_1': {'date': visit1_date},
'visit_2': {'date': visit1_date + timedelta(weeks=4)}}
print(visits)
To extend that a little bit further, you may have a visits factory that would dynamically create a dictionary of visits based on a start date and the number of visits (assuming the visits come every 4 weeks evenly):
from datetime import datetime, timedelta
def visit_factory(start_date, number_of_visits):
return {'visit_%d' % (index + 1): start_date + timedelta(weeks=4 * index)
for index in range(number_of_visits)}
visits = visit_factory(datetime(2016, 11, 23), 2)
print(visits)

Python containers (e.g. dictionaries) can definately refer to them selves. The problem with your example is that you are not just storing a reference, but the result of an expression (which involves the reference). This result will not itself keep any reference to how it was made, and so you loose the reference.
Another Small problem is that you cannot create an object and reference it in the same expression. Since dicts are mutable though, you can just create some of its elements first, and then add elements to it in further expression, referencing the dict via its variable name.

Related

How can we find possible time zone names corresponding to a time zone abbreviation?

Given a time zone abbreviation, is there a way to get a list of possible time zones?
Like IST could mean one of the following:
Asia/Kolkata (India)
Asia/Tel_Aviv (Israel)
Europe/Dublin (Ireland)
What I was looking for was a way to get ['Asia/Kolkata', 'Asia/Tel_Aviv', 'Europe/Dublin'] as output when 'IST' is given as input.
I was hoping that there would a way using the standard modules itself. Can it be done with the new zoneinfo module?
Being inspired by this answer, I did
from datetime import datetime as dt
import zoneinfo
s = zoneinfo.available_timezones()
d = dict()
for tz in s:
t = z.ZoneInfo(tz).tzname(dt.utcnow())
if t not in d:
d[t] = [tz]
else:
d[t].append(tz)
But this is giving 'PDT' as
'PDT': ['America/Ensenada',
'America/Santa_Isabel',
'Mexico/BajaNorte',
'US/Pacific-New',
'US/Pacific',
'America/Vancouver',
'PST8PDT',
'America/Tijuana',
'America/Los_Angeles',
and 'PST' as just
'PST': ['Asia/Manila'],
But wouldn't 'America/Los_Angeles' also come under 'PST' at some point (not right now but later on).
I'm obviously missing something, but couldn't understand what it is..
Can there be a way to get the list of possible timezones from a time zone abbreviation?
extending the linked answer, in a kind-of-naive approach, you can generate your lookup table for a specific year like
from collections import defaultdict
from datetime import datetime, timedelta
import zoneinfo
year = 2021
checkdays = [datetime(year, 1, 1) + timedelta(i) for i in range(0, 365, 5)]
abbreviationmap = defaultdict(set)
for d in checkdays:
for z in zoneinfo.available_timezones():
abbreviationmap[zoneinfo.ZoneInfo(z).tzname(d)].add(z)
print(sorted(abbreviationmap['PDT']))
# ['America/Ensenada', 'America/Los_Angeles', 'America/Santa_Isabel',
# 'America/Tijuana', 'America/Vancouver', 'Canada/Pacific', 'Mexico/BajaNorte',
# 'PST8PDT', 'US/Pacific']

Given a list of dates, how can I combine/format them as a string like "Day1-Day3, Day5 Month Year"

Background:
I am working on a project that gives me properly formatted JSON data. For the purposes of this question, the data is not important. Specifically what is relevant to this question is the following example list of strings (dates, in the format YYYY-MM-DD):
dates = [ "2021-04-17", "2021-04-18", "2021-04-19", "2021-04-23" ]
What I want to do:
Given the example list dates above, I want to combine dates with the same month and year into a single line/string, with a format like so:
17-19, 23 APR 2021
What I tried:
I created a simple iteration on the dates list to create a dictionary, broken down like so: {year: {month: {days}}}. However this still leaves me with the issue of combining sequential days, e.g. [17, 18, 19, 23] needs to become 17-19, 23 when finally printing or creating the string. (See my comment below the code snippet)
It's beginning to feel like a sledgehammer solution for a nail problem. I feel like there has to be a better way to do this with list comprehension or something simpler.
Code snippet:
import datetime
dates = [ "2021-04-17", "2021-04-18", "2021-04-19", "2021-04-23" ]
parsed_dates = {}
for d in dates:
current_date = datetime.datetime.strptime(d, '%Y-%m-%d')
# local variables just for legibility
year = current_date.year
month = current_date.month
day = current_date.day
# if the current year is not in the dict, add it
if year not in parsed_dates:
parsed_dates[year] = {}
# if the current month is not in the year dict, add it
if month not in parsed_dates[year]:
parsed_dates[year][month] = set()
# for the day, simply add to the set (ignore duplicates)
parsed_dates[year][month].add(day)
# dictionary sorting omitted for legibility
# ...
print(parsed_dates)
I haven't continued trying to parse the dictionary into my desired format because as mentioned I feel like I'm going down the wrong rabbit hole here.
See my posted answer below for the solution, it is definitely not optimal. Hoping someone else has better ideas.
Output:
{2021: {4: {17, 18, 19, 23}}}
Some parameters for the potential solution:
I am not in control of the source data, this is provided to me
The date formats will always be a string, with format YYYY-MM-DD. Assume all dates are valid, and will successfully parse with datetime.datetime.strptime using the format string '%Y-%m-%d' (e.g. there are no whacky strings like 2021-50-98)
The dates are not guaranteed to be sequential days or months, or in order (e.g. I could have [ ... "2021-02-05", "2020-01-01", ...]
The dates list may contain duplicates (e.g. [ ... "2021-04-21", "2021-04-21", ... ])
No consideration needs to be given to localization (e.g. assume everything is in English, and everyone understands both styles of date format, DD MMM YYYY and YYYY-MM-DD)
The solution does not need to be "bullet-proof" (e.g. we can make assumptions on format and input, and that it will always look like the example above; the only variable is the number of dates per list)
The solution does not need to be extensible (e.g. I only need this as a solution to the above problem given the list of dates with no consideration for permutations or changes to data)
Please let me know if any other clarification or details are needed, or if the post is too long. I tried to give as much information as possible but I realize this might be overwhelming to quickly read and formulate potential solutions.
For anyone looking at this question, my current solution is as posted below. It is a ton of code to do something simple, so I'm sure there is a better solution out there. I am happy to accept someone else's answer if they can optimize this or point out a core concept I might be missing.
Note: I added "2021-03-18" to the dates list here to demonstrate breaking up the strings properly after formatting.
Code:
import datetime
import calendar
dates = [ "2021-04-17", "2021-04-18", "2021-03-18", "2021-04-19", "2021-04-23" ]
dates_parsed = [datetime.datetime.strptime(d, '%Y-%m-%d') for d in dates]
dates_dict = {}
for d in dates_parsed:
# local variables just for legibility
year = d.year
month = d.month
day = d.day
# if the current year is not in the dict, add it
if year not in dates_dict:
dates_dict[year] = {}
# if the current month is not in the year dict, add it
if month not in dates_dict[year]:
dates_dict[year][month] = set()
# for the day, simply add to the set (ignore duplicates)
dates_dict[year][month].add(day)
# begin looking through the dictionary for ranges to format
for year in dates_dict:
for month in dates_dict[year]:
days = list(dates_dict[year][month])
day_ranges = [] # a list of lists, used to step through and format as needed
curr_range = [] # the current list being operated on below
for day in days:
# check if current range is empty
if not curr_range:
curr_range = [day]
continue
# if the current range has elements, check if this day is the next one
# if so, add it to the current range
if curr_range[-1] + 1 == day:
curr_range.append(day)
# otherwise, we're not in a sequential day range
# push this list to the day_ranges list, and clear the curr_range list
# after clearning, add the current day to the new curr_range list
else:
day_ranges.append(curr_range.copy())
curr_range.clear()
curr_range = [day]
# if the curr_range isn't empty when exiting the loop, add it to the day_ranges list
if len(curr_range) > 0:
day_ranges.append(curr_range.copy())
# begin formatting the day ranges
# this formats something like [[17, 18, 19], [23]] to '17-19, 23'
day_strings = []
for d in day_ranges:
if len(d) > 1:
day_strings.append(f'{d[0]}-{d[-1]}')
else:
day_strings.append(str(d[0]))
# finally, join up each day ranges into one string
day_str = ', '.join([x for x in day_strings])
month_name = calendar.month_name[month]
print(f"{day_str} {month_name[:3].upper()} {year}")
Output:
17-19, 23 APR 2021
18 MAR 2021
(I'm not particularly concerned with the sorting of the year/month for this particular answer, that's easy enough to do elsewhere.)

Avoiding repeat data in many instances of the same class

I'm working on subclassing the datetime.datetime class in an attempt to add some calendar-based operations. In particular, I wish to be able to add/subtract days/weeks/month/years from an anchor date while adjusting for weekends and bank holidays.
Here's a snippet that should cover the method:
import datetime
import dateutil
class DateTime(datetime.datetime):
def add_workdays(cls, n = 1):
x = 1
while x <= n:
cls += dateutil.relativedelta.relativedelta(days=1)
if cls.weekday() > 4:
x = x
else:
x += 1
return DateTime(cls.year, cls.month, cls.day, cls.hour, cls.minute,
cls.second, cls.microsecond)
This method trivially adds n business days (only accounting for weekends) to the current date and returns the result.
In order to acheive the bank holiday corrections, I could easily just pass an array containing bank holidays (via a setter method or via overriding the __new__ method) to the method and adjust accordingly. However, the problem occurs when considering, for instance, time-series data. In this case, each datetime object would contain a copy of said array, which, I would suspect, could make memory usage quite high for long time-series data.
So my question is this: how would the sophisticated Python programmer deal with this? I've been looking at the way the bizdays package achieves this, but it seems to suffer from the same "shortcomings".
In other languages I would have been able to simply point to a single instance of a holiday array, but to my (admittedly sparse) knowledge of Python, this is not possible. Is it simply more correct to store dates as strings and convert to DateTime only when needed?
You could simply add a variable inside the class definition:
import datetime
class DateTime(datetime.datetime):
holidays = [datetime.date(2017, 7, 4), datetime.date(2017,12,25)] # ....
print(DateTime.holidays)
# [datetime.date(2017, 7, 4), datetime.date(2017, 12, 25)]
It will be available everywhere, including in any DateTime instance. There will only be one single copy for all your instances:
import datetime
class DateTime(datetime.datetime):
holidays = [datetime.date(2017, 7, 4), datetime.date(2017,12,25)] # ....
def test_only_one_copy_of_holidays(self): # Only defined for testing purposes
return DateTime.holidays
holidays1 = DateTime(2017,7,21).test_only_one_copy_of_holidays()
holidays2 = DateTime(2017,7,30).test_only_one_copy_of_holidays()
print(holidays1 is holidays2) # Are they the same object?
# True
For multiple countries, you could use a dict of lists:
>>> import datetime
>>> holidays = {'France': [datetime.date(2017,7,14), datetime.date(2017,12,25)], 'USA': [datetime.date(2017,7,4), datetime.date(2017,12,25)]}
>>> holidays['USA']
[datetime.date(2017, 7, 4), datetime.date(2017, 12, 25)]

Querying a DateField with a range of 2 years

This is a pretty straight-forward question:
I have two models, each with a DateField. I want to query Model-A based on the date in Model-B. I want a query that returns all the objects of Model-A that have a date within 2 years, plus or minus, of the date in one object of Model-B. How can this be done?
Assuming you have a date value from model B, calculate two dates: one - 2 years in the past and another - 2 years in the future by the help of python-dateutil module (taken partially from here). Then, use __range notation to filter out A records by date range:
from dateutil.relativedelta import relativedelta
def yearsago(from_date, years):
return from_date - relativedelta(years=years)
b_date = b.my_date
date_min, date_max = yearsago(b_date, 2), yearsago(b_date, -2)
data = A.objects.filter(my_date__range=(date_min, date_max))
where b is a B model instance.
Also see: Django database query: How to filter objects by date range?
Hope that helps.

Python Date Utility Library

Is there a library in python that can produce date dimensions given a certain day? I'd like to use this for data analysis. Often I have a time series of dates, but for aggregation purposes I'd like to be able to quickly produce dates associated with that day - like first date of month, first day in week, and the like.
I think I could create my own, but if there is something out there already it'd be nice.
Thanks
Have a look at dateutil.
The recurrence rules and relative deltas are what you want.
For example, if you wanted to get last monday:
import dateutil.relativedelta as rd
import datetime
last_monday = datetime.date.today() + rd.relativedelta(weekday=rd.MO(-1))
time and datetime modules
For some of your purposes you can use time module with strftime() method or date module with its strftime() method. It allows you to pull, among other data:
number of the week of the year,
number of the weekday (you can also use weekday() method for getting weekday number between 0 for Monday and 6 for Sunday),
year,
month,
Which will suffice to calculate first day of the month, first day of the week and some other data.
Examples
To pull the data you need, do just:
to pull the number of the day of the week
>>> from datetime import datetime
>>> datetime.now().weekday()
6
to pull the first day of the month use replace() function of datetime object:
>>> from datetime import datetime
>>> datetime.now()
datetime.datetime(2012, 3, 3, 21, 41, 20, 953000)
>>> first_day_of_the_month = datetime.now().replace(day=1)
>>> first_day_of_the_month
datetime.datetime(2012, 3, 1, 21, 41, 20, 953000)
EDIT: As J.F. Sebastian suggested within comments, datetime objects have weekday() methods, which makes using int(given_date.strftime('%w')) rather pointless. I have updated the answer above.

Categories