I need to check that two dates, not in any date range on the list.
I want to find out can user check-in in dates (check_range_true - can, check_range_false - can't) or this dates already booked (in date_ranges)
I have range looks like:
date_ranges = [
['2020-1-12', '2020-1-13'],
['2020-1-14', '2020-1-15'],
['2020-1-15', '2020-1-16'],
['2020-1-16', '2020-1-18'],
['2020-1-18', '2020-1-19'],
['2020-1-21', '2020-1-23'],
['2020-1-23', '2020-1-27'],
['2020-1-30', '2020-2-1'],
['2020-2-5', '2020-2-7'],
['2020-2-7', '2020-2-9'],
['2020-2-9', '2020-2-11'],
['2020-2-14', '2020-2-18'],
['2020-2-20', '2020-2-26'],
['2020-3-26', '2020-3-30'],
['2020-5-29', '2020-5-30'],
['2020-10-10', '2021-1-15']
]
And two dates (for example)
check_range_true = ['2020-02-02', '2020-02-04']
check_range_false = ['2020-02-02', '2020-02-05']
I know how check one date in range but not understand how to solve it with two dates.
What to the best way to check these dates in a range and got results, True for the first variable (because of 2020-02-02, 2020-02-04 not "touch" range) and False for the second variable (because of 2020-02-05 is in range of ['2020-2-5', '2020-2-7'])?
What you what to do is to check the dates with (start < first_date < end) and (start < end_date < end) logic
date_ranges = [
['2020-1-12', '2020-1-13'],
['2020-1-14', '2020-1-15'],
['2020-1-15', '2020-1-16'],
['2020-1-16', '2020-1-18'],
['2020-1-18', '2020-1-19'],
['2020-1-21', '2020-1-23'],
['2020-1-23', '2020-1-27'],
['2020-1-30', '2020-2-1'],
['2020-2-5', '2020-2-7'],
['2020-2-7', '2020-2-9'],
['2020-2-9', '2020-2-11'],
['2020-2-14', '2020-2-18'],
['2020-2-20', '2020-2-26'],
['2020-3-26', '2020-3-30'],
['2020-5-29', '2020-5-30'],
['2020-10-10', '2021-1-15']
]
#convert to a flat list
date_ranges = [k for i in date_ranges for k in i]
#truncate the start and the end value
date_ranges = date_ranges[1:-1]
#convert values to datetime
import datetime
date_ranges = [datetime.datetime.strptime(i, '%Y-%m-%d') for i in date_ranges]
#create available time slots
date_ranges = [[date_ranges[i],date_ranges[i+1]] for i in range(0,len(date_ranges),2)]
#convert the check date to date time
check_range = ['2020-02-02', '2020-02-04']
check_range = [datetime.datetime.strptime(i, '%Y-%m-%d') for i in check_range]
# apply the logic of start < date < end twice
any([(i[0] < check_range[0] < i[1]) and (i[0] < check_range[1] < i[1]) for i in date_ranges])
True
check_range = ['2020-02-02', '2020-02-05']
check_range = [datetime.datetime.strptime(i, '%Y-%m-%d') for i in check_range]
any([(i[0] < check_range[0] < i[1]) and (i[0] < check_range[1] < i[1]) for i in date_ranges])
False
If I understand this correctly you want to check if given date range (e.g. check_range_true) overlaps (or not) with any other date range specified in the list. To achieve this, I would first transform string values to proper datetime objects for easier dates comparison. This could be achieved with list comprehension and strptime:
from datetime import datetime
booked_date_ranges = [
[datetime.strptime(start_date, '%Y-%m-%d'), datetime.strptime(end_date, '%Y-%m-%d')]
for start_date, end_date in date_ranges
]
Then I would create a function, which would check if provided date range (consisting of start date, and end date) overlaps with any date range from the previously specified list. You need to check if start date is between date range or end date is between date range. It would be something along these lines:
def dates_overlap(date_range_to_check, booked_date_ranges):
dates = [datetime.strptime(date, '%Y-%m-%d') for date in date_range_to_check]
return any(
(start_date <= dates[0] and dates[0] <= end_date) or (start_date <= dates[1] and dates[1] <= end_date)
for start_date, end_date in booked_date_ranges
)
Then if you want to check if given date range DOES NOT overlap, you can just use the dates_overlap function and negate the result:
>>> not dates_overlap(check_range_false, booked_date_ranges)
False
>>> not dates_overlap(check_range_true, booked_date_ranges)
True
I hope this answers your question. Of course this is just a draft and there's definitely some room for improvement, but should be a working solution to given problem.
Related
I'm currently developing something and was wondering if the new match statement in python 3.10 would be suited for such a use case, where I have conditional statements.
As input I have a timestamp and a dataframe with dates and values. The goal is to loop over all rows and add the value to the corresponding bin bases on the date. Here, in which bin the value is placed depends on the date in relation with the timestamp. A date within 1 month of the timestamp is place in bin 1 and within 2 months in bin 2 etc...
The code that I have now is as follows:
bins = [0] * 7
for date, value in zip(df.iloc[:,0],df.iloc[:,1]):
match [date,value]:
case [date,value] if date < timestamp + pd.Timedelta(1,'m'):
bins[0] += value
case [date,value] if date > timestamp + pd.Timedelta(1,'m') and date < timestamp + pd.Timedelta(2,'m'):
bins[1] += value
case [date,value] if date > timestamp + pd.Timedelta(2,'m') and date < timestamp + pd.Timedelta(3,'m'):
bins[2] += value
case [date,value] if date > timestamp + pd.Timedelta(3,'m') and date < timestamp + pd.Timedelta(4,'m'):
bins[3] += value
case [date,value] if date > timestamp + pd.Timedelta(4,'m') and date < timestamp + pd.Timedelta(5,'m'):
bins[4] += value
case [date,value] if date > timestamp + pd.Timedelta(5,'m') and date < timestamp + pd.Timedelta(6,'m'):
bins[5] += value
Correction: originally I stated that this code does not work. It turns out that it actually does. However, I am still wondering if this would be an appropriate use of the match statement.
I'd say it's not a good use of structural pattern matching because there is no actual structure. You are checking values of the single object, so if/elif chain is a much better, more readable and natural choice.
I've got 2 more issues with the way you wrote it -
you do not consider values that are on the edges of the bins
You are checking same condition twice, even though if you reached some check in match/case you are guaranteed that the previous ones were not matched - so you do not need to do if date > timestamp + pd.Timedelta(1,'m') and... if previous check of if date < timestamp + pd.Timedelta(1,'m') failed you already know that it is not smaller. (There is an edge case of equality but it should be handled somehow anyway)
All in all I think this would be the cleaner solution:
for date, value in zip(df.iloc[:,0],df.iloc[:,1]):
if date < timestamp + pd.Timedelta(1,'m'):
bins[0] += value
elif date < timestamp + pd.Timedelta(2,'m'):
bins[1] += value
elif date < timestamp + pd.Timedelta(3,'m'):
bins[2] += value
elif date < timestamp + pd.Timedelta(4,'m'):
bins[3] += value
elif date < timestamp + pd.Timedelta(5,'m'):
bins[4] += value
elif date < timestamp + pd.Timedelta(6,'m'):
bins[5] += value
else:
pass
This should really be done directly with Pandas functions:
import pandas as pd
from datetime import datetime
timestamp = datetime.now()
bins = [pd.Timestamp(year=1970, month=1, day=1)]+[pd.Timestamp(timestamp)+pd.Timedelta(i, 'm') for i in range(6)]+[pd.Timestamp(year=2100, month=1, day=1)] # plus open bin on the right
n_samples = 1000
data = {
'date': [pd.to_datetime(timestamp)+pd.Timedelta(i,'s') for i in range(n_samples)],
'value': list(range(n_samples))
}
df = pd.DataFrame(data)
df['bin'] = pd.cut(df.date, bins, right=False)
df.groupby('bin').value.sum()
start = "Nov20"
end = "Jan21"
# Expected output:
["Nov20", "Dec20", "Jan21"]
What I've tried so far is the following but am looking for more elegant way.
from calendar import month_abbr
from time import strptime
def get_range(a, b):
start = strptime(a[:3], '%b').tm_mon
end = strptime(b[:3], '%b').tm_mon
dates = []
for m in month_abbr[start:]:
dates.append(m+a[-2:])
for mm in month_abbr[1:end + 1]:
dates.append(mm+b[-2:])
print(dates)
get_range('Nov20', 'Jan21')
Note: i don't want to use pandas as that's not logical to import such library for generating dates.
The date range may span different years so one way is to loop from the start date to end date and increment the month by 1 until end date is reached.
Try this:
from datetime import datetime
def get_range(a, b):
start = datetime.strptime(a, '%b%y')
end = datetime.strptime(b, '%b%y')
dates = []
while start <= end:
dates.append(start.strftime('%b%y'))
if start.month == 12:
start = start.replace(month=1, year=start.year+1)
else:
start = start.replace(month=start.month+1)
return dates
dates = get_range("Nov20", "Jan21")
print(dates)
Output:
['Nov20', 'Dec20', 'Jan21']
You can use timedelta to step one month (31 days) forward, but make sure you stay on the 1st of the month, otherwise the days might add up and eventually skip a month.
from datetime import datetime
from datetime import timedelta
def get_range(a, b):
start = datetime.strptime(a, '%b%y')
end = datetime.strptime(b, '%b%y')
dates = []
while start <= end:
dates.append(start.strftime('%b%y'))
start = (start + timedelta(days=31)).replace(day=1) # go to 1st of next month
return dates
dates = get_range("Jan20", "Jan21")
print(dates)
I 'm having a function where it creates a dictionary as below.
x = {'filename': {'filetype': ('5/6/2019', '12/31/2019')}, 'filename2': {'filetype': ('3/24/2018', '5/6/2019')}}
I need to create a new function by passing the date and its type to return the filename based on the tuple dates.
def fn(date, filetype):
I'm trying to pass a date as a first argument
and the date should check if it is in between the tuple as start and end dates
in the dictionary values above If it is in between those dates I need to return the file name
return filename
Question:
Is it possible to check the in-between dates for tuples?
you should convert to datetime objects:
from datetime import datetime
x = {'filename': {'filetype': ('5/6/2019', '12/31/2019')}, 'filename2': {'filetype': ('3/24/2018', '5/6/2019')}}
def fn(dateobj, filetype):
dateobj = datetime.strptime(dateobj, '%m/%d/%Y')
startdate = datetime.strptime(filetype[0], '%m/%d/%Y')
enddate = datetime.strptime(filetype[1], '%m/%d/%Y')
return startdate <= dateobj <= enddate
print(fn('6/6/2019', x['filename']['filetype']))
print(fn('4/6/2019', x['filename']['filetype']))
this will print:
True
False
As people mentioned in the comments, transforming the string dates to datetime objects is recommended. One way to do it is:
from datetime import datetime
new_date = datetime.strptime('12/31/2019', '%m/%d/%Y')
Assuming all datestrings are datetime objects, your function becomes:
def fn(date, filetype):
for filename, range in x.items():
if filetype in range:
start, end = range[filetype]
if start <= date <= end:
return filename
This will return the filename if the date lies between the range, and None otherwise
Use split to convert dates to 3 numeric values in this order: year, month, date. Then you can compare the dates as tuples.
def convert(datestr):
m, d, y = datestr.split('/')
return (int(y), int(m), int(d))
date1 = convert('12/31/2018')
date2 = convert('1/1/2019')
print(date1 < date2)
The same approach works with lists, but those two types must not be mixed, either all dates in a comparison are tuples, or all dates are lists.
For date intervals simply test (e.g. in an if statement):
begin <= date <= end
where all 3 values are as described above.
I'm trying to create list of hours contained within each specified interval, which would be quite complicated with loop. Therefore, I wanted to ask for datetime recommendations.
# input in format DDHH/ddhh:
validity = ['2712/2812','2723/2805','2800/2812']
# demanded output:
val_hours = ['2712', '2713', '2714'..., '2717', '2723', '2800',...'2804',]
It would be great if last hour of validity would be considered as non-valid, becouse interval is ended by that hour, or more precisely by 59th minute of previous one.
I've tried quite complicated way with if conditions and loops, but I am persuaded that there is better one - as always.
It is something like:
#input in format DDHH/ddhh:
validity = ['2712/2812','2723/2805','2800/2812']
output = []
#upbound = previsously defined function defining list of lengt of each group
upbound = [24, 6, 12]
#For only first 24-hour group:
for hour in range(0,upbound[0]):
item = int(validity[0][-7:-5]) + hour
if (hour >= 24):
hour = hour - 24
output = output + hour
Further I would have to prefix numbers with date smaller than 10, like 112 (01st 12:00 Zulu) with zero and ensure correct day.
Loops and IFs seem to me just to compĂșlicated. Not mentioning error handling, it looks like two or three conditions.
Thank you for your help!
For each valid string, I use datetime.strptime to parse it, then based on either start date is less than or equal to end date, or greater than end date, I calculate the hours.
For start date less than or equal to end date, I consider original valid string, else I create two strings start_date/3023 and 0100/end_date
import datetime
validity = ['2712/2812','2723/2805','2800/2812','3012/0112','3023/0105','0110/0112']
def get_valid_hours(valid):
hours_li = []
#Parse the start date and end date as datetime
start_date_str, end_date_str = valid.split('/')
start_date = datetime.datetime.strptime(start_date_str,'%d%H')
end_date = datetime.datetime.strptime(end_date_str, '%d%H')
#If start date less than equal to end date
if start_date <= end_date:
dt = start_date
i=0
#Keep creating new dates until we hit end date
while dt < end_date:
#Append the dates to a list
dt = start_date+datetime.timedelta(hours=i)
hours_li.append(dt.strftime('%d%H'))
i+=1
#Else split the validity into two and calculate them separately
else:
start_date_str, end_date_str = valid.split('/')
return get_valid_hours('{}/3023'.format(start_date_str)) + get_valid_hours('0100/{}'.format(end_date_str))
#Append sublist to a bigger list
return hours_li
for valid in validity:
print(get_valid_hours(valid))
The output then looks like, not sure if this was the format needed!
['2712', '2713', '2714', '2715', '2716', '2717', '2718', '2719', '2720', '2721', '2722', '2723', '2800', '2801', '2802', '2803', '2804', '2805', '2806', '2807', '2808', '2809', '2810', '2811', '2812']
['2723', '2800', '2801', '2802', '2803', '2804', '2805']
['2800', '2801', '2802', '2803', '2804', '2805', '2806', '2807', '2808', '2809', '2810', '2811', '2812']
['3012', '3013', '3014', '3015', '3016', '3017', '3018', '3019', '3020', '3021', '3022', '3023', '0100', '0101', '0102', '0103', '0104', '0105', '0106', '0107', '0108', '0109', '0110', '0111', '0112']
['0100', '0101', '0102', '0103', '0104', '0105']
['0110', '0111', '0112']
Finally, I created something easy like this:
validity = ['3012/0112','3023/0105','0110/0112']
upbound = [24, 6, 12]
hours_list = []
for idx, val in enumerate(validity):
hours_li = []
DD = val[:2]
HH = val[2:4]
dd = val[5:7]
hh = val[7:9]
if DD == dd:
for i in range(int(HH),upbound[idx]):
hours_li.append(DD + str(i).zfill(2))
if DD <> dd:
for i in range(int(HH),24):
hours_li.append(DD + str(i).zfill(2))
for j in range(0,int(hh)):
hours_li.append(dd + str(j).zfill(2))
hours_list.append(hours_li)
This works for 24h validity (it could be solved by one if condition and similar block of concatenate), does not use datetime, just numberst and str. It is neither pythonic nor fast, but works.
I have two Python datetime and I want to count the days between those dates, counting ONLY the days belonging to the month I choose. The range might overlap multiple months/years.
Example:
If I have 2017-10-29 & 2017-11-04 and I chose to count the days in October, I get 3 (29, 30 & 31 Oct.).
I can't find a straightforward way to do this so I think I'm going to iterate over the days using datetime.timedelta(days=1), and increment a count each time the day belongs to the month I chose.
Do you know a more performant method?
I'm using Python 2.7.10 with the Django framework.
Iterating over the days would be the most straightforward way to do it. Otherwise, you would need to know how many days are in a given month and you would need different code for different scenarios:
The given month is the month of the first date
The given month is the month of the second date
The given month is between the first and the second date (if dates span more than two months)
If you want to support dates spanning more than one year then you would need the input to include month and year.
Your example fits scenario #1, which I guess you could do like this:
>>> from datetime import datetime, timedelta
>>>
>>> first_date = datetime(2017, 10, 29)
>>>
>>> first_day_of_next_month = first_date.replace(month=first_date.month + 1, day=1)
>>> last_day_of_this_month = first_day_of_next_month - timedelta(1)
>>> number_of_days_in_this_month = last_day_of_this_month.day
>>> number_of_days_in_this_month - first_date.day + 1
3
This is why I would suggest implementing it the way you originally intended and only turning to this if there's a performance concern.
You can get difference between two datetime objects by simply subtracting them.
So, we start by getting the difference between the two dates.
And then we generate all the dates between the two using
gen = (start_date + datetime.timedelta(days = e) for e in range(diff + 1))
And since we only want the dates between the specified ones, we apply a filter.
filter(lambda x : x==10 , gen)
Then we will sum them over.
And the final code is this:
diff = start_date - end_date
gen = (start_date + datetime.timedelta(days = e) for e in range(diff + 1))
filtered_dates = filter(
lambda x : x.month == 10 ,
gen
)
count = sum(1 for e in filtered_dates)
You can also use reduce but sum() is a lot more readable.
A potential method of achieving this is to first compare whether your start or end dates you are comparing have the same month that you want to choose.
For example:
start = datetime(2017, 10, 29)
end = datetime(2017, 11, 4)
We create a function to compare the dates like so:
def daysofmonth(start, end, monthsel):
if start.month == monthsel:
days = (datetime(start.year, monthsel+1, 1) - start).days
elif end.month == monthsel:
days = (end - datetime(end.year, monthsel, 1)).days
elif not (monthsel > start.month) & (end.month > monthsel):
return 0
else:
days = (datetime(start.year, monthsel+1, 1) - datetime(start.year, monthsel, 1)).days
return days
So, in our example setting monthsel gives:
>>> daysofmonth(start, end, 10)
>>> 3
Using pandas whit your dates:
import pandas as pd
from datetime import datetime
first_date = datetime(2017, 10, 29)
second_date = datetime(2017, 11, 4)
days_count = (second_date - first_date).days
month_date = first_date.strftime("%Y-%m")
values = pd.date_range(start=first_date,periods=days_count,freq='D').to_period('M').value_counts()
print(values)
print(values[month_date])
outputs
2017-10 3
2017-11 3
3