Related
I'm having trouble building a logic/algorithm that creates a date, adds it to the URL and then when I create another URL, it will contain the next date.
It should iterate through every day of every month of every year (that's why I thought of the nested for loops).
Note that I only have one variable for date because I want the start and end dates to be the same.
# Setting range of years, months and days to be iterated through in the URL later
YYYY = []
years = range(2016, 2021)
for yyyy in years:
YYYY.append(yyyy)
MM = []
months = range(1, 13)
for mm in months:
MM.append(mm)
DD = []
days = range (1, 32)
for dd in days:
DD.append(dd)
# Create iterating logic with i, j and k t define which year, month and day will be added to the URL
for i in YYYY:
for j in MM:
for k in DD:
True
# start_date and e_date are the same, so we just define 'date'
date = str(YYYY[i]) + '-' + str(MM[j]) + '-' + str(DD[k])
print(date)
# Create URL with the date variable so it can be iterated through
URL = ('https://movement.uber.com/explore/atlanta/travel-times/query?si=1074&ti=&ag=taz&dt[tpb]=ALL_DAY&dt[wd;]=1,2,3,4,5,6,7&dt[dr][sd]=' +
date + '&dt[dr][ed]=' + date + '&dt[dr][ed]=2016-01-19&cd=&sa;=&sdn=&lang=en-US')
You could use datetime and timedelta objects to make a generator to produce the URLs to iterate over:
from datetime import datetime, timedelta
def get_url():
date = datetime(2016,1,1)
while date < datetime(2020,12,31):
yield 'https://movement.uber.com/explore/atlanta/travel-times/query?si=1074&ti=&ag=taz&dt[tpb]=ALL_DAY&dt[wd;]=1,2,3,4,5,6,7&dt[dr][sd]=' + \
date.strftime('%Y-%m-%d') + '&dt[dr][ed]=' + date.strftime('%Y-%m-%d') + '&dt[dr][ed]=2016-01-19&cd=&sa;=&sdn=&lang=en-US'
date += timedelta(days=1)
i = 0
for url in get_url():
i += 1
if i < 3 or i > 10:
print(url)
if i > 12:
break
Output:
https://movement.uber.com/explore/atlanta/travel-times/query?si=1074&ti=&ag=taz&dt[tpb]=ALL_DAY&dt[wd;]=1,2,3,4,5,6,7&dt[dr][sd]=2016-01-01&dt[dr][ed]=2016-01-01&dt[dr][ed]=2016-01-19&cd=&sa;=&sdn=&lang=en-US
https://movement.uber.com/explore/atlanta/travel-times/query?si=1074&ti=&ag=taz&dt[tpb]=ALL_DAY&dt[wd;]=1,2,3,4,5,6,7&dt[dr][sd]=2016-01-02&dt[dr][ed]=2016-01-02&dt[dr][ed]=2016-01-19&cd=&sa;=&sdn=&lang=en-US
https://movement.uber.com/explore/atlanta/travel-times/query?si=1074&ti=&ag=taz&dt[tpb]=ALL_DAY&dt[wd;]=1,2,3,4,5,6,7&dt[dr][sd]=2016-01-11&dt[dr][ed]=2016-01-11&dt[dr][ed]=2016-01-19&cd=&sa;=&sdn=&lang=en-US
https://movement.uber.com/explore/atlanta/travel-times/query?si=1074&ti=&ag=taz&dt[tpb]=ALL_DAY&dt[wd;]=1,2,3,4,5,6,7&dt[dr][sd]=2016-01-12&dt[dr][ed]=2016-01-12&dt[dr][ed]=2016-01-19&cd=&sa;=&sdn=&lang=en-US
https://movement.uber.com/explore/atlanta/travel-times/query?si=1074&ti=&ag=taz&dt[tpb]=ALL_DAY&dt[wd;]=1,2,3,4,5,6,7&dt[dr][sd]=2016-01-13&dt[dr][ed]=2016-01-13&dt[dr][ed]=2016-01-19&cd=&sa;=&sdn=&lang=en-US
Python 3 sum a list of time based on dates from another list.
I used the mentioned code to arrive at the total time, but I am trying to aggregation the time for each dates. e.g. '02-01-2019' should sum up to '08:00:00'.
Date = ['01-01-2019', '02-01-2019', '02-01-2019']
Time = ['07:00:00', '06:00:00','02:00:00']
total = 0
for t in Time:
h, m, s = map(int, t.split(":"))
total += 3600*h + 60*m + s
d="%02d:%02d:%02d" % (total / 3600, total / 60 % 60, total % 60)
I need an if statement to check if the sum of time for each date>='08:00:00'.
e.g
if time_for_each_date>='08:00:00':
do something
else do something else.
This might help get you started on accomplishing your ultimate goal:
Date = ['01-01-2019', '02-01-2019', '02-01-2019']
Time = ['07:00:00', '06:00:00','02:00:00']
import datetime
data = zip(Date, Time)
dates = []
for d in data:
dt = datetime.datetime.strptime("{}, {}".format(*d), "%m-%d-%Y, %H:%M:%S")
dates.append(dt)
totals = {}
for d in dates:
if d.date() not in totals: totals[d.date()] = d.hour
else: totals[d.date()] += d.hour
for date, time in totals.items():
if time >= 8:
# do something
print('do something:', date)
else:
print('do something else.')
I have the following code to do this, but how can I do it better? Right now I think it's better than nested loops, but it starts to get Perl-one-linerish when you have a generator in a list comprehension.
day_count = (end_date - start_date).days + 1
for single_date in [d for d in (start_date + timedelta(n) for n in range(day_count)) if d <= end_date]:
print strftime("%Y-%m-%d", single_date.timetuple())
Notes
I'm not actually using this to print. That's just for demo purposes.
The start_date and end_date variables are datetime.date objects because I don't need the timestamps. (They're going to be used to generate a report).
Sample Output
For a start date of 2009-05-30 and an end date of 2009-06-09:
2009-05-30
2009-05-31
2009-06-01
2009-06-02
2009-06-03
2009-06-04
2009-06-05
2009-06-06
2009-06-07
2009-06-08
2009-06-09
Why are there two nested iterations? For me it produces the same list of data with only one iteration:
for single_date in (start_date + timedelta(n) for n in range(day_count)):
print ...
And no list gets stored, only one generator is iterated over. Also the "if" in the generator seems to be unnecessary.
After all, a linear sequence should only require one iterator, not two.
Update after discussion with John Machin:
Maybe the most elegant solution is using a generator function to completely hide/abstract the iteration over the range of dates:
from datetime import date, timedelta
def daterange(start_date, end_date):
for n in range(int((end_date - start_date).days)):
yield start_date + timedelta(n)
start_date = date(2013, 1, 1)
end_date = date(2015, 6, 2)
for single_date in daterange(start_date, end_date):
print(single_date.strftime("%Y-%m-%d"))
NB: For consistency with the built-in range() function this iteration stops before reaching the end_date. So for inclusive iteration use the next day, as you would with range().
This might be more clear:
from datetime import date, timedelta
start_date = date(2019, 1, 1)
end_date = date(2020, 1, 1)
delta = timedelta(days=1)
while start_date <= end_date:
print(start_date.strftime("%Y-%m-%d"))
start_date += delta
Use the dateutil library:
from datetime import date
from dateutil.rrule import rrule, DAILY
a = date(2009, 5, 30)
b = date(2009, 6, 9)
for dt in rrule(DAILY, dtstart=a, until=b):
print dt.strftime("%Y-%m-%d")
This python library has many more advanced features, some very useful, like relative deltas—and is implemented as a single file (module) that's easily included into a project.
Pandas is great for time series in general, and has direct support for date ranges.
import pandas as pd
daterange = pd.date_range(start_date, end_date)
You can then loop over the daterange to print the date:
for single_date in daterange:
print (single_date.strftime("%Y-%m-%d"))
It also has lots of options to make life easier. For example if you only wanted weekdays, you would just swap in bdate_range. See http://pandas.pydata.org/pandas-docs/stable/timeseries.html#generating-ranges-of-timestamps
The power of Pandas is really its dataframes, which support vectorized operations (much like numpy) that make operations across large quantities of data very fast and easy.
EDIT:
You could also completely skip the for loop and just print it directly, which is easier and more efficient:
print(daterange)
import datetime
def daterange(start, stop, step=datetime.timedelta(days=1), inclusive=False):
# inclusive=False to behave like range by default
if step.days > 0:
while start < stop:
yield start
start = start + step
# not +=! don't modify object passed in if it's mutable
# since this function is not restricted to
# only types from datetime module
elif step.days < 0:
while start > stop:
yield start
start = start + step
if inclusive and start == stop:
yield start
# ...
for date in daterange(start_date, end_date, inclusive=True):
print strftime("%Y-%m-%d", date.timetuple())
This function does more than you strictly require, by supporting negative step, etc. As long as you factor out your range logic, then you don't need the separate day_count and most importantly the code becomes easier to read as you call the function from multiple places.
This is the most human-readable solution I can think of.
import datetime
def daterange(start, end, step=datetime.timedelta(1)):
curr = start
while curr < end:
yield curr
curr += step
Numpy's arange function can be applied to dates:
import numpy as np
from datetime import datetime, timedelta
d0 = datetime(2009, 1,1)
d1 = datetime(2010, 1,1)
dt = timedelta(days = 1)
dates = np.arange(d0, d1, dt).astype(datetime)
The use of astype is to convert from numpy.datetime64 to an array of datetime.datetime objects.
Why not try:
import datetime as dt
start_date = dt.datetime(2012, 12,1)
end_date = dt.datetime(2012, 12,5)
total_days = (end_date - start_date).days + 1 #inclusive 5 days
for day_number in range(total_days):
current_date = (start_date + dt.timedelta(days = day_number)).date()
print current_date
Show the last n days from today:
import datetime
for i in range(0, 100):
print((datetime.date.today() + datetime.timedelta(i)).isoformat())
Output:
2016-06-29
2016-06-30
2016-07-01
2016-07-02
2016-07-03
2016-07-04
For completeness, Pandas also has a period_range function for timestamps that are out of bounds:
import pandas as pd
pd.period_range(start='1/1/1626', end='1/08/1627', freq='D')
import datetime
def daterange(start, stop, step_days=1):
current = start
step = datetime.timedelta(step_days)
if step_days > 0:
while current < stop:
yield current
current += step
elif step_days < 0:
while current > stop:
yield current
current += step
else:
raise ValueError("daterange() step_days argument must not be zero")
if __name__ == "__main__":
from pprint import pprint as pp
lo = datetime.date(2008, 12, 27)
hi = datetime.date(2009, 1, 5)
pp(list(daterange(lo, hi)))
pp(list(daterange(hi, lo, -1)))
pp(list(daterange(lo, hi, 7)))
pp(list(daterange(hi, lo, -7)))
assert not list(daterange(lo, hi, -1))
assert not list(daterange(hi, lo))
assert not list(daterange(lo, hi, -7))
assert not list(daterange(hi, lo, 7))
for i in range(16):
print datetime.date.today() + datetime.timedelta(days=i)
I have a similar problem, but I need to iterate monthly instead of daily.
This is my solution
import calendar
from datetime import datetime, timedelta
def days_in_month(dt):
return calendar.monthrange(dt.year, dt.month)[1]
def monthly_range(dt_start, dt_end):
forward = dt_end >= dt_start
finish = False
dt = dt_start
while not finish:
yield dt.date()
if forward:
days = days_in_month(dt)
dt = dt + timedelta(days=days)
finish = dt > dt_end
else:
_tmp_dt = dt.replace(day=1) - timedelta(days=1)
dt = (_tmp_dt.replace(day=dt.day))
finish = dt < dt_end
Example #1
date_start = datetime(2016, 6, 1)
date_end = datetime(2017, 1, 1)
for p in monthly_range(date_start, date_end):
print(p)
Output
2016-06-01
2016-07-01
2016-08-01
2016-09-01
2016-10-01
2016-11-01
2016-12-01
2017-01-01
Example #2
date_start = datetime(2017, 1, 1)
date_end = datetime(2016, 6, 1)
for p in monthly_range(date_start, date_end):
print(p)
Output
2017-01-01
2016-12-01
2016-11-01
2016-10-01
2016-09-01
2016-08-01
2016-07-01
2016-06-01
You can generate a series of date between two dates using the pandas library simply and trustfully
import pandas as pd
print pd.date_range(start='1/1/2010', end='1/08/2018', freq='M')
You can change the frequency of generating dates by setting freq as D, M, Q, Y
(daily, monthly, quarterly, yearly
)
Using pendulum.period:
import pendulum
start = pendulum.from_format('2020-05-01', 'YYYY-MM-DD', formatter='alternative')
end = pendulum.from_format('2020-05-02', 'YYYY-MM-DD', formatter='alternative')
period = pendulum.period(start, end)
for dt in period:
print(dt.to_date_string())
> pip install DateTimeRange
from datetimerange import DateTimeRange
def dateRange(start, end, step):
rangeList = []
time_range = DateTimeRange(start, end)
for value in time_range.range(datetime.timedelta(days=step)):
rangeList.append(value.strftime('%m/%d/%Y'))
return rangeList
dateRange("2018-09-07", "2018-12-25", 7)
Out[92]:
['09/07/2018',
'09/14/2018',
'09/21/2018',
'09/28/2018',
'10/05/2018',
'10/12/2018',
'10/19/2018',
'10/26/2018',
'11/02/2018',
'11/09/2018',
'11/16/2018',
'11/23/2018',
'11/30/2018',
'12/07/2018',
'12/14/2018',
'12/21/2018']
For those who are interested in Pythonic functional way:
from datetime import date, timedelta
from itertools import count, takewhile
for d in takewhile(lambda x: x<=date(2009,6,9), map(lambda x:date(2009,5,30)+timedelta(days=x), count())):
print(d)
What about the following for doing a range incremented by days:
for d in map( lambda x: startDate+datetime.timedelta(days=x), xrange( (stopDate-startDate).days ) ):
# Do stuff here
startDate and stopDate are datetime.date objects
For a generic version:
for d in map( lambda x: startTime+x*stepTime, xrange( (stopTime-startTime).total_seconds() / stepTime.total_seconds() ) ):
# Do stuff here
startTime and stopTime are datetime.date or datetime.datetime object
(both should be the same type)
stepTime is a timedelta object
Note that .total_seconds() is only supported after python 2.7 If you are stuck with an earlier version you can write your own function:
def total_seconds( td ):
return float(td.microseconds + (td.seconds + td.days * 24 * 3600) * 10**6) / 10**6
This function has some extra features:
can pass a string matching the DATE_FORMAT for start or end and it is converted to a date object
can pass a date object for start or end
error checking in case the end is older than the start
import datetime
from datetime import timedelta
DATE_FORMAT = '%Y/%m/%d'
def daterange(start, end):
def convert(date):
try:
date = datetime.datetime.strptime(date, DATE_FORMAT)
return date.date()
except TypeError:
return date
def get_date(n):
return datetime.datetime.strftime(convert(start) + timedelta(days=n), DATE_FORMAT)
days = (convert(end) - convert(start)).days
if days <= 0:
raise ValueError('The start date must be before the end date.')
for n in range(0, days):
yield get_date(n)
start = '2014/12/1'
end = '2014/12/31'
print list(daterange(start, end))
start_ = datetime.date.today()
end = '2015/12/1'
print list(daterange(start, end))
Here's code for a general date range function, similar to Ber's answer, but more flexible:
def count_timedelta(delta, step, seconds_in_interval):
"""Helper function for iterate. Finds the number of intervals in the timedelta."""
return int(delta.total_seconds() / (seconds_in_interval * step))
def range_dt(start, end, step=1, interval='day'):
"""Iterate over datetimes or dates, similar to builtin range."""
intervals = functools.partial(count_timedelta, (end - start), step)
if interval == 'week':
for i in range(intervals(3600 * 24 * 7)):
yield start + datetime.timedelta(weeks=i) * step
elif interval == 'day':
for i in range(intervals(3600 * 24)):
yield start + datetime.timedelta(days=i) * step
elif interval == 'hour':
for i in range(intervals(3600)):
yield start + datetime.timedelta(hours=i) * step
elif interval == 'minute':
for i in range(intervals(60)):
yield start + datetime.timedelta(minutes=i) * step
elif interval == 'second':
for i in range(intervals(1)):
yield start + datetime.timedelta(seconds=i) * step
elif interval == 'millisecond':
for i in range(intervals(1 / 1000)):
yield start + datetime.timedelta(milliseconds=i) * step
elif interval == 'microsecond':
for i in range(intervals(1e-6)):
yield start + datetime.timedelta(microseconds=i) * step
else:
raise AttributeError("Interval must be 'week', 'day', 'hour' 'second', \
'microsecond' or 'millisecond'.")
import datetime
from dateutil.rrule import DAILY,rrule
date=datetime.datetime(2019,1,10)
date1=datetime.datetime(2019,2,2)
for i in rrule(DAILY , dtstart=date,until=date1):
print(i.strftime('%Y%b%d'),sep='\n')
OUTPUT:
2019Jan10
2019Jan11
2019Jan12
2019Jan13
2019Jan14
2019Jan15
2019Jan16
2019Jan17
2019Jan18
2019Jan19
2019Jan20
2019Jan21
2019Jan22
2019Jan23
2019Jan24
2019Jan25
2019Jan26
2019Jan27
2019Jan28
2019Jan29
2019Jan30
2019Jan31
2019Feb01
2019Feb02
from datetime import date,timedelta
delta = timedelta(days=1)
start = date(2020,1,1)
end=date(2020,9,1)
loop_date = start
while loop_date<=end:
print(loop_date)
loop_date+=delta
You can use Arrow:
This is example from the docs, iterating over hours:
from arrow import Arrow
>>> start = datetime(2013, 5, 5, 12, 30)
>>> end = datetime(2013, 5, 5, 17, 15)
>>> for r in Arrow.range('hour', start, end):
... print repr(r)
...
<Arrow [2013-05-05T12:30:00+00:00]>
<Arrow [2013-05-05T13:30:00+00:00]>
<Arrow [2013-05-05T14:30:00+00:00]>
<Arrow [2013-05-05T15:30:00+00:00]>
<Arrow [2013-05-05T16:30:00+00:00]>
To iterate over days, you can use like this:
>>> start = Arrow(2013, 5, 5)
>>> end = Arrow(2013, 5, 5)
>>> for r in Arrow.range('day', start, end):
... print repr(r)
(Didn't check if you can pass datetime.date objects, but anyways Arrow objects are easier in general)
If you are going to use dynamic timedelta then you can use:
1. With while loop
def datetime_range(start: datetime, end: datetime, delta: timedelta) -> Generator[datetime, None, None]:
while start <= end:
yield start
start += delta
2. With for loop
from datetime import datetime, timedelta
from typing import Generator
def datetime_range(start: datetime, end: datetime, delta: timedelta) -> Generator[datetime, None, None]:
delta_units = int((end - start) / delta)
for _ in range(delta_units + 1):
yield start
start += delta
3. If you are using async/await
async def datetime_range(start: datetime, end: datetime, delta: timedelta) -> AsyncGenerator[datetime, None]:
delta_units = int((end - start) / delta)
for _ in range(delta_units + 1):
yield start
start += delta
4. List comprehension
def datetime_range(start: datetime, end: datetime, delta: timedelta) -> List[datetime]:
delta_units = int((end - start) / delta)
return [start + (delta * index) for index in range(delta_units + 1)]
Then 1 and 2 solutions simply can be used like this
start = datetime(2020, 10, 10, 10, 00)
end = datetime(2022, 10, 10, 18, 00)
delta = timedelta(minutes=30)
result = [time_part for time_part in datetime_range(start, end, delta)]
# or
for time_part in datetime_range(start, end, delta):
print(time_part)
3-third solution can be used like this in async context. Because it retruns an async generator object, which can be used only in async context
start = datetime(2020, 10, 10, 10, 00)
end = datetime(2022, 10, 10, 18, 00)
delta = timedelta(minutes=30)
result = [time_part async for time_part in datetime_range(start, end, delta)]
async for time_part in datetime_range(start, end, delta):
print(time_part)
The benefit of the solutions about is that all of them are using dynamic timedelta. This can be very usefull in cases when you do not know which time delta you will have.
Slightly different approach to reversible steps by storing range args in a tuple.
def date_range(start, stop, step=1, inclusive=False):
day_count = (stop - start).days
if inclusive:
day_count += 1
if step > 0:
range_args = (0, day_count, step)
elif step < 0:
range_args = (day_count - 1, -1, step)
else:
raise ValueError("date_range(): step arg must be non-zero")
for i in range(*range_args):
yield start + timedelta(days=i)
Is there any python function to deduce the number of Fridays or Thursdays from a date range? I searched google and I found many methods which usually use days divided by 7 concept but it does not give you the accurate days. For example from 1/Nov/2016 to 12/Nov/2016 there are two Fridays and two Thursdays so the result of subtraction should be 8.
You can do it with numpy:
import numpy as np
from datetime import datetime
start_date = datetime(2022, 10, 19).strftime('%Y-%m-%d')
end_date = datetime(2022, 12, 21).strftime('%Y-%m-%d')
weekend_days = np.busday_count(start_date, end_date, weekmask='0000110').item()
numpy busday_count doc
Keep in mind that end date is excluded from the count.
Using the date object from the datetime module.
from datetime import date, timedelta
curr = date(2016, 11, 1)
end = date(2016, 11, 12)
step = timedelta(1)
num_thur_fri = 0
while curr <= end:
if curr.weekday() in [3,2]: #Friday and thursday
num_thur_fri += 1
curr += step
print(num_thur_fri)
More reading here: https://docs.python.org/2/library/datetime.html#module-datetime
#brianpck is right, this is a really naive solution. Here's a better one
from datetime import date
begin = date(2016, 11, 1)
end = date(2016, 11, 12)
diff = (begin-end).days
day_of_week = begin.weekday()
num_thur_fri = 2*(diff//7)
for i in range(diff%7):
if day_of_week in [2,3]:
num_thur_fri += 1
day_of_week = (day_of_week +1) %7
Here is a simpler and faster approach that will calculate this figure for long periods of time.
First, you must calculate the amount of days between two datetime's. You can then floor divide by 7 to get the amount of entire weeks and multiply by 2 to get the number of Thursdays and Fridays. The final step is to modulo by seven to get the amount of days at the tail and then calculate how many of those are Thursdays or Fridays: this last step is the only one that actually requires knowing which weekday it is.
A full function would be:
from datetime import datetime, timedelta
def thursday_fridays_between(date1, date2):
days_between = abs((date2 - date1).days)
thursday_friday = days_between // 7 * 2
thursday_friday += sum((a + timedelta(i)).weekday() in (3, 2) for i in range(days_between % 7 + 1))
return thursday_friday
It can be used as follows:
>>> a = datetime(2016, 11, 1)
>>> b = datetime(2016, 11, 12)
>>> thursday_fridays_between(a, b)
4
i figure out a method, correct me if i am wrong.
here is my code
from datetime import date, timedelta, datetime
curr = "1-11-2016"
end = "30-11-2016"
format = "%d-%m-%Y"
start_date = datetime.strptime(curr, format)
end_date = datetime.strptime(end, format)
step = timedelta(1)
num_thur_fri = 0
off_days = ['Fri','Thu']
days = (end_date - start_date).days
for x in range(days):
day = start_date.strftime("%a")
print(day)
if day in off_days:
num_thur_fri += 1
start_date += step
print(num_thur_fri)
I have the following code to do this, but how can I do it better? Right now I think it's better than nested loops, but it starts to get Perl-one-linerish when you have a generator in a list comprehension.
day_count = (end_date - start_date).days + 1
for single_date in [d for d in (start_date + timedelta(n) for n in range(day_count)) if d <= end_date]:
print strftime("%Y-%m-%d", single_date.timetuple())
Notes
I'm not actually using this to print. That's just for demo purposes.
The start_date and end_date variables are datetime.date objects because I don't need the timestamps. (They're going to be used to generate a report).
Sample Output
For a start date of 2009-05-30 and an end date of 2009-06-09:
2009-05-30
2009-05-31
2009-06-01
2009-06-02
2009-06-03
2009-06-04
2009-06-05
2009-06-06
2009-06-07
2009-06-08
2009-06-09
Why are there two nested iterations? For me it produces the same list of data with only one iteration:
for single_date in (start_date + timedelta(n) for n in range(day_count)):
print ...
And no list gets stored, only one generator is iterated over. Also the "if" in the generator seems to be unnecessary.
After all, a linear sequence should only require one iterator, not two.
Update after discussion with John Machin:
Maybe the most elegant solution is using a generator function to completely hide/abstract the iteration over the range of dates:
from datetime import date, timedelta
def daterange(start_date, end_date):
for n in range(int((end_date - start_date).days)):
yield start_date + timedelta(n)
start_date = date(2013, 1, 1)
end_date = date(2015, 6, 2)
for single_date in daterange(start_date, end_date):
print(single_date.strftime("%Y-%m-%d"))
NB: For consistency with the built-in range() function this iteration stops before reaching the end_date. So for inclusive iteration use the next day, as you would with range().
This might be more clear:
from datetime import date, timedelta
start_date = date(2019, 1, 1)
end_date = date(2020, 1, 1)
delta = timedelta(days=1)
while start_date <= end_date:
print(start_date.strftime("%Y-%m-%d"))
start_date += delta
Use the dateutil library:
from datetime import date
from dateutil.rrule import rrule, DAILY
a = date(2009, 5, 30)
b = date(2009, 6, 9)
for dt in rrule(DAILY, dtstart=a, until=b):
print dt.strftime("%Y-%m-%d")
This python library has many more advanced features, some very useful, like relative deltas—and is implemented as a single file (module) that's easily included into a project.
Pandas is great for time series in general, and has direct support for date ranges.
import pandas as pd
daterange = pd.date_range(start_date, end_date)
You can then loop over the daterange to print the date:
for single_date in daterange:
print (single_date.strftime("%Y-%m-%d"))
It also has lots of options to make life easier. For example if you only wanted weekdays, you would just swap in bdate_range. See http://pandas.pydata.org/pandas-docs/stable/timeseries.html#generating-ranges-of-timestamps
The power of Pandas is really its dataframes, which support vectorized operations (much like numpy) that make operations across large quantities of data very fast and easy.
EDIT:
You could also completely skip the for loop and just print it directly, which is easier and more efficient:
print(daterange)
import datetime
def daterange(start, stop, step=datetime.timedelta(days=1), inclusive=False):
# inclusive=False to behave like range by default
if step.days > 0:
while start < stop:
yield start
start = start + step
# not +=! don't modify object passed in if it's mutable
# since this function is not restricted to
# only types from datetime module
elif step.days < 0:
while start > stop:
yield start
start = start + step
if inclusive and start == stop:
yield start
# ...
for date in daterange(start_date, end_date, inclusive=True):
print strftime("%Y-%m-%d", date.timetuple())
This function does more than you strictly require, by supporting negative step, etc. As long as you factor out your range logic, then you don't need the separate day_count and most importantly the code becomes easier to read as you call the function from multiple places.
This is the most human-readable solution I can think of.
import datetime
def daterange(start, end, step=datetime.timedelta(1)):
curr = start
while curr < end:
yield curr
curr += step
Numpy's arange function can be applied to dates:
import numpy as np
from datetime import datetime, timedelta
d0 = datetime(2009, 1,1)
d1 = datetime(2010, 1,1)
dt = timedelta(days = 1)
dates = np.arange(d0, d1, dt).astype(datetime)
The use of astype is to convert from numpy.datetime64 to an array of datetime.datetime objects.
Why not try:
import datetime as dt
start_date = dt.datetime(2012, 12,1)
end_date = dt.datetime(2012, 12,5)
total_days = (end_date - start_date).days + 1 #inclusive 5 days
for day_number in range(total_days):
current_date = (start_date + dt.timedelta(days = day_number)).date()
print current_date
Show the last n days from today:
import datetime
for i in range(0, 100):
print((datetime.date.today() + datetime.timedelta(i)).isoformat())
Output:
2016-06-29
2016-06-30
2016-07-01
2016-07-02
2016-07-03
2016-07-04
For completeness, Pandas also has a period_range function for timestamps that are out of bounds:
import pandas as pd
pd.period_range(start='1/1/1626', end='1/08/1627', freq='D')
import datetime
def daterange(start, stop, step_days=1):
current = start
step = datetime.timedelta(step_days)
if step_days > 0:
while current < stop:
yield current
current += step
elif step_days < 0:
while current > stop:
yield current
current += step
else:
raise ValueError("daterange() step_days argument must not be zero")
if __name__ == "__main__":
from pprint import pprint as pp
lo = datetime.date(2008, 12, 27)
hi = datetime.date(2009, 1, 5)
pp(list(daterange(lo, hi)))
pp(list(daterange(hi, lo, -1)))
pp(list(daterange(lo, hi, 7)))
pp(list(daterange(hi, lo, -7)))
assert not list(daterange(lo, hi, -1))
assert not list(daterange(hi, lo))
assert not list(daterange(lo, hi, -7))
assert not list(daterange(hi, lo, 7))
for i in range(16):
print datetime.date.today() + datetime.timedelta(days=i)
I have a similar problem, but I need to iterate monthly instead of daily.
This is my solution
import calendar
from datetime import datetime, timedelta
def days_in_month(dt):
return calendar.monthrange(dt.year, dt.month)[1]
def monthly_range(dt_start, dt_end):
forward = dt_end >= dt_start
finish = False
dt = dt_start
while not finish:
yield dt.date()
if forward:
days = days_in_month(dt)
dt = dt + timedelta(days=days)
finish = dt > dt_end
else:
_tmp_dt = dt.replace(day=1) - timedelta(days=1)
dt = (_tmp_dt.replace(day=dt.day))
finish = dt < dt_end
Example #1
date_start = datetime(2016, 6, 1)
date_end = datetime(2017, 1, 1)
for p in monthly_range(date_start, date_end):
print(p)
Output
2016-06-01
2016-07-01
2016-08-01
2016-09-01
2016-10-01
2016-11-01
2016-12-01
2017-01-01
Example #2
date_start = datetime(2017, 1, 1)
date_end = datetime(2016, 6, 1)
for p in monthly_range(date_start, date_end):
print(p)
Output
2017-01-01
2016-12-01
2016-11-01
2016-10-01
2016-09-01
2016-08-01
2016-07-01
2016-06-01
You can generate a series of date between two dates using the pandas library simply and trustfully
import pandas as pd
print pd.date_range(start='1/1/2010', end='1/08/2018', freq='M')
You can change the frequency of generating dates by setting freq as D, M, Q, Y
(daily, monthly, quarterly, yearly
)
Using pendulum.period:
import pendulum
start = pendulum.from_format('2020-05-01', 'YYYY-MM-DD', formatter='alternative')
end = pendulum.from_format('2020-05-02', 'YYYY-MM-DD', formatter='alternative')
period = pendulum.period(start, end)
for dt in period:
print(dt.to_date_string())
> pip install DateTimeRange
from datetimerange import DateTimeRange
def dateRange(start, end, step):
rangeList = []
time_range = DateTimeRange(start, end)
for value in time_range.range(datetime.timedelta(days=step)):
rangeList.append(value.strftime('%m/%d/%Y'))
return rangeList
dateRange("2018-09-07", "2018-12-25", 7)
Out[92]:
['09/07/2018',
'09/14/2018',
'09/21/2018',
'09/28/2018',
'10/05/2018',
'10/12/2018',
'10/19/2018',
'10/26/2018',
'11/02/2018',
'11/09/2018',
'11/16/2018',
'11/23/2018',
'11/30/2018',
'12/07/2018',
'12/14/2018',
'12/21/2018']
For those who are interested in Pythonic functional way:
from datetime import date, timedelta
from itertools import count, takewhile
for d in takewhile(lambda x: x<=date(2009,6,9), map(lambda x:date(2009,5,30)+timedelta(days=x), count())):
print(d)
What about the following for doing a range incremented by days:
for d in map( lambda x: startDate+datetime.timedelta(days=x), xrange( (stopDate-startDate).days ) ):
# Do stuff here
startDate and stopDate are datetime.date objects
For a generic version:
for d in map( lambda x: startTime+x*stepTime, xrange( (stopTime-startTime).total_seconds() / stepTime.total_seconds() ) ):
# Do stuff here
startTime and stopTime are datetime.date or datetime.datetime object
(both should be the same type)
stepTime is a timedelta object
Note that .total_seconds() is only supported after python 2.7 If you are stuck with an earlier version you can write your own function:
def total_seconds( td ):
return float(td.microseconds + (td.seconds + td.days * 24 * 3600) * 10**6) / 10**6
This function has some extra features:
can pass a string matching the DATE_FORMAT for start or end and it is converted to a date object
can pass a date object for start or end
error checking in case the end is older than the start
import datetime
from datetime import timedelta
DATE_FORMAT = '%Y/%m/%d'
def daterange(start, end):
def convert(date):
try:
date = datetime.datetime.strptime(date, DATE_FORMAT)
return date.date()
except TypeError:
return date
def get_date(n):
return datetime.datetime.strftime(convert(start) + timedelta(days=n), DATE_FORMAT)
days = (convert(end) - convert(start)).days
if days <= 0:
raise ValueError('The start date must be before the end date.')
for n in range(0, days):
yield get_date(n)
start = '2014/12/1'
end = '2014/12/31'
print list(daterange(start, end))
start_ = datetime.date.today()
end = '2015/12/1'
print list(daterange(start, end))
Here's code for a general date range function, similar to Ber's answer, but more flexible:
def count_timedelta(delta, step, seconds_in_interval):
"""Helper function for iterate. Finds the number of intervals in the timedelta."""
return int(delta.total_seconds() / (seconds_in_interval * step))
def range_dt(start, end, step=1, interval='day'):
"""Iterate over datetimes or dates, similar to builtin range."""
intervals = functools.partial(count_timedelta, (end - start), step)
if interval == 'week':
for i in range(intervals(3600 * 24 * 7)):
yield start + datetime.timedelta(weeks=i) * step
elif interval == 'day':
for i in range(intervals(3600 * 24)):
yield start + datetime.timedelta(days=i) * step
elif interval == 'hour':
for i in range(intervals(3600)):
yield start + datetime.timedelta(hours=i) * step
elif interval == 'minute':
for i in range(intervals(60)):
yield start + datetime.timedelta(minutes=i) * step
elif interval == 'second':
for i in range(intervals(1)):
yield start + datetime.timedelta(seconds=i) * step
elif interval == 'millisecond':
for i in range(intervals(1 / 1000)):
yield start + datetime.timedelta(milliseconds=i) * step
elif interval == 'microsecond':
for i in range(intervals(1e-6)):
yield start + datetime.timedelta(microseconds=i) * step
else:
raise AttributeError("Interval must be 'week', 'day', 'hour' 'second', \
'microsecond' or 'millisecond'.")
import datetime
from dateutil.rrule import DAILY,rrule
date=datetime.datetime(2019,1,10)
date1=datetime.datetime(2019,2,2)
for i in rrule(DAILY , dtstart=date,until=date1):
print(i.strftime('%Y%b%d'),sep='\n')
OUTPUT:
2019Jan10
2019Jan11
2019Jan12
2019Jan13
2019Jan14
2019Jan15
2019Jan16
2019Jan17
2019Jan18
2019Jan19
2019Jan20
2019Jan21
2019Jan22
2019Jan23
2019Jan24
2019Jan25
2019Jan26
2019Jan27
2019Jan28
2019Jan29
2019Jan30
2019Jan31
2019Feb01
2019Feb02
from datetime import date,timedelta
delta = timedelta(days=1)
start = date(2020,1,1)
end=date(2020,9,1)
loop_date = start
while loop_date<=end:
print(loop_date)
loop_date+=delta
You can use Arrow:
This is example from the docs, iterating over hours:
from arrow import Arrow
>>> start = datetime(2013, 5, 5, 12, 30)
>>> end = datetime(2013, 5, 5, 17, 15)
>>> for r in Arrow.range('hour', start, end):
... print repr(r)
...
<Arrow [2013-05-05T12:30:00+00:00]>
<Arrow [2013-05-05T13:30:00+00:00]>
<Arrow [2013-05-05T14:30:00+00:00]>
<Arrow [2013-05-05T15:30:00+00:00]>
<Arrow [2013-05-05T16:30:00+00:00]>
To iterate over days, you can use like this:
>>> start = Arrow(2013, 5, 5)
>>> end = Arrow(2013, 5, 5)
>>> for r in Arrow.range('day', start, end):
... print repr(r)
(Didn't check if you can pass datetime.date objects, but anyways Arrow objects are easier in general)
If you are going to use dynamic timedelta then you can use:
1. With while loop
def datetime_range(start: datetime, end: datetime, delta: timedelta) -> Generator[datetime, None, None]:
while start <= end:
yield start
start += delta
2. With for loop
from datetime import datetime, timedelta
from typing import Generator
def datetime_range(start: datetime, end: datetime, delta: timedelta) -> Generator[datetime, None, None]:
delta_units = int((end - start) / delta)
for _ in range(delta_units + 1):
yield start
start += delta
3. If you are using async/await
async def datetime_range(start: datetime, end: datetime, delta: timedelta) -> AsyncGenerator[datetime, None]:
delta_units = int((end - start) / delta)
for _ in range(delta_units + 1):
yield start
start += delta
4. List comprehension
def datetime_range(start: datetime, end: datetime, delta: timedelta) -> List[datetime]:
delta_units = int((end - start) / delta)
return [start + (delta * index) for index in range(delta_units + 1)]
Then 1 and 2 solutions simply can be used like this
start = datetime(2020, 10, 10, 10, 00)
end = datetime(2022, 10, 10, 18, 00)
delta = timedelta(minutes=30)
result = [time_part for time_part in datetime_range(start, end, delta)]
# or
for time_part in datetime_range(start, end, delta):
print(time_part)
3-third solution can be used like this in async context. Because it retruns an async generator object, which can be used only in async context
start = datetime(2020, 10, 10, 10, 00)
end = datetime(2022, 10, 10, 18, 00)
delta = timedelta(minutes=30)
result = [time_part async for time_part in datetime_range(start, end, delta)]
async for time_part in datetime_range(start, end, delta):
print(time_part)
The benefit of the solutions about is that all of them are using dynamic timedelta. This can be very usefull in cases when you do not know which time delta you will have.
Slightly different approach to reversible steps by storing range args in a tuple.
def date_range(start, stop, step=1, inclusive=False):
day_count = (stop - start).days
if inclusive:
day_count += 1
if step > 0:
range_args = (0, day_count, step)
elif step < 0:
range_args = (day_count - 1, -1, step)
else:
raise ValueError("date_range(): step arg must be non-zero")
for i in range(*range_args):
yield start + timedelta(days=i)