Basically, I want my script to pause between 4 and 5 AM. The only way to do this I've come up with so far is this:
seconds_into_day = time.time() % (60*60*24)
if 60*60*4 < seconds_into_day < 60*60*5:
sleep(time_left_till_5am)
Any "proper" way to do this? Aka some built-in function/lib for calculating time; rather than just using seconds all the time?
You want datetime
The datetime module supplies classes for manipulating dates and times in both simple and complex ways
If you use date.hour from datetime.now() you'll get the current hour:
datetimenow = datetime.now();
if datetimenow.hour in range(4, 5)
sleep(time_left_till_5am)
You can calculate time_left_till_5am by taking 60 - datetimenow.minute multiplying by 60 and adding to 60 - datetimenow.second.
Python has a built-in datetime library: http://docs.python.org/library/datetime.html
This should probably get you what you're after:
import datetime as dt
from time import sleep
now = dt.datetime.now()
if now.hour >= 4 andnow.hour < 5:
sleep((60 - now.minute)*60 + (60 - now.second))
OK, the above works, but here's the purer, less error-prone solution (and what I was originally thinking of but suddenly forgot how to do):
import datetime as dt
from time import sleep
now = dt.datetime.now()
pause = dt.datetime(now.year, now.month, now.day, 4)
start = dt.datetime(now.year, now.month, now.day, 5)
if now >= pause and now < start:
sleep((start - now).seconds)
That's where my original "timedelta" comment came from -- what you get from subtracting two datetime objects is a timedelta object (which in this case we pull the 'seconds' attribute from).
The following code covers the more general case where a script needs to pause during any fixed window of less than 24 hours duration. Example: must sleep between 11:00 PM and 01:00 AM.
import datetime as dt
def sleep_duration(sleep_from, sleep_to, now=None):
# sleep_* are datetime.time objects
# now is a datetime.datetime object
if now is None:
now = dt.datetime.now()
duration = 0
lo = dt.datetime.combine(now, sleep_from)
hi = dt.datetime.combine(now, sleep_to)
if lo <= now < hi:
duration = (hi - now).seconds
elif hi < lo:
if now >= lo:
duration = (hi + dt.timedelta(hours=24) - now).seconds
elif now < hi:
duration = (hi - now).seconds
return duration
tests = [
(4, 5, 3, 30),
(4, 5, 4, 0),
(4, 5, 4, 30),
(4, 5, 5, 0),
(4, 5, 5, 30),
(23, 1, 0, 0),
(23, 1, 0, 30),
(23, 1, 0, 59),
(23, 1, 1, 0),
(23, 1, 1, 30),
(23, 1, 22, 30),
(23, 1, 22, 59),
(23, 1, 23, 0),
(23, 1, 23, 1),
(23, 1, 23, 59),
]
for hfrom, hto, hnow, mnow in tests:
sfrom = dt.time(hfrom)
sto = dt.time(hto)
dnow = dt.datetime(2010, 7, 5, hnow, mnow)
print sfrom, sto, dnow, sleep_duration(sfrom, sto, dnow)
and here's the output:
04:00:00 05:00:00 2010-07-05 03:30:00 0
04:00:00 05:00:00 2010-07-05 04:00:00 3600
04:00:00 05:00:00 2010-07-05 04:30:00 1800
04:00:00 05:00:00 2010-07-05 05:00:00 0
04:00:00 05:00:00 2010-07-05 05:30:00 0
23:00:00 01:00:00 2010-07-05 00:00:00 3600
23:00:00 01:00:00 2010-07-05 00:30:00 1800
23:00:00 01:00:00 2010-07-05 00:59:00 60
23:00:00 01:00:00 2010-07-05 01:00:00 0
23:00:00 01:00:00 2010-07-05 01:30:00 0
23:00:00 01:00:00 2010-07-05 22:30:00 0
23:00:00 01:00:00 2010-07-05 22:59:00 0
23:00:00 01:00:00 2010-07-05 23:00:00 7200
23:00:00 01:00:00 2010-07-05 23:01:00 7140
23:00:00 01:00:00 2010-07-05 23:59:00 3660
When dealing with dates and times in Python I still prefer mxDateTime over Python's datetime module as although the built-in one has improved greatly over the years it's still rather awkward and lacking in comparison. So if interested go here: mxDateTime It's free to download and use. Makes life much easier when dealing with datetime math.
import mx.DateTime as dt
from time import sleep
now = dt.now()
if 4 <= now.hour < 5:
stop = dt.RelativeDateTime(hour=5, minute=0, second=0)
secs_remaining = ((now + stop) - now).seconds
sleep(secs_remaining)
Related
I have the following dataset:
date event next_event duration_Minutes
2021-09-09 22:30:00 1 2021-09-09 23:00:00 30
2021-09-09 23:00:00 2 2021-09-09 23:10:00 10
2021-09-09 23:10:00 1 2021-09-09 23:50:00 40
2021-09-09 23:50:00 4 2021-09-10 00:50:00 60
2021-09-10 00:50:00 4 2021-09-12 00:50:00 2880
The main problem is that I would like to split the multi-day events into separate events in the following way. I would like to have the event duration from 2021-09-09 23:50:00 until 2021-09-10 00: 00: 00 and then the duration from 2021-09-10 00: 00: 00 to 2021-09-10 00:50:00, and so on. This would be useful because after, I would need to group the events by day and calculate the duration of the each event by day, so I would like to fix these situation in which there is the day change between events.
I would like to obtain something like this:
date event next_event duration_Minutes
2021-09-09 22:30:00 1 2021-09-09 23:00:00 30
2021-09-09 23:00:00 2 2021-09-09 23:10:00 10
2021-09-09 23:10:00 1 2021-09-09 23:50:00 40
2021-09-09 23:50:00 4 2021-09-10 00:00:00 10
2021-09-09 00:00:00 4 2021-09-10 00:50:00 50
2021-09-10 00:50:00 4 2021-09-11 00:00:00 1390
2021-09-11 00:00:00 4 2021-09-12 00:00:00 1440
2021-09-12 00:00:00 4 2021-09-12 00:50:00 50
It should be able to handle situations in which we don't have an event for an entire day or more like in the example.
My current solution for now is:
first_record_hour_ts = df.index.floor('H')[0]
last_record_hour_ts = df.index.floor('H')[-1]
# Create a series from the first to the last date containing Nan
df_to_join = pd.Series(np.nan, index=pd.date_range(first_record_hour_ts, last_record_hour_ts, freq='H'))
df_to_join = pd.DataFrame(df_to_join)
# Concatenate with current status dataframe
df = pd.concat([df, df_to_join[~df_to_join.index.isin(df.index)]]).sort_index()
# Forward fill the nana
df.fillna(method='ffill', inplace=True)
df['next_event'] = df.index.shift(-1)
# Calculate the delta between the 2 status
df['duration'] = df['next_event'] - df.index
# Convert into minutes
df['duration_Minutes'] = df['duration_Minutes'].apply(lambda x: x.total_seconds() // 60)
This doesn't solve exactly the problem, but I think it may solve my goal which being able to group by event and by day at the end.
Ok, the code below looks a bit long -- and there's certainly a better/more efficient/shorter way of doing this. But I think it's pretty reasonably simple to follow along.
split_datetime_span_by_day below takes two dates: start_date and end_date. In your case, it would be date and next_event from your source data.
The function then checks whether that time period (start -> end) spans over midnight. If it doesn't, it returns the start date, the end date, and the time period in seconds. If it does span over midnight, it creates a new segment (start -> midnight), and then calls itself again (i.e. recurses), and the process continues until the time period does not span over midnight.
Just a note: the returned segment list is made up of tuples of (start, end, nmb_seconds). I'm returning the number of seconds, not the number of minutes as in your question, because I didn't know how you wanted to round the seconds (up, down, etc.). That's left as an exercise for the reader :-)
from datetime import datetime, timedelta
def split_datetime_span_by_day(start_date, end_date, split_segments=None):
assert start_date < end_date # sanity check
# when is the next midnight after start_date?
# adapted from https://ispycode.com/Blog/python/2016-07/Get-Midnight-Today
start_next_midnight = datetime.combine(start_date, datetime.min.time()) + timedelta(days=1)
if split_segments is None:
split_segments = []
if end_date < start_next_midnight:
# end date is before next midnight, no split necessary
return split_segments + [(
start_date,
end_date,
(end_date - start_date).total_seconds()
)]
# otherwise, split at next midnight...
split_segments += [(
start_date,
start_next_midnight,
(start_next_midnight - start_date).total_seconds()
)]
if (end_date - start_next_midnight).total_seconds() > 0:
# ...and recurse to get next segment
return split_datetime_span_by_day(
start_date=start_next_midnight,
end_date=end_date,
split_segments=split_segments
)
else:
# case where start_next_midnight == end_date i.e. end_date is midnight
# don't split & create a 0 second segment
return split_segments
# test case:
start_date = datetime.strptime('2021-09-12 00:00:00', '%Y-%m-%d %H:%M:%S')
end_date = datetime.strptime('2021-09-14 01:00:00', '%Y-%m-%d %H:%M:%S')
print(split_datetime_span_by_day(start_date=start_date, end_date=end_date))
# returned values:
# [
# (datetime.datetime(2021, 9, 12, 0, 0), datetime.datetime(2021, 9, 13, 0, 0), 86400.0),
# (datetime.datetime(2021, 9, 13, 0, 0), datetime.datetime(2021, 9, 14, 0, 0), 86400.0),
# (datetime.datetime(2021, 9, 14, 0, 0), datetime.datetime(2021, 9, 14, 1, 0), 3600.0)
# ]
This is my dataframe.
Start_hour End_date
23:58:00 00:26:00
23:56:00 00:01:00
23:18:00 23:36:00
How can I get in a new column the difference (in minutes) between these two columns?
>>> from datetime import datetime
>>>
>>> before = datetime.now()
>>> print('wait for more than 1 minute')
wait for more than 1 minute
>>> after = datetime.now()
>>> td = after - before
>>>
>>> td
datetime.timedelta(seconds=98, microseconds=389121)
>>> td.total_seconds()
98.389121
>>> td.total_seconds() / 60
1.6398186833333335
Then you can round it or use it as-is.
You can do something like this:
import pandas as pd
df = pd.DataFrame({
'Start_hour': ['23:58:00', '23:56:00', '23:18:00'],
'End_date': ['00:26:00', '00:01:00', '23:36:00']}
)
df['Start_hour'] = pd.to_datetime(df['Start_hour'])
df['End_date'] = pd.to_datetime(df['End_date'])
df['diff'] = df.apply(
lambda row: (row['End_date']-row['Start_hour']).seconds / 60,
axis=1
)
print(df)
Start_hour End_date diff
0 2021-03-29 23:58:00 2021-03-29 00:26:00 28.0
1 2021-03-29 23:56:00 2021-03-29 00:01:00 5.0
2 2021-03-29 23:18:00 2021-03-29 23:36:00 18.0
You can also rearrange your dates as string again if you like:
df['Start_hour'] = df['Start_hour'].apply(lambda x: x.strftime('%H:%M:%S'))
df['End_date'] = df['End_date'].apply(lambda x: x.strftime('%H:%M:%S'))
print(df)
Output:
Start_hour End_date diff
0 23:58:00 00:26:00 28.0
1 23:56:00 00:01:00 5.0
2 23:18:00 23:36:00 18.0
Short answer:
df['interval'] = df['End_date'] - df['Start_hour']
df['interval'][df['End_date'] < df['Start_hour']] += timedelta(hours=24)
Why so:
You probably trying to solve the problem that your Start_hout and End_date values sometimes belong to a different days, and that's why you can't just substutute one from the other.
It your time window never exceeds 24 hours interval, you could use some modular arithmetic to deal with 23:59:59 - 00:00:00 border:
if End_date < Start_hour, this always means End_date belongs to a next day
this implies, if End_date - Start_hour < 0 then we should add 24 hours to End_date to find the actual difference
The final formula is:
if rec['Start_hour'] < rec['End_date']:
offset = 0
else:
offset = timedelta(hours=24)
rec['delta'] = offset + rec['End_date'] - rec['Start_hour']
To do the same with pandas.DataFrame we need to change code accordingly. And
that's how we get the snippet from the beginning of the answer.
import pandas as pd
df = pd.DataFrame([
{'Start_hour': datetime(1, 1, 1, 23, 58, 0), 'End_date': datetime(1, 1, 1, 0, 26, 0)},
{'Start_hour': datetime(1, 1, 1, 23, 58, 0), 'End_date': datetime(1, 1, 1, 23, 59, 0)},
])
# ...
df['interval'] = df['End_date'] - df['Start_hour']
df['interval'][df['End_date'] < df['Start_hour']] += timedelta(hours=24)
> df
Start_hour End_date interval
0 0001-01-01 23:58:00 0001-01-01 00:26:00 0 days 00:28:00
1 0001-01-01 23:58:00 0001-01-01 23:59:00 0 days 00:01:00
import datetime
dt = datetime.datetime(2020, 7, 1)
t = datetime.time(12, 34)
final = datetime.datetime.combine(dt.date(), t)
loop = 1
while loop == 1:
print(final)
I want to make it repeat over and over with the time changing but I keep printing the same time over and over, the time isn't changing
an illustration of my comments; what you could do
from datetime import datetime
t0 = datetime(2020,9,18,0,0,0)
for i in range(5):
print(t0.replace(hour=i+1))
# prints
# 2020-09-18 01:00:00
# 2020-09-18 02:00:00
# 2020-09-18 03:00:00
# 2020-09-18 04:00:00
# 2020-09-18 05:00:00
I have a CSV file that contains start-time and end-time for sessions.
I would like to understand how I can do End-time - Start-time to get the duration of a session.
So far I have this and it works
start_time = "2016-11-09 18:06:17"
end_time ="2016-11-09 18:21:07"
start_dt = dt.datetime.strptime(start_time, '%Y-%m-%d %H:%M:%S')
end_dt = dt.datetime.strptime(end_time, '%Y-%m-%d %H:%M:%S')
diff = (end_dt - start_dt)
duration = diff.seconds/60
print (duration)
but I want to do it for the whole column at once.
To import from a csv and then manipulate the date, pandas is the way to go. Since the only info you gave about your data was start and end time, I will show that.
Code:
import pandas as pd
df = pd.read_csv(data, parse_dates=['start_time', 'end_time'],
infer_datetime_format=True)
print(df)
df['time_delta'] = df.end_time.values - df.start_time.values
print(df.time_delta)
Test Data:
from io import StringIO
data = StringIO(u'\n'.join([x.strip() for x in """
start_time,end_time,a_number
2013-09-19 03:00:00,2013-09-19 04:00:00,221.0797
2013-09-19 04:00:00,2013-09-19 05:00:00,220.5083
2013-09-24 03:00:00,2013-09-24 05:00:00,221.7733
2013-09-24 04:00:00,2013-09-24 06:00:00,221.2493
""".split('\n')[1:-1]]))
Results:
start_time end_time a_number
0 2013-09-19 03:00:00 2013-09-19 04:00:00 221.0797
1 2013-09-19 04:00:00 2013-09-19 05:00:00 220.5083
2 2013-09-24 03:00:00 2013-09-24 05:00:00 221.7733
3 2013-09-24 04:00:00 2013-09-24 06:00:00 221.2493
0 01:00:00
1 01:00:00
2 02:00:00
3 02:00:00
Name: time_delta, dtype: timedelta64[ns]
It seems you are trying to run diff against strings, instead of datetime values.
How about something like this?
from datetime import datetime
start_time = datetime(2016, 11, 12, 18, 06, 17)
end_time = datetime(2016, 11, 09, 18, 21, 07)
diff = end_time - start_time
print(diff.seconds / 60)
I think this should work.
I have the following model:
Deal(models.Model):
start_date = models.DateTimeField()
end_date = models.DateTimeField()
I want to iterate through a given year
year = '2010'
For each month in year I want to execute a query to see if the month is between start_date and end_date.
How can I iterate through a given year? Use the month to do a query?
SELECT * FROM deals WHERE month BETWEEN start_date AND end_date
The outcome will tell me if I had a deal in January 2010 and/or in February 2010, etc.
How can I iterate through a given year?
You could use python-dateutil's rrule. Install with command pip install python-dateutil.
Example usage:
In [1]: from datetime import datetime
In [2]: from dateutil import rrule
In [3]: list(rrule.rrule(rrule.MONTHLY, dtstart=datetime(2010,01,01,00,01), count=12))
Out[3]:
[datetime.datetime(2010, 1, 1, 0, 1),
datetime.datetime(2010, 2, 1, 0, 1),
datetime.datetime(2010, 3, 1, 0, 1),
datetime.datetime(2010, 4, 1, 0, 1),
datetime.datetime(2010, 5, 1, 0, 1),
datetime.datetime(2010, 6, 1, 0, 1),
datetime.datetime(2010, 7, 1, 0, 1),
datetime.datetime(2010, 8, 1, 0, 1),
datetime.datetime(2010, 9, 1, 0, 1),
datetime.datetime(2010, 10, 1, 0, 1),
datetime.datetime(2010, 11, 1, 0, 1),
datetime.datetime(2010, 12, 1, 0, 1)]
Use the month to do a query?
You could iterate over months like this:
In [1]: from dateutil import rrule
In [2]: from datetime import datetime
In [3]: months = list(rrule.rrule(rrule.MONTHLY, dtstart=datetime(2010,01,01,00,01), count=13))
In [4]: i = 0
In [5]: while i < len(months) - 1:
...: print "start_date", months[i], "end_date", months[i+1]
...: i += 1
...:
start_date 2010-01-01 00:01:00 end_date 2010-02-01 00:01:00
start_date 2010-02-01 00:01:00 end_date 2010-03-01 00:01:00
start_date 2010-03-01 00:01:00 end_date 2010-04-01 00:01:00
start_date 2010-04-01 00:01:00 end_date 2010-05-01 00:01:00
start_date 2010-05-01 00:01:00 end_date 2010-06-01 00:01:00
start_date 2010-06-01 00:01:00 end_date 2010-07-01 00:01:00
start_date 2010-07-01 00:01:00 end_date 2010-08-01 00:01:00
start_date 2010-08-01 00:01:00 end_date 2010-09-01 00:01:00
start_date 2010-09-01 00:01:00 end_date 2010-10-01 00:01:00
start_date 2010-10-01 00:01:00 end_date 2010-11-01 00:01:00
start_date 2010-11-01 00:01:00 end_date 2010-12-01 00:01:00
start_date 2010-12-01 00:01:00 end_date 2011-01-01 00:01:00
Replace the "print" statement with a query. Feel free to adapt it to your needs.
There is probably a better way but that could do the job.