This question already has answers here:
Differance between two days excluding weekends in hours
(3 answers)
Closed 2 years ago.
here are two really useful questions for datetime comparison in python:
Calculate Pandas DataFrame Time Difference Between Two Columns in Hours and Minutes
Determine the difference between two DateTimes, only counting opening hours
I have a dataframe in python with two columns:
A B
10:00:00 01.01.2019 12:00:00 02.01.2019
And I have opening hours (which are relevant hours), which only should count for the calculation, so not the full 24 hours and maybe not every day. So my business would be open from: 10:00:00 - 18:00:00 every day, how can i adjust:
df_time['td'] = df_time['B']-df_time['A']
That the outcome would be 10 hours in this case? And it is open from monday to friday.
Not a full answer, but I would do:
Count the time in the first day 18:00:00 - df['A'].dt.time
Count the time in the last day df['B'] - 10:00:00
Count the day in between and multiply by 8.
Related
I have a dataframe with a column that are timestamps of individual trades that occurred at BitMEX. I am now trying to work out the difference in times between each timestamp to the next using
timediff = df6['timestamp'].diff()
Then when I try df6.timediff.isnull().sum() I get the value as "one" which is the NaT value on top of the column for the first value.
However then when I draw a histogram I see many zeros. On inspecting the dataframe I see many rows with zero total.
Below are the timestamps I can see after doing the .diff(). I also see the timestamp no longer displays milliseconds either.
7463 0 days 00:00:00.342889
7464 0 days 00:01:07.891225
7465 0 days 00:00:00
7466 0 days 00:00:00.038494
7467 0 days 00:00:00.135066
7468 0 days 00:00:00
7469 0 days 00:00:00
7470 0 days 00:00:00
7471 0 days 00:00:00
7472 0 days 00:00:01.122758
7473 0 days 00:00:00.728908
7474 0 days 00:00:13.272938
My question is - how do I find the number of rows of timestamps that are actually zero - i.e. in this case the above timestamp is difference in time (t - t(t-1))
This question already has answers here:
Pandas Timedelta in Days
(5 answers)
Closed 3 years ago.
Given two columns in a dataframe that are date time objects:
Checkin Checkout
2018-09-13 19:55:00 2018-09-16 13:08:00
I'd like to compute the time difference in days and have it output as an integer to a new column. So far, I've done this but the output also includes seconds.
delta = df['Checkin'] - df['Checkout']
print(delta)
The output however ends up being:
2 days 17:13:00
and is output as a DT object. I'd like it to just output as 2 and as an integer in a new column.
How would I go about doing that?
You need dt.days
(df['checkin'] - df['checkout']).dt.days
Output:
0 -3
dtype: int64
I am trying to create a date range using pd.date_range from 2018-01-01 to 2018-01-31 for days of the week Monday, Tuesday, Wednesday, from 6 AM - 6 PM at every 12 minutes.
Basically, I need an array of datetime objects or strings with a value every 12 minutes for particular days of the week, between particular business hours for the given range of dates. I am not able to use CustomBusinessDay, CustomBusinessHour and freq together to get the desired range of datetime objects.
Any suggestions?
You could use
index = pd.date_range('2018-01-01', '2018-02-01', freq='12min')
index[(index.dayofweek <= 2) & (index.hour >= 6) & (index.hour < 18)]
This question already has answers here:
Add months to a date in Pandas
(4 answers)
How can I get pandas Timestamp offset by certain amount of months?
(1 answer)
Closed 4 years ago.
I have multiple df, and they are indexed with timestamps for consecutive months. For example:
1996-01-01 01:00:00
1996-02-01 01:00:00
1996-03-01 01:00:00
1996-04-01 01:00:00
1996-05-01 01:00:00
1996-06-01 01:00:00
I'm trying to create a function where I can add an arbitrary number of rows onto the df, continuing on from whatever the last month happens to be. I tried to solve this by using:
df.iloc[-1].name + pd.Timedelta(1, unit='M')
in a for loop, but this only seems to add 30 days, instead of changing the month value +1. Is there a more reliable way to fetch a pd.Timestamp and add 1 month?
Thank you
I have data on spells (hospital stays), each with a start and end date, but I want to count the number of days spent in hospital for calendar months. Of course, this number can be zero for months not appearing in a spell. But I cannot just attribute the length of each spell to the starting month, as longer spells run over to the following month (or more).
Basically, it would suffice for me if I could cut spells at turn-of-month datetimes, getting from the data in the first example to the data in the second:
id start end
1 2011-01-01 10:00:00 2011-01-08 16:03:00
2 2011-01-28 03:45:00 2011-02-04 15:22:00
3 2011-03-02 11:04:00 2011-03-05 05:24:00
id start end month stay
1 2011-01-01 10:00:00 2011-01-08 16:03:00 2011-01 7
2 2011-01-28 03:45:00 2011-01-31 23:59:59 2011-01 4
2 2011-02-01 00:00:00 2011-02-04 15:22:00 2011-02 4
3 2011-03-02 11:04:00 2011-03-05 05:24:00 2011-03 3
I read up on the Time Series / Date functionality of pandas, but I do not see a straightforward solution to this. How can one accomplish the slicing?
It's simpler than you think: just subtract the dates. The result is a time span. See Add column with number of days between dates in DataFrame pandas
You even get to do this for the entire frame at once:
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.subtract.html
Update, now that I understand the problem better.
Add a new column: take the spell's end date; if the start date is in a different month, then set this new date's day to 01 and the time to 00:00.
This is the cut DateTime you can use to compute the portion of the stay attributable to each month. cut - start is the first month; end - cut is the second.