Round seconds of a date time column - python

I am trying to round seconds in a dataframe column which contains date and time in the format 01Jan2019:11:03:57.541.
I want to get the result as 01Jan2019:11:03:58
The column is in object format.
Could someone please help.

Use to_datetime for datetimes, then round by Series.dt.round and last convert by strftime:
df = pd.DataFrame({'date':['01Jan2019:11:03:57.541','01Jan2019:11:03:57.241']})
print (df)
date
0 01Jan2019:11:03:57.541
1 01Jan2019:11:03:57.241
df['date'] = (pd.to_datetime(df['date'], format='%d%b%Y:%H:%M:%S.%f')
.dt.round('S')
.dt.strftime('%d%b%Y:%H:%M:%S'))
print (df)
date
0 01Jan2019:11:03:58
1 01Jan2019:11:03:57

Related

Use multiple dates in pd.date_range

I have a range of dates in date column of dataframe. The dates are scattered eg 1st feb, 5th Feb, 11th feb etc.
I want to use pd.date_range with frequency one minute on every date in this column. So my start argument will be date and the end argument will be date + datetime.timedelta(days=1).
I'm struggling with using apply function with this, can someone help me with it? or can I use some other function over here?
I don't want to use a for loop because the length of my dates will be HUGE.
I tried this :
df.date.apply(lamda x : pd.date_range(start=df['date'],end = df['date']+datetime.timedelta(days=1),freq="1min"),axis =1)
but I'm getting error.
Thanks in advance
Use x in lambda function instead df['date'] and remove axis=1:
df = pd.DataFrame({'date':pd.date_range('2021-11-26', periods=3)})
print (df)
date
0 2021-11-26
1 2021-11-27
2 2021-11-28
s = df['date'].apply(lambda x:pd.date_range(start=x,end=x+pd.Timedelta(days=1),freq="1min"))
print (s)
0 DatetimeIndex(['2021-11-26 00:00:00', '2021-11...
1 DatetimeIndex(['2021-11-27 00:00:00', '2021-11...
2 DatetimeIndex(['2021-11-28 00:00:00', '2021-11...
Name: date, dtype: object.Timedelta(days=1),freq="1min"))

Pandas DateTime for Month

I have month column with values formatted as: 2019M01
To find the seasonality I need this formatted into Pandas DateTime format.
How to format 2019M01 into datetime so that I can use it for my seasonality plotting?
Thanks.
Use to_datetime with format parameter:
print (df)
date
0 2019M01
1 2019M03
2 2019M04
df['date'] = pd.to_datetime(df['date'], format='%YM%m')
print (df)
date
0 2019-01-01
1 2019-03-01
2 2019-04-01

Split Date Time string (not in usual format) and pull out month

I have a dataframe that has a date time string but is not in traditional date time format. I would like to separate out the date from the time into two separate columns. And then eventually also separate out the month.
This is what the date/time string looks like: 2019-03-20T16:55:52.981-06:00
>>> df.head()
Date Score
2019-03-20T16:55:52.981-06:00 10
2019-03-07T06:16:52.174-07:00 9
2019-06-17T04:32:09.749-06:003 1
I tried this but got a type error:
df['Month'] = pd.DatetimeIndex(df['Date']).month
This can be done just using pandas itself. You can first convert the Date column to datetime by passing utc = True:
df['Date'] = pd.to_datetime(df['Date'], utc = True)
And then just extract the month using dt.month:
df['Month'] = df['Date'].dt.month
Output:
Date Score Month
0 2019-03-20 22:55:52.981000+00:00 10 3
1 2019-03-07 13:16:52.174000+00:00 9 3
2 2019-06-17 10:32:09.749000+00:00 1 6
From the documentation of pd.to_datetime you can see a parameter:
utc : boolean, default None
Return UTC DatetimeIndex if True (converting any tz-aware datetime.datetime objects as well).

How to convert pandas column to date when column is something like "Jan-18"?

what is the efficient way to convert the column values into dates "DD-MM-YYYY" when the values given like "Feb-15" which needs to be "01-02-2015". if it's "Dec-46" it must return "01-12-1946".
You can pass the format '%b-%y' to to_datetime:
In[42]:
df = pd.DataFrame({'date':["Feb-15","Dec-46"]})
df['new_date'] = pd.to_datetime(df['date'], format='%b-%y')
df
Out[42]:
date new_date
0 Feb-15 2015-02-01
1 Dec-46 2046-12-01
Note that the new dtype is datetime64, you cannot control the display output, if you insist on DD-MM-YYYY then you would have to convert to a string using dt.strftime:
In[43]:
df['str_date'] = df['new_date'].dt.strftime('%d-%m-%Y')
df
Out[43]:
date new_date str_date
0 Feb-15 2015-02-01 01-02-2015
1 Dec-46 2046-12-01 01-12-2046
but then you have strings which is not that useful if you need to perform arithmetic operations or filtering
EDIT
You cannot store dates earlier than 1970 so '01-01-1946' is not a valid datetime that can be represented by datetime64

How to group a data frame by a time interval in pandas?

I have a data frame df
Date Mobile_No Amount Time .....
121526 2014-12-24 739637 200.00 9:44:00
121529 2014-12-28 199002 500.00 9:49:44
121531 2014-12-10 813770 100.00 9:50:41
121536 2014-12-09 178795 100.00 9:52:15
121537 2014-12-09 178795 100.00 9:52:24
having Date and Time of type datetime64 and object. I need to group this data frame by time interval of 5 minutes and Mobile_No. My expected output is the last two rows should be counted as one (Same Mobile_No and time interval is less than 5 minutes).
Is there any way to achieve this?
First I thought to combine Date and Time column and make timestamp and then use it as index and apply pd.TimeGrouper(), but this doesn't seem to work
>>>import datetime as dt
>>>import pandas as pd
...
>>> df.apply(lambda x: dt.datetime.combine(x['Date'], dt.time(x['Time'])), axis=1)
gives the error
'an integer is required', u'occurred at index 121526'
Can you not convert to string, concat the strings and parse the format in to_datetime if you are having issues:
df['Time']=df['Time'].astype(str)
df['Date']=df['Date'].astype(str)
df['Timestamp'] = df['Date'] +' ' + df['Time']
df.index = pd.to_datetime(df['Timestamp'], format='%Y/%m/%d %H:%M:%S')
from there you can resample or us pd.Grouper as required.

Categories