Converting String to datetime in pandas - python

I want to convert a column from excel from Object type to DateTime.
index
Planned Hours
Actual Hours
0
2:00
2:00
1
1:00
1:00
2
1:45
1:45
.....
.....
......
18676
35:00
35:00
Formate of the column
Planned Hours - object
, Actual Hours - object
I used time['Actual Hours'] = pd.to_datetime(time['Actual Hours'],format='%H:%M') and i got an error as
time data '35:00' does not match format '%H:%M' (match)
When I change the format to time['Actual Hours'] = pd.to_datetime(time['Actual Hours'],format='%HH:%MM') I get this error
time data '2:00' does not match format '%HH:%MM' (match)
How can I convert the string values into DateTime without the error?

You probably want Timedelta, not datetime (which is what you would use for some actual date, but you seem to be working with time durations, not dates):
df['Planned Hours'] = pd.to_timedelta(df['Planned Hours'] + ':00')

Related

Python-PANDAS dataframe- Datetime format

I am trying to get a specific format i.e. date-month-year eg: 01-12-2023 12:00:00;00 AM.
In my dataframe i have a column which is in a datetime format i.e. 2023-12-01.
Here i tried to convert the datetime format into the format that i needed by using strftime ,however didnt want the datatype to be changed so after converting using strftime , converted back again in datetime.
After doing so , again it went back datetime format i.e. 2023-12-01 instead of 01-12-2023 12:00:00;00 AM.
How can i resolve this by getting the required format and not changing the datatype.
Checked few examples but didnt work.
Code:
Below code creates a new column DOB1 from the Month column which has a datetime format :
Dataframe bkp:
Month
2022-05-01
2023-06-01
bkp['DOB1'] = bkp['Month'].dt.strftime('%d-%m-%Y %I:%M:%S %p')
Output:
Month DOB1
2022-05-01 01-05-2022 12:00:00:00 AM
2023-06-01 01-06-2023 12:00:00:00 AM
bkp.dtypes >>>> Month : datetime64[ns] , DOB1 : object
# Converting to datetime format , to retain the original format
bkp['DOB1'] = pd.to_datetime(bkp['DOB1'])
Output:
Month DOB1
2022-05-01 2022-05-01
2023-06-01 2023-06-01
Could anyone suggest how to get the required format and have the datetime format still.

How to convert millisecond Unix timestamp to readable date?

I have a string 1615070997520. This is a Unix timestamp, but this is for millisecond. When I convert this to date with converter, It gives me correct date (Saturday, March 6, 2021 10:49:57.520 PM GMT).
But with this code:
from datetime import datetime
ts = int("1615070997520")
print(datetime.utcfromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S'))
It gives me an error which is ValueError: year 53149 is out of range.
Is there any way to convert it into correct date like yyyy-mm-dd hh:mm:ss.ms using Python?
Try this one
ts = int("1615070997520")/1000
print(datetime.utcfromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S'))

Converting Epoch time format to standard time format

I am having an issue with converting the Epoch time format 1585542406929 into the 2020-09-14 Hours Minutes Seconds format.
I tried running this, but it gives me an error
from datetime import datetime
DATETIME_FORMAT = '%Y-%m-%d %H:%M:%S'
datetime.utcfromtimestamp(df2.timestamp_ms).strftime('%Y-%m-%d %H:%M:%S')
error : cannot convert the series to <class 'int'>
What am I not understanding about this datetime function? Is there a better function that I should be using?
edit: should mention that timestamp_ms is my column from my dataframe called df.
Thanks to #chepner for helping me understand the format that this is in.
A quick solution is the following:
# make a new column with Unix time as #ForceBru mentioned
start_date = '1970-01-01'
df3['helper'] = pd.to_datetime(start_date)
# convert your column of JSON dates / numbers to days
df3['timestamp_ms'] = df3['timestamp_ms'].apply(lambda x: (((x/1000)/60)/60/24))
# add a day adder column
df3['time_added'] = pd.to_timedelta(df3['timestamp_ms'],'d')
# add the two columns together
df3['actual_time'] = df3['helper'] + df3['time_added']
Note that you might have to subtract some time off from the actual time stamp. For instance, I had sent my message at 10: 40 am today when it is central time (mid west USA), but the timestamp was putting it at 3:40 pm today.

csv Pandas datetime convert time to seconds

I work with data from Datalogger and the timestap is not supported by datetime in the Pandas Dataframe.
I would like to convert this timestamp into a format pandas knows and the then convert the datetime into seconds, starting with 0.
>>>df.time
0 05/20/2019 19:20:27:374
1 05/20/2019 19:20:28:674
2 05/20/2019 19:20:29:874
3 05/20/2019 19:20:30:274
Name: time, dtype: object
I tried to convert it from the object into datetime64[ns]. with %m or %b for month.
df_time = pd.to_datetime(df["time"], format = '%m/%d/%y %H:%M:%S:%MS')
df_time = pd.to_datetime(df["time"], format = '%b/%d/%y %H:%M:%S:%MS')
with error: redefinition of group name 'M' as group 7; was group 5 at position 155
I tried to reduce the data set and remove the milliseconds without success.
df['time'] = pd.to_datetime(df['time'],).str[:-3]
ValueError: ('Unknown string format:', '05/20/2019 19:20:26:383')
or is it possible to just subtract the first time line from all the other values in the column time?
Use '%m/%d/%Y %H:%M:%S:%f' as format instead of '%m/%d/%y %H:%M:%S:%MS'
Here is the format documentation for future reference
I am not exactly sure what you are looking for but you can use the above example to format your output and then you can remove items from your results like the microseconds this way:
date = str(datetime.now())
print(date)
2019-07-28 14:04:28.986601
print(date[11:-7])
14:04:28
time = date[11:-7]
print(time)
14:04:28

How to get hours-minute-seconds from ISO 8601 date time format?

I am working with an Excel file in Pandas where I am trying to deal with a
Date column where the Date is listed in ISO 8601 format. I want to take this column and store the date and time in two different columns.The values in these two columns need to be stored in Eastern Daylight Savings. This is what they are supposed to look like
Date Date (New) Time (New)
1999-01-01T00:00:29.75 12/31/1998 6:59:58 PM
1999-01-01T00:00:30.00 12/31/1998 6:59:59 PM
1999-01-01T00:00:32.25 12/31/1998 7:00:00 PM
1999-01-01T00:00:30.50 12/31/1998 6:59:58 PM
I have achieved this, partially.
I have converted the values to Eastern Daylight savings time and successfully stored the Date value correctly. However, I want the time value to be stored in the 12 hours format and not in the 24 hours format as it is being right now?
This is what my output looks like so far.
Date Date (New) Time (New)
1999-01-01T00:00:29.75 1998-12-31 19:00:30
1999-01-01T00:00:30.00 1998-12-31 19:00:30
1999-01-01T00:00:32.25 1998-12-31 19:00:32
1999-01-01T00:00:30.50 1998-12-31 19:00:31
Does anyone have any idea what i can do for this?
from pytz import timezone
import dateutil.parser
from pytz import UTC
import datetime as dt
df3['Day']=pd.to_datetime(df['Date'], format='%Y-%m-%d %H:%M: %S.%f',errors='coerce').dt.tz_localize('UTC')
df3['Day']= df3['Day'].dt.tz_convert('US/Eastern')
df3['Date(New)'], df3['Time(New)'] = zip(*[(d.date(), d.time()) for d in df3['Day']])
You should use d.time().strftime("%I:%M:%S %p") which will format the date as requested.
strftime() and strptime() Behavior
You can set the time format used for outputting - the time value itself is (and should be) stored as datetime.time() - if you want a specific string representation you can create a string-type column in the format you want:
from pytz import timezone
import pandas as pd
import datetime as dt
df= pd.DataFrame([{"Date":dt.datetime.now()}])
df['Day']=pd.to_datetime( df['Date'], format='%Y-%m-%d %H:%M: %S.%f',
errors='coerce').dt.tz_localize('UTC')
df['Day']= df['Day'].dt.tz_convert('US/Eastern')
df['Date(New)'], df['Time(New)'] = zip(*[(d.date(), d.time()) for d in df['Day']])
# create strings with specific formatting
df['Date(asstring)'] = df['Day'].dt.strftime("%Y-%m-%d")
df['Time(asstring)'] = df["Day"].dt.strftime("%I:%M:%S %p")
# show resulting column / cell types
print(df.dtypes)
print(df.applymap(type))
# show df
print(df)
Output:
# df.dtypes
Date datetime64[ns]
Day datetime64[ns, US/Eastern]
Date(New) object
Time(New) object
Date(asstring) object
Time(asstring) object
# from df.applymap(type)
Date <class 'pandas._libs.tslib.Timestamp'>
Day <class 'pandas._libs.tslib.Timestamp'>
Date(New) <class 'datetime.date'>
Time(New) <class 'datetime.time'>
Date(asstring) <class 'str'>
Time(asstring) <class 'str'>
# from print(df)
Date Day Date(New) Time(New)
0 2019-01-04 00:40:02.802606 2019-01-03 19:40:02.802606-05:00 2019-01-03 19:40:02.802606
Date(asstring) Time(asstring)
2019-01-03 07:40:02 PM
It looks like you are very close. %H is the 24 hour format. You should use %I instead.
How can I account for period (AM/PM) with datetime.strptime?

Categories