Finding the right format for pd.to_datetime - python

I'm trying to convert strings in my dataset('2016-01-01 00:00:00') to time stamps using pd.to_datetime.
Im trying:
pd.to_datetime(train["timestamp"],format='%Y/%m/%d %I:%M:%S')
but I get
time data '2016-01-01 00:00:00' does not match format '%Y/%m/%d %I:%M:%S' (match)
How can I fix this?

If you want it to be in the specific format that you mentioned, that is %Y/%m/%d %I:%M:%S, then do it like this.
First convert your string to datetime format using to_datetime:
df['timestamp'] = pd.to_datetime(df['timestamp'])
Now that your column is in datetime format, convert to the following format using strftime:
df['timestamp'] = df['timestamp'].dt.strftime('%Y/%m/%d %I:%M:%S')
Output:
timestamp
0 2016/01/01 12:00:00
1 2016/01/01 12:00:00
As others pointed out, use %H instead of %I for 24 hour format, like this:
df['timestamp'] = df['timestamp'].dt.strftime('%Y/%m/%d %H:%M:%S')

That's because your format in your df is different. Try the following using -, also use %H for 24-hour clock:
pd.to_datetime(train["timestamp"],format='%Y-%m-%d %H:%M:%S')

2 issues here:
Use - instead of /
%I is for Hour 00-12, use %H for Hour 00-23
pd.to_datetime(train["timestamp"],format='%Y-%m-%d %H:%M:%S')

Related

Pandas converting date time in string to datetime format

I have a column in Pandas dataframe which is a datetime entry column in string.
I have tried using the the syntax but it gives rise to this error.
Syntax
pd.to_datetime(df['Datetime'], format = '%y-%m-%d %H:%M:%S')
Error
time data '2020-11-01 16:23:12' does not match format '%y-%m-%d %H:%M:%S'
Try %Y,
this is the cheatsheet: https://strftime.org/
Yes, you've used the wrong format for the year.
pd.to_datetime(df["Datetime"], format="%Y-%m-%d %H:%M:%S")

Unable to get time difference between to pandas dataframe columns

I have a pandas dataframe that contains a couple of columns. Two of which are start_time and end_time. In those columns the values look like - 2020-01-04 01:38:33 +0000 UTC
I am not able to create a datetime object from these strings because I am not able to get the format right -
df['start_time'] = pd.to_datetime(df['start_time'], format="yyyy-MM-dd HH:mm:ss +0000 UTC")
I also tried using yyyy-MM-dd HH:mm:ss %z UTC as a format
This gives the error -
ValueError: time data '2020-01-04 01:38:33 +0000 UTC' does not match format 'yyyy-MM-dd HH:mm:ss +0000 UTC' (match)
You just need to use the proper timestamp format that to_datetime will recognize
df['start_time'] = pd.to_datetime(df['start_time'], format="%Y-%m-%d %H:%M:%S +0000 UTC")
There are some notes below about this problem:
1. About your error
This gives the error -
You have parsed a wrong datetime format that will cause the error. For correct format check this one https://strftime.org/. Correct format for this problem would be: "%Y-%m-%d %H:%M:%S %z UTC"
2. Pandas limitation with timezone
Parsing UTC timezone as %z doesn't working on pd.Series (it only works on index value). So if you use this, it will not work:
df['startTime'] = pd.to_datetime(df.startTime, format="%Y-%m-%d %H:%M:%S %z UTC", utc=True)
Solution for this is using python built-in library for inferring the datetime data:
from datetime import datetime
f = lambda x: datetime.strptime(x, "%Y-%m-%d %H:%M:%S %z UTC")
df['startTime'] = pd.to_datetime(df.startTime.apply(f), utc=True)
#fmarm answer only help you dealing with date and hour data, not UTC timezone.

format 01-01-16 7:43 string to datetime

I have the following strings that I'd like to convert to datetime objects:
'01-01-16 7:43'
'01-01-16 3:24'
However, when I try to use strptime it always results in a does not match format error.
Pandas to_datetime function nicely handles the automatic conversion, but I'd like to solve it with the datetime library as well.
format_ = '%m-%d-%Y %H:%M'
my_date = datetime.strptime("01-01-16 4:51", format_)
ValueError: time data '01-01-16 4:51' does not match format '%m-%d-%Y %H:%M'
as i see your date time string '01-01-16 7:43'
its a 2-digit year not 4-digit year
that in order to parse through a 2-digit year, e.g. '16' rather than '2016', a %y is required instead of a %Y.
you can do that like this
from datetime import datetime
datetime_str = '01-01-16 7:43'
datetime_object = datetime.strptime(datetime_str, '%m-%d-%y %H:%M')
print(type(datetime_object))
print(datetime_object)
give you output 2016-01-01 07:43:00
First of all, if you want to match 2016 you should write %Y while for 16 you should write %y.
That means you should write:
format_ = '%m-%d-%y %H:%M'
Check this link for all format codes.

Python change a date format in dataframe

I have a dataset containing a column "date":
date item
20.3.2010 17:08 a
20.3.2010 11:16 b
2010-03-20 15:55:14.060 c
2010-03-21 13:56:45.077 d
I would like to convert all values that have format as 20.3.2010 17:08 into 2010-03-21 13:56:45.077.
Does anybody have an idea?
Thank you.
Check on below:
from datetime import datetime
INPUT_FORMAT = '%d.%m.%Y %H:%M'
OUTPUT_FORMAT = '%Y-%m-%d %H:%M:%S.%f'
datetime.strptime('20.3.2010 17:08',INPUT_FORMAT).strftime(OUTPUT_FORMAT)
#Output '2010-03-20 17:08:00.000000'
You could find more information in offcial strptime and strftime.
To do a 100% match with 3 digits microseconds you could use this SO approach.
df['date'] = pd.to_datetime(df['date'], , format = '%Y-%m-%d %H:%M:%S.%f')
You can find more information on pd.to_datetime() here, and the format string type can be found here.

Converting string to date that contains 00:00:00

To convert a string date to date format dropping the '00:00:00' I use :
import datetime
strDate = '2017-04-17 00:00:00'
datetime.datetime.strptime(strDate, '%Y/%m/%d %H:%M:%S').strftime('%Y-%m-%d')
Returns :
ValueError: time data '2017-04-17 00:00:00' does not match format '%Y/%m/%d %H:%M:%S'
Is %H:%M:%S not correct format ?
This is the correct way:
datetime.datetime.strptime(strDate, '%Y-%m-%d %H:%M:%S').strftime('%Y-%m-%d')
Notice the - instead of / in strptime. The date is converted to: 2017-04-17.
If you would like to have it displayed a different way, have a look here.

Categories