Python-PANDAS dataframe- Datetime format - python

I am trying to get a specific format i.e. date-month-year eg: 01-12-2023 12:00:00;00 AM.
In my dataframe i have a column which is in a datetime format i.e. 2023-12-01.
Here i tried to convert the datetime format into the format that i needed by using strftime ,however didnt want the datatype to be changed so after converting using strftime , converted back again in datetime.
After doing so , again it went back datetime format i.e. 2023-12-01 instead of 01-12-2023 12:00:00;00 AM.
How can i resolve this by getting the required format and not changing the datatype.
Checked few examples but didnt work.
Code:
Below code creates a new column DOB1 from the Month column which has a datetime format :
Dataframe bkp:
Month
2022-05-01
2023-06-01
bkp['DOB1'] = bkp['Month'].dt.strftime('%d-%m-%Y %I:%M:%S %p')
Output:
Month DOB1
2022-05-01 01-05-2022 12:00:00:00 AM
2023-06-01 01-06-2023 12:00:00:00 AM
bkp.dtypes >>>> Month : datetime64[ns] , DOB1 : object
# Converting to datetime format , to retain the original format
bkp['DOB1'] = pd.to_datetime(bkp['DOB1'])
Output:
Month DOB1
2022-05-01 2022-05-01
2023-06-01 2023-06-01
Could anyone suggest how to get the required format and have the datetime format still.

Related

time data mm/dd/YYYY doesn't match format specified

I have a column with the following format:
Original format:
mm/dd/YYYY
10/28/2021
10/28/2021
the output after:
print(df['mm/dd/YYYY'])
0 2021-10-28 00:00:00
1 2021-10-28 00:00:00
However when I am trying to convert to datetime I get the following error:
pd.to_datetime(df['mm/dd/YYYY'], format='%Y-%m-%d %H:%M:%S')
time data mm/dd/YYYY doesn't match format specified
You are passing the wrong format. Try
pd.to_datetime(df['mm/dd/YYYY'], format='%m/%d/%Y')

How to read AM/PM times in pandas?

I am dealing with a csv file containing a column called startTime, containing times.
When opening this file with Excel, the times appear as AM/PM times in the formula bar, although the timestamps in the column appear improperly formatted:
startTime
16:02.0
17:45.0
18:57.0
20:23.0
When reading this file using pandas' read_csv, I am unable to format these timestamps properly:
import pandas as pd
df = pd.read_csv('example_file.csv')
print(df.startTime)
Simply yields:
0 16:02.0
1 17:45.0
2 18:57.0
3 20:23.0
I first attempted to convert the output Series using pd.to_datetime(df.startTime,format=" %H%M%S") but this yields the following error message:
time data '16:02.0' does not match format ' %H%M%S' (match)
I then tried pd.to_datetime(df.startTime,format=" %I:%M:%S %p") based on this answer, in order to account for the AM/PM convention, but this returned the same error message.
How can I use pandas to format these timestamps like Excel automatically does?
Your csv file has text, not datetime, so you need to first convert text stored in this column to pandas datetime object, then you can convert this pandas datetime object to the kind of format that you want via a strftime method:
pd.to_datetime(df['startTime']).dt.strftime(date_format = '%I:%M:%S %p')
Outputs:
0 04:02:00 PM
1 05:45:00 PM
2 06:57:00 PM
3 08:23:00 PM
Note: these values are string values, not datetime.
Edit for this specific issue:
A quick format to add 00h to your timestamp before converting to get midnight AM:
pd.to_datetime(df['startTime'].apply(lambda x: f'00:{x}')).dt.strftime(date_format = '%I:%M:%S %p')
Outputs:
0 00:16:02 AM
1 00:17:45 AM
2 00:18:57 AM
3 00:20:23 AM
Try:
>>> pd.to_datetime(df['startTime'].str.strip(), format='%H:%M.%S')
0 1900-01-01 16:02:00
1 1900-01-01 17:45:00
2 1900-01-01 18:57:00
3 1900-01-01 20:23:00
Name: startTime, dtype: datetime64[ns]
Coerce to datetetime and extract time using dt.strftime
df['startTime']=pd.to_datetime(df['startTime']).dt.strftime('%I:%M.%S%p')

Converting String to datetime in pandas

I want to convert a column from excel from Object type to DateTime.
index
Planned Hours
Actual Hours
0
2:00
2:00
1
1:00
1:00
2
1:45
1:45
.....
.....
......
18676
35:00
35:00
Formate of the column
Planned Hours - object
, Actual Hours - object
I used time['Actual Hours'] = pd.to_datetime(time['Actual Hours'],format='%H:%M') and i got an error as
time data '35:00' does not match format '%H:%M' (match)
When I change the format to time['Actual Hours'] = pd.to_datetime(time['Actual Hours'],format='%HH:%MM') I get this error
time data '2:00' does not match format '%HH:%MM' (match)
How can I convert the string values into DateTime without the error?
You probably want Timedelta, not datetime (which is what you would use for some actual date, but you seem to be working with time durations, not dates):
df['Planned Hours'] = pd.to_timedelta(df['Planned Hours'] + ':00')

csv Pandas datetime convert time to seconds

I work with data from Datalogger and the timestap is not supported by datetime in the Pandas Dataframe.
I would like to convert this timestamp into a format pandas knows and the then convert the datetime into seconds, starting with 0.
>>>df.time
0 05/20/2019 19:20:27:374
1 05/20/2019 19:20:28:674
2 05/20/2019 19:20:29:874
3 05/20/2019 19:20:30:274
Name: time, dtype: object
I tried to convert it from the object into datetime64[ns]. with %m or %b for month.
df_time = pd.to_datetime(df["time"], format = '%m/%d/%y %H:%M:%S:%MS')
df_time = pd.to_datetime(df["time"], format = '%b/%d/%y %H:%M:%S:%MS')
with error: redefinition of group name 'M' as group 7; was group 5 at position 155
I tried to reduce the data set and remove the milliseconds without success.
df['time'] = pd.to_datetime(df['time'],).str[:-3]
ValueError: ('Unknown string format:', '05/20/2019 19:20:26:383')
or is it possible to just subtract the first time line from all the other values in the column time?
Use '%m/%d/%Y %H:%M:%S:%f' as format instead of '%m/%d/%y %H:%M:%S:%MS'
Here is the format documentation for future reference
I am not exactly sure what you are looking for but you can use the above example to format your output and then you can remove items from your results like the microseconds this way:
date = str(datetime.now())
print(date)
2019-07-28 14:04:28.986601
print(date[11:-7])
14:04:28
time = date[11:-7]
print(time)
14:04:28

Pandas to_datetime not formatting as expected

I have a data frame with a column 'Date' with data type datetime64. The values are in YYYY-MM-DD format.
How can I convert it to YYYY-MM format and use it as a datetime64 object itself.
I tried converting my datetime object to a string in YYYY-MM format and then back to datetime object in YYYY-MM format but it didn't work.
Original data = 1988-01-01.
Converting datatime object to string in YY-MM format
df['Date']=df['Date'].dt.strftime('%Y-%m')
This worked as expected, my column value became
1988-01
Converting the string back to datetime object in Y-m format
df['Date']=pd.to_datetime(df['Date'],format= '%Y-%m')
I was expecting the Date column in YYYY-MM format but it became YYYY-MM-DD format.
1988-01-01
Can you please let me know if I am missing something.
Thanks
It is expected behaviour, in datetimes the year, month and day arguments are required.
If want remove days need month period by to_period:
df['Date'] = df['Date'].dt.to_period('M')
df['Date'] = pd.to_datetime(df['Date'],format= '%Y-%m').dt.to_period('M')
Sample:
df = pd.DataFrame({'Date':pd.to_datetime(['1988-01-01','1999-01-15'])})
print (df)
Date
0 1988-01-01
1 1999-01-15
df['Date'] = df['Date'].dt.to_period('M')
print (df)
Date
0 1988-01
1 1999-01

Categories