converting arbitrary date time format to panda timeseries - python

I'm trying to convert a column in a dataframe to timeseries, the values in the column are strings and they are in the following form:
12/10/202110:42:05.397
which means 12-10-2021 at 10:42:05 and 397 milliseconds. This is the format that Labview is saving the data into a file.
I'm trying to use the following command, but I can't figure out how to define the format for my case:
pd.to_datetime(df.DateTime, format=???)
Note that there is no space between year 2021 and hour 10

Use:
df['dt'] = pd.to_datetime(df['DateTime'], format='%d/%m/%Y%H:%M:%S.%f')
print(df)
# Output
DateTime dt
0 12/10/202110:42:05.397 2021-10-12 10:42:05.397
Setup:
df = pd.DataFrame({'DateTime': ['12/10/202110:42:05.397']})
As suggested by #RaymondKwok, use the documentation:
strftime() and strptime() Format Codes

Related

How to convert two different date formats from a pandas dataframe column into same format?

I have two different date formats in a pandas column such as - DD-MM-YYYY and MM/DD/YYYY and I want to convert them into the same format.
I tried using the code -
data['SALE DATE'] = pd.to_datetime(data['SALE DATE']).dt.strftime('%m/%d/%Y')
but this converts the dates into
DD/MM/YYYY and MM/DD/YYYY into the output - data['SALE DATE']
I want a python solution to overcome this problem. Any leads will be very helpful.
The most intuitive solution is to write a custom conversion function,
someting like:
def myDateConv(tt):
sep = tt[2]
if sep == '-':
return pd.to_datetime(tt, format='%d-%m-%Y')
elif sep == '/':
return pd.to_datetime(tt, format='%m/%d/%Y')
else:
return tt
and then pass it as a converter for the column in question:
df = pd.read_csv('Input.csv', converters={'Date': myDateConv})
I prepared a CSV file, which read with read_csv without any
custom converter gave the original content and both columns of
object type:
Date Input format
0 03-05-2020 DD-MM-YYYY
1 05/07/2020 MM/DD/YYYY
But reading the same file with the above converter gave:
Date Input format
0 2020-05-03 DD-MM-YYYY
1 2020-05-07 MM/DD/YYYY
with Date column of datetime64[ns] type and both dates from
May, just as intended.
Or if you have this DataFrame from other source and you want to
convert this column, run:
df.Date = df.Date.apply(myDateConv)
If you are using pandas version 1.xx you can use the following solution:
pd.to_datetime(["11-08-2018", "05-03-2016", "08/30/2017", "09/21/2018"], infer_datetime_format=True, dayfirst=True).strftime("%m/%d/%Y")
This gives the following result:
Index(['08/11/2018', '03/05/2016', '08/30/2017', '09/21/2018'], dtype='object')
... the important argument here is dayfirst=True.
See pd.to_datetime docs for more.

How to convert python dataframe timestamp to datetime format

I have a dataframe with date information in one column.
The date visually appears in the dataframe in this format: 2019-11-24
but when you print the type it shows up as:
Timestamp('2019-11-24 00:00:00')
I'd like to convert each value in the dataframe to a format like this:
24-Nov
or
7-Nov
for single digit days.
I've tried using various datetime and strptime commands to convert but I am getting errors.
Here's a way to do:
df = pd.DataFrame({'date': ["2014-10-23","2016-09-08"]})
df['date_new'] = pd.to_datetime(df['date'])
df['date_new'] = df['date_new'].dt.strftime("%d-%b")
date date_new
0 2014-10-23 23-Oct
1 2016-09-08 08-Sept

How do I format date using pandas?

My data 'df' shows data 'Date' as 1970-01-01 00:00:00.019990103 when this is formatted to date_to using pandas. How do I show the date as 01/03/1999?
consider LoneWanderer's comment for next time and show some of the code that you have tried.
I would try this:
from datetime import datetime
now = datetime.now()
print(now.strftime('%d/%m/%Y'))
You can print now to see that is in the same format that you have and after that is formatted to the format required.
I see that the actual date is in last 10 chars of your source string.
To convert such strings to a Timestamp (ignoring the starting part), run:
df.Date = df.Date.apply(lambda src: pd.to_datetime(src[-8:]))
It is worth to consider to keep this date just as Timestamp, as it
simplifies operations on date / time and apply your formatting only in printouts.
But if you want to have this date as a string in "your" format, in the
"original" column, perform the second conversion (Timestamp to string):
df.Date = df.Date.dt.strftime('%m/%d/%Y')

Parse timestamp having hour beyond 23 in python

I am learning python and came across an issue where I am trying to read timestamp from CSV file in below format,
43:32.0
here 43 is at hours position and convert it to DateTime format in Pandas.
I tried code,
df['time'] = df['time'].astype(str).str[:-2]
df['time'] = pd.to_datetime(df['time'], errors='coerce')
But, this is converting all values to NaT
I need the output to be in format - mm/dd/yyyy hh:mm:ss
I'm going to assume that this is a Date for 11-29-17 (today's date)?
I believe you need to add an extra 0: in the beginning of the string. Basic Example:
import pandas as pd
# creating a dataframe of your string
df1 = pd.DataFrame({'A':['43:32.0']})
# adding '0:' to the front
df1['A'] = '0:' + df1['A'].astype(str)
# making new column to show the output
df1['B'] = pd.to_datetime(df1['A'], errors='coerce')
#output
A B
0 0:43:32.0 2017-11-29 00:43:32

Working on dates with mm-dd-YY & YY-mm-dd format in pandas

I am trying to do a simple test on pandas capabilities to handle dates & format.
For that i have created a dataframe with values like below. :
df = pd.DataFrame({'date1' : ['10-11-11','12-11-12','10-10-10','12-11-11',
'12-12-12','11-12-11','11-11-11']})
Here I am assuming that the values are dates. And I am converting it into proper format using pandas' to_datetime function.
df['format_date1'] = pd.to_datetime(df['date1'])
print(df)
Out[3]:
date1 format_date1
0 10-11-11 2011-10-11
1 12-11-12 2012-12-11
2 10-10-10 2010-10-10
3 12-11-11 2011-12-11
4 12-12-12 2012-12-12
5 11-12-11 2011-11-12
6 11-11-11 2011-11-11
Here, Pandas is reading the date of the dataframe as "MM/DD/YY" and converting it in native format (i.e. YYYY/MM/DD). I want to check if Pandas can take my input indicating that the date format is actually "YY/MM/DD" and then let it convert into its native format. This will change the value of row no.: 5. To do this, I have run following code. But it is giving me an error.
df3['format_date2'] = pd.to_datetime(df3['date1'], format='%Y/%m/%d')
ValueError: time data '10-10-10' does not match format '%Y/%m/%d' (match)
I have seen the sort of solution here. But I was hoping to get a little easy and crisp answer.
%Y in the format specifier takes the 4-digit year (i.e. 2016). %y takes the 2-digit year (i.e. 16, meaning 2016). Change the %Y to %y and it should work.
Also the dashes in your format specifier are not present. You need to change your format to %y-%m-%d

Categories