I have a dateframe column in Python that is in the format YYMM. E.g January 1996 is 9601.
I'm having a hard time converting it from 9601 to a useable date time format. I want the new format to be 01-01-1996. Does anyone have any suggestions? I tried pd.to_datetime function but it's not getting the results I'm looking for.
Use to_datetime with parameter format:
df = pd.DataFrame({'col':['9601', '9705']})
df['col'] = pd.to_datetime(df['col'], format='%y%m')
print (df)
col
0 1996-01-01
1 1997-05-01
Related
I have a dataframe with date information in one column.
The date visually appears in the dataframe in this format: 2019-11-24
but when you print the type it shows up as:
Timestamp('2019-11-24 00:00:00')
I'd like to convert each value in the dataframe to a format like this:
24-Nov
or
7-Nov
for single digit days.
I've tried using various datetime and strptime commands to convert but I am getting errors.
Here's a way to do:
df = pd.DataFrame({'date': ["2014-10-23","2016-09-08"]})
df['date_new'] = pd.to_datetime(df['date'])
df['date_new'] = df['date_new'].dt.strftime("%d-%b")
date date_new
0 2014-10-23 23-Oct
1 2016-09-08 08-Sept
I am trying to convert a column in my dataframe to dates, which are meant to be birthdays. The data was manually captured over a period of years with different formats. I cant get Pandas to format the whole column correctly.
formats include:
YYYYMMDD
DDMMYYYY
DD/MM/YYYY
DD-MMM-YYYY (eg JAN)
I have tried
dates['BIRTH-DATE(MAIN)'] = pd.to_datetime(dates['BIRTH-DATE(MAIN)'])
but i get the error
ValueError: year 19670314 is out of range
Not sure how I can get it to include multiple date formats?
You could create your own function to handle this. For example, something like:
df = pd.DataFrame({'date': {0: '20180101', 1: '01022018', 2: '01/02/2018', 3: '01-JAN-2018'}})
def fix_date(series, patterns=['%Y%m%d', '%d%m%Y', '%d/%m/%Y', '%d-%b-%Y']):
datetimes = []
for pat in patterns:
datetimes.append(pd.to_datetime(series, format=pat, errors='coerce'))
return pd.concat(datetimes, axis=1).ffill(axis=1).iloc[:, -1]
df['fixed_dates'] = fix_date(df['date'])
[out]
print(df)
date fixed_dates
0 20180101 2018-01-01
1 01022018 2018-02-01
2 01/02/2018 2018-02-01
3 01-JAN-2018 2018-01-01
In my eyes pandas is really good in converting dates but it is nearly impossible to guess always the right format automatically. Use pd.to_datetime with the option errors='coerce' and check the dates which were not converted by hand.
I am learning python and came across an issue where I am trying to read timestamp from CSV file in below format,
43:32.0
here 43 is at hours position and convert it to DateTime format in Pandas.
I tried code,
df['time'] = df['time'].astype(str).str[:-2]
df['time'] = pd.to_datetime(df['time'], errors='coerce')
But, this is converting all values to NaT
I need the output to be in format - mm/dd/yyyy hh:mm:ss
I'm going to assume that this is a Date for 11-29-17 (today's date)?
I believe you need to add an extra 0: in the beginning of the string. Basic Example:
import pandas as pd
# creating a dataframe of your string
df1 = pd.DataFrame({'A':['43:32.0']})
# adding '0:' to the front
df1['A'] = '0:' + df1['A'].astype(str)
# making new column to show the output
df1['B'] = pd.to_datetime(df1['A'], errors='coerce')
#output
A B
0 0:43:32.0 2017-11-29 00:43:32
I have a particular format of date in my dataframe as
df:
Date
12-Jun-16
22-Jan-12
I want to covert it to this format
df:
Date
12-Jan-2015
Any help as to how to do it?
I think you need convert column to_datetime and then if need change format add strftime:
df.Date = pd.to_datetime(df.Date).dt.strftime('%d-%b-%Y')
print (df)
Date
0 12-Jun-2016
1 22-Jan-2012
I have a dataframe as follows, and I am trying to reduce the dataframe to only contain rows for which the Date is greater than a variable curve_enddate. The df['Date'] is in datetime and hence I'm trying to convert curve_enddate[i][0] which gives a string of the form 2015-06-24 to datetime but am getting the error ValueError: time data '2015-06-24' does not match format '%Y-%b-%d'.
Date Maturity Yield_pct Currency
0 2015-06-24 0.25 na CAD
1 2015-06-25 0.25 0.0948511020 CAD
The line where I get the Error:
df = df[df['Date'] > time.strptime(curve_enddate[i][0], '%Y-%b-%d')]
Thank You
You are using wrong date format, %b is for the named months (abbreviations like Jan or Feb , etc), use %m for the numbered months.
Code -
df = df[df['Date'] > time.strptime(curve_enddate[i][0], '%Y-%m-%d')]
You cannot compare a time.struct_time tuple which is what time.strptime returns to a Timestamp so you also need to change that as well as using '%Y-%m-%d' using m which is the month as a decimal number. You can use pd.to_datetime to create the object to compare:
df = df[df['Date'] > pd.to_datetime(curve_enddate[i][0], '%Y-%m-%d')]