Convert string column to DateTime format - python

I have a DataFrame column where value is of string type 'June 6, 2016, 6' and I want to convert it into DataTime as 'YYYY-MM-DD HH:MM' format.
When tried convert by just taking value , I could able to convert it into right format.
import datetime
stringDate = "June 6, 2016, 11"
dateObject = datetime.datetime.strptime(stringDate, "%B %d, %Y, %H")
print dateObject
**Output : 2016-06-06 11:00:00**
But when I tried different options to apply the same conversion on python dataframe columns I'm not getting time part in the conversion.
**Option1**
df['Date'] = df.Date.apply(lambda x: dt.datetime.strptime(x, "%B %d, %Y, %H").date())
**Option2**
df['Date'] = pd.to_datetime(df['Date'] = df.Date.apply(lambda x: dt.datetime.strptime(x, "%B %d, %Y, %H"))
Output: both cases got 2016-06-06
Any suggestions will be appreciated.

I think you need add parameter format to to_datetime:
print (pd.to_datetime('June 6, 2016, 11', format='%B %d, %Y, %H'))
2016-06-06 11:00:00
It works with DataFrame too:
df = pd.DataFrame({'Date':['June 6, 2016, 11', 'May 6, 2016, 11']})
print (df)
Date
0 June 6, 2016, 11
1 May 6, 2016, 11
print (pd.to_datetime(df['Date'], format='%B %d, %Y, %H'))
0 2016-06-06 11:00:00
1 2016-05-06 11:00:00
Name: Date, dtype: datetime64[ns]

Related

Date and time format from string

I'm converting the date of this string in this way, but I get the error "time data 'Aug 6, 2022, 10:44 AM' does not match format '%m %d, %Y, %I:%Mp'"
fechaDAT = 'Aug 6, 2022, 10:44 AM'
dateC = datetime.strptime(fechaDAT, "%m %d, %Y, %I:%Mp")
here is the right format :
dateC = datetime.strptime(fechaDAT, "%b %d, %Y, %I:%M %p")
%b for month abbrevation
%p for locale AM/PM

Creating a column with certain position of value from a cell in pandas

I have a dataset with a column "date" with values like "Jul 31, 2014", "Sep 23, 2018"...
I want to place months in a different column, convert them in integer using "df.to_datetime(df.MONTH, format='%b').dt.month" and then return back in order to sort it by the date index.
How can I choose only the first 3 letters from the cells?
You can try to_datetime with the date format %b %d, %Y:
df["date"] = pd.to_datetime(df["date"], format='%b %d, %Y')
df["month"] = df["date"].dt.month
Code:
print(df)
# date
# 0 Jul 31, 2014
# 1 Sep 23, 2018
df["date"] = pd.to_datetime(df["date"], format='%b %d, %Y')
df["month"] = df["date"].dt.month
print(df)
# date month
# 0 2014-07-31 7
# 1 2018-09-23 9
For more detail on how to get the date format, refer the doc

Changing date string to pandas datestamp

I have a dataframe with column date which looks like this:
Feb 24, 2020 # 12:47:31.616
I would like it to become this:
2020-02-24
I can achieve this using slicing since I am dealing only with one week's data hence all months will be Feb.
Is there a neat pandas way to change the datestamp to date format I desire?
Thank you for your suggestions.
Use to_datetime with format %b %d, %Y # %H:%M:%S.%f and then if necessary convert to dates by Series.dt.date or to datetimes by Series.dt.floor:
#dates
df = pd.DataFrame({'dates':['Feb 24, 2020 # 12:47:31.616','Feb 24, 2020 # 12:47:31.616']})
df['dates'] = pd.to_datetime(df['dates'], format='%b %d, %Y # %H:%M:%S.%f').dt.date
#datetimes
df['dates'] = pd.to_datetime(df['dates'], format='%b %d, %Y # %H:%M:%S.%f').dt.floor('d')
print (df)
dates
0 2020-02-24
1 2020-02-24
Using pd.to_datetime with Series.str.split:
df = pd.DataFrame({'date':['Feb 24, 2020 # 12:47:31.616']})
date
0 Feb 24, 2020 # 12:47:31.616
df['date'] = pd.to_datetime(df['date'].str.split('\s#\s').str[0], format='%b %d, %Y')
date
0 2020-02-24

python: convert column from string to datetime with mixed formats

I've converted from string to datetimes in columns numerous times. However in each of those instances, the string format was consistent. Now I have a dataframe with mixed formats to change. Example below, but this is throughout 100,000s of rows.
index date
0 30 Jan 2018
1 January 30 2018
I could convert each type on an individual basis, but is there a way to convert that df['date'] to datetime with mixed formats easily?
Here is a module which can do this for you dateparser
from dateparser import parse
print(parse('2018-04-18 22:33:40'))
print(parse('Wed 11 Jul 2018 23:00:00 GMT'))
Output:
datetime.datetime(2018, 4, 18, 22, 33, 40)
datetime.datetime(2018, 7, 11, 23, 0, tzinfo=<StaticTzInfo 'GMT'>)
Here is a way to do it using datetime.strptime
from datetime import datetime
def IsNumber(s):
try:
int(s)
return True
except ValueError:
return False
def ConvertToDatetime(date):
date=date.split(" ") #split by space
if(IsNumber(date[0])): #is of the form dd month year
if(len(date[1])==3): #if month is for form Jan,Feb...
datetime_object = datetime.strptime(" ".join(date), '%d %b %Y')
else: #if month is for form January ,February ...
datetime_object = datetime.strptime(" ".join(date), '%d %B %Y')
else: #is of the form month date year
if(len(date[0])==3): #if month is for form Jan,Feb...
datetime_object = datetime.strptime(" ".join(date), '%b %d %Y')
else: #if month is for form January ,February ...
datetime_object = datetime.strptime(" ".join(date), '%B %d %Y')
return datetime_object
You can add more cases based on the documentation and the format
An example for the two in your question are :
ConvertToDatetime("30 Jan 2018")
2018-01-30 00:00:00
ConvertToDatetime("January 30 2018")
2018-01-30 00:00:00

Convert different date format string into datetime format

I have a column of date with different format of date
publish_date = ["Feb. 2, 2000", "June 4, 1989", "Mar. 13, 2018"]
I was using strptime() to convert one type of string, how can I convert multiple formats of date in the same column?
type 1: %b %d, %Y
type 2: %B %d, %Y
You could use the 3rd party dateparser module
Install with pip install dateparser, then
>>> import dateparser
>>> publish_date = ["Feb. 2, 2000", "June 4, 1989", "Mar. 13, 2018"]
>>> for d in publish_date:
... print(dateparser.parse(d))
...
2000-02-02 00:00:00
1989-06-04 00:00:00
2018-03-13 00:00:00
dateparser accepts a huge range of formats, but you can restrict it to just the ones you're interested in if you like
>>> for d in publish_date:
... print(dateparser.parse(d, date_formats=['%b %d, %Y', '%B %d, %Y']))
...
2000-02-02 00:00:00
1989-06-04 00:00:00
2018-03-13 00:00:00
You can also use dateutil
Demo:
from dateutil.parser import parse
publish_date = ["Feb. 2, 2000", "June 4, 1989", "Mar. 13, 2018"]
for date in publish_date:
print( parse(date) )
Output:
2000-02-02 00:00:00
1989-06-04 00:00:00
2018-03-13 00:00:00

Categories