Convert different date format string into datetime format - python

I have a column of date with different format of date
publish_date = ["Feb. 2, 2000", "June 4, 1989", "Mar. 13, 2018"]
I was using strptime() to convert one type of string, how can I convert multiple formats of date in the same column?
type 1: %b %d, %Y
type 2: %B %d, %Y

You could use the 3rd party dateparser module
Install with pip install dateparser, then
>>> import dateparser
>>> publish_date = ["Feb. 2, 2000", "June 4, 1989", "Mar. 13, 2018"]
>>> for d in publish_date:
... print(dateparser.parse(d))
...
2000-02-02 00:00:00
1989-06-04 00:00:00
2018-03-13 00:00:00
dateparser accepts a huge range of formats, but you can restrict it to just the ones you're interested in if you like
>>> for d in publish_date:
... print(dateparser.parse(d, date_formats=['%b %d, %Y', '%B %d, %Y']))
...
2000-02-02 00:00:00
1989-06-04 00:00:00
2018-03-13 00:00:00

You can also use dateutil
Demo:
from dateutil.parser import parse
publish_date = ["Feb. 2, 2000", "June 4, 1989", "Mar. 13, 2018"]
for date in publish_date:
print( parse(date) )
Output:
2000-02-02 00:00:00
1989-06-04 00:00:00
2018-03-13 00:00:00

Related

python: convert column from string to datetime with mixed formats

I've converted from string to datetimes in columns numerous times. However in each of those instances, the string format was consistent. Now I have a dataframe with mixed formats to change. Example below, but this is throughout 100,000s of rows.
index date
0 30 Jan 2018
1 January 30 2018
I could convert each type on an individual basis, but is there a way to convert that df['date'] to datetime with mixed formats easily?
Here is a module which can do this for you dateparser
from dateparser import parse
print(parse('2018-04-18 22:33:40'))
print(parse('Wed 11 Jul 2018 23:00:00 GMT'))
Output:
datetime.datetime(2018, 4, 18, 22, 33, 40)
datetime.datetime(2018, 7, 11, 23, 0, tzinfo=<StaticTzInfo 'GMT'>)
Here is a way to do it using datetime.strptime
from datetime import datetime
def IsNumber(s):
try:
int(s)
return True
except ValueError:
return False
def ConvertToDatetime(date):
date=date.split(" ") #split by space
if(IsNumber(date[0])): #is of the form dd month year
if(len(date[1])==3): #if month is for form Jan,Feb...
datetime_object = datetime.strptime(" ".join(date), '%d %b %Y')
else: #if month is for form January ,February ...
datetime_object = datetime.strptime(" ".join(date), '%d %B %Y')
else: #is of the form month date year
if(len(date[0])==3): #if month is for form Jan,Feb...
datetime_object = datetime.strptime(" ".join(date), '%b %d %Y')
else: #if month is for form January ,February ...
datetime_object = datetime.strptime(" ".join(date), '%B %d %Y')
return datetime_object
You can add more cases based on the documentation and the format
An example for the two in your question are :
ConvertToDatetime("30 Jan 2018")
2018-01-30 00:00:00
ConvertToDatetime("January 30 2018")
2018-01-30 00:00:00

How to get date from this format "Tue May 15 2018 00:00:00 GMT-0400 (EDT)" in python

How can i convert date from this formt "Tue May 15 2018 00:00:00 GMT-0400 (EDT)" to date in the format yyyy-mm-dd?
dateutil is your friend:
>>> import dateutil.parser
>>> dt=dateutil.parser.parse('Tue May 15 2018 00:00:00 GMT-0400 (EDT)')
datetime.datetime(2018, 5, 15, 0, 0, tzinfo=tzoffset('EDT', 14400))
>>> dt.strftime('%Y-%m-%d')
'2018-05-15'
You can try:
from datetime import datetime
date_converted = datetime.strptime(time_string, '%Y-%m-%d')
More info here, docs and here.

How to convert string date with timezone to datetime?

I have date in string:
Tue Oct 04 2016 12:13:00 GMT+0200 (CEST)
and I use (according to https://docs.python.org/2/library/datetime.html#strftime-strptime-behavior):
datetime.strptime(datetime_string, '%a %b %m %Y %H:%M:%S %z %Z')
but I get error:
ValueError: 'z' is a bad directive in format '%a %b %m %Y %H:%M:%S %z %Z'
How to do it correctly?
%z is the +0200, %Z is CEST. Therefore:
>>> s = "Tue Oct 04 2016 12:13:00 GMT+0200 (CEST)"
>>> datetime.strptime(s, '%a %b %d %Y %H:%M:%S GMT%z (%Z)')
datetime.datetime(2016, 10, 4, 12, 13, tzinfo=datetime.timezone(datetime.timedelta(0, 7200), 'CEST'))
I also replaced your %m with %d; %m is the month, numerically, so in your case 04 would be parsed as April.
python datetime can't parse the GMT part (You might want to specify it manually in your format). You can use dateutil instead:
In [16]: s = 'Tue Oct 04 2016 12:13:00 GMT+0200 (CEST)'
In [17]: from dateutil import parser
In [18]: parser.parse(s)
Out[18]: d = datetime.datetime(2016, 10, 4, 12, 13, tzinfo=tzoffset(u'CEST', -7200))
In [30]: d.utcoffset()
Out[30]: datetime.timedelta(-1, 79200)
In [31]: d.tzname()
Out[31]: 'CEST'
Simpler way to achieve this without taking care of datetime formatting identifiers will be the usage of dateutil.parser(). For example:
>>> import dateutil.parser
>>> date_string = 'Tue Oct 04 2016 12:13:00 GMT+0200 (CEST)'
>>> dateutil.parser.parse(date_string)
datetime.datetime(2016, 10, 4, 12, 13, tzinfo=tzoffset(u'CEST', -7200))
If you want to parse all you datetime data in a column in pandas DataFrame, you can use apply method to apply together with dateutil.parser.parse to parse whole column:
from dateutil.parser import parse
df['col_name'] = df['col_name'].apply(parse)

Convert string column to DateTime format

I have a DataFrame column where value is of string type 'June 6, 2016, 6' and I want to convert it into DataTime as 'YYYY-MM-DD HH:MM' format.
When tried convert by just taking value , I could able to convert it into right format.
import datetime
stringDate = "June 6, 2016, 11"
dateObject = datetime.datetime.strptime(stringDate, "%B %d, %Y, %H")
print dateObject
**Output : 2016-06-06 11:00:00**
But when I tried different options to apply the same conversion on python dataframe columns I'm not getting time part in the conversion.
**Option1**
df['Date'] = df.Date.apply(lambda x: dt.datetime.strptime(x, "%B %d, %Y, %H").date())
**Option2**
df['Date'] = pd.to_datetime(df['Date'] = df.Date.apply(lambda x: dt.datetime.strptime(x, "%B %d, %Y, %H"))
Output: both cases got 2016-06-06
Any suggestions will be appreciated.
I think you need add parameter format to to_datetime:
print (pd.to_datetime('June 6, 2016, 11', format='%B %d, %Y, %H'))
2016-06-06 11:00:00
It works with DataFrame too:
df = pd.DataFrame({'Date':['June 6, 2016, 11', 'May 6, 2016, 11']})
print (df)
Date
0 June 6, 2016, 11
1 May 6, 2016, 11
print (pd.to_datetime(df['Date'], format='%B %d, %Y, %H'))
0 2016-06-06 11:00:00
1 2016-05-06 11:00:00
Name: Date, dtype: datetime64[ns]

How to convert "Tue Aug 25 10:00:00 2015" this time stamp to "2015-08-25 10:00" in python

How to convert "Tue Aug 25 10:00:00 2015" this time stamp to ‍‍"2015-08-25 10:00" in python.
from datetime import datetime
date_object = datetime.strptime('Jun 1 2005 1:33PM', '%b %d %Y %I:%M%p')
With the correct format string, you can use datetime.strptime to parse the string and format it again:
import datetime
date = datetime.datetime.strptime('Tue Aug 25 10:00:00 2015', '%a %b %d %H:%M:%S %Y')
print date.strftime('%Y-%m-%d %H:%M')
use parser using pip install python-dateutil
>>>from dateutil import parser
>>>str(parser.parse("Tue Aug 25 10:00:00 2015"))
'2015-08-25 10:00:00'

Categories