I'm trying to convert strings in a list to datetime format on Python. I am unable to use pd.DateTime at the moment. The imported datetime package doesn't seem to work. I'm new to this.
Please help.
Cheers.
Code Image
You should consider using official datetime formats
Example:
from datetime import datetime
#datetime(year, month, day)
date = datetime(2018, 11, 28)
# datetime(year, month, day, hour, minute, second, microsecond)
date = datetime(2017, 11, 28, 23, 55, 59, 342380)
you can use strptime
from datetime import datetime, strptime
my_datetime_list = [strptime(string_date, '%y-%m-%d') for string_date in list_of_string_dates]
what is the best method in Python to convert a string to a given format? My problem is that I have scraped dates that have the following format: Dec 13, 2019 6:01 am
Ideally I want to analyse the scraped data in excel, but unfortunately Excel can not read this date format.
Do you think it is best to do that in Python or in Excel?
Thanks
You can definetely do this with Python using either standard library, or dateparser package.
>>> import dateparser
>>> dateparser.parse('Dec 13, 2019 6:01 am')
datetime.datetime(2019, 12, 13, 6, 1)
Or directly to ISO format:
>>> dateparser.parse('Dec 13, 2019 6:01 am').isoformat()
'2019-12-13T06:01:00'
Another thing to look out for when working with time programmatically is time zone - it's where bugs are very likely to appear. There's a very sweet package for working with datetime data in python called pendulum, I cannot stress enough how convenient it is. And it's API is completely compatible with python's standard library datetime. So you can just do import pendulum as dt instead of import datetime as dt and it will work.
It also has a great parser tool with support for time zones:
>>> import pendulum
>>> dt = pendulum.parse('1975-05-21T22:00:00')
>>> print(dt)
'1975-05-21T22:00:00+00:00
# You can pass a tz keyword to specify the timezone
>>> dt = pendulum.parse('1975-05-21T22:00:00', tz='Europe/Paris')
>>> print(dt)
'1975-05-21T22:00:00+01:00'
# Not ISO 8601 compliant but common
>>> dt = pendulum.parse('1975-05-21 22:00:00')
By passing the tz keyword argument you can parse and specify time zone at the same time.
You can use strptime()
to convert string to a datetime format.
>>> utc_time = datetime.strptime("Dec 13, 2019 6:01 am", "%b %d, %Y %I:%M %p")
>>> utc_time.strftime("%d-%m-%Y %R")
'13-12-2019 06:01'
you can use pythons inbuilt datetime library.
check this: https://docs.python.org/3.6/library/datetime.html
I have a Python datetime string that is timezone aware and need to convert it to UTC timestamp.
'2016-07-15T10:00:00-06:00'
Most of the SO links talks about getting the current datetime in UTC but not on converting the given datetime to UTC.
Hi this was a bit tricky, but here is my, probably far from perfect, answer:
[IN]
import datetime
import pytz
date_str = '2016-07-15T10:00:00-06:00'
# Have to get rid of that bothersome final colon for %z to work
datetime_object = datetime.datetime.strptime(date_str[:-3] + date_str[-2:],
'%Y-%m-%dT%H:%M:%S%z')
datetime_object.astimezone(pytz.utc)
[OUT]
datetime.datetime(2016, 7, 15, 16, 0, tzinfo=<UTC>)
Why python 2.7 doesn't include Z character (Zulu or zero offset) at the end of UTC datetime object's isoformat string unlike JavaScript?
>>> datetime.datetime.utcnow().isoformat()
'2013-10-29T09:14:03.895210'
Whereas in javascript
>>> console.log(new Date().toISOString());
2013-10-29T09:38:41.341Z
Option: isoformat()
Python's datetime does not support the military timezone suffixes like 'Z' suffix for UTC. The following simple string replacement does the trick:
In [1]: import datetime
In [2]: d = datetime.datetime(2014, 12, 10, 12, 0, 0)
In [3]: str(d).replace('+00:00', 'Z')
Out[3]: '2014-12-10 12:00:00Z'
str(d) is essentially the same as d.isoformat(sep=' ')
See: Datetime, Python Standard Library
Option: strftime()
Or you could use strftime to achieve the same effect:
In [4]: d.strftime('%Y-%m-%dT%H:%M:%SZ')
Out[4]: '2014-12-10T12:00:00Z'
Note: This option works only when you know the date specified is in UTC.
See: datetime.strftime()
Additional: Human Readable Timezone
Going further, you may be interested in displaying human readable timezone information, pytz with strftime %Z timezone flag:
In [5]: import pytz
In [6]: d = datetime.datetime(2014, 12, 10, 12, 0, 0, tzinfo=pytz.utc)
In [7]: d
Out[7]: datetime.datetime(2014, 12, 10, 12, 0, tzinfo=<UTC>)
In [8]: d.strftime('%Y-%m-%d %H:%M:%S %Z')
Out[8]: '2014-12-10 12:00:00 UTC'
Python datetime objects don't have time zone info by default, and without it, Python actually violates the ISO 8601 specification (if no time zone info is given, assumed to be local time). You can use the pytz package to get some default time zones, or directly subclass tzinfo yourself:
from datetime import datetime, tzinfo, timedelta
class simple_utc(tzinfo):
def tzname(self,**kwargs):
return "UTC"
def utcoffset(self, dt):
return timedelta(0)
Then you can manually add the time zone info to utcnow():
>>> datetime.utcnow().replace(tzinfo=simple_utc()).isoformat()
'2014-05-16T22:51:53.015001+00:00'
Note that this DOES conform to the ISO 8601 format, which allows for either Z or +00:00 as the suffix for UTC. Note that the latter actually conforms to the standard better, with how time zones are represented in general (UTC is a special case.)
Short answer
datetime.now(timezone.utc).isoformat().replace("+00:00", "Z")
Long answer
The reason that the "Z" is not included is because datetime.now() and even datetime.utcnow() return timezone naive datetimes, that is to say datetimes with no timezone information associated. To get a timezone aware datetime, you need to pass a timezone as an argument to datetime.now. For example:
from datetime import datetime, timezone
datetime.utcnow()
#> datetime.datetime(2020, 9, 3, 20, 58, 49, 22253)
# This is timezone naive
datetime.now(timezone.utc)
#> datetime.datetime(2020, 9, 3, 20, 58, 49, 22253, tzinfo=datetime.timezone.utc)
# This is timezone aware
Once you have a timezone aware timestamp, isoformat will include a timezone designation. Thus, you can then get an ISO 8601 timestamp via:
datetime.now(timezone.utc).isoformat()
#> '2020-09-03T20:53:07.337670+00:00'
"+00:00" is a valid ISO 8601 timezone designation for UTC. If you want to have "Z" instead of "+00:00", you have to do the replacement yourself:
datetime.now(timezone.utc).isoformat().replace("+00:00", "Z")
#> '2020-09-03T20:53:07.337670Z'
The following javascript and python scripts give identical outputs. I think it's what you are looking for.
JavaScript
new Date().toISOString()
Python
from datetime import datetime
datetime.utcnow().isoformat()[:-3]+'Z'
The output they give is the UTC (zulu) time formatted as an ISO string with a 3 millisecond significant digit and appended with a Z.
2019-01-19T23:20:25.459Z
Your goal shouldn't be to add a Z character, it should be to generate a UTC "aware" datetime string in ISO 8601 format. The solution is to pass a UTC timezone object to datetime.now() instead of using datetime.utcnow():
from datetime import datetime, timezone
datetime.now(timezone.utc)
>>> datetime.datetime(2020, 1, 8, 6, 6, 24, 260810, tzinfo=datetime.timezone.utc)
datetime.now(timezone.utc).isoformat()
>>> '2020-01-08T06:07:04.492045+00:00'
That looks good, so let's see what Django and dateutil think:
from django.utils.timezone import is_aware
is_aware(datetime.now(timezone.utc))
>>> True
from dateutil.parser import isoparse
is_aware(isoparse(datetime.now(timezone.utc).isoformat()))
>>> True
Note that you need to use isoparse() from dateutil.parser because the Python documentation for datetime.fromisoformat() says it "does not support parsing arbitrary ISO 8601 strings".
Okay, the Python datetime object and the ISO 8601 string are both UTC "aware". Now let's look at what JavaScript thinks of the datetime string. Borrowing from this answer we get:
let date = '2020-01-08T06:07:04.492045+00:00';
const dateParsed = new Date(Date.parse(date))
document.write(dateParsed);
document.write("\n");
// Tue Jan 07 2020 22:07:04 GMT-0800 (Pacific Standard Time)
document.write(dateParsed.toISOString());
document.write("\n");
// 2020-01-08T06:07:04.492Z
document.write(dateParsed.toUTCString());
document.write("\n");
// Wed, 08 Jan 2020 06:07:04 GMT
Notes:
I approached this problem with a few goals:
generate a UTC "aware" datetime string in ISO 8601 format
use only Python Standard Library functions for datetime object and string creation
validate the datetime object and string with the Django timezone utility function, the dateutil parser and JavaScript functions
Note that this approach does not include a Z suffix and does not use utcnow(). But it's based on the recommendation in the Python documentation and it passes muster with both Django and JavaScript.
See also:
Stop using utcnow and utcfromtimestamp
What is the “right” JSON date format?
In Python >= 3.2 you can simply use this:
>>> from datetime import datetime, timezone
>>> datetime.now(timezone.utc).isoformat()
'2019-03-14T07:55:36.979511+00:00'
Python datetimes are a little clunky. Use arrow.
> str(arrow.utcnow())
'2014-05-17T01:18:47.944126+00:00'
Arrow has essentially the same api as datetime, but with timezones and some extra niceties that should be in the main library.
A format compatible with Javascript can be achieved by:
arrow.utcnow().isoformat().replace("+00:00", "Z")
'2018-11-30T02:46:40.714281Z'
Javascript Date.parse will quietly drop microseconds from the timestamp.
I use pendulum:
import pendulum
d = pendulum.now("UTC").to_iso8601_string()
print(d)
>>> 2019-10-30T00:11:21.818265Z
There are a lot of good answers on the post, but I wanted the format to come out exactly as it does with JavaScript. This is what I'm using and it works well.
In [1]: import datetime
In [1]: now = datetime.datetime.utcnow()
In [1]: now.strftime('%Y-%m-%dT%H:%M:%S') + now.strftime('.%f')[:4] + 'Z'
Out[3]: '2018-10-16T13:18:34.856Z'
Using only standard libraries, making no assumption that the timezone is already UTC, and returning the exact format requested in the question:
dt.astimezone(timezone.utc).replace(tzinfo=None).isoformat(timespec='milliseconds') + 'Z'
This does require Python 3.6 or later though.
>>> import arrow
>>> now = arrow.utcnow().format('YYYY-MM-DDTHH:mm:ss.SSS')
>>> now
'2018-11-28T21:34:59.235'
>>> zulu = "{}Z".format(now)
>>> zulu
'2018-11-28T21:34:59.235Z'
Or, to get it in one fell swoop:
>>> zulu = "{}Z".format(arrow.utcnow().format('YYYY-MM-DDTHH:mm:ss.SSS'))
>>> zulu
'2018-11-28T21:54:49.639Z'
By combining all answers above I came with following function :
from datetime import datetime, tzinfo, timedelta
class simple_utc(tzinfo):
def tzname(self,**kwargs):
return "UTC"
def utcoffset(self, dt):
return timedelta(0)
def getdata(yy, mm, dd, h, m, s) :
d = datetime(yy, mm, dd, h, m, s)
d = d.replace(tzinfo=simple_utc()).isoformat()
d = str(d).replace('+00:00', 'Z')
return d
print getdata(2018, 02, 03, 15, 0, 14)
pip install python-dateutil
>>> a = "2019-06-27T02:14:49.443814497Z"
>>> dateutil.parser.parse(a)
datetime.datetime(2019, 6, 27, 2, 14, 49, 443814, tzinfo=tzutc())
I have a datetime string in the form of a string as:
2011-10-23T08:00:00-07:00
How do i parse this string as the datetime object.
I did the following reading the documentation:
date = datetime.strptime(data[4],"%Y-%m-%d%Z")
BUt I get the error
ValueError: time data '2011-10-23T08:00:00-07:00' does not match format '%Y-%m-%d%Z'
which is very clear.
But I am not sure how to read this format.
Any suggestions.
Thanks
Edit: Also, I must add, all I care about is the date part
Standard datetime.datetime.strptime has problems with timezone definitions. Use dateutil.parser
>>> from dateutil import parser
>>> parser.parse("2011-10-23T08:00:00-07:00")
datetime.datetime(2011, 10, 23, 8, 0, tzinfo=tzoffset(None, -25200))
If you care about the date part only, you can try it without dateutil.parser:
>>> from datetime import datetime
>>> datetime.strptime(data[4].partition('T')[0], '%Y-%m-%d').date()
datetime.date(2011, 10, 23)