1 hour off when converting RFC 2822 date to datetime - python

Here's what I'm trying to do:
>>> from email.utils import parsedate
>>> tup = parsedate("Fri, 22 Jan 2016 10:15:00 GMT")
>>> tup
(2016, 1, 22, 10, 15, 0, 0, 1, -1)
>>> import datetime
>>> import time
>>> timestamp = time.mktime(tup)
>>> timestamp
1453454100.0
>>> datetime.datetime.utcfromtimestamp(timestamp)
datetime.datetime(2016, 1, 22, 9, 15)
I'm using the email.utils.parsedate function to parse an RFC 2822 date to a struct_time. This looks correct, the hour part is 10. Then, I convert it to a timestamp using time.mktime, and then, I try to get a UTC datetime out of it using datetime.utcfromtimestamp, but for some odd reason, the hour in the datetime is 9. I don't really get why.
I'm in UTC+1, so there's probably a conversion to local time happening somewhere, but I have no clue where.

The problem is that mktime expects the tuple to be in local time. There's also calendar.gmtime, which expects it to be in UTC. I might actually just end up using dateutil as #Boaz recommends

I recommend just to use dateutil
https://pypi.python.org/pypi/python-dateutil
It converts it directly to a correct datetime object
from dateutil import parser
parser.parse("Fri, 22 Jan 2016 10:15:00 GMT")

From Time access and conversions:
time.mktime(t):
This is the inverse function of localtime(). Its argument is the struct_time or full 9-tuple (since the dst flag is needed; use -1 as the dst flag if it is unknown) which expresses the time in local time, not UTC.
For correct results you should use calendar.timegm():
>>> calendar.timegm(tup)
1453457700

You can also use datetime.
>>> from datetime import datetime as dt
>>> d = "Fri, 22 Jan 2016 10:15:00 GMT"
>>> dt.strptime(d, "%a, %d %b %Y %H:%M:%S %Z")
datetime.datetime(2016, 1, 22, 10, 15)

Related

How to convert datetime like `2021-06-25 15:00:08+00:00` to local timezone datetime.datetime python?

I have many datetime.datetime object like 2021-06-25 15:00:08+00:00 where the timezone is different for different data.Eg.another data is 2021-06-24 06:33:06-07:00 .I want to save all of them by converting into a local tmezone.How can I do that?
The datetime.datetime.astimezone() method will return a datetime object with the same UTC time but in the local timezone. For your example times:
>>> dt_1 = datettime.fromisoformat(2021-06-25 15:00:08+00:00)
>>> dt_1.astimezone()
datetime.datetime(2021, 6, 25, 11, 0, 8, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=72000), 'EDT'))
>>> dt_2 = datetime.fromisoformat(2021-06-24 06:33:06-07:00)
>>> dt_2.astimezone()
datetime.datetime(2021, 6, 24, 9, 33, 6, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=72000), 'EDT'))
Since datetime.datetime objects with tzinfo are timezone-aware, the information will be stored in the objects regardless. This is just a handy way to get the local time.
UPDATE, based on a follow-up question below:
astimezone() doesn't depend on the way the datetime object is created. For differently formatted date/time strings, datetime.strptime can be used to create a timezone-aware datetime objects. From the example given in that follow-up question:
>>> dt_3 = datetime.strptime('Sat, 26 Jun 2021 15:00:09 +0000 (UTC)',
'%a, %d %b %Y %H:%M:%S %z (%Z)')
>>> dt_3.astimezone()
datetime.datetime(2021, 6, 26, 11, 0, 9, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=72000), 'EDT'))
You could use pytz library
from datetime import datetime
import pytz
dt_input = datetime.fromisoformat('2021-06-24 06:33:06-07:00')
print(dt_input) # prints datetime in input timezone
local_tz = pytz.timezone('Asia/Kolkata') #provide your timezone here
dt_local = dt_input.astimezone(local_tz)
print(dt_local) #prints in your local timezone as provided above
You can refer to this SO question similar to your question:
How to convert a UTC datetime to a local datetime using only standard library?
EDIT:
Convert any string to datetime object:
You can use strptime('datestring', 'dateformat')
example from your comment:
#This will convert the string to datetime object
datetime.strptime('Sat, 26 Jun 2021 15:00:09 +0000 (UTC)','%a, %d %b %Y %H:%M:%S %z (%Z)')
Once it is converted to datetime object you can convert it to your local timezone as mentioned above

Can't make sense of date conversion to UTC

I have a string containing a date and time in UTC (not part of the string, but I know that it's UTC). So I create an aware datetime object using the following code:
>>> import datetime
>>> import pytz
>>> mystr = '01/09/2018 00:15:00'
>>> start_time = pytz.utc.localize(datetime.datetime.strptime(mystr, '%d/%m/%Y %H:%M:%S'))
>>> start_time
datetime.datetime(2018, 9, 1, 0, 15, tzinfo=<UTC>)
>>> str(start_time)
'2018-09-01 00:15:00+00:00'
>>> start_time.strftime('%s')
'1535757300'
All seems fine now but if I do in the shell:
$ TZ=UTC date -d #1535757300
Fri Aug 31 23:15:00 UTC 2018
Shouldn't I be getting Sat Sep 1 00:15:00 UTC 2018 instead (ie, the same date I started with)?
You need to use the following:
int((start_time - datetime.datetime(1970,1,1)).total_seconds())
Because you are using strftime, it's referencing your systems time which is probably in your local timezone. It's more reliable to just calculate it yourself as above.
EDIT:
As the follow-up post by OP states, it's easier to just use start_time.timestamp() but that works only in Python 3.3* and up.
After some more tries, it looks like %s is not supported among the format specifiers listed in the official documentation at https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior. Instead, I get the correct result by using the timestamp() method:
>>> start_time.timestamp()
1535760900.0

Python UTC datetime object's ISO format doesn't include Z (Zulu or Zero offset)

Why python 2.7 doesn't include Z character (Zulu or zero offset) at the end of UTC datetime object's isoformat string unlike JavaScript?
>>> datetime.datetime.utcnow().isoformat()
'2013-10-29T09:14:03.895210'
Whereas in javascript
>>> console.log(new Date().toISOString());
2013-10-29T09:38:41.341Z
Option: isoformat()
Python's datetime does not support the military timezone suffixes like 'Z' suffix for UTC. The following simple string replacement does the trick:
In [1]: import datetime
In [2]: d = datetime.datetime(2014, 12, 10, 12, 0, 0)
In [3]: str(d).replace('+00:00', 'Z')
Out[3]: '2014-12-10 12:00:00Z'
str(d) is essentially the same as d.isoformat(sep=' ')
See: Datetime, Python Standard Library
Option: strftime()
Or you could use strftime to achieve the same effect:
In [4]: d.strftime('%Y-%m-%dT%H:%M:%SZ')
Out[4]: '2014-12-10T12:00:00Z'
Note: This option works only when you know the date specified is in UTC.
See: datetime.strftime()
Additional: Human Readable Timezone
Going further, you may be interested in displaying human readable timezone information, pytz with strftime %Z timezone flag:
In [5]: import pytz
In [6]: d = datetime.datetime(2014, 12, 10, 12, 0, 0, tzinfo=pytz.utc)
In [7]: d
Out[7]: datetime.datetime(2014, 12, 10, 12, 0, tzinfo=<UTC>)
In [8]: d.strftime('%Y-%m-%d %H:%M:%S %Z')
Out[8]: '2014-12-10 12:00:00 UTC'
Python datetime objects don't have time zone info by default, and without it, Python actually violates the ISO 8601 specification (if no time zone info is given, assumed to be local time). You can use the pytz package to get some default time zones, or directly subclass tzinfo yourself:
from datetime import datetime, tzinfo, timedelta
class simple_utc(tzinfo):
def tzname(self,**kwargs):
return "UTC"
def utcoffset(self, dt):
return timedelta(0)
Then you can manually add the time zone info to utcnow():
>>> datetime.utcnow().replace(tzinfo=simple_utc()).isoformat()
'2014-05-16T22:51:53.015001+00:00'
Note that this DOES conform to the ISO 8601 format, which allows for either Z or +00:00 as the suffix for UTC. Note that the latter actually conforms to the standard better, with how time zones are represented in general (UTC is a special case.)
Short answer
datetime.now(timezone.utc).isoformat().replace("+00:00", "Z")
Long answer
The reason that the "Z" is not included is because datetime.now() and even datetime.utcnow() return timezone naive datetimes, that is to say datetimes with no timezone information associated. To get a timezone aware datetime, you need to pass a timezone as an argument to datetime.now. For example:
from datetime import datetime, timezone
datetime.utcnow()
#> datetime.datetime(2020, 9, 3, 20, 58, 49, 22253)
# This is timezone naive
datetime.now(timezone.utc)
#> datetime.datetime(2020, 9, 3, 20, 58, 49, 22253, tzinfo=datetime.timezone.utc)
# This is timezone aware
Once you have a timezone aware timestamp, isoformat will include a timezone designation. Thus, you can then get an ISO 8601 timestamp via:
datetime.now(timezone.utc).isoformat()
#> '2020-09-03T20:53:07.337670+00:00'
"+00:00" is a valid ISO 8601 timezone designation for UTC. If you want to have "Z" instead of "+00:00", you have to do the replacement yourself:
datetime.now(timezone.utc).isoformat().replace("+00:00", "Z")
#> '2020-09-03T20:53:07.337670Z'
The following javascript and python scripts give identical outputs. I think it's what you are looking for.
JavaScript
new Date().toISOString()
Python
from datetime import datetime
datetime.utcnow().isoformat()[:-3]+'Z'
The output they give is the UTC (zulu) time formatted as an ISO string with a 3 millisecond significant digit and appended with a Z.
2019-01-19T23:20:25.459Z
Your goal shouldn't be to add a Z character, it should be to generate a UTC "aware" datetime string in ISO 8601 format. The solution is to pass a UTC timezone object to datetime.now() instead of using datetime.utcnow():
from datetime import datetime, timezone
datetime.now(timezone.utc)
>>> datetime.datetime(2020, 1, 8, 6, 6, 24, 260810, tzinfo=datetime.timezone.utc)
datetime.now(timezone.utc).isoformat()
>>> '2020-01-08T06:07:04.492045+00:00'
That looks good, so let's see what Django and dateutil think:
from django.utils.timezone import is_aware
is_aware(datetime.now(timezone.utc))
>>> True
from dateutil.parser import isoparse
is_aware(isoparse(datetime.now(timezone.utc).isoformat()))
>>> True
Note that you need to use isoparse() from dateutil.parser because the Python documentation for datetime.fromisoformat() says it "does not support parsing arbitrary ISO 8601 strings".
Okay, the Python datetime object and the ISO 8601 string are both UTC "aware". Now let's look at what JavaScript thinks of the datetime string. Borrowing from this answer we get:
let date = '2020-01-08T06:07:04.492045+00:00';
const dateParsed = new Date(Date.parse(date))
document.write(dateParsed);
document.write("\n");
// Tue Jan 07 2020 22:07:04 GMT-0800 (Pacific Standard Time)
document.write(dateParsed.toISOString());
document.write("\n");
// 2020-01-08T06:07:04.492Z
document.write(dateParsed.toUTCString());
document.write("\n");
// Wed, 08 Jan 2020 06:07:04 GMT
Notes:
I approached this problem with a few goals:
generate a UTC "aware" datetime string in ISO 8601 format
use only Python Standard Library functions for datetime object and string creation
validate the datetime object and string with the Django timezone utility function, the dateutil parser and JavaScript functions
Note that this approach does not include a Z suffix and does not use utcnow(). But it's based on the recommendation in the Python documentation and it passes muster with both Django and JavaScript.
See also:
Stop using utcnow and utcfromtimestamp
What is the “right” JSON date format?
In Python >= 3.2 you can simply use this:
>>> from datetime import datetime, timezone
>>> datetime.now(timezone.utc).isoformat()
'2019-03-14T07:55:36.979511+00:00'
Python datetimes are a little clunky. Use arrow.
> str(arrow.utcnow())
'2014-05-17T01:18:47.944126+00:00'
Arrow has essentially the same api as datetime, but with timezones and some extra niceties that should be in the main library.
A format compatible with Javascript can be achieved by:
arrow.utcnow().isoformat().replace("+00:00", "Z")
'2018-11-30T02:46:40.714281Z'
Javascript Date.parse will quietly drop microseconds from the timestamp.
I use pendulum:
import pendulum
d = pendulum.now("UTC").to_iso8601_string()
print(d)
>>> 2019-10-30T00:11:21.818265Z
There are a lot of good answers on the post, but I wanted the format to come out exactly as it does with JavaScript. This is what I'm using and it works well.
In [1]: import datetime
In [1]: now = datetime.datetime.utcnow()
In [1]: now.strftime('%Y-%m-%dT%H:%M:%S') + now.strftime('.%f')[:4] + 'Z'
Out[3]: '2018-10-16T13:18:34.856Z'
Using only standard libraries, making no assumption that the timezone is already UTC, and returning the exact format requested in the question:
dt.astimezone(timezone.utc).replace(tzinfo=None).isoformat(timespec='milliseconds') + 'Z'
This does require Python 3.6 or later though.
>>> import arrow
>>> now = arrow.utcnow().format('YYYY-MM-DDTHH:mm:ss.SSS')
>>> now
'2018-11-28T21:34:59.235'
>>> zulu = "{}Z".format(now)
>>> zulu
'2018-11-28T21:34:59.235Z'
Or, to get it in one fell swoop:
>>> zulu = "{}Z".format(arrow.utcnow().format('YYYY-MM-DDTHH:mm:ss.SSS'))
>>> zulu
'2018-11-28T21:54:49.639Z'
By combining all answers above I came with following function :
from datetime import datetime, tzinfo, timedelta
class simple_utc(tzinfo):
def tzname(self,**kwargs):
return "UTC"
def utcoffset(self, dt):
return timedelta(0)
def getdata(yy, mm, dd, h, m, s) :
d = datetime(yy, mm, dd, h, m, s)
d = d.replace(tzinfo=simple_utc()).isoformat()
d = str(d).replace('+00:00', 'Z')
return d
print getdata(2018, 02, 03, 15, 0, 14)
pip install python-dateutil
>>> a = "2019-06-27T02:14:49.443814497Z"
>>> dateutil.parser.parse(a)
datetime.datetime(2019, 6, 27, 2, 14, 49, 443814, tzinfo=tzutc())

Comparing dates in Python - how to handle time zone modifiers

I am doing Python date comparizon:
Assume I have a date like this: 'Fri Aug 17 12:34:00 2012 +0000'
I am parsing it in the following manner:
dt=datetime.strptime('Fri Aug 17 12:34:00 2012 +0000', '%a %b %d %H:%M:%S %Y +0000')
I could not find on the documentation page how to handle the remaining +0000?
I want to have a more generic solution then hardcoded value.
Perhaps this is quite easy, any hint?
The default datetime module does not handle timezones very well; beyond your current machine timezone and UTC, they are basically not supported.
You'll have to use an external library for that or handle the timezone offset manually.
External library options:
Use dateutil.parser can handle just about any date and or time format you care to throw at it:
from dateutil import parser
dt = parser.parse(s)
The iso8601 library handles only ISO 8601 formats, which include timezone offsets of the same form:
import iso8601
datetimetext, tz = s.rsplit(None, 1) # only grab the timezone portion.
timezone = iso8601.iso8601.parse_timezone('{}:{}'.format(tz[:3], tz[3:]))
dt = datetime.strptime(datetimetext, '%a %b %d %H:%M:%S %Y').replace(tzinfo=timezone)
Demonstration of each approach:
>>> import datetime
>>> s = 'Fri Aug 17 12:34:00 2012 +0000'
>>> import iso8601
>>> timezone = iso8601.iso8601.parse_timezone('{}:{}'.format(tz[:3], tz[3:]))
>>> datetime.datetime.strptime(datetimetext, '%a %b %d %H:%M:%S %Y').replace(tzinfo=timezone)
datetime.datetime(2012, 8, 17, 12, 34, tzinfo=<FixedOffset '+00:00'>)
>>> from dateutil import parser
>>> parser.parse(s)
datetime.datetime(2012, 8, 17, 12, 34, tzinfo=tzutc())
You might want also give Delorean a look. Which is a wrapper around both pytz and dateutil it provides timezone manipulation as well as easy datetime time zone shift.
Here is how I would solve your question with Delorean.
>>> from delorean import parse
>>> parse("2011/01/01 00:00:00 -0700")
Delorean(datetime=2011-01-01 07:00:00+00:00, timezone=UTC)
From there you can return the datetime by simply return the .datetime attribute. If you wan to do some time shifts simply use the Delorean object and do .shift("UTC") etc.
Use the dateutil.parser:
>>> import dateutil.parser
>>> dateutil.parser.parse('Fri Aug 17 12:34:00 2012 +0000')
>>> datetime.datetime(2012, 8, 17, 12, 34, tzinfo=tzutc())

Parsing time string in Python

I have a date time string that I don't know how to parse it in Python.
The string is like this:
Tue May 08 15:14:45 +0800 2012
I tried
datetime.strptime("Tue May 08 15:14:45 +0800 2012","%a %b %d %H:%M:%S %z %Y")
but Python raises
'z' is a bad directive in format '%a %b %d %H:%M:%S %z %Y'
According to Python doc:
%z UTC offset in the form +HHMM or -HHMM (empty string if the the object is naive).
What is the right format to parse this time string?
datetime.datetime.strptime has problems with timezone parsing. Have a look at the dateutil package:
>>> from dateutil import parser
>>> parser.parse("Tue May 08 15:14:45 +0800 2012")
datetime.datetime(2012, 5, 8, 15, 14, 45, tzinfo=tzoffset(None, 28800))
Your best bet is to have a look at strptime()
Something along the lines of
>>> from datetime import datetime
>>> date_str = 'Tue May 08 15:14:45 +0800 2012'
>>> date = datetime.strptime(date_str, '%a %B %d %H:%M:%S +0800 %Y')
>>> date
datetime.datetime(2012, 5, 8, 15, 14, 45)
Im not sure how to do the +0800 timezone unfortunately, maybe someone else can help out with that.
The formatting strings can be found at http://docs.python.org/library/time.html#time.strftime and are the same for formatting the string for printing.
Hope that helps
Mark
PS, Your best bet for timezones in installing pytz from pypi. ( http://pytz.sourceforge.net/ )
in fact I think pytz has a great datetime parsing method if i remember correctly. The standard lib is a little thin on the ground with timezone functionality.
Here's a stdlib solution that supports a variable utc offset in the input time string:
>>> from email.utils import parsedate_tz, mktime_tz
>>> from datetime import datetime, timedelta
>>> timestamp = mktime_tz(parsedate_tz('Tue May 08 15:14:45 +0800 2012'))
>>> utc_time = datetime(1970, 1, 1) + timedelta(seconds=timestamp)
>>> utc_time
datetime.datetime(2012, 5, 8, 7, 14, 45)
It has discussed many times in SO. In short, "%z" is not supported because platform not support it.
My solution is a new one, just skip the time zone.:
datetime.datetime.strptime(re.sub(r"[+-]([0-9])+", "", "Tue May 08 15:14:45 +0800 2012"),"%a %b %d %H:%M:%S %Y")
In [117]: datetime.datetime.strptime?
Type: builtin_function_or_method
Base Class: <type 'builtin_function_or_method'>
String Form: <built-in method strptime of type object at 0x9a2520>
Namespace: Interactive
Docstring:
string, format -> new datetime parsed from a string (like time.strptime()).

Categories