datetuil results in wrong utc offset when using lower case

datetuil results in wrong utc offset when using lower case - python

I'm trying to convert dates into different timezones using dateutil.
I've noticed, that using lower case letters when creating the tz objects is leading to incorrect timezones.
>>> from datetime import datetime
>>> from dateutil import tz
>>> tz.gettz('utc-6')
tzstr('utc-6')
>>> tz.gettz('UTC-6')
tzstr('UTC-6')
Up to this point, everything seems to be correct, however when passing the tzstr into a datetime object, the lower-case tzstr will apply the inverted offset. This happens to both, + and -, for all offsets.
I used %z to show the applied offset, but the issue is affecting all datetime operations.
>>> datetime.now(tz.gettz('UTC-6')).strftime('%z')
'-0600'
>>> datetime.now(tz.gettz('utc-6')).strftime('%z')
'+0600'
>>> datetime.now(tz.gettz('utc-06')).strftime('%z')
'+0600'
>>>
>>> datetime.now(tz.gettz('utc-8')).strftime('%z')
'+0800'
>>>
Is there a big oversight by myself, or is it indeed a bug in the package?
I couldn't find anything in the docs limiting the user input to capitalized UTC+X.
(Python version is tags/v3.9.9:ccb0e6a on Windows and python-dateutil 2.8.2, but occurs as well on Linux with python 2.8.2)

The dateutil library, by default, diverges from POSIX-style time zones, which "use an inverted offset format, so normally GMT+3 would be parsed as an offset 3 hours behind GMT. The tzstr time zone object will parse this as an offset 3 hours ahead of GMT" (dateutil documentation on dateutil.tz.tzstr()).
The library determines when to invert the offset by checking if your timezone abbreviation is either "UTC" or "GMT" (case sensitive), then multiplies the offset by -1. Because the check is case sensitive, the offset for "UTC-6" does properly get offset to -0600 but "utc-6" does not (tzstr source code).
# Here we break the compatibility with the TZ variable handling.
# GMT-3 actually *means* the timezone -3.
if res.stdabbr in ("GMT", "UTC") and not posix_offset:
res.stdoffset *= -1
Regardless of if this was an intentional decision, with the current tzstr implementation, you should uppercase your timezone strings before passing them in to gettz().

Related

What's the correct datetime format for the specified date string?

I have the following date string: 2019-05-12T14:52:13.136621898Z
For the life of me I can't figure out the format. The closest datetime format string that would make sense to me is:
%Y-%m-%dT%H:%M:%S.%fZ
I've searched StackOverflow, Google and the Python docs.

For such issues, it is worthy to look at datetime module and it's parser.parse function, which parses any datetime string without you needing to provide the format!
from dateutil import parser
dt_obj = parser.parse('2019-05-12T14:52:13.136621898Z')
print(dt_obj)
Also the closest format which fits your requirement is '%Y-%m-%dT%H:%M:%S.%fZ' which works with 2019-05-12T14:52:13.136621Z where '.%f` encapsulates microseconds with 6 decimal places but since your decimal has 9 decimal places, this won't work for you!

The format is called "round trip format" and Python does not have a format specifier for it, while e.g. .NET has the o format specifier. Therefore it might be a bit harder to craft a format string:
In your case it has 9 digits for sub-seconds. #Devesh said it should be 8 digits and the .NET framework uses 7 digits for it. But generally, %f should be ok.
You can't use a hard coded Z at the end, because that's the time zone. Z would only work for UTC, but would ignore all other time zones. The time zone format specifier is %z
As datetime says:
utcoffset() is transformed into a string of the form ±HHMM[SS[.ffffff]], where HH is a 2-digit string giving the number of UTC offset hours, MM is a 2-digit string giving the number of UTC offset minutes, SS is a 2-digit string giving the number of UTC offset seconds and ffffff is a 6-digit string giving the number of UTC offset microseconds. The ffffff part is omitted when the offset is a whole number of seconds and both the ffffff and the SS part is omitted when the offset is a whole number of minutes. For example, if utcoffset() returns timedelta(hours=-3, minutes=-30), %z is replaced with the string '-0330'
and
For example, '+01:00:00' will be parsed as an offset of one hour. In addition, providing 'Z' is identical to '+00:00'

Does python time.strftime process timezone options correctly (for RFC 3339)

I'm trying to get RFC3339 compatible output from python's time module, using the time.strftime() function.
With the Linux 'date' command, I can use a format string like the following: "date +%F_%T%:z"
$ date +%F_%T%:z
2017-06-29_16:13:29-07:00
When used with python time.strftime, the %:z appears to not be supported.
$ python
>>> import time
>>> print time.strftime("%F %T%:z")
2017-06-29 16:16:15%:z
Apparently, '%z' is supported, but '%:z' is not:
>>> print time.strftime("%F %T%z")
2017-05-29 16:15:35-0700
RFC3339 specifically uses the timezone offset with the embedded colon.
That would be 07:00 in my case, instead of 0700.
I believe the omission of support for the "%:z' option is due to the underlying C implementation of strftime() not supporting the versions of timezone offset formatters with colons. That is '%:z', '%::z', etc.
Is there any workaround for this (e.g. another python module, or some option I'm missing int the 'time' module), other than writing code to get %z output and reformat it in %:z format, to solve this problem?
EDIT: Another question (Generate RFC 3339 timestamp in Python) gives solutions for other modules that can be used to output RFC3339 output. I'm going to self-answer with information that I found for the question in the title.

The strict answer to the question in the title "Does python time.strftime process timezone options correctly (for RFC3339)?" is: No.
The "%:z" supported by the Linux 'date' command is a GNU extension, and is not in the POSIX spec, or in the C implementation of strftime (as of this writing).
With regards to workarounds (requested in the body of the question), answers in Generate RFC 3339 timestamp in Python can be used as alternatives time.strftime to output RFC3339-compliant software.
Specifically, I used the pytz module to get timezone information, and datetime class isoformat() function to print in RFC3339-compliant format (with a colon in the timezone offset portion of the output). Like so:
(in Python 2.7 on Ubuntu 14.04)
>>> import pytz, datetime
>>> latz = pytz.timezone("America/Los_Angeles")
>>> latz
<DstTzInfo 'America/Los_Angeles' PST-1 day, 16:00:00 STD>
>>> dt = datetime.datetime.now(latz)
>>> dt2 = datetime.datetime(dt.year, dt.month, dt.day, dt.hour, dt.minute, dt.second, 0, latz)
>>> dt2.isoformat()
'2017-07-06T11:50:07-08:00'
Note the conversion from dt to dt2, to set microseconds to 0. This prevents isoformat from printing microseconds as a decimal portion of seconds in the isoformat output (which RFC3339 does not support)

Converting Iso8601 with non-decimal UTC zone offest [duplicate]

This question already has answers here:
How do I translate an ISO 8601 datetime string into a Python datetime object? [duplicate]
(11 answers)
Closed 7 years ago.
How do I modify the code below to handle a timezone, note there is no decimal.
2015-12-22T11:57:11-08:00, -8:00 is causing me issues, does epoch time take time zone into account?
timegm(datetime.strptime(datestring, "%Y-%m-%dT%H:%M:%S.%f").timetuple())

There are a few issues here.
If your time doesn't include a microsecond block, this date string will not work with the format string you provided. Just try it without the -08:00 bit. That means you either need to assume that all your times won't have that block or you need to account for both possibilities.
strptime does a TERRIBLE job of dealing with ISO8601 offsets. If you look at the formatting guide, you'll notice that you can use %z for +/-HHMM, but ISO8601 time zones (in my experience) are almost always presented with the format +/-HH:MM. And even then %z has a nasty habit of being called a bad directive in the format string.
To answer your question, yes, time zone matters. The UNIX epoch is seconds since 1970-01-01T00:00:00+00:00. More importantly, even if you correctly assign the datetime object's tzinfo when you parse the string, timetuple will NOT take into account that tzinfo. You need to use utctimetuple
So now you just need to properly parse the datetime. There are solutions that don't use external libraries, but I find the easiest way to parse ISO8601 date strings is to use the python-dateutil package available via pip:
>>> import calendar
>>> from dateutil import parser
>>> datestring = '2015-12-22T11:57:11-08:00'
>>> tz_aware_datetime = parser.parse(datestring)
>>> tz_aware_datetime
datetime.datetime(2015, 12, 22, 11, 57, 11, tzinfo=tzoffset(None, -28800))
>>> calendar.timegm(tz_aware_datetime.utctimetuple())
1450814231

time.strftime() incorrect timezone format [duplicate]

Every time I use:
time.strftime("%z")
I get:
Eastern Daylight Time
However, I would like the UTC offset in the form +HHMM or -HHMM. I have even tried:
time.strftime("%Z")
Which still yields:
Eastern Daylight Time
I have read several other posts related to strftime() and %z always seems to return the UTC offset in the proper +HHMM or -HHMM format. How do I get strftime() to output in the +HHMM or -HHMM format for python 3.3?
Edit: I'm running Windows 7

In 2.x, if you look at the docs for time.strftime, they don't even mention %z. It's not guaranteed to exist at all, much less to be consistent across platforms. In fact, as footnote 1 implies, it's left up to the C strftime function. In 3.x, on the other hand, they do mention %z, and the footnote that explains that it doesn't work the way you'd expect is not easy to see; that's an open bug.
However, in 2.6+ (including all 3.x versions), datetime.strftime is guaranteed to support %z as "UTC offset in the form +HHMM or -HHMM (empty string if the the object is naive)." So, that makes for a pretty easy workaround: use datetime instead of time. Exactly how to change things depends on what exactly you're trying to do — using Python-dateutil tz then datetime.now(tz.tzlocal()).strftime('%z') is the way to get just the local timezone formatted as a GMT offset, but if you're trying to format a complete time the details will be a little different.
If you look at the source, time.strftime basically just checks the format string for valid-for-the-platform specifiers and calls the native strftime function, while datetime.strftime has a bunch of special handling for different specifiers, including %z; in particular, it will replace the %z with a formatted version of utcoffset before passing things on to strftime. The code has changed a few times since 2.7, and even been radically reorganized once, but the same difference is basically there even in the pre-3.5 trunk.

For a proper solution, see abarnert’s answer below.
You can use time.altzone which returns a negative offset in seconds. For example, I’m on CEST at the moment (UTC+2), so I get this:
>>> time.altzone
-7200
And to put it in your desired format:
>>> '{}{:0>2}{:0>2}'.format('-' if time.altzone > 0 else '+', abs(time.altzone) // 3600, abs(time.altzone // 60) % 60)
'+0200'
As abarnert mentioned in the comments, time.altzone gives the offset when DST is active while time.timezone does for when DST is not active. To figure out which to use, you can do what J.F. Sebastian suggested in his answer to a different question. So you can get the correct offset like this:
time.altzone if time.daylight and time.localtime().tm_isdst > 0 else time.timezone
As also suggested by him, you can use the following in Python 3 to get the desired format using datetime.timezone:
>>> datetime.now(timezone.utc).astimezone().strftime('%z')
'+0200'

Use time.timezone to get the time offset in seconds.
Format it using :
("-" if time.timezone > 0 else "+") + time.strftime("%H:%M", time.gmtime(abs(time.timezone)))
to convert the same to +/-HH:MM format.
BTW isn't this supposed to be a bug ? According to strftime docs.
Also I thought this SO answer might help you to convert from Zone offset string to HH:MM format. But since "%z" is not working as expected, I feel its moot.
NOTE: The time.timezone is immune to Daylight savings.

It will come as no surprise that this bug persists in, what is the latest Windows version available currently, Win 10 Version 1703 (Creators). However, time marches on and there is a lovely date-and-time library called pendulum that does what the question asks for. Sébastien Eustace (principal author of the product?) has shown me this.
>>> pendulum.now().strftime('%z')
'-0400'
pendulum assumes UTC/GMT unless told otherwise, and keeps timezone with the date-time object. There are many other possibilities, amongst them these:
>>> pendulum.now(tz='Europe/Paris').strftime('%z')
'+0200'
>>> pendulum.create(year=2016, month=11, day=5, hour=16, minute=23, tz='America/Winnipeg').strftime('%z')
'-0500'
>>> pendulum.now(tz='America/Winnipeg').strftime('%z')
'-0500'

Checking if a date string is in UTC format

I have a date string like "2011-11-06 14:00:00+00:00". Is there a way to check if this is in UTC format or not ?. I tried to convert the above string to a datetime object using utc = datetime.strptime('2011-11-06 14:00:00+00:00','%Y-%m-%d %H:%M%S+%z) so that i can compare it with pytz.utc, but i get 'ValueError: 'z' is a bad directive in format '%Y-%m-%d %H:%M%S+%z'
How to check if the date string is in UTC ?. Some example would be really appreciated.
Thank You

A simple regular expression will do:
>>> import re
>>> RE = re.compile(r'^\d{4}-\d{2}-\d{2}[ T]\d{2}:\d{2}:\d{2}[+-]\d{2}:\d{2}$')
>>> bool(RE.search('2011-11-06 14:00:00+00:00'))
True

By 'in UTC format' do you actually mean ISO-8601?. This is a pretty common question.

The problem with your format string is that strptime just passes the job of parsing time strings on to c's strptime, and different flavors of c accept different directives. In your case (and mine, it seems), the %z directive is not accepted.
There's some ambiguity in the doc pages about this. The datetime.datetime.strptime docs point to the format specification for time.strptime which doesn't contain a lower-case %z directive, and indicates that
Additional directives may be supported on certain platforms, but only the ones listed here have a meaning standardized by ANSI C.
But then it also points here which does contain a lower-case %z, but reiterates that
The full set of format codes supported varies across platforms, because Python calls the platform C library’s strftime() function, and platform variations are common.
There's also a bug report about this issue.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

datetuil results in wrong utc offset when using lower case - python

Related

What's the correct datetime format for the specified date string?

Does python time.strftime process timezone options correctly (for RFC 3339)

Converting Iso8601 with non-decimal UTC zone offest [duplicate]

time.strftime() incorrect timezone format [duplicate]

Checking if a date string is in UTC format

Categories

Resources