Is there a way I can obtain a datetime aware object out of a string in Python using only the standard library modules ?
I know that I can use dateutil.parser.parse, but unfortunately that's not a good enough reason to add it as a dependency to my project. I already have the mx.DateTime module as a dependency, buuttttt:
>>> dateutil.parser.parse('2011-10-24T06:51:47-07:00')
datetime.datetime(2011, 10, 24, 6, 51, 47, tzinfo=tzoffset(None, -25200))
>>> mx.DateTime.ISO.ParseDateTimeUTC('2011-10-24T06:51:47-07:00')
<mx.DateTime.DateTime object for '2011-10-24 13:51:47.00' at 29c7e48>
the ParseDateTimeUTC fails to detect the offset, even though in its documentation says that:
Returns a DateTime instance in UTC reflecting the given ISO
date. A time part is optional and must be delimited from the
date by a space or 'T'. Timezones are honored.
mx.DateTime.ISO.ParseDateTimeUTC is doing the right thing - it is applying the specified timezone to adjust the time to UTC. The resulting UTC time doesn't have a timezone because it isn't a local time anymore.
The standard Python library doesn't contain any concrete timezone classes according to the documentation:
tzinfo is an abstract base clase, meaning that this class should not
be instantiated directly. You need to derive a concrete subclass, and
(at least) supply implementations of the standard tzinfo methods
needed by the datetime methods you use. The datetime module does not
supply any concrete subclasses of tzinfo.
I've always been surprised that they didn't at least include a class for UTC, or a generic implementation like dateutil's tzoffset.
Related
I'm literally days-new to Python programming after years of C++ and C programming, and am trying to get a feel for the grammar.
In the following very-beginner code:
from datetime import datetime
from datetime import date
print( datetime.now() )
# print( now() ) # NameError: name 'now' is not defined
print( date(2005, 2, 27) )
# print( datetime.date(2005, 2, 27) ) # TypeError: descriptor 'date' requires a 'datetime.datetime' object but received a 'int'
...why is it necessary to scope now() in datetime but apparently incorrect to do so with date(...)?
The learning material I'm referencing said the two import statements mean I'm "importing the date and datetime classes from the datetime standard module." Possibly biased from my C++ background, I'm equating module with namespace and would have thought this meant (1) you'd need to explicitly scope functions and classes with the module they came from (like std::sort()), or (2) not need explicit scoping because the from/import clause is akin to CC++'s using clause. So the grammar of the above looks odd to me because it looks like I'm using two "things" that come from the datetime "namespace," and I must scope one thing but not the other.
FWIW, I use vim as my editor - I wonder: would something about this have been more transparent with a a graphical/autosuggest-enabled editor?
To any answerers, I'd be grateful if you could explain how an experienced Python programmer would go about finding out the answer to a question like this. What I mean is: in C/C++, I'd look up whatever .h I #include to find out what's what - how do you go about "looking up" the datetime "module"?
You're correct - you don't need to scope! This is a slightly confusing situation because the datetime module has a class which is also called datetime.
So what's happening in each of these:
print(datetime.now()) # Call the now() class method of the datetime class and print its output
print(now()) # now() is not defined in the namespace, hence the error
print(date(2005, 2, 27)) # Instantiate a date object and print its representation
print(datetime.date(2005, 2, 27)) # This is trying to call the date() method of the datetime class, which doesn't exist, hence the error.
With the last case, if you had just done import datetime, the whole datetime module would have been imported. In that case, you can instantiate a date class object by doing datetime.date(2005, 2, 27).
Hope that makes a tiny bit of sense!
The importing in python ist simple. When you import
from datetime import datetime
then you have from modul datetime only import the class / function datetime. (In this case a class). Your interpreter don't know the function "now" but you can access to it when you take the loop over that what you imported.
datetime.now()
after your second import
from datetime import date
your compiler knows the classes date and datetime. When you try
print( datetime.date(2005, 2, 27) )
then it is not that what you expect. You try to call from class datetime the function date which have other parameters as the class date from modul datetime.
The problem with the modul datetime is that it contains a class with datetime so it is a little bit confusing.
I can create a timezone specific datetime object like this
import datetime
d = datetime.datetime.now().astimezone()
Result is
datetime.datetime(2018, 4, 2, 15, 12, 2, 807451, tzinfo=datetime.timezone(datetime.timedelta(0, 7200), 'CEST'))
It looks like that tzinfo is represented by two values/attributes: A timedelta and a string. But how can I access them?
I would like to do something like this
d.tzinfo.delta
d.tzinfo.name
I need this informations to be able to (de)serialize the datetime to and from JSON.
I don't want to use third-party packages for such solutions.
tzinfo in this case is an instance of the datetime.timezone() class:
The timezone class is a subclass of tzinfo, each instance of which represents a timezone defined by a fixed offset from UTC.
You can use the tzinfo.utcoffset() and tzinfo.utcname() methods to access the delta and name. For timezone() instances the argument each of these take is ignored, but normally you'd pass in the datetime instance they are attached to:
d.tzinfo.utcoffset(d)
d.tzinfo.utcname(d)
You'd usually call these on the datetime.datetime instance, which has the same methods (but which take no arguments) and these will then handle passing in the right argument to the methods on the contained tzinfo attribute.
Demo:
>>> import datetime
>>> d = datetime.datetime.now().astimezone()
>>> d.utcoffset()
datetime.timedelta(seconds=7200)
>>> d.tzname()
'CEST'
>>> d.tzinfo.utcoffset(d)
datetime.timedelta(seconds=7200)
>>> d.tzinfo.utcoffset(d) is d.utcoffset() # they are the same object
True
The datetime.timezone() subclass is just one implementation of a tzinfo time zone, 3rd-party libraries like pytz offer their own, and the utcoffset() and tzname() return values may well vary for timezones with historical information attached.
You're asking about datetime.tzinfo, which is an abstract base class, as documented here.
datetime comes with an implementation of the tzinfo abstract base class called datetime.timezone, which is documented here.
Just below that last link are the docs on timezone.utcoffset and timezone.utcname, which are ways to access the properties you asked about on the tzinfo if it is a timezone. However, this is not the only implementation of that abstract class. If you are using the pytz timezone, for example, then you'll need to read the docs on that instead.
>>> import dateutil.parser, dateutil.tz as tz
>>> dateutil.parser.parse('2017-08-09 10:45 am').replace(tzinfo=tz.gettz('America/New_York'))
datetime.datetime(2017, 8, 9, 10, 45, tzinfo=tzfile('/usr/share/zoneinfo/America/New_York'))
Is that really the way that we're supposed to set a default timezone for parsing? I've read the documentation for the parser and examples but I cannot seem to find anything that says, "This is how to set the default timezone for dateutil.parser.parse", or even anything like it.
Because while this works, there are cases where it would do the wrong thing, if the zone were provided. Does that mean we should do this?
>>> d = dateutil.parser.parse('2017-08-09 10:45 am +06:00')
>>> d = d.replace(tzinfo=d.tzinfo or tz.gettz('America/Chicago'))
Because that's clunky, too.
What's the recommended way to set a default timezone when parsing?
There are basically two "correct" ways to do this. You can see that this was brought up as Issue #94 on dateutil's issue tracker, and "set a default time zone" is determined to be out of scope, since this is something that can be easily done with the information returned by the parser anyway (and thus no need to build it in to the parser itself). The two ways are:
Provide a default date that has a time zone. If you don't care what the default date is, you can just specify some date literal and be done with it. If you want the behavior to be basically the same as dateutil's default behavior (replacing missing elements from "today's date at midnight"), you have to have a bit of boilerplate:
from datetime import datetime, time
from dateutil import tz, parser
default_date = datetime.combine(datetime.now(),
time(0, tzinfo=tz.gettz("America/New_York")))
dt = parser.parse(some_dt_str, default=default_date)
Use your second method with .replace:
from dateutil import parser
def my_parser(*args, default_tzinfo=tz.gettz("America/New_York"), **kwargs):
dt = parser.parse(*args, **kwargs)
return dt.replace(tzinfo=dt.tzinfo or default_tzinfo)
This last one is probably slightly cleaner than the first, but has a slight performance decrease if run in a tight loop (since the first one only needs the default date created once), but dateutil's parser is actually quite slow, so an extra date construction is likely the least of your problems if you're running it in a tight loop.
Fleshing out Paul's comment - because a datetime has to be at least a year, month, and day, dateutil already has a default that it uses:
>>> from datetime import datetime
>>> datetime.now()
datetime.datetime(2017, 10, 13, 15, 16, 13, 548750)
>>> dateutil.parser.parse('2017')
datetime.datetime(2017, 10, 13, 0, 0)
Given this, the appropriate choice would be to create a default that contains the timezone and is either just the current date, or whatever date makes sense:
>>> dateutil.parser.parse('2017', default=datetime(2017, 10, 13, tzinfo=tz.gettz('America/New_York')))
Naturally you can store the default as something sensible, like default_datetime or something, then it becomes:
>>> dateutil.parser.parse('2017', default=default_datetime)
I'm trying to perform a, to me, simple task of generating a current date/time combo at a speficic time zone. All I see is suggestions to use pytz even though datetime includes the tzinfo class to deal with timezones. However, if I try to use tzinfo, it does not work:
>>> from datetime import datetime, tzinfo
>>> d = datetime.now(tzinfo.tzname("EDT"))
TypeError: descriptor 'tzname' requires a 'datetime.tzinfo' object but received a 'str'
The docs say you can use a time zone name like "EDT" or "GMT". What's wrong with this?
The function tzinfo.tzname does the opposite of what you think it does.
It takes a datetime object and returns a string indicating the time zone.
Coming from C#, I've learned to always be aware of time-zones when handling date/time. Python has proper timezone handling and useful helpers like datetime.utcnow, which makes working with date/time straight forward. But, when reading in the python docs, I noticed that there is something called a "naive" datetime instance. As far as I can see, this is just a datetime without any timezone.
What is the use-case for a naive datetime?
Isn't a datetime without a time-zone pretty useless?
And why doesn't datetime.now() return a datetime in the current locale (like .NET)?
I'm sure I'm missing something crucial, so I hope someone can shed some light on this.
What is the point of a naive datetime
A naive datetime is very useful!
In some cases you don't know or don't want to specify a timezone.
Imagine that you are parsing an ancient external program log file, and you don't know what timezone the datetimes are in - your best bet is leave them as-is. Attaching a timezone to such datetimes would be wrong and could lead to errors, as you'd be pretending to have information you don't actually have.
And why doesn't datetime.now() return a datetime in the current locale (like .NET)?
datetime.now() does return a value in the current locale timezone, but it doesn't have a timezone associated with it (a tzinfo attribute), which is probably what you meant. Notice that the same is true for utcnow(), both return naive datetimes
The rationale for not including timezone support in the datetime module is alluded to in the docs:
Note that no concrete tzinfo classes are supplied by the datetime module. [...] The rules for time adjustment across the world are more political than rational, and there is no standard suitable for every application.
If you included timezone support in the standard library, you'd get wrong results somewhere in the world.
Timezones are a political concept and change several times a year, globally. The life expectancy of the locally installed python standard library is (generally) much larger than the correctness of timezone data.
What should I do to support timezones
Disclaimer: you should just use UTC in almost all cases. Local timezones should only be used as a last step when showing values to the user.
To use timezones, your program should depend on the pytz package, which gives you proper timezone suport.
from time import tzname
from pytz import timezone
from datetime import datetime
timezone(tzname[0]).localize(datetime.now())
Remember that your program or the local system administrator will need to keep the package up to date.
What is the use-case for a naive datetime?
Python does not allocate space for the pointer to the timezone object if the datetime object is naive:
/* ---------------------------------------------------------------------------
* Basic object allocation: tp_alloc implementations. These allocate
* Python objects of the right size and type, and do the Python object-
* initialization bit. If there's not enough memory, they return NULL after
* setting MemoryError. All data members remain uninitialized trash.
*
* We abuse the tp_alloc "nitems" argument to communicate whether a tzinfo
* member is needed. This is ugly, imprecise, and possibly insecure.
* tp_basicsize for the time and datetime types is set to the size of the
* struct that has room for the tzinfo member, so subclasses in Python will
* allocate enough space for a tzinfo member whether or not one is actually
* needed. That's the "ugly and imprecise" parts. The "possibly insecure"
* part is that PyType_GenericAlloc() (which subclasses in Python end up
* using) just happens today to effectively ignore the nitems argument
* when tp_itemsize is 0, which it is for these type objects. If that
* changes, perhaps the callers of tp_alloc slots in this file should
* be changed to force a 0 nitems argument unless the type being allocated
* is a base type implemented in this file (so that tp_alloc is time_alloc
* or datetime_alloc below, which know about the nitems abuse).
*/
static PyObject *
time_alloc(PyTypeObject *type, Py_ssize_t aware)
{
PyObject *self;
self = (PyObject *)
PyObject_MALLOC(aware ?
sizeof(PyDateTime_Time) :
sizeof(_PyDateTime_BaseTime));
if (self == NULL)
return (PyObject *)PyErr_NoMemory();
PyObject_INIT(self, type);
return self;
}
How you use naive datetime objects is up to you. In my code I use naive datatime objects as:
As if they had UTC timezone associated with them, or
I don't care what the timezone is at all.