Mystery times conversion (pandas and datetime) - python

Can anyone explain this?
import pandas as pd
import datetime
pd.to_datetime(1532329236726000, unit="us")
returns Timestamp('2018-07-23 07:00:36.726000')
datetime.datetime(2018, 7, 23, 8, 0, 36, 726000).timestamp() * 10**6
returns 1532329236726000.0.
So, is 1532329236726000 2018-07-23 07:00:36 or 2018-07-23 08:00:36 ?

This will depend on the timezone info of the datetime object you create. YOu are in fact creating a naive timezone object in both cases which does not have a specific timezone set
datetime.datetime() objects may assume a local timezone as opposed to UTC which your pd.to_datetime() assumes
datetime.datetime(2018, 7, 23, 7, 0, 36, 726000).replace(tzinfo=pytz.utc).timestamp() * 10**6 returns the same epoch time you put into your original question
From the python docs
"A naive object does not contain enough information to unambiguously locate itself relative to other date/time objects. Whether a naive object represents Coordinated Universal Time (UTC), local time, or time in some other timezone is purely up to the program, just like it is up to the program whether a particular number represents metres, miles, or mass. Naive objects are easy to understand and to work with, at the cost of ignoring some aspects of reality."
https://docs.python.org/3/library/datetime.html
You can explicitly tell both functions in your answer to use UTC with a kwarg to the constructors (and without using pytz as below)
datetime.datetime(2018, 7, 23, 7, 0, 36, 726000, tzinfo=datetime.timezone.utc)
pd.to_datetime(1532329236726000, unit="us", utc=True)

Related

Can Dates be Timezone-Aware in Python?

There's lots of SO answers on ensuring datetimes are a particular timezone. For example you can ensure your datetime is UTC with
from datetime import datetime
import pytz
now_utc = datetime.utcnow()
which yields:
datetime.datetime(2017, 5, 11, 17, 37, 5, 602054)
you can make that datetime aware of its timezone (e.g. for asserting two different datetime objects are from the same timezone) with
now_utc_aware = datetime.now(pytz.utc)
datetime.datetime(2017, 5, 11, 17, 38, 2, 757587, tzinfo=< UTC>)
But when I pull the date from a timezone-aware datetime, I seem to lose the timezone-awareness.
now_utc_aware.date()
datetime.date(2017, 5, 11)
Interestingly, there's a SO question which seems to ask exactly this, and about a date specifically (datetime.today()), but the answers (including an accepted one) relate to datetimes. The code I've seen to add timezone awareness to datetimes all seem to throw errors on my datetime.date object.
Is it possible to add timezone awareness to a date object?
From the Python docs:
class datetime.date
An idealized naive date, ... Attributes: year, month, and day.
There's nothing there for time or time zone. It's just a date.
While it is true that not everywhere on Earth is on the same date simultaneously (because of time zones), that doesn't mean a date itself has time zone awareness.
As a real-world analogy, think of a date as just a square on a calendar. One cannot start talking about timezones without introducing time, which is measured by a clock, not a calendar.

What does Django Date filter mean by 'Naive'?

I want to use the 'c' date format with the Django date filter. That format references 'naive' dates. I don't want to have timezones in my template (that's held elsewhere in my xml).
I'm not sure what that is, the documentation for django doesn't mention it, nor does the PHP site it references.
What is it, and how do I get rid of it?
The documentation refers to python dates, of these available types.
An object of type time or datetime may be naive or aware. A datetime
object d is aware if d.tzinfo is not None and d.tzinfo.utcoffset(d)
does not return None. If d.tzinfo is None, or if d.tzinfo is not None
but d.tzinfo.utcoffset(d) returns None, d is naive. A time object t is
aware if t.tzinfo is not None and t.tzinfo.utcoffset(None) does not
return None. Otherwise, t is naive.
So naive just means it does not have any time zone information.
To make something 'aware' follow this method by unutbu:
In general, to make a naive datetime timezone-aware, use the localize
method:
import datetime
import pytz
unaware = datetime.datetime(2011, 8, 15, 8, 15, 12, 0)
aware = datetime.datetime(2011, 8, 15, 8, 15, 12, 0, pytz.UTC)
now_aware = pytz.utc.localize(unaware)
assert aware == now_aware
For the UTC timezone, it is not really necessary to use localize
since there is no daylight savings time calculation to handle:
now_aware = unaware.replace(tzinfo=pytz.UTC)
works. (.replace returns a new datetime; it does not modify
unaware.)
To make it unaware set the timezone to None.

Datetime Timezone conversion using pytz

This is just another post on pytz.
There are two functions to convert datetime objects between two timezones. The second functions works for all cases. The first function fails in two cases, (3) and (4). Similar SO post did not have an issue like this. Any explanation based on the difference between localize(datetime.datetime) and replace(tzinfo) would be a great help.
>>> from dateutil.parser import parse
>>> import pytz
First function (buggy)
The function below uses datetime.datetime.replace(tzinfo).
def buggy_timezone_converter(input_dt, current_tz='UTC', target_tz='US/Eastern'):
'''input_dt is a datetime.datetime object'''
current_tz = pytz.timezone(current_tz)
target_tz = pytz.timezone(target_tz)
target_dt = input_dt.replace(tzinfo=current_tz).astimezone(target_tz)
return target_tz.normalize(target_dt)
Notice the four datetime conversion now.
(1) from UTC to EST -- OK
>>> buggy_timezone_converter(parse('2013-02-26T04:00:00'))
Out[608]: datetime.datetime(2013, 2, 25, 23, 0, tzinfo=<DstTzInfo 'US/Eastern' EST-1 day, 19:00:00 STD>)
(2) from UTC to EDT -- OK
>>> buggy_timezone_converter(parse('2013-05-26T04:00:00'))
Out[609]: datetime.datetime(2013, 5, 26, 0, 0, tzinfo=<DstTzInfo 'US/Eastern' EDT-1 day, 20:00:00 DST>)
(3) from EST to UTC -- Not OK. Time offset is 4 hours 56 minutes. It is supposed to be 5 hours
>>> buggy_timezone_converter(parse('2013-02-26T04:00:00'), target_tz='UTC', current_tz='US/Eastern')
Out[610]: datetime.datetime(2013, 2, 26, 8, 56, tzinfo=<UTC>)
(4) from EDT to UTC -- Not OK. Time offset is 4 hours 56 minutes. It is supposed to be 4 hours. Daylight saving is not considered.
>>> buggy_timezone_converter(parse('2013-05-26T04:00:00'), current_tz='US/Eastern', target_tz='UTC')
Out[611]: datetime.datetime(2013, 5, 26, 8, 56, tzinfo=<UTC>)
Second function (Works perfectly)
The function below uses pytz.timezone.localize(datetime.datetime). It works perfectly
def good_timezone_converter(input_dt, current_tz='UTC', target_tz='US/Eastern'):
current_tz = pytz.timezone(current_tz)
target_tz = pytz.timezone(target_tz)
target_dt = current_tz.localize(input_dt).astimezone(target_tz)
return target_tz.normalize(target_dt)
(1) from UTC to EST -- OK
>>> good_timezone_converter(parse('2013-02-26T04:00:00'))
Out[618]: datetime.datetime(2013, 2, 25, 23, 0, tzinfo=<DstTzInfo 'US/Eastern' EST-1 day, 19:00:00 STD>)
(2) from UTC to EDT -- OK
>>> good_timezone_converter(parse('2013-05-26T04:00:00'))
Out[619]: datetime.datetime(2013, 5, 26, 0, 0, tzinfo=<DstTzInfo 'US/Eastern' EDT-1 day, 20:00:00 DST>)
(3) from EST to UTC -- OK.
>>> good_timezone_converter(parse('2013-02-26T04:00:00'), current_tz='US/Eastern', target_tz='UTC')
Out[621]: datetime.datetime(2013, 2, 26, 9, 0, tzinfo=<UTC>)
(4) from EDT to UTC -- OK.
>>> good_timezone_converter(parse('2013-05-26T04:00:00'), current_tz='US/Eastern', target_tz='UTC')
Out[620]: datetime.datetime(2013, 5, 26, 8, 0, tzinfo=<UTC>)
I assume you have these questions:
why does the first function work for UTC timezone?
why does it fail for 'US/Eastern' timezone (DstTzInfo instance)?
why does the second function work for all provided examples?
The first function is incorrect because it uses d.replace(tzinfo=dsttzinfo_instance) instead of dsttzinfo_instance.localize(d) .
The second function is correct most of the time except during ambiguous or non-existing times e.g., during DST transitions -- you can change the behaviour by passing is_dst parameter to .localize(): False(default)/True/None(raise an exception).
The first function works for UTC timezone because it has a fixed utc offset (zero) for any date. Other timezones such as America/New_York may have different utc offsets at different times (Daylight saving time, war time, any time that some local politician might think is a good idea -- it can be anything -- the tz database works in most cases). To implement tzinfo.utcoffset(dt), tzinfo.tzname(dt), tzinfo.dst(dt) methods pytz uses a collection of DstTzInfo instances each with a different set of (_tzname, _utcoffset, _dst) attributes. Given dt (date/time) and is_dst, .localize() method chooses an appropriate (in most cases but not always) DstTzInfo instance from the collection. pytz.timezone('America/New_York') returns a DstTzInfo instance with (_tzname, _utcoffset, _dst) attributes that correspond to some undocumented moment in time (different pytz versions may return different values -- the current version may return tzinfo instance that corresponds to the earliest date for which zoneinfo is available -- you don't want this value most of the time: I think the motivation behind the choice of the default value is to highlight the error (passing pytz.timezone to datetime constructor or .replace() method).
To summarize: .localize() selects appropriate utcoffset, tzname, dst values, .replace() uses the default (inappropriate) value. UTC has only one set of utcoffset, tzname, dst therefore the default value may be used and .replace() method works with UTC timezone. You need to pass a datetime object and is_dst parameter to select appropriate values for other timezones such as 'America/New_York'.
In principle, pytz could have called localize() method to implement utcoffset(), tzname(), dst() methods even if dt.tzinfo == self: it would make these methods O(log n) in time where n is number of intervals with different (utcoffset, tzname, dst) values but datetime constructor and .replace() would work as is i.e., the explicit localize() call would be necessary only to pass is_dst.

python compare datetimes with different timezones

I'm implementing feature with scheduled publishing of object.
User chooses the time to publish and i created a cron task to run every minute and check if it's the time to publish.
Users are from different timezones.
So i need to compare two datetimes:
>>user_chosen_time
datetime.datetime(2012, 12, 4, 14, 0, tzinfo=tzinfo(120))
>>curdate=datetime.datetime.now()
datetime.datetime(2012, 12, 4, 18, 4, 20, 17340)
>>user_chosen_time==curdate
*** TypeError: can't compare offset-naive and offset-aware datetimes
Sorry for rather stupid question but i need to discuss this. Thanks
As the error suggests you "can't compare offset-naive and offset-aware datetimes". It means that you should compare two datetimes that are both timezone-aware or both timezone-naive (not timezone-aware). In your codes, curdate has no timezone info and thus could not be compared with user_chosen_time which is timezone-aware.
First you should assign correct timezone to each datetime. And then you could directly compare two datetimes with different timezones.
Example (with pytz):
import pytz
import datetime as dt
# create timezone
nytz=pytz.timezone('America/New_York')
jptz=pytz.timezone('Asia/Tokyo')
# randomly initiate two timestamps
a=dt.datetime(2018,12,13,11,2)
b=dt.datetime(2018,12,13,22,45)
# assign timezone to timestamps
a=nytz.localize(a)
b=jptz.localize(b)
# a = datetime.datetime(2018, 12, 13, 11, 2, tzinfo=<DstTzInfo 'America/New_York' EST-1 day, 19:00:00 STD>)
# b = datetime.datetime(2018, 12, 13, 22, 45, tzinfo=<DstTzInfo 'Asia/Tokyo' JST+9:00:00 STD>)
a>b # True
b>a # False
For other methods you could refer to Convert a python UTC datetime to a local datetime using only python standard library?.
http://pytz.sourceforge.net/ is where you want to look when you want to eliminate the timezone differencies :)
edit: just found this post on SO that may give you a lot more informations on your problem

pytz localize vs datetime replace

I'm having some weird issues with pytz's .localize() function. Sometimes it wouldn't make adjustments to the localized datetime:
.localize behaviour:
>>> tz
<DstTzInfo 'Africa/Abidjan' LMT-1 day, 23:44:00 STD>
>>> d
datetime.datetime(2009, 9, 2, 14, 45, 42, 91421)
>>> tz.localize(d)
datetime.datetime(2009, 9, 2, 14, 45, 42, 91421,
tzinfo=<DstTzInfo 'Africa/Abidjan' GMT0:00:00 STD>)
>>> tz.normalize(tz.localize(d))
datetime.datetime(2009, 9, 2, 14, 45, 42, 91421,
tzinfo=<DstTzInfo 'Africa/Abidjan' GMT0:00:00 STD>)
As you can see, time has not been changed as a result of localize/normalize operations.
However, if .replace is used:
>>> d.replace(tzinfo=tz)
datetime.datetime(2009, 9, 2, 14, 45, 42, 91421,
tzinfo=<DstTzInfo 'Africa/Abidjan' LMT-1 day, 23:44:00 STD>)
>>> tz.normalize(d.replace(tzinfo=tz))
datetime.datetime(2009, 9, 2, 15, 1, 42, 91421,
tzinfo=<DstTzInfo 'Africa/Abidjan' GMT0:00:00 STD>)
Which seems to make adjustments into datetime.
Question is - which is correct and why other's wrong?
localize just assumes that the naive datetime you pass it is "right" (except for not knowing about the timezone!) and so just sets the timezone, no other adjustments.
You can (and it's advisable...) internally work in UTC (rather than with naive datetimes) and use replace when you need to perform I/O of datetimes in a localized way (normalize will handle DST and the like).
localize is the correct function to use for creating datetime aware objects with an initial fixed datetime value. The resulting datetime aware object will have the original datetime value. A very common usage pattern in my view, and one that perhaps pytz can better document.
replace(tzinfo = ...) is unfortunately named. It is a function that is random in its behaviour. I would advise avoiding the use of this function to set timezones unless you enjoy self-inflicted pain. I have already suffered enough from using this function.
This DstTzInfo class is used for timezones where the offset from UTC changes at certain points in time. For example (as you are probably aware), many locations transition to "daylight savings time" at the beginning of Summer, and then back to "standard time" at the end of Summer. Each DstTzInfo instance only represents one of these timezones, but the "localize" and "normalize" methods help you get the right instance.
For Abidjan, there has only ever been one transition (according to pytz), and that was in 1912:
>>> tz = pytz.timezone('Africa/Abidjan')
>>> tz._utc_transition_times
[datetime.datetime(1, 1, 1, 0, 0), datetime.datetime(1912, 1, 1, 0, 16, 8)]
The tz object we get out of pytz represents the pre-1912 timezone:
>>> tz
<DstTzInfo 'Africa/Abidjan' LMT-1 day, 23:44:00 STD>
Now looking up at your two examples, see that when you call tz.localize(d) you do NOT get this pre-1912 timezone added to your naive datetime object. It assumes that the datetime object you give it represents local time in the correct timezone for that local time, which is the post-1912 timezone.
However in your second example using d.replace(tzinfo=tz), it takes your datetime object to represent the time in the pre-1912 timezone. This is probably not what you meant. Then when you call dt.normalize it converts this to the timezone that is correct for that datetime value, ie the post-1912 timezone.
I realize I'm a little late on this...
but here is what I found to work well.
Work in UTC as Alex stated:
tz = pytz.timezone('Africa/Abidjan')
now = datetime.datetime.utcnow()
Then to localize:
tzoffset = tz.utcoffset(now)
mynow = now+tzoffset
And this method does handle DST perfectly

Categories