Convert a UTC time to epoch - python

I am looking to analyze traffic flow with relation to weather data. The traffic data has a UNIX timestamp (aka epoch), but I am running into trouble with converting the timestamp (in the weather data) to epoch. The problem is that I am in Norway and the UTC timestamp in the weather data isn't in the same timezone as me (GMT+1).
My initial approach
I first tried converting it into epoch and treating the data as if it was in the GMT+1 timezone. Then I compensated by subtracting the difference in number of seconds between UTC and GMT+1.
Problems with the approach
I realize first of all that this approach is very primitive and not very elegant (in fact probably it is at best an ugly hack). However, the biggest problem here is that the difference between UTC and GMT+1 is not constant (due to daylight savings).
Question
Is there any reliable way of turning UTC time to a UNIX time stamp in python (taking into account that my machine is in GMT+1)? The timestamp is in the following format:
Y-m-d HH:MM:SS
Edit:
Tried rmunns' solution:
def convert_UTC_to_epoch(timestamp):
tz_UTC = pytz.timezone('UTC')
time_format = "%Y-%m-%d %H:%M:%S"
naive_timestamp = datetime.datetime.strptime(timestamp, time_format)
aware_timestamp = tz_UTC.localize(naive_timestamp)
epoch = aware_timestamp.strftime("%s")
return (int) (epoch)
This does not work properly as evidenced below:
#Current time at time of the edit is 15:55:00 UTC on June 9th 2014.
>>> diff = time.time() - convert_UTC_to_epoch("2014-06-09 15:55:00")
>>> diff
3663.25887799263
>>> #This is about an hour off.

The solution was to use the calendar module (inspired from here)
>>>#Quick and dirty demo
>>>print calendar.timegm(datetime.datetime.utcnow().utctimetuple()) - time.time()
>>>-0.6182510852813721
And here is the conversion function:
import calendar, datetime, time
#Timestamp is a datetime object in UTC time
def UTC_time_to_epoch(timestamp):
epoch = calendar.timegm(timestamp.utctimetuple())
return epoch

An alternative, datetime has it's own .strptime() method.
http://en.wikipedia.org/wiki/Unix_time
The Unix epoch is the time 00:00:00 UTC on 1 January 1970 (or 1970-01-01T00:00:00Z ISO 8601).
import datetime
unix_epoch = datetime.datetime(1970, 1, 1)
log_dt = datetime.datetime.strptime("14-05-07 12:14:16", "%y-%m-%d %H:%M:%S")
seconds_from_epoch = (log_dt - unix_epoch).total_seconds()
>>> 1399490056.0

The pytz module will probably help you. It allows you to write code like:
import pytz
import datetime
tz_oslo = pytz.timezone('Europe/Oslo')
time_format = "%Y-%m-%d %H:%M:%S"
naive_timestamp = datetime.datetime(2014, 6, 4, 12, 34, 56)
# Or:
naive_timestamp = datetime.datetime.strptime("2014-06-04 12:34:56", time_format)
aware_timestamp = tz_oslo.localize(naive_timestamp)
print(aware_timestamp.strftime(time_format + " %Z%z"))
This should print "2014-06-04 14:34:56 CEST+0200".
Do note the following from the pytz manual:
The preferred way of dealing with times is to always work in UTC, converting to localtime only when generating output to be read by humans.
So keep that in mind as you write your code: do the conversion to local time once and once only, and you'll have a much easier time doing, say, comparisons between two timestamps correctly.
Update: Here are a couple of videos you may find useful:
What you need to know about datetimes, a PyCon 2012 presentation by Taavi Burns (30 minutes)
Drive-in Double Header: Datetimes and Log Analysis, a two-part presentation. (Caution: annoying buzz in the video, but I couldn't find a copy with better sound). The first part is the "What you need to know about datetimes" presentation I linked just above, and the second part has some practical tips for parsing log files and doing useful things with them. (50 minutes)
Update 2: The convert_UTC_to_epoch() function you mention in your updated question (which I've reproduced below) is returning local time, not UTC:
def convert_UTC_to_epoch(timestamp):
tz_UTC = pytz.timezone('UTC')
time_format = "%Y-%m-%d %H:%M:%S"
naive_timestamp = datetime.datetime.strptime(timestamp, time_format)
aware_timestamp = tz_UTC.localize(naive_timestamp)
epoch = aware_timestamp.strftime("%s")
return (int) (epoch)
The problem is that you're using strftime("%s"), which is undocumented and is returning the wrong result. Python doesn't support the %s parameter, but it appears to work because it gets passed to your system's strftime() function, which does support the %s parameter -- but it returns local time! You're taking a UTC timestamp and parsing it as local time, which is why it's an hour off. (The mystery is why it isn't two hours off -- isn't Norway in daylight savings time right now? Shouldn't you be at UTC+2?)
As you can see from the interactive Python session below, I'm in the UTC+7 timezone and your convert_UTC_to_epoch() function is seven hours off for me.
# Current time is 02:42 UTC on June 10th 2014, 09:42 local time
>>> time.timezone
-25200
>>> time.time() - convert_UTC_to_epoch("2014-06-10 02:42:00")
25204.16531395912
>>> time.time() + time.timezone - convert_UTC_to_epoch("2014-06-10 02:42:00")
6.813306093215942
The strftime("%s") call is interpreting 02:42 on June 10th as being in local time, which would be 19:42 UTC on June 9th. Subtracting 19:42 UTC on June 9th from 02:42 UTC June 10th (which is what time.time() returns) gives a difference of seven hours. See Convert python datetime to epoch with strftime for more details on why you should never use strftime("%s").
(By the way, if you saw what I had previously written under the heading "Update 2", where I claimed that time.time() was returning local time, ignore that -- I got it wrong. I was fooled at first by the strftime("%s") bug just like you were.)

You can use the time and datetime modules:
import time, datetime
date = "14-05-07 12:14:16" #Change to whatever date you want
date = time.strptime(date, "%y-%m-%d %H:%M:%S")
epoch = datetime.datetime.fromtimestamp(time.mktime(date)).strftime('%s')
This runs as:
>>> import time, datetime
>>> date = "14-05-07 12:14:16"
>>> date = time.strptime(date, "%y-%m-%d %H:%M:%S")
>>> epoch = datetime.datetime.fromtimestamp(time.mktime(date)).strftime('%s')
>>> epoch
'1399490056'
>>>

Related

'Europe/Madrid' timezone doesn't match 'Etc/GMT+1'

I'm trying to convert a UTC timestamp to one in the Spanish timezone.
>>> import datetime as dt
>>> import pytz
>>> today = dt.datetime.utcfromtimestamp(1573516800)
datetime.datetime(2019, 11, 12, 0, 0)
>>> today.replace(tzinfo=pytz.timezone('Europe/Madrid')).timestamp()
1573517700.0
>>> today.replace(tzinfo=pytz.timezone('Etc/GMT+1')).timestamp()
1573520400.0
I'm surprised that I get different results for Europe/Madrid and Etc/GMT+1. Why is this? Should Europe/Madrid be used differently, or it is possibly a bug?
A few things:
Europe/Madrid is UTC+1 during standard time, and UTC+2 during summer time (aka daylight saving time).
Etc/GMT+1 is UTC-1 for the entire year. Note the sign is opposite what you might expect. See the explanation in the tzdata sources, and on Wikipedia.
Since Madrid is on UTC+1 on the date you gave, you would get the same result for that date if you used Etc/GMT-1. However, I don't recommend that, as you would then later get the wrong result for a date during summer time.
The Etc/GMT±X zones are intended to be used primarily for non-localizable scenarios such as tracking time onboard ships at sea - not for populated locations on land.
As Mason's answer showed, you should be using the localize function rather than replace to assign a time zone. This is covered in the pytz documentation.
UTC Timestamp: The number of seconds since January 1st, 1970 at UTC.
Python datetime: A nice way of seeing this time that is user friendly
The UTC timestamp is not effected by timezones, but the datetime is.
This code takes the given timestamp and converts it to a UTC datetime and a Europe/Madrid timezone.
import datetime as dt
import pytz
# define the old and new timezones
old_timezone = pytz.timezone("UTC")
new_timezone = pytz.timezone("Europe/Madrid")
# get an 'offset-aware' datetime
today = dt.datetime.utcfromtimestamp(1573516800)
my_datetime = old_timezone.localize(today)
# returns datetime in the new timezone
my_datetime_in_new_timezone = my_datetime.astimezone(new_timezone)
print("Old:", str(my_datetime), "\nNew:", str(my_datetime_in_new_timezone), "\nDifference:",
str(my_datetime - my_datetime_in_new_timezone))
Output:
Old: 2019-11-12 00:00:00+00:00
New: 2019-11-12 01:00:00+01:00
Difference: 0:00:00
Code adapted from:
Python: How do you convert datetime/timestamp from one timezone to another timezone?

Python Date to EPOCH conversion

import time
print(int(time.mktime(time.strptime('2017-08-12T17:07:46', '%Y-%m-%dT%H:%M:%S'))))
I get 1502582866 and I expect 1502557666? Any help is welcome
Yes, #Paul is correct: according to the documentation, mktime will take a struct_time in local time and convert it to seconds since the epoch, which has no timezones nor daylight savings.
So, your initial time of:
`2017-08-12T17:07:46`
becomes:
1502582866 ==> `GMT: Sunday, August 13, 2017 12:07:46 AM`
which is correct.
If you are wondering how to correctly convert back, you would use localtime:
import time
epoch_time = 1502582866
time_string = time.strftime('%Y-%m-%dT%H:%M:%S', time.localtime(epoch_time))
print time_string
>> 2017-08-12T17:07:46
Also see a sample in action here: https://eval.in/845075

Python time library: how do I preserve dst with strptime and strftime

I need to store a timestamp in a readable format, and then later on I need to convert it to epoch for comparison purposes.
I tried doing this:
import time
format = '%Y %m %d %H:%M:%S +0000'
timestamp1 = time.strftime(format,time.gmtime()) # '2016 03 25 04:06:22 +0000'
t1 = time.strptime(timestamp1, format) # time.struct_time(..., tm_isdst=-1)
time.sleep(1)
epoch_now = time.mktime(time.gmtime())
epoch_t1 = time.mktime(t1)
print "Delta: %s" % (epoch_now - epoch_t1)
Running this, instead of getting Delta of 1 sec, I get 3601 (1 hr 1 sec), CONSISTENTLY.
Investigating further, it seems that when I just do time.gmtime(), the struct has tm_isdst=0, whereas the converted struct t1 from timestamp1 string has tm_isdst=-1.
How can I ensure the isdst is preserved to 0. I think that's probably the issue here.
Or is there a better way to record time in human readable format (UTC), and yet be able to convert back to epoch properly for time diff calculation?
UPDATES:
After doing more research last night, I switched to using datetime because it preserves more information in the datetime object, and this is confirmed by albertoql answer below.
Here's what I have now:
from datetime import datetime
format = '%Y-%m-%d %H:%M:%S.%f +0000' # +0000 is optional; only for user to see it's UTC
d1 = datetime.utcnow()
timestamp1 = d1.strftime(format)
d1a = datetime.strptime(timestamp1, format)
time.sleep(1)
d2 = datetime.utcnow()
print "Delta: %s" % (d2 - d1a).seconds
I chose not to add tz to keep it simple/shorter; I can still strptime that way.
Below, first an explanation about the problem, then two possible solutions, one using time, another using datetime.
Problem explanation
The problem is on the observation that the OP made in the question: tm_isdst=-1. tm_isdst is a flag that determines whether daylight savings time is in effect or not (see for more details https://docs.python.org/2/library/time.html#time.struct_time).
Specifically, given the format of the string for the time from the OP (that complies with RFC 2822 Internet email standard), [time.strptime]4 does not store the information about the timezone, namely +0000. Thus, when the struct_time is created again according to the information in the string, tm_isdst=-1, namely unknown. The guess on how to fill in that information when making the calculation is based on the local system. For example, as if the system refers to North America, where daylight savings time is in effect, tm_isdst is set.
Solution with time
If you want to use only time package, then, the easiest way to parse directly the information is to specify that the time is in UTC, and thus adding %Z to the format. Note that time does not provide a way to store the information about the timezone in struct_time. As a result, it does not print the actual time zone associated with the time saved in the variable. The time zone is retrieved from the system. Therefore, it is not possible to directly use the same format for time.strftime. The part of the code for writing and reading the string would look like:
format = '%Y %m %d %H:%M:%S UTC'
format2 = '%Y %m %d %H:%M:%S %Z'
timestamp1 = time.strftime(format, time.gmtime())
t1 = time.strptime(timestamp1, format2)
Solution with datetime
Another solution involves the use datetime and dateutil packages, which directly support timezone, and the code could be (assuming that preserving the timezone information is a requirement):
from datetime import datetime
from dateutil import tz, parser
import time
time_format = '%Y %m %d %H:%M:%S %z'
utc_zone = tz.gettz('UTC')
utc_time1 = datetime.utcnow()
utc_time1 = utc_time1.replace(tzinfo=utc_zone)
utc_time1_string = utc_time1.strftime(time_format)
utc_time1 = parser.parse(utc_time1_string)
time.sleep(1)
utc_time2 = datetime.utcnow()
utc_time2 = utc_time2.replace(tzinfo=utc_zone)
print "Delta: %s" % (utc_time2 - utc_time1).total_seconds()
There are some aspects to pay attention to:
After the call of utcnow, the timezone is not set, as it is a naive UTC datetime. If the information about UTC is not needed, it is possible to delete both lines where the timezone is set for the two times, and the result would be the same, as there is no guess about DST.
It is not possible to use datetime.strptime because of %z, which is not correctly parsed. If the string contains the information about the timezone, then parser should be used.
It is possible to directly perform the difference from two instances of datetime and transform the resulting delta into seconds.
If it is necessary to get the time in seconds since the epoch, an explicit computation should be made, as there is no direct function that does that automatically in datetime (at the time of the answer). Below the code, for example for utc_time2:
epoch_time = datetime(1970,1,1)
epoch2 = (utc_time2 - epoch_time).total_seconds()
datetime.resolution, namely the smallest possible difference between two non-equal datetime objects. This results in a difference that is up to the resolution.

python converting string in localtime to UTC epoch timestamp

I have strings in YMD hms format that had the timezone stripped. But I know they are in Eastern time with daylight savings time.
I am trying to convert them into epoch timestamps for UTC time.
I wrote the following function:
def ymdhms_timezone_dst_to_epoch(input_str, tz="US/Eastern"):
print(input_str)
dt = datetime.datetime.fromtimestamp(time.mktime(time.strptime(input_str,'%Y-%m-%d %H:%M:%S')))
local_dt = pytz.timezone(tz).localize(dt)
print(local_dt.strftime('%Y-%m-%d %H:%M:%S %Z%z'))
utc_dt = local_dt.astimezone(pytz.utc)
print(utc_dt.strftime('%Y-%m-%d %H:%M:%S %Z%z'))
e = int(utc_dt.strftime("%s"))
print(e)
return e
Given string `2015-04-20 21:12:07` this prints:
2015-04-20 21:12:07
2015-04-20 21:12:07 EDT-0400 #<- so far so good?
2015-04-21 01:12:07 UTC+0000 #<- so far so good?
1429596727
which looks ok up to the epoch timestamp. But http://www.epochconverter.com/epoch/timezones.php?epoch=1429596727 says it should mao to
Greenwich Mean Time Apr 21 2015 06:12:07 UTC.
What is wrong?
I have strings in YMD hms format that had the timezone stripped. But I know they are in Eastern time with daylight savings time.
A portable way is to use pytz:
#!/usr/bin/env python
from datetime import datetime
import pytz # $ pip install pytz
naive_dt = datetime.strptime('2015-04-20 21:12:07', '%Y-%m-%d %H:%M:%S')
tz = pytz.timezone('US/Eastern')
eastern_dt = tz.normalize(tz.localize(naive_dt))
print(eastern_dt)
# -> 2015-04-20 21:12:07-04:00
I am trying to convert them into epoch timestamps for UTC time.
timestamp = (eastern_dt - datetime(1970, 1, 1, tzinfo=pytz.utc)).total_seconds()
# -> 1429578727.0
See Converting datetime.date to UTC timestamp in Python.
There are multiple issues in your code:
time.mktime() may return a wrong result for ambiguous input time (50% chance) e.g., during "fall back" DST transition in the Fall
time.mktime() and datetime.fromtimestamp() may fail for past/future dates if they have no access to a historical timezone database on a system (notably, Windows)
localize(dt) may return a wrong result for ambiguous or non-existent time i.e., during DST transitions. If you know that the time corresponds to the summer time then use is_dst=True. tz.normalize() is necessary here, to adjust possible non-existing times in the input
utc_dt.strftime("%s") is not portable and it does not respect tzinfo object. It interprets input as a local time i.e., it returns a wrong result unless your local timezone is UTC.
Can I just always set is_dst=True?
You can, if you don't mind getting imprecise results for ambiguous or non-existent times e.g., there is DST transition in the Fall in America/New_York time zone:
>>> from datetime import datetime
>>> import pytz # $ pip install pytz
>>> tz = pytz.timezone('America/New_York')
>>> ambiguous_time = datetime(2015, 11, 1, 1, 30)
>>> time_fmt = '%Y-%m-%d %H:%M:%S%z (%Z)'
>>> tz.localize(ambiguous_time).strftime(time_fmt)
'2015-11-01 01:30:00-0500 (EST)'
>>> tz.localize(ambiguous_time, is_dst=False).strftime(time_fmt) # same
'2015-11-01 01:30:00-0500 (EST)'
>>> tz.localize(ambiguous_time, is_dst=True).strftime(time_fmt) # different
'2015-11-01 01:30:00-0400 (EDT)'
>>> tz.localize(ambiguous_time, is_dst=None).strftime(time_fmt)
Traceback (most recent call last):
...
pytz.exceptions.AmbiguousTimeError: 2015-11-01 01:30:00
The clocks are turned back at 2a.m. on the first Sunday in November:
is_dst disambiguation flag may have three values:
False -- default, assume the winter time
True -- assume the summer time
None -- raise an exception for ambiguous/non-existent times.
is_dst value is ignored for existing unique local times.
Here's a plot from PEP 0495 -- Local Time Disambiguation that illustrates the DST transition:
The local time repeats itself twice in the fold (summer time -- before the fold, winter time -- after).
To be able to disambiguate the local time automatically, you need some additional info e.g., if you read a series of local times then it may help if you know that they are sorted: Parsing of Ordered Timestamps in Local Time (to UTC) While Observing Daylight Saving Time.
First of all '%s' is not supported on all platforms , its actually working for you because your platform C library’s strftime() function (that is called by Python) supports it. This function is what is causing the issue most probably, I am guessing its not timezone aware , hence when taking difference from epoch time it is using your local timezone, which is most probably EST(?)
Instead of relying on '%s' , which only works in few platforms (linux, I believe) , you should manually subtract the datetime you got from epoch (1970/1/1 00:00:00) to get the actual seconds since epoch . Example -
e = (utc_dt - datetime.datetime(1970,1,1,0,0,0,tzinfo=pytz.utc)).total_seconds()
Demo -
>>> (utc_dt - datetime.datetime(1970,1,1,0,0,0,tzinfo=pytz.utc)).total_seconds()
1429578727.0
This correctly corresponds to the date-time you get.
I don't exactly know why but you have to remove the timezone info from your utc_dt before using %s to print it.
e = int(utc_dt.replace(tzinfo=None).strftime("%s"))
print(e)
return e

Does Python's time.time() return the local or UTC timestamp?

Does time.time() in the Python time module return the system's time or the time in UTC?
The time.time() function returns the number of seconds since the epoch, as a float. Note that "the epoch" is defined as the start of January 1st, 1970 in UTC. So the epoch is defined in terms of UTC and establishes a global moment in time. No matter where on Earth you are, "seconds past epoch" (time.time()) returns the same value at the same moment.
Here is some sample output I ran on my computer, converting it to a string as well.
>>> import time
>>> ts = time.time()
>>> ts
1355563265.81
>>> import datetime
>>> datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
'2012-12-15 01:21:05'
>>>
The ts variable is the time returned in seconds. I then converted it to a human-readable string using the datetime library.
This is for the text form of a timestamp that can be used in your text files. (The title of the question was different in the past, so the introduction to this answer was changed to clarify how it could be interpreted as the time. [updated 2016-01-14])
You can get the timestamp as a string using the .now() or .utcnow() of the datetime.datetime:
>>> import datetime
>>> print datetime.datetime.utcnow()
2012-12-15 10:14:51.898000
The now differs from utcnow as expected -- otherwise they work the same way:
>>> print datetime.datetime.now()
2012-12-15 11:15:09.205000
You can render the timestamp to the string explicitly:
>>> str(datetime.datetime.now())
'2012-12-15 11:15:24.984000'
Or you can be even more explicit to format the timestamp the way you like:
>>> datetime.datetime.now().strftime("%A, %d. %B %Y %I:%M%p")
'Saturday, 15. December 2012 11:19AM'
If you want the ISO format, use the .isoformat() method of the object:
>>> datetime.datetime.now().isoformat()
'2013-11-18T08:18:31.809000'
You can use these in variables for calculations and printing without conversions.
>>> ts = datetime.datetime.now()
>>> tf = datetime.datetime.now()
>>> te = tf - ts
>>> print ts
2015-04-21 12:02:19.209915
>>> print tf
2015-04-21 12:02:30.449895
>>> print te
0:00:11.239980
Based on the answer from #squiguy, to get a true timestamp I would type cast it from float.
>>> import time
>>> ts = int(time.time())
>>> print(ts)
1389177318
At least that's the concept.
The answer could be neither or both.
neither: time.time() returns approximately the number of seconds elapsed since the Epoch. The result doesn't depend on timezone so it is neither UTC nor local time. Here's POSIX defintion for "Seconds Since the Epoch".
both: time.time() doesn't require your system's clock to be synchronized so it reflects its value (though it has nothing to do with local timezone). Different computers may get different results at the same time. On the other hand if your computer time is synchronized then it is easy to get UTC time from the timestamp (if we ignore leap seconds):
from datetime import datetime
utc_dt = datetime.utcfromtimestamp(timestamp)
On how to get timestamps from UTC time in various Python versions, see How can I get a date converted to seconds since epoch according to UTC?
To get a local timestamp using datetime library, Python 3.x
#wanted format: year-month-day hour:minute:seconds
from datetime import datetime
# get time now
dt = datetime.now()
# format it to a string
timeStamp = dt.strftime('%Y-%m-%d %H:%M:%S')
# print it to screen
print(timeStamp)
I eventually settled for:
>>> import time
>>> time.mktime(time.gmtime())
1509467455.0
There is no such thing as an "epoch" in a specific timezone. The epoch is well-defined as a specific moment in time, so if you change the timezone, the time itself changes as well. Specifically, this time is Jan 1 1970 00:00:00 UTC. So time.time() returns the number of seconds since the epoch.
timestamp is always time in utc, but when you call datetime.datetime.fromtimestamp it returns you time in your local timezone corresponding to this timestamp, so result depend of your locale.
>>> import time, datetime
>>> time.time()
1564494136.0434234
>>> datetime.datetime.now()
datetime.datetime(2019, 7, 30, 16, 42, 3, 899179)
>>> datetime.datetime.fromtimestamp(time.time())
datetime.datetime(2019, 7, 30, 16, 43, 12, 4610)
There exist nice library arrow with different behaviour. In same case it returns you time object with UTC timezone.
>>> import arrow
>>> arrow.now()
<Arrow [2019-07-30T16:43:27.868760+03:00]>
>>> arrow.get(time.time())
<Arrow [2019-07-30T13:43:56.565342+00:00]>
time.time() return the unix timestamp.
you could use datetime library to get local time or UTC time.
import datetime
local_time = datetime.datetime.now()
print(local_time.strftime('%Y%m%d %H%M%S'))
utc_time = datetime.datetime.utcnow()
print(utc_time.strftime('%Y%m%d %H%M%S'))

Categories