I have tweet data file. Each has feature as 'created_at' in the following format:
u'created_at': 1369859382
What does this 10 digit number correspond to?
Any help will be appreciated.
That could be a UNIX timestamp ...http://www.onlineconversion.com/unix_time.htm
The example you suggested is equivalent to Wed, 29 May 2013 20:29:42 GMT
Here is a useful resource for mystery date/times formats ... http://www.fmdiff.com/fm/timestamp.html?session=vc8uqio2fsg9op81ohnhbthclmsb21j3
It is the time in seconds since January 1, 1970. The number in your example is May 29, 2013, 1:29:42 PM (in the PDT time zone, anyway, seven hours behind UTC).
>>> import datetime
>>> datetime.datetime.fromtimestamp(1369859382)
datetime.datetime(2013, 5, 29, 13, 29, 42)
Related
I am capturing the header of an Email to Email Date in the format below:
Fri, 27 Mar 2020 12:05:17 +0000 (UTC)
I need to transform to the format: YYYY-MM-DD HH: MM
I tried to use the datetime.strptime function but to no avail.
Can anyone help me with this? As I'm starting now in Python, I'm taking a beating!
I would use python-dateutil to convert your string to a datetime object, then use strftime to output a string of the desired format.
import datetime
from dateutil import parser
x = parser.parse("Fri, 27 Mar 2020 12:05:17 +0000 (UTC)")
print(x.strftime("%Y-%m-%d %H:%M"))
the output should be
2020-03-27 12:05
A custom implementation without using the datetime library if you don't want that dependency.
date_given = "Fri, 27 Mar 2020 12:05:17 +0000 (UTC)"
month ={'Jan':'01','Feb':'02','Mar':'03','Apr':'04','May':'05','Jun':'06','Jul':'07','Aug':'08','Sep':'09','Oct':'10','Nov':'11','Dec':'12'}
_,d,m,y,t,_,_ = date_given.split(' ')
print(str(y)+"-"+str(month[m])+'-'+str(d)+' '+str(t[:-3]))
Would give you 2020-03-27 12:05
The standard library's email package provides tools to parse RFC5322 format datetime strings.
from email.headerregistry import DateHeader
kwds = {} # This dict is modified in-place
DateHeader.parse('Fri, 27 Mar 2020 12:05:17 +0000 (UTC)', kwds)
kwds['datetime']
datetime.datetime(2020, 3, 27, 12, 5, 17, tzinfo=datetime.timezone.utc)
While DateHeader is the modern tool for parsing date headers, the legacy* function email.utils.parsedate_to_datetime is easier to use
from email.utils import parsedate_to_datetime
parsedate_to_datetime('Fri, 27 Mar 2020 12:05:17 +0000 (UTC)')
datetime.datetime(2020, 3, 27, 12, 5, 17, tzinfo=datetime.timezone.utc)
*While to the docs list the utils module under the legacy API heading, parsedate_to_datetime is used internally by DateHeader to parse datetime strings, so it probably isn't going away any time soon.
I receive this data from the GPS Integrated in my Vehicle:
INS_Time::INS_Time_Millisec[ms] # example of the value: 295584830.0
INS_Time::INS_Time_Week[Week] # example of the value: 2077.0
INS_Time::Leap_seconds[s] # example of the value: 18.0
what I need is a UTC timestamp so I assume that I must combine all those different times from GPS to get a UTC timestamp but I don't know how this should be done. I would be thankful if someone can guide me through this.
example of the result I want to have: 1572430625230
I'm using python 3.7, if there is a library for this it would be very helpful otherwise I'm searching also for an Algorithm to do this
My guess :
According to https://en.wikipedia.org/wiki/Epoch_(computing)#Notable_epoch_dates_in_computing , GPS epoch is 6 January 1980, GPS counts weeks (a week is defined to start on Sunday) and 6 January is the first Sunday of 1980
And according to http://leapsecond.com/java/gpsclock.htm , GPS time was zero at 0h 6-Jan-1980 and since it is not perturbed by leap seconds GPS is now ahead of UTC by 18 seconds.
So we have to define a gps_epoch and substract the given leap seconds in order to have utc datetime
from datetime import datetime, timedelta
import pytz
def gps_datetime(time_week, time_ms, leap_seconds):
gps_epoch = datetime(1980, 1, 6, tzinfo=pytz.utc)
# gps_time - utc_time = leap_seconds
return gps_epoch + timedelta(weeks=time_week, milliseconds=time_ms, seconds=-leap_seconds)
With your example
>>>gps_datetime(2077, 295584830.0,18.0)
datetime.datetime(2019, 10, 30, 10, 6, 6, 830000, tzinfo=<UTC>)
>>>gps_datetime(2077, 295584830.0,18.0).timestamp()
1572429966.83
But I am quite far from your expected result (1572430625230 even if it is expressed in ms)
Do not forget to pip install pytz
I'm trying to parse a date string using the following code:
from dateutil.parser import parse
datestring = 'Thu Jul 25 15:13:16 GMT+06:00 2019'
d = parse(datestring)
print (d)
The parsed date is:
datetime.datetime(2019, 7, 25, 15, 13, 16, tzinfo=tzoffset(None, -21600))
As you can see, instead of adding 6 hours to GMT, it actually subtracted 6 hours.
What's wrong I'm doing here? Any help on how can I parse datestring in this format?
There's a comment in the source: https://github.com/dateutil/dateutil/blob/cbcc0871792e7eed4a42cc62630a08ec7a78be30/dateutil/parser/_parser.py#L803.
# Check for something like GMT+3, or BRST+3. Notice
# that it doesn't mean "I am 3 hours after GMT", but
# "my time +3 is GMT". If found, we reverse the
# logic so that timezone parsing code will get it
# right.
Important parts
Notice that it doesn't mean "I am 3 hours after GMT", but "my time +3 is GMT"
If found, we reverse the logic so that timezone parsing code will get it right
Last sentence in that comment (and 2nd bullet point above) explains why 6 hours are subtracted. Hence, Thu Jul 25 15:13:16 GMT+06:00 2019 means Thu Jul 25 09:13:16 2019 GMT.
Take a look at http://www.timebie.com/tz/timediff.php?q1=Universal%20Time&q2=GMT%20+6%20Time for more context.
dateutil.parse converts every time into GMT. The input is being read as 15:13:16 in GMT+06:00 time. Naturally, it becomes 15:13:16-06:00 in GMT.
This question already has answers here:
Parsing date/time string with timezone abbreviated name in Python?
(6 answers)
Closed 5 years ago.
I have a date in this format = "Tue, 28 Feb 2017 18:30:32 GMT"
I can convert it to a datetime object using time.strptime("Tue, 28 Feb 2017 18:30:32 GMT", "%a, %d %b %Y %H:%M:%S %Z") but the datetime object does not keep track of timezone.
I want to be able to to know the timezone. How can I achieve that? Any help is much appreciated.
from dateutil import parser
parser.parse("Tue, 28 Feb 2017 18:30:32 GMT")
datetime.datetime(2017, 2, 28, 18, 30, 32, tzinfo=tzutc())
This problem is actually more involved than it might first appear. I understand that timezone names are not unique and that there are throngs of the things. However, if the number of them that you need to work with is manageable, and if your inputs are limited to that format, then this approach might be good for you.
>>> from dateutil.parser import *
>>> tzinfos = {'GMT': 0, 'PST': -50, 'DST': 22 }
>>> aDate = parse("Tue, 28 Feb 2017 18:30:32 GMT", tzinfos=tzinfos)
>>> aDate.tzinfo.tzname(0)
'GMT'
>>> aDate = parse("Tue, 28 Feb 2017 18:30:32 PST", tzinfos=tzinfos)
>>> aDate.tzinfo.tzname(0)
'PST'
>>> aDate = parse("Tue, 28 Feb 2017 18:30:32 DST", tzinfos=tzinfos)
>>> aDate.tzinfo.tzname(0)
'DST'
Load the alternative timezone abbreviations into a dictionary, in this code called tzinfos then parse away. The timezone parsed from the date expression becomes available in the construct shown.
Other date items are available, as you would expect.
>>> aDate.day
28
>>> aDate.month
2
>>> aDate.year
2017
I'm trying to convert 48-bits (8 octets) to a timestamp using python for a little security project. I'm working with some network packets from the DNP3 protocol and I'm trying to decode timestamp values foreach DNP3 class object.
According to the DNP3 standard, "DNP3 time (in the form of an UINT48): Absolute time value expressed as the number of milliseconds since the start of January 1, 1970".
I have the following octets which need to be converted into a datetime:
# List of DNP3 timestamps
DNP3_ts = []
# Feb 20, 2016 00:27:07.628000000 UTC
DNP3_ts.append('\xec\x58\xed\xf9\x52\x01')
# Feb 20, 2016 00:34:08.107000000 UTC
DNP3_ts.append('\x6b\xc3\xf3\xf9\x52\x01')
# Feb 20, 2016 00:42:40.460000000 UTC
DNP3_ts.append('\xcc\x94\xfb\xf9\x52\x01')
# Feb 20, 2016 00:56:47.642000000 UTC
DNP3_ts.append('\x1a\x82\x08\xfa\x52\x01')
# Feb 20, 2016 00:56:48.295000000 UTC
DNP3_ts.append('\xa7\x84\x08\xfa\x52\x01')
# Feb 20, 2016 00:58:21.036000000 UTC
DNP3_ts.append('\xec\xee\x09\xfa\x52\x01')
# Feb 20, 2016 01:17:09.147000000 UTC
DNP3_ts.append('\x9b\x25\x1b\xfa\x52\x01')
# Feb 20, 2016 01:49:05.895000000 UTC
DNP3_ts.append('\xe7\x64\x38\xfa\x52\x01')
# Feb 20, 2016 01:58:30.648000000 UTC
DNP3_ts.append('\xf8\x02\x41\xfa\x52\x01')
for ts in DNP3_ts:
print [ts]
So I need figure out the following steps:
# 1. Converting the octets into a 48bit Integer (which can't be done in python)
# 2. Using datetime to calculate time from 01/01/1970
# 3. Convert current time to 48bits (6 octets)
If anyone can help me out with these steps it would be very much appreciated!
You can trivially combine the bytes to create a 48-bit integer with some bitwise operations. You can convert each octet to a uint8 with ord() and left shift them by a different multiple of 8 so they all occupy a different location in the 48-bit number.
DNP3 encodes the bytes in a reverse order. To visualise this, let your octets from left to right called A-F and the bits of A called aaaaaaaa, etc. So from your octets to the 48-bit number you want to achieve this order.
A B C D E F
ffffffff eeeeeeee dddddddd cccccccc bbbbbbbb aaaaaaaa
Once you have the milliseconds, divide them by 1000 to get a float number in seconds and pass that to datetime.datetime.utcfromtimestamp(). You can further format this datetime object with the strftime() method. The code to achieve all this is
from datetime import datetime
def dnp3_to_datetime(octets):
milliseconds = 0
for i, value in enumerate(octets):
milliseconds = milliseconds | (ord(value) << (i*8))
date = datetime.utcfromtimestamp(milliseconds/1000.)
return date.strftime('%b %d, %Y %H:%M:%S.%f UTC')
By calling this function for each of your DNP3 times, you get the following results.
Feb 19, 2016 14:27:07.628000 UTC
Feb 19, 2016 14:34:08.107000 UTC
Feb 19, 2016 14:42:40.460000 UTC
Feb 19, 2016 14:56:47.642000 UTC
Feb 19, 2016 14:56:48.295000 UTC
Feb 19, 2016 14:58:21.036000 UTC
Feb 19, 2016 15:17:09.147000 UTC
Feb 19, 2016 15:49:05.895000 UTC
Feb 19, 2016 15:58:30.648000 UTC
You'll notice that these results lag by 8 hours exactly. I can't figure out this discrepancy, but I don't think my approach is wrong.
In order to go from a datetime to a DNP3 time, start by converting the time to a timestamp of milliseconds. Then, by right shifting and masking 8 bits at a time you can construct the DNP3 octets.
def datetime_to_dnp3(date=None):
if date is None:
date = datetime.utcnow()
seconds = (date - datetime(1970, 1, 1)).total_seconds()
milliseconds = int(seconds * 1000)
return ''.join(chr((milliseconds >> (i*8)) & 0xff) for i in xrange(6))
If you call it without arguments, it'll give you the current time, but you have the option to specify any specific datetime. For example, datetime(1970, 1, 1) will return \x00\x00\x00\x00\x00\x00 and datetime(1970, 1, 1, 0, 0, 0, 1000) (one millisecond after the 1970 epoch) will return \x01\x00\x00\x00\x00\x00.
Note, depending on the bytes in the DNP3 time, you may get weird symbols if you try to print them. Don't worry though, the bytes are still there, it's just that Python trying to encode them to characters. If you want to see the individual bytes with interfering with each other, simply print list(DNP3_ts[i]). You may notice that it prints '\x52' as R (similar to many ASCII printable characters), but they are equivalent.
To get an integer from the input bytes in Python 3:
>>> millis = int.from_bytes(b'\xec\x58\xed\xf9\x52\x01', 'little')
>>> millis
1455892027628
To interpret the integer as "milliseconds since epoch":
>>> from datetime import datetime, timedelta
>>> utc_time = datetime(1970, 1, 1) + timedelta(milliseconds=millis)
>>> str(utc_time)
'2016-02-19 14:27:07.628000'
Note: the result is different from the one provided in the comment in your question (Feb 20, 2016 00:27:07.628).
If you need to support older Python versions, to convert bytes in the little-endian order to an unsigned integer:
>>> import binascii
>>> int(binascii.hexlify(b'\xec\x58\xed\xf9\x52\x01'[::-1]), 16)
1455892027628