I have a string:
r = 'Thu Dec 17 08:56:41 CST 2020'
Here CST represent China central time('Asia/Shanghai'). I wanted to parse it to datetime...I am doing something like
from dateparser import parse
r1 = parse(r)
Which is giving me r1 as:
2020-12-17 08:56:41-06:00
And I am also doing this
r2 = r1.replace(tzinfo=pytz.timezone("Asia/Shanghai"))
And this is giving me r2 as:
2020-12-17 08:50:41+08:00
There is 6 min lag in r2 can someone tell me why is that? And how to correctly transfer my raw string r1 to desired r2 which is:
2020-12-17 08:56:41 in Asia/Shanghai timezone
Thanks
Using dateutil.parser you can directly parse your date correctly.
Note that CST is an ambiguous timezone, so you need to specify which one you mean. You can either do this directly in the tzinfos parameter of the parse() call or you can define a dictionary that has mappings for timezones and pass this. In this dict, you can either specify the offset, e.g.
timezone_info = {
"CDT": -5 * 3600,
"CEST": 2 * 3600,
"CST": 8 * 3600
}
parser.parse(r, tzinfos=timezone_info)
or (using gettz) directly specify a timezone:
timezone_info = {
"CDT": gettz("America/Chicago"),
"CEST": gettz("Europe/Berlin"),
"CST": gettz("Asia/Shanghai")
}
parser.parse(r, tzinfos=timezone_info)
See also the dateutil.parser documentation and the answers to this SO question.
Be aware that the latter approach is tricky if you have a location with daylight saving time! Depending on the date you apply it to, gettz("America/Chicago") will have UTC-5 or UTC-6 as a result (as Chicago switches between Central Standard Time and Central Daylight Time). So depending on your input data, the second example may actually not really be correct and yield the wrong outcome! Currently, China observes China Standard Time (CST) all year, so for your use case it makes no difference (may depend on your date range though).
Overall:
from dateutil import parser
from dateutil.tz import gettz
timezone_info = {"CST": gettz("Asia/Shanghai")}
r = 'Thu Dec 17 08:56:41 CST 2020'
d = parser.parse(r, tzinfos=timezone_info)
print(d)
print(d.strftime('%Y-%m-%d %H:%M:%S %Z%z'))
gets you
2020-12-17 08:56:41+08:00
2020-12-17 08:56:41 CST+0800
EDIT: Printing the human readable timezone name instead of the abbreviated one name is just a little more complicated with this approach, as dateutil.tz.gettz() gets you a tzfile that has no attribute which has just the name. However, you can obtain it via the protected _filename using split():
print(d.strftime('%Y-%m-%d %H:%M:%S') + " in " + "/".join(d.tzinfo._filename.split('/')[-2:]))
yields
2020-12-17 08:56:41+08:00 in Asia/Shanghai
This of course only works if you used gettz() to set the timezone in the first place.
EDIT 2: If you know that all your dates are in CST anyway, you can also ignore the timezone when parsing. This gets you naive (or unanware) datetimes which you can then later add a human readable timezone to. You can do this using replace() and specify the timezone either as shown above using gettz() or using timezone(() from the pytz module:
from dateutil import parser
from dateutil.tz import gettz
import pytz
r = 'Thu Dec 17 08:56:41 CST 2020'
d = parser.parse(r, ignoretz=True)
d_dateutil = d.replace(tzinfo=gettz('Asia/Shanghai'))
d_pytz = d.replace(tzinfo=pytz.timezone('Asia/Shanghai'))
Note that depending on which module you use to add the timezone information, the class of tzinfo differs. For the pytz object, there is a more direct way of accessing the timezone in human readable form:
print(type(d_dateutil.tzinfo))
print("/".join(d_dateutil.tzinfo._filename.split('/')[-2:]))
print(type(d_pytz.tzinfo))
print(d_pytz.tzinfo.zone)
produces
<class 'dateutil.tz.tz.tzfile'>
Asia/Shanghai
<class 'pytz.tzfile.Asia/Shanghai'>
Asia/Shanghai
from datetime import datetime
import pytz
# The datetime string you have
r = "Thu Dec 17 08:56:41 CST 2020"
# The time-zone string you want to use
offset_string = 'Asia/Shanghai'
# convert the time zone string into offset from UTC
# a. datetime.now(pytz.timezone(offset_string)).utcoffset().total_seconds() --- returns seconds offset from UTC
# b. convert seconds into hours (decimal) --- divide by 60 twice
# c. remove the decimal point, we want the structure as: +0800
offset_num_repr = '+{:05.2f}'.format(datetime.now(pytz.timezone(offset_string)).utcoffset().total_seconds()/60/60).replace('.', '')
print('Numeric representation of the offset: ', offset_num_repr)
# replace the CST 2020 with numeric timezone offset
# a. replace it with the offset computed above
updated_datetime = str(r).replace('CST', offset_num_repr)
print('\t Modified datetime string: ', updated_datetime)
# Now parse your string into datetime object
r = datetime.strptime(updated_datetime, "%a %b %d %H:%M:%S %z %Y")
print('\tFinal parsed datetime object: ', r)
Should produce:
Numeric representation of the offset: +0800
Modified datetime string: Thu Dec 17 08:56:41 +0800 2020
Final parsed datetime object: 2020-12-17 08:56:41+08:00
I'm trying to parse a date string using the following code:
from dateutil.parser import parse
datestring = 'Thu Jul 25 15:13:16 GMT+06:00 2019'
d = parse(datestring)
print (d)
The parsed date is:
datetime.datetime(2019, 7, 25, 15, 13, 16, tzinfo=tzoffset(None, -21600))
As you can see, instead of adding 6 hours to GMT, it actually subtracted 6 hours.
What's wrong I'm doing here? Any help on how can I parse datestring in this format?
There's a comment in the source: https://github.com/dateutil/dateutil/blob/cbcc0871792e7eed4a42cc62630a08ec7a78be30/dateutil/parser/_parser.py#L803.
# Check for something like GMT+3, or BRST+3. Notice
# that it doesn't mean "I am 3 hours after GMT", but
# "my time +3 is GMT". If found, we reverse the
# logic so that timezone parsing code will get it
# right.
Important parts
Notice that it doesn't mean "I am 3 hours after GMT", but "my time +3 is GMT"
If found, we reverse the logic so that timezone parsing code will get it right
Last sentence in that comment (and 2nd bullet point above) explains why 6 hours are subtracted. Hence, Thu Jul 25 15:13:16 GMT+06:00 2019 means Thu Jul 25 09:13:16 2019 GMT.
Take a look at http://www.timebie.com/tz/timediff.php?q1=Universal%20Time&q2=GMT%20+6%20Time for more context.
dateutil.parse converts every time into GMT. The input is being read as 15:13:16 in GMT+06:00 time. Naturally, it becomes 15:13:16-06:00 in GMT.
I have a Unix timestamp which value is 1502878840. This Unix timestamp value can be converted to human readable like Aug 16, 2017 10:20:40.
I have 2 following python code to convert 1502878840 to Aug 16, 2017 10:20:40. Both of them give a same result (Aug 16, 2017 10:20:40)
First method
utc = datetime.fromtimestamp(1502878840)
Second method
utc = datetime(1970, 1, 1) + timedelta(seconds=1502878840)
Could anyone answer me 2 following questions.
1. The result of 2 methods are same. But at the logic view point of Python code, is there any case that may cause the difference in result?
I ask this question because I see most of the python code use First method.
2. As I read here, the Unix time will have a problem on 19 January, 2038 03:14:08 GMT.
I run a timestamp which has a date after 19.Jan, 2038 (2148632440- Feb 01, 2038 10:20:40). The result is as follows
First method: ValueError: timestamp out of range for platform time_t
Second method: 2038-02-01 10:20:40
Question is: Can I use Second method to overcome the problem of "Year 2038 problem"?
Quoting the documentation:
fromtimestamp() may raise OverflowError, if the timestamp is out of the range of values supported by the platform C localtime() or gmtime() functions, and OSError on localtime() or gmtime() failure. It’s common for this to be restricted to years in 1970 through 2038. Note that on non-POSIX systems that include leap seconds in their notion of a timestamp, leap seconds are ignored by fromtimestamp(), and then it’s possible to have two timestamps differing by a second that yield identical datetime objects. See also utcfromtimestamp().
The second solution solves your problem:
utc = datetime(1970, 1, 1) + timedelta(seconds=1502878840)
I have tweet data file. Each has feature as 'created_at' in the following format:
u'created_at': 1369859382
What does this 10 digit number correspond to?
Any help will be appreciated.
That could be a UNIX timestamp ...http://www.onlineconversion.com/unix_time.htm
The example you suggested is equivalent to Wed, 29 May 2013 20:29:42 GMT
Here is a useful resource for mystery date/times formats ... http://www.fmdiff.com/fm/timestamp.html?session=vc8uqio2fsg9op81ohnhbthclmsb21j3
It is the time in seconds since January 1, 1970. The number in your example is May 29, 2013, 1:29:42 PM (in the PDT time zone, anyway, seven hours behind UTC).
>>> import datetime
>>> datetime.datetime.fromtimestamp(1369859382)
datetime.datetime(2013, 5, 29, 13, 29, 42)
I see a lot on converting a date string to an datetime object in Python, but I want to go the other way.
I've got
datetime.datetime(2012, 2, 23, 0, 0)
and I would like to convert it to string like '2/23/2012'.
You can use strftime to help you format your date.
E.g.,
import datetime
t = datetime.datetime(2012, 2, 23, 0, 0)
t.strftime('%m/%d/%Y')
will yield:
'02/23/2012'
More information about formatting see here
date and datetime objects (and time as well) support a mini-language to specify output, and there are two ways to access it:
direct method call: dt.strftime('format here')
format method (python 2.6+): '{:format here}'.format(dt)
f-strings (python 3.6+): f'{dt:format here}'
So your example could look like:
dt.strftime('The date is %b %d, %Y')
'The date is {:%b %d, %Y}'.format(dt)
f'The date is {dt:%b %d, %Y}'
In all three cases the output is:
The date is Feb 23, 2012
For completeness' sake: you can also directly access the attributes of the object, but then you only get the numbers:
'The date is %s/%s/%s' % (dt.month, dt.day, dt.year)
# The date is 02/23/2012
The time taken to learn the mini-language is worth it.
For reference, here are the codes used in the mini-language:
%a Weekday as locale’s abbreviated name.
%A Weekday as locale’s full name.
%w Weekday as a decimal number, where 0 is Sunday and 6 is Saturday.
%d Day of the month as a zero-padded decimal number.
%b Month as locale’s abbreviated name.
%B Month as locale’s full name.
%m Month as a zero-padded decimal number. 01, ..., 12
%y Year without century as a zero-padded decimal number. 00, ..., 99
%Y Year with century as a decimal number. 1970, 1988, 2001, 2013
%H Hour (24-hour clock) as a zero-padded decimal number. 00, ..., 23
%I Hour (12-hour clock) as a zero-padded decimal number. 01, ..., 12
%p Locale’s equivalent of either AM or PM.
%M Minute as a zero-padded decimal number. 00, ..., 59
%S Second as a zero-padded decimal number. 00, ..., 59
%f Microsecond as a decimal number, zero-padded on the left. 000000, ..., 999999
%z UTC offset in the form +HHMM or -HHMM (empty if naive), +0000, -0400, +1030
%Z Time zone name (empty if naive), UTC, EST, CST
%j Day of the year as a zero-padded decimal number. 001, ..., 366
%U Week number of the year (Sunday is the first) as a zero padded decimal number.
%W Week number of the year (Monday is first) as a decimal number.
%c Locale’s appropriate date and time representation.
%x Locale’s appropriate date representation.
%X Locale’s appropriate time representation.
%% A literal '%' character.
Another option:
import datetime
now=datetime.datetime.now()
now.isoformat()
# ouptut --> '2016-03-09T08:18:20.860968'
If you are looking for a simple way of datetime to string conversion and can omit the format. You can convert datetime object to str and then use array slicing.
In [1]: from datetime import datetime
In [2]: now = datetime.now()
In [3]: str(now)
Out[3]: '2019-04-26 18:03:50.941332'
In [5]: str(now)[:10]
Out[5]: '2019-04-26'
In [6]: str(now)[:19]
Out[6]: '2019-04-26 18:03:50'
But note the following thing. If other solutions will rise an AttributeError when the variable is None in this case you will receive a 'None' string.
In [9]: str(None)[:19]
Out[9]: 'None'
You could use simple string formatting methods:
>>> dt = datetime.datetime(2012, 2, 23, 0, 0)
>>> '{0.month}/{0.day}/{0.year}'.format(dt)
'2/23/2012'
>>> '%s/%s/%s' % (dt.month, dt.day, dt.year)
'2/23/2012'
You can easly convert the datetime to string in this way:
from datetime import datetime
date_time = datetime(2012, 2, 23, 0, 0)
date = date_time.strftime('%m/%d/%Y')
print("date: %s" % date)
These are some of the patterns that you can use to convert datetime to string:
For better understanding, you can take a look at this article on how to convert strings to datetime and datetime to string in Python or the official strftime documentation
type-specific formatting can be used as well:
t = datetime.datetime(2012, 2, 23, 0, 0)
"{:%m/%d/%Y}".format(t)
Output:
'02/23/2012'
If you want the time as well, just go with
datetime.datetime.now().__str__()
Prints 2019-07-11 19:36:31.118766 in console for me
The sexiest version by far is with format strings.
from datetime import datetime
print(f'{datetime.today():%Y-%m-%d}')
It is possible to convert a datetime object into a string by working directly with the components of the datetime object.
from datetime import date
myDate = date.today()
#print(myDate) would output 2017-05-23 because that is today
#reassign the myDate variable to myDate = myDate.month
#then you could print(myDate.month) and you would get 5 as an integer
dateStr = str(myDate.month)+ "/" + str(myDate.day) + "/" + str(myDate.year)
# myDate.month is equal to 5 as an integer, i use str() to change it to a
# string I add(+)the "/" so now I have "5/" then myDate.day is 23 as
# an integer i change it to a string with str() and it is added to the "5/"
# to get "5/23" and then I add another "/" now we have "5/23/" next is the
# year which is 2017 as an integer, I use the function str() to change it to
# a string and add it to the rest of the string. Now we have "5/23/2017" as
# a string. The final line prints the string.
print(dateStr)
Output --> 5/23/2017
You can convert datetime to string.
published_at = "{}".format(self.published_at)
String concatenation, str.join, can be used to build the string.
d = datetime.now()
'/'.join(str(x) for x in (d.month, d.day, d.year))
'3/7/2016'
end_date = "2021-04-18 16:00:00"
end_date_string = end_date.strftime("%Y-%m-%d")
print(end_date_string)
An approach to how far from now
support different languages by passing in param li, a list corresponding timestamp.
from datetime import datetime
from dateutil import parser
t1 = parser.parse("Tue May 26 15:14:45 2021")
t2 = parser.parse("Tue May 26 15:9:45 2021")
# 5min
t3 = parser.parse("Tue May 26 11:14:45 2021")
# 4h
t4 = parser.parse("Tue May 26 11:9:45 2021")
# 1day
t6 = parser.parse("Tue May 25 11:14:45 2021")
# 1day4h
t7 = parser.parse("Tue May 25 11:9:45 2021")
# 1day4h5min
t8 = parser.parse("Tue May 19 11:9:45 2021")
# 1w
t9 = parser.parse("Tue Apr 26 11:14:45 2021")
# 1m
t10 = parser.parse("Tue Oct 08 06:00:20 2019")
# 1y7m, 19m
t11 = parser.parse("Tue Jan 08 00:00:00 2019")
# 2y4m, 28m
# create: date of object creation
# now: time now
# li: a list of string indicate time (in any language)
# lst: suffix (in any language)
# long: display length
def howLongAgo(create, now, li, lst, long=2):
dif = create - now
print(dif.days)
sec = dif.days * 24 * 60 * 60 + dif.seconds
minute = sec // 60
sec %= 60
hour = minute // 60
minute %= 60
day = hour // 24
hour %= 24
week = day // 7
day %= 7
month = (week * 7) // 30
week %= 30
year = month // 12
month %= 12
s = []
for ii, tt in enumerate([sec, minute, hour, day, week, month, year]):
ss = li[ii]
if tt != 0:
if tt == 1:
s.append(str(tt) + ss)
else:
s.append(str(tt) + ss + 's')
return ' '.join(list(reversed(s))[:long]) + ' ' + lst
t = howLongAgo(t1, t11, [
'second',
'minute',
'hour',
'day',
'week',
'month',
'year',
], 'ago')
print(t)
# 2years 4months ago
I have used this method to insert dates to JSON object
my_json_string = json.dumps({'date_of_birth': '''{}'''.format(date_of_birth)})