python convert date from string breaks when year is two digit - python

i am trying to convert a date string to date format
>>> str = "04-18-2002 03:50PM"
>>> time.strptime(str, '%m-%d-%Y %H:%M%p')
time.struct_time(tm_year=2002, tm_mon=4, tm_mday=18, tm_hour=3, tm_min=50, tm_sec=0, tm_wday=3, tm_yday=108, tm_isdst=-1)
however when the year is in two digit it breaks
>>> str = "04-18-02 03:50PM"
>>> time.strptime(str, '%m-%d-%Y %H:%M%p')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/_strptime.py", line 454, in _strptime_time
return _strptime(data_string, format)[0]
File "/usr/lib/python2.7/_strptime.py", line 325, in _strptime
(data_string, format))
ValueError: time data '04-18-02 03:50' does not match format '%m-%d-%Y %H:%M'
any ideas??

%Y in the format string denotes a four-digit year, see the documentation. For a two-digit year, use %y instead. To support both formats, first try: one of the formats, and catch the ValueError and try the other one.

The correct format for a two-digit year is %y (lowercase y).
As a side note, please don't call the string variable str as it shadows the builtin.

Related

Abbreviated month or day of the week with a DOT (".") at the end of the string

I need to change with python a lot of strings with a Spanish date format (DDMMMYYYY, MMM abbreviated month in Spanish) in a other datetime format but I'm having problems because my locale Spanish settings has a "." (a dot) at the end of the string when it change this format in a abbreviated month format.
By default, python take the English version of the language but I can change the language with the locale library.
When I select 'esp' or 'es_ES.utf8' the dot at the end of the abbreviated month appears.
Does it depend on the regional settings of my Windows 10? (I check it and all seems OK) Does it depend on the LOCALE library settings?
The same code in UBUNTU runs OK (without the point)
How can I solve this problem?
I don't want to transform all the strings like that..
str_date = str_date[:5] + "." + str_date[5:]
Thanks a lot!!
Example (previously I change the language with locale):
>>> datetime.strptime('2021-01-18', '%Y-%m-%d').strftime('%b')
'ene.'
>>> print(datetime.strptime('18ene2021', '%d%b%Y'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\galonsoi\AppData\Local\Programs\Python\Python36\lib\_strptime.py", line 565, in _strptime_datetime
tt, fraction = _strptime(data_string, format)
File "C:\Users\galonsoi\AppData\Local\Programs\Python\Python36\lib\_strptime.py", line 362, in _strptime
(data_string, format))
ValueError: time data '18ene2021' does not match format '%d%b%Y'
>>> print(datetime.strptime('18ene.2021', '%d%b%Y'))
2021-01-18 00:00:00 ----> THIS IS OK BECAUSE I WRITE THE DOT AT THE END OF THE ABBREVIATED MONTH
Complete sequence of the Example
>>> import locale
>>> from datetime import datetime
>>>
>>> locale.getlocale()
(None, None)
>>> print (datetime.strptime('2021-01-18', '%Y-%m-%d').strftime('%b'))
Jan
>>> locale.setlocale(locale.LC_ALL, '')
`Spanish_Spain.1252`
>>> locale.getlocale()
(`es_ES`, `cp1252`)
#INCORRECT FORMAT, ADD A "." AT THE END
>>> print (datetime.strptime('2021-01-18', '%Y-%m-%d').strftime('%b'))
ene.
>>> locale.setlocale(locale.LC_ALL, 'es_ES.UTF-8')
`es_ES.UTF-8`
#FORMATO INCORRECTO, AÑADE UN "." a may
>>> print (datetime.strptime('2021-01-18', '%Y-%m-%d').strftime('%b'))
ene.
>>> print(datetime.strptime('18ene2021', '%d%b%Y'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\galonsoi\AppData\Local\Programs\Python\Python36\lib\_strptime.py", line 565, in _strptime_datetime
tt, fraction = _strptime(data_string, format)
File "C:\Users\galonsoi\AppData\Local\Programs\Python\Python36\lib\_strptime.py", line 362, in _strptime
(data_string, format))
ValueError: time data '18ene2021' does not match format '%d%b%Y'
>>> print(datetime.strptime('18ene.2021', '%d%b%Y'))
2021-01-18 00:00:00 ----> THIS IS OK BECAUSE I WROTE THE DOT AT THE END OF THE ABBREVIATED MONTH
You could make use of dateutil's parser, where you can set custom month names via the parser.parserinfo class. Ex:
import locale
locale.setlocale(locale.LC_ALL, 'Spanish_Spain.1252') # set locale for reproducibility
import calendar
from dateutil import parser
# subclass parser.parserinfo and set custom month names with dots stripped:
class LocaleParserInfo(parser.parserinfo):
MONTHS = [(ma.strip('.'), ml) for ma, ml in zip(calendar.month_abbr, calendar.month_name)][1:]
s = '18ene2021'
print(parser.parse(s, parserinfo=LocaleParserInfo()))
# 2021-01-18 00:00:00

Time data does not match format '%c'

This is very unexpected behavior...
I create a time string using the '%c' directive.
%c is the Locale’s appropriate date and time representation.
Then I try to parse the resulting time string, specifying the same '%c' as the string's format.
However this does not work as you can see from the error below. What am I missing?
I need to be able to store the time in a human-readable localized string, and then convert the string back into a struct_time so I can extract information from it.
(It is extremely important that the string be localized, and I of course don't want to write parsing algorithms for all locales around the world!)
# Ensure the locale is set.
import locale
locale.setlocale(locale.LC_ALL, '')
'en_US.UTF-8'
# 1. Create a localized time string using the '%c' directive.
import datetime
time_stamp = datetime.datetime.now().strftime('%c')
time_stamp
'Mon 21 Dec 2020 03:47:55 PM '
# 2. Try to parse the string using the same directive used to create it.
import time
time.strptime(time_stamp, '%c')
# 3. Unexpected error...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/_strptime.py", line 562, in _strptime_time
tt = _strptime(data_string, format)[0]
File "/usr/lib/python3.8/_strptime.py", line 349, in _strptime
raise ValueError("time data %r does not match format %r" %
ValueError: time data 'Mon 21 Dec 2020 03:47:55 PM ' does not match format '%c'
Your locale is probably not configuring .strftime("%c") the way you expect and .strptime is objecting to the postfixed %p (PM)
Use locale.nl_langinfo(locale.D_T_FMT) to build your format instead!
>>> locale.nl_langinfo(locale.D_T_FMT)
'%a %b %e %H:%M:%S %Y'
>>> locale.setlocale(locale.LC_ALL, '')
'en_US.UTF-8'
>>> locale.nl_langinfo(locale.D_T_FMT)
'%a %b %e %X %Y'
However, if you
.. know the exact structure of the output, filter exact matches with a regex and then parse
.. can control the format, don't bother to format it and directly use time.time()
.. or always work in UTC and format as ISO 8601, deriving a tz-aware object and reading back with a custom parser (refer to the Caution on .fromisoformat)
>>> datetime.datetime.now(tz=datetime.timezone.utc)
datetime.datetime(2020, 12, 22, 0, 4, 29, 537007, tzinfo=datetime.timezone.utc)
use pytz, which is much "smarter" than the datetime builtin lib and properly supports a huge variety of locales
Instead of using %c, you can specify how you want to format the date using %a, %b and other directives. For example:
import locale
locale.setlocale(locale.LC_ALL, 'en_US.utf-8')
import datetime
fmt = '%a %b %d %Y %H:%M:%S'
time_stamp = datetime.datetime.now().strftime(fmt)
print(time_stamp)
import time
print(time.strptime(time_stamp, fmt))
This produces an output that you are looking for:
Output:
Mon Dec 21 2020 21:27:50
time.struct_time(tm_year=2020, tm_mon=12, tm_mday=21, tm_hour=21, tm_min=27, tm_sec=50, tm_wday=0, tm_yday=356, tm_isdst=-1)

How to parse datetime with Z letter with no specified seconds after semicolon

I'm parsing logs of this program and not accessed to source code of the program.
Log contains an interesting timestamp of event in a log record –
2018-11-02T06:25:03870000Z. It looks strange to me and I don't know how correct is it. But I tend to think that 03974200Z describe a seconds (%s) part and I would like to gather information from this record as much as it possible.
I'm trying to parse this example from Python 3.7 like this:
d = '2018-11-02T06:25:03870000Z'
dt.datetime.strptime(d, '%Y-%m-%dT%H:%M:%S')
It generates a predictable error:
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/_strptime.py", line 577, in _strptime_datetime
tt, fraction, gmtoff_fraction = _strptime(data_string, format)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/_strptime.py", line 362, in _strptime
data_string[found.end():])
ValueError: unconverted data remains: 870000Z
Update:
I have dirty solution for this but if there any better approach to do such operation than this:
sc = d.split(':')[-1][:2]
dd = d.split(':')
en = ':'.join(dd[:-1])
en += ':' + sc
>> en
'2018-11-02T06:25:03'
Questions:
How to parse such a datetime correctly (determining 03 in the example
as a part of seconds)?
(optional) Idk. But is this datetime example in log is correct (in terms of ISO or anything)?
The Z is specifying Zulu Time Zone (UTC or GMT), the seconds are given as whole seconds (03) followed by microseconds (870000) so you can parse the date fully using:
d = '2018-11-02T06:25:03870000Z'
dt.datetime.strptime(d, '%Y-%m-%dT%H:%M:%S%fZ')
I would use
import re
d = '2018-11-02T06:25:03870000Z'
date = re.findall('\d+', d)
this gives you a list of all occurences of one or more digits in a row and now you can do with it what you want, for example
print("Y: %s, M: %s, D: %s, H: %s, m: %s, S: %s" %(tuple(date)))
of course you can then also round the seconds so that they have only two digits

Python string to timestamp

I am trying to convert these strings to timestamps:
python test.py
2015-02-15T14:25:54+00:00
2015-02-15T16:59:01+00:00
2015-02-15T18:44:13+00:00
2015-02-15T18:45:24+00:00
2015-02-15T18:52:11+00:00
2015-02-15T18:52:33+00:00
2015-02-15T18:59:00+00:00
2015-02-15T19:06:16+00:00
2015-02-15T19:07:02+00:00
I get this output on executing below code:
for member in members_dict['members']:
s = member['timestamp_signup']
print s
But when I try to get the timestamp:
for member in members_dict['members']:
s = member['timestamp_signup']
print s
print time.mktime(datetime.datetime.strptime(s, "%Y-%m-%dT%H:%M:%S+00:00").timetuple())
I get the error as:
Traceback (most recent call last):
File "test.py", line 20, in <module>
print datetime.strptime(s, '"%Y-%m-%dT%H:%M:%S+00:00"').date()
File "/usr/lib/python2.7/_strptime.py", line 325, in _strptime
(data_string, format))
ValueError: time data '' does not match format '"%Y-%m-%dT%H:%M:%S+00:00"'
What am I doing wrong here?
Your code to convert string to datetime is fine. For example:
>>> from datetime import datetime
>>> my_str = '2015-02-15T14:25:54+00:00'
>>> datetime.strptime(my_str, "%Y-%m-%dT%H:%M:%S+00:00")
datetime.datetime(2015, 2, 15, 14, 25, 54)
Error you are getting is due to empty string present in your file. I got to know about it based on your error message:
ValueError: time data '' does not match format
# empty string ^
Possibly there is empty line at the end of your file (or, somewhere else)

Issue using eval function on value in list

Here is my code. It is the last eval line that is returning a syntax error
def format_date():
month_abrv = ['Jan','Feb','Mar','Apr','May','Jun','Jul',
'Aug','Sep','Oct','Nov','Dec']
print('''This program takes an input in the format MM/DD/YYYY(ex:09/23/2014)
and outputs date in format DD Mth, YYYY (ex: 23 Sep, 2014)''')
date_string = input('\nInput date in format MM/DD/YYYY ')
date_list = date_string.split('/')
date_list[0] = eval(date_list[0])
format_date()
Here is the error
Traceback (most recent call last):
File "C:/Python34/Python ICS 140/FormatDate.py", line 16, in <module>
format_date()
File "C:/Python34/Python ICS 140/FormatDate.py", line 14, in format_date
date_list[0] = eval(date_list[0])
File "<string>", line 1
09
^
SyntaxError: invalid token
After doing a little more digging I found that Python interprets numbers starting with 0s as octal numbers. using the int() function to change '09' to 9 worked in this case.

Categories