Time data does not match format '%c' - python

This is very unexpected behavior...
I create a time string using the '%c' directive.
%c is the Locale’s appropriate date and time representation.
Then I try to parse the resulting time string, specifying the same '%c' as the string's format.
However this does not work as you can see from the error below. What am I missing?
I need to be able to store the time in a human-readable localized string, and then convert the string back into a struct_time so I can extract information from it.
(It is extremely important that the string be localized, and I of course don't want to write parsing algorithms for all locales around the world!)
# Ensure the locale is set.
import locale
locale.setlocale(locale.LC_ALL, '')
'en_US.UTF-8'
# 1. Create a localized time string using the '%c' directive.
import datetime
time_stamp = datetime.datetime.now().strftime('%c')
time_stamp
'Mon 21 Dec 2020 03:47:55 PM '
# 2. Try to parse the string using the same directive used to create it.
import time
time.strptime(time_stamp, '%c')
# 3. Unexpected error...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/_strptime.py", line 562, in _strptime_time
tt = _strptime(data_string, format)[0]
File "/usr/lib/python3.8/_strptime.py", line 349, in _strptime
raise ValueError("time data %r does not match format %r" %
ValueError: time data 'Mon 21 Dec 2020 03:47:55 PM ' does not match format '%c'

Your locale is probably not configuring .strftime("%c") the way you expect and .strptime is objecting to the postfixed %p (PM)
Use locale.nl_langinfo(locale.D_T_FMT) to build your format instead!
>>> locale.nl_langinfo(locale.D_T_FMT)
'%a %b %e %H:%M:%S %Y'
>>> locale.setlocale(locale.LC_ALL, '')
'en_US.UTF-8'
>>> locale.nl_langinfo(locale.D_T_FMT)
'%a %b %e %X %Y'
However, if you
.. know the exact structure of the output, filter exact matches with a regex and then parse
.. can control the format, don't bother to format it and directly use time.time()
.. or always work in UTC and format as ISO 8601, deriving a tz-aware object and reading back with a custom parser (refer to the Caution on .fromisoformat)
>>> datetime.datetime.now(tz=datetime.timezone.utc)
datetime.datetime(2020, 12, 22, 0, 4, 29, 537007, tzinfo=datetime.timezone.utc)
use pytz, which is much "smarter" than the datetime builtin lib and properly supports a huge variety of locales

Instead of using %c, you can specify how you want to format the date using %a, %b and other directives. For example:
import locale
locale.setlocale(locale.LC_ALL, 'en_US.utf-8')
import datetime
fmt = '%a %b %d %Y %H:%M:%S'
time_stamp = datetime.datetime.now().strftime(fmt)
print(time_stamp)
import time
print(time.strptime(time_stamp, fmt))
This produces an output that you are looking for:
Output:
Mon Dec 21 2020 21:27:50
time.struct_time(tm_year=2020, tm_mon=12, tm_mday=21, tm_hour=21, tm_min=27, tm_sec=50, tm_wday=0, tm_yday=356, tm_isdst=-1)

Related

Abbreviated month or day of the week with a DOT (".") at the end of the string

I need to change with python a lot of strings with a Spanish date format (DDMMMYYYY, MMM abbreviated month in Spanish) in a other datetime format but I'm having problems because my locale Spanish settings has a "." (a dot) at the end of the string when it change this format in a abbreviated month format.
By default, python take the English version of the language but I can change the language with the locale library.
When I select 'esp' or 'es_ES.utf8' the dot at the end of the abbreviated month appears.
Does it depend on the regional settings of my Windows 10? (I check it and all seems OK) Does it depend on the LOCALE library settings?
The same code in UBUNTU runs OK (without the point)
How can I solve this problem?
I don't want to transform all the strings like that..
str_date = str_date[:5] + "." + str_date[5:]
Thanks a lot!!
Example (previously I change the language with locale):
>>> datetime.strptime('2021-01-18', '%Y-%m-%d').strftime('%b')
'ene.'
>>> print(datetime.strptime('18ene2021', '%d%b%Y'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\galonsoi\AppData\Local\Programs\Python\Python36\lib\_strptime.py", line 565, in _strptime_datetime
tt, fraction = _strptime(data_string, format)
File "C:\Users\galonsoi\AppData\Local\Programs\Python\Python36\lib\_strptime.py", line 362, in _strptime
(data_string, format))
ValueError: time data '18ene2021' does not match format '%d%b%Y'
>>> print(datetime.strptime('18ene.2021', '%d%b%Y'))
2021-01-18 00:00:00 ----> THIS IS OK BECAUSE I WRITE THE DOT AT THE END OF THE ABBREVIATED MONTH
Complete sequence of the Example
>>> import locale
>>> from datetime import datetime
>>>
>>> locale.getlocale()
(None, None)
>>> print (datetime.strptime('2021-01-18', '%Y-%m-%d').strftime('%b'))
Jan
>>> locale.setlocale(locale.LC_ALL, '')
`Spanish_Spain.1252`
>>> locale.getlocale()
(`es_ES`, `cp1252`)
#INCORRECT FORMAT, ADD A "." AT THE END
>>> print (datetime.strptime('2021-01-18', '%Y-%m-%d').strftime('%b'))
ene.
>>> locale.setlocale(locale.LC_ALL, 'es_ES.UTF-8')
`es_ES.UTF-8`
#FORMATO INCORRECTO, AÑADE UN "." a may
>>> print (datetime.strptime('2021-01-18', '%Y-%m-%d').strftime('%b'))
ene.
>>> print(datetime.strptime('18ene2021', '%d%b%Y'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\galonsoi\AppData\Local\Programs\Python\Python36\lib\_strptime.py", line 565, in _strptime_datetime
tt, fraction = _strptime(data_string, format)
File "C:\Users\galonsoi\AppData\Local\Programs\Python\Python36\lib\_strptime.py", line 362, in _strptime
(data_string, format))
ValueError: time data '18ene2021' does not match format '%d%b%Y'
>>> print(datetime.strptime('18ene.2021', '%d%b%Y'))
2021-01-18 00:00:00 ----> THIS IS OK BECAUSE I WROTE THE DOT AT THE END OF THE ABBREVIATED MONTH
You could make use of dateutil's parser, where you can set custom month names via the parser.parserinfo class. Ex:
import locale
locale.setlocale(locale.LC_ALL, 'Spanish_Spain.1252') # set locale for reproducibility
import calendar
from dateutil import parser
# subclass parser.parserinfo and set custom month names with dots stripped:
class LocaleParserInfo(parser.parserinfo):
MONTHS = [(ma.strip('.'), ml) for ma, ml in zip(calendar.month_abbr, calendar.month_name)][1:]
s = '18ene2021'
print(parser.parse(s, parserinfo=LocaleParserInfo()))
# 2021-01-18 00:00:00

not able to convert file read() to an integer

I have a simple file which contains exactly one integer. This integer is an epoch timestamp value.
ts_f = open('latest_ts','r')
pattern = '%a %b %d %H:%M:%S NZDT %Y'
tmp = ts_f.read()
//Do some processing to update the timestamp value.
ts_f.close()
ts_f = open('latest_ts','w+')
ts_f.write(latest_ts_epoch)
ts_f.close()
Since both of these are integer values and read returns a string, I tried to convert tmp to an interger as int(tmp).It does not allow me to convert tmp to an integer and gives the error
ValueError: invalid literal for int() with base 10:
See added line below, your file starts with a BOM which needs to be decoded first.
ts_f = open('latest_ts','r')
pattern = '%a %b %d %H:%M:%S NZDT %Y'
tmp = ts_f.read()
tmp = tmp.decode("utf-8-sig")
ts_f.close()
ts_f = open('latest_ts','w+')
ts_f.write(latest_ts_epoch)
ts_f.close()

python convert date from string breaks when year is two digit

i am trying to convert a date string to date format
>>> str = "04-18-2002 03:50PM"
>>> time.strptime(str, '%m-%d-%Y %H:%M%p')
time.struct_time(tm_year=2002, tm_mon=4, tm_mday=18, tm_hour=3, tm_min=50, tm_sec=0, tm_wday=3, tm_yday=108, tm_isdst=-1)
however when the year is in two digit it breaks
>>> str = "04-18-02 03:50PM"
>>> time.strptime(str, '%m-%d-%Y %H:%M%p')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/_strptime.py", line 454, in _strptime_time
return _strptime(data_string, format)[0]
File "/usr/lib/python2.7/_strptime.py", line 325, in _strptime
(data_string, format))
ValueError: time data '04-18-02 03:50' does not match format '%m-%d-%Y %H:%M'
any ideas??
%Y in the format string denotes a four-digit year, see the documentation. For a two-digit year, use %y instead. To support both formats, first try: one of the formats, and catch the ValueError and try the other one.
The correct format for a two-digit year is %y (lowercase y).
As a side note, please don't call the string variable str as it shadows the builtin.

python convert string to datetime

i have a loop where i try to process set of data where one action is to convert ordinary string to datetime. everything works fine except sometimes happend a weird thing ... here is what i know
there are exactly the same parameters entering the function always
those parameters are the same type always
first time i run it, it always get trought
when it gets to second element in the loop in appx 80% throw and value error (time data did not match format)
but after i run it again, everything is ok, and it gets stuck on next emelement ...
because my function is pretty big and there are many things happing i decide to provide you with some saple code whitch i wrote right here, just for clarification:
data = ['January 20 1999', 'March 4 2010', 'June 11 1819']
dformat = '%B %d %Y'
for item in data:
out = datetime.datetime.strptime(item, dformat)
print out
although this clearly works in my program it doesnt ... i have try everything i have came up with but havent succeeded yet therefore i would be glad with any idea you provide thanks
btw: the error i always get looks like this
ValueError: time data did not match format: data=March 4 2010 fmt=%B %d %Y
You probably have a different locale set up. %B is March in locales that use English, but in other locales it will fail.
For example:
>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'sv_SE.utf8')
'sv_SE.utf8'
>>> import datetime
>>>
>>> data = ['January 20 1999', 'March 4 2010', 'June 11 1819']
>>> for item in data:
... print datetime.datetime.strptime(item, '%B %d %Y')
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/usr/lib/python2.6/_strptime.py", line 325, in _strptime
(data_string, format))
ValueError: time data 'January 20 1999' does not match format '%B %d %Y'
Here you see that even though the format does match, it claims it doesn't. And that's because the month names doesn't match. Change it to Swedish locale names, and it works again:
>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'sv_SE.utf8')
'sv_SE.utf8'
>>> import datetime
>>>
>>> data = ['Januari 20 1999', 'Mars 4 2010', 'Juni 11 1819']
>>> for item in data:
... print datetime.datetime.strptime(item, '%B %d %Y')
...
1999-01-20 00:00:00
2010-03-04 00:00:00
1819-06-11 00:00:00
(Note that the above locale 'sv_SE.utf8' might not work for you, because you have to have that specific locale installed. To see which ones that are installed on a Unix machine, run this command from the command line:
$ locale -a
C
en_AG
en_AU.utf8
en_BW.utf8
en_CA.utf8
en_DK.utf8
en_GB.utf8
en_HK.utf8
en_IE.utf8
en_IN
en_NG
en_NZ.utf8
en_PH.utf8
en_SG.utf8
en_US.utf8
en_ZA.utf8
en_ZW.utf8
POSIX
sv_FI.utf8
sv_SE.utf8
)
Pretty weird though... In the same run locale usually doesn't change. However, if your program keeps doing this, you might want to call 'setlocale' everytime the code enters into the loop (ugly solution, I know).

Using a Unicode format for Python's `time.strftime()`

I am trying to call Python's time.strftime() function using a Unicode format string:
u'%d\u200f/%m\u200f/%Y %H:%M:%S'
(\u200f is the "Right-To-Left Mark" (RLM).)
However, I am getting an exception that the RLM character cannot be encoded into ascii:
UnicodeEncodeError: 'ascii' codec can't encode character u'\u200f' in position 2: ordinal not in range(128)
I have tried searching for an alternative but could not find a reasonable one. Is there an alternative to this function, or a way to make it work with Unicode characters?
Many standard library functions still don't support Unicode the way they should. You can use this workaround:
import time
my_format = u'%d\u200f/%m\u200f/%Y %H:%M:%S'
my_time = time.localtime()
time.strftime(my_format.encode('utf-8'), my_time).decode('utf-8')
You can format string through utf-8 encoding:
time.strftime(u'%d\u200f/%m\u200f/%Y %H:%M:%S'.encode('utf-8'), t).decode('utf-8')
You should read from a file as Unicode and then convert it to Date-time format.
from datetime import datetime
f = open(LogFilePath, 'r', encoding='utf-8')
# Read first line of log file and remove '\n' from end of it
Log_DateTime = f.readline()[:-1]
You can define Date-time format like this:
fmt = "%Y-%m-%d %H:%M:%S.%f"
But some programming language like C# doesn't support it easily, so you can change it to:
fmt = "%Y-%m-%d %H:%M:%S"
Or you can use like following way (to satisfy .%f):
Log_DateTime = Log_DateTime + '.000000'
If you have an unrecognized symbol (an Unicode symbol) then you should remove it too.
# Removing an unrecognized symbol at the first of line (first character)
Log_DateTime = Log_DateTime[1:] + '.000000'
At the end, you should convert string date-time to real Date-time format:
Log_DateTime = datetime.datetime.strptime(Log_DateTime, fmt)
Current_Datetime = datetime.datetime.now() # Default format is '%Y-%m-%d %H:%M:%S.%f'
# Calculate different between that two datetime and do suitable actions
Current_Log_Diff = (Current_Datetime - Log_DateTime).total_seconds()

Categories