Nan with Python 2.5 on Windows - python

How do I create a Nan with Python 2.5 on Windows?
float('nan') fails with the error ValueError: invalid literal for float(): nan
Summary of the answers: Neither float('inf') nor float('nan') works with Python 2.5 and Windows. This is a bug that was fixed in Python 2.6.
If you are using numpy, then you can use numpy.inf and numpy.nan.
If you need a workaround without numpy, then you can use an expression that overflows such as 1e1000 to get an inf, and 1e1000 / 1e1000 or 1e1000 - 1e1000 to get a nan.

Another way is dividing inf by itself:
>>> float('inf') / float('inf')
nan
Or in a more obscure way, which might not work across platforms (but works around that specific bug in Python 2.5 on Windows):
>>> 1e31337 / 1e31337
nan
>>> 1e31337 - 1e31337
nan

There is already an accepted answer to this question, but I think the following should work if you don't want to rely on overflow and have numpy installed ... (not tested as I don't have python2.5 or windows)
>>> import numpy as np
>>> np.nan
nan
>>> np.inf
inf

Upgrade your Python distribution if possible. The behavior you listed is considered a bug. (Note: Cython link.)
Canonically, Python is supposed to support this definition of nan in a cross-platform manner. This behavior appears to have been fixed in Python 2.6 and 3.0.
(Additional reading)
Of course, this works in the Linux versions of Python:
$ python2.4
Python 2.4.3 (#1, Sep 21 2011, 19:55:41)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-51)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> float('nan')
nan
$ python2.5
Python 2.5.2 (r252:60911, Jun 26 2008, 10:20:40)
[GCC 4.1.2 20070626 (Red Hat 4.1.2-14)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> float('nan')
nan

Related

Replace accented letters with the respective non-accented ones at Python 3

I am not sure that this popular answer works in Python 3 since there is no unicode in Python 3.
Therefore, how can replace accented letters with the respective non-accented ones at Python 3?
For example,
sentence = 'intérêt'
to
new_sentence = 'interet'
The linked answer references the third-party module unidecode, not Python 2's unicode type.
$ python3
Python 3.7.1 (default, Nov 19 2018, 13:04:22)
[Clang 10.0.0 (clang-1000.11.45.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import unidecode
>>> unidecode.unidecode('intérêt')
'interet'

String/unicode references in Python embedded dictionaries [duplicate]

This question already has answers here:
Why does comparing strings using either '==' or 'is' sometimes produce a different result?
(15 answers)
Python string interning
(2 answers)
Closed 5 years ago.
I have a question about the Python 2.7.5-Python 2.7.13. It may be
about semantics or it may be a genuine Python bug. I'm not entirely
sure which. Here is the simplest code I can construct with the
issue
Python 2.7.13 |Enthought, Inc. (x86_64)| (default, Mar 2 2017, 08:20:50)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
>>> dd = {'foo': {'yy':u'Tannenbaum'}}
>>> dd['foo']['yy'] is u'Tannenbaum'
False
>>> dd['foo']['yy'] == u'Tannenbaum'
True
Note: If 'Tannebaum' is changed from unicode to a string the outcome changes. Both of the final tests are true. The question is: Why do the two final tests differ in the unicode case? My understanding is that since unicode and strings are both immutables the "is" and "==" tests should never differ in value. But I get this behavior in both Python 2.7.13 and the old 2.7.5 that came installed on my Mac. Am I relying on something I shouldn't rely on? Is the moral that I should never use "is" for string equality? But what is the principle that tells me that?
Postscript: I have access to a Python 3.6.2 on another machine, and lo and behold, I cannot reproduce this anomaly.
Python 3.6.2 (default, Jul 30 2017, 12:03:06)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> dd = {'foo': {'yy':u'Tannenbaum'}}
>>> dd['foo']['yy'] is u'Tannenbaum'
True
>>> dd['foo']['yy'] == u'Tannenbaum'
True

Does python support unicode beyond basic multilingual plane?

Below is a simple test. repr seems to work fine. yet len and x for x in doesn't seem to divide the unicode text correctly in Python 2.6 and 2.7:
In [1]: u"爨爵"
Out[1]: u'\U0002f920\U0002f921'
In [2]: [x for x in u"爨爵"]
Out[2]: [u'\ud87e', u'\udd20', u'\ud87e', u'\udd21']
Good news is Python 3.3 does the right thing ™.
Is there any hope for Python 2.x series?
Yes, provided you compiled your Python with wide-unicode support.
By default, Python is built with narrow unicode support only. Enable wide support with:
./configure --enable-unicode=ucs4
You can verify what configuration was used by testing sys.maxunicode:
import sys
if sys.maxunicode == 0x10FFFF:
print 'Python built with UCS4 (wide unicode) support'
else:
print 'Python built with UCS2 (narrow unicode) support'
A wide build will use UCS4 characters for all unicode values, doubling memory usage for these. Python 3.3 switched to variable width values; only enough bytes are used to represent all characters in the current value.
Quick demo showing that a wide build handles your sample Unicode string correctly:
$ python2.6
Python 2.6.6 (r266:84292, Dec 27 2010, 00:02:40)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.maxunicode
1114111
>>> [x for x in u'\U0002f920\U0002f921']
[u'\U0002f920', u'\U0002f921']

Python sys.maxint, sys.maxunicode on Linux and windows

On 64-bit Debian Linux 6:
Python 2.6.6 (r266:84292, Dec 26 2010, 22:31:48)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.maxint
9223372036854775807
>>> sys.maxunicode
1114111
On 64-bit Windows 7:
Python 2.7.1 (r271:86832, Nov 27 2010, 17:19:03) [MSC v.1500 64 bit (AMD64)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.maxint
2147483647
>>> sys.maxunicode
65535
Both Operating Systems are 64-bit. They have sys.maxunicode, according to wikipedia There are 1,114,112 code points in unicode. Is sys.maxunicode on Windows wrong?
And why do they have different sys.maxint?
I don't know what your question is, but sys.maxunicode is not wrong on Windows.
See the docs:
sys.maxunicode
An integer giving the largest supported code point for a Unicode character. The value of this depends on the configuration option that
specifies whether Unicode characters are stored as UCS-2 or UCS-4.
Python on Windows uses UCS-2, so the largest code point is 65,535 (and the supplementary-plane characters are encoded by 2*16 bit "surrogate pairs").
About sys.maxint, this shows at which point Python 2 switches from "simple integers" (123) to "long integers" (12345678987654321L). Obviously Python for Windows uses 32 bits, and Python for Linux uses 64 bits. Since Python 3, this has become irrelevant because the simple and long integer types have been merged into one. Therefore, sys.maxint is gone from Python 3.
Regarding the difference is sys.maxint, see What is the bit size of long on 64-bit Windows?. Python uses the long type internally to store a small integer on Python 2.x.

Python2.4 and 2.6 behaves differently for os.path.getmtime() on Windows

Getting two different modification time when calculated from different Python versions on Windows XP.
Python2.4
C:\Copy of elisp>c:\python24\python
Python 2.4.4 (#71, Oct 18 2006, 08:34:43) [MSC v.1310 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.path.getmtime("auto-complete-emacs-lisp.el")
1251684178
>>> ^Z
Python2.6
C:\Copy of elisp>C:\Python26\python
Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.path.getmtime("auto-complete-emacs-lisp.el")
1251687778.0
>>>
There is a difference of 3600 seconds reported by Python2.6 and Python2.4.
What is the reason of this strange behavior?
It's a bug in Microsoft's implementation of the C standard library. Python 2.4 used to use the stdlib fstat call to get file information, and hence could end up an hour out in locales that use DST.
In Python 2.5 and later, os.stat calls the direct Win32-only API to get file information when running on Windows, resulting in the correct output. See this thread for more.
There is a difference of 3600 seconds ...
This should be the kicker. It's a timezone problem, pure and simple.
Now all you have to do is find out why 2.4 and 2.6 are using different timezone information :-)

Categories