Python: make scipy use numpy.float128 instead numpy.float64? - python

Is it possible, and if yes how to do it: How can I make scipy use by default numpy.float128. For example
>>> from scipy.stats import norm
>>> type(norm.pdf(10, 10, 1))
<class 'numpy.float64'>
and I want it to be
>>> from scipy.stats import norm
>>> type(norm.pdf(10, 10, 1))
<class 'numpy.float128'>
If not, I will need to implement norm.pdf function by myself, which is easy, but do not solve my problem.

In general, you can't. Some of the scipy routines are wrappers of code written in C or Fortran that are only available in double precision. Even if you figure out which ones are pure python+numpy, and manage to ensure that the operations performed in the computation preserve the data type, you'll find that many of the functions use hardcoded 64 bit constants such as numpy.pi.

Related

Is it possible to display all of the functions that SymPy tries during `simplify()`?

The SymPy docs state the following:
SymPy has dozens of functions to perform various kinds of simplification. There is also one general function called simplify() that attempts to apply all of these functions in an intelligent way to arrive at the simplest form of an expression.
I am using SymPy as a tool to help me relearn maths, so it would be really useful if I could view all of the functions that SymPy tries.
Is is possible to display all of the functions that SymPy tries during simplify()? How can I do this?
The source of simplify is here. According to it, SymPy attempts the following operations, most of which are documented in simplify module docs (the page you linked is from SymPy tutorial, which does not go into details.)
cancel(expr)
_mexpand(expr).cancel()
together(expr, deep=True)
factor_terms(expr, sign=False)
hyperexpand(expr)
piecewise_fold(expr)
besselsimp(expr)
trigsimp(expr, deep=True)
expand_log(expr, deep=True)
logcombine(expr)
combsimp(expr)
sum_simplify(expr)
product_simplify(expr)
quantity_simplify(expr)
powsimp(expr, combine='exp', deep=True)
powsimp(expr)
expand_power_exp(expand_mul(expr)))
exptrigsimp(expr)
To try these directly, import
from sympy import *
from sympy.simplify.simplify import sum_simplify, product_simplify
from sympy.core.function import _mexpand
However, simplify does not just try these methods one by one: most of them are used only when the expression matches some pattern, and some of them are used in combinations.

Calling Fortran routines in scipy.special in a function jitted with numba

Is there a way to directly or indirectly call the Fortran routines that can be found here https://github.com/scipy/scipy/tree/master/scipy/special/cdflib and that are used by scipy.stats from a function that is supposed to by compiled by numba in nopythonmode?
Concretely, because scipy.stats.norm.cdf() is somehow pretty slow, I am right now directly using scipy.special.ndtr, which is called by the former. However, I'm doing this in a loop, and my intend is to speed it up using numba.
I would take a look at rvlib, which uses Numba and CFFI to call RMath, which is the standalone C library that R uses to calculate statistical distributions. The functions it provides should be callable by Numba in nopython mode. Take a look at the README for an example of the function that is equivalent to scipy.stats.norm.cdf()
If you're still interested in wrapping cdflib yourself, I would recommend using CFFI. You'll have to build a C-interface for the functions you want. You might find this blog post that I wrote helpful in getting started:
https://www.continuum.io/blog/developer-blog/calling-c-libraries-numba-using-cffi
If all you need is the normal CDF, then you can implement it using the erfc function in the standard library's math module.
import numba
from math import erfc, sqrt
SQRT2 = sqrt(2.0)
#numba.jit(nopython=True)
def normcdf(x):
# If X ~ N(0,1), returns P(X < x).
return erfc(-x / SQRT2) / 2.0
You can probably vectorize it using numba.vectorize instead of numba.jit.

Conjugate transpose operator ".H" in numpy

It is very convenient in numpy to use the .T attribute to get a transposed version of an ndarray. However, there is no similar way to get the conjugate transpose. Numpy's matrix class has the .H operator, but not ndarray. Because I like readable code, and because I'm too lazy to always write .conj().T, I would like the .H property to always be available to me. How can I add this feature? Is it possible to add it so that it is brainlessly available every time numpy is imported?
(A similar question could by asked about the .I inverse operator.)
You can subclass the ndarray object like:
from numpy import ndarray
class myarray(ndarray):
#property
def H(self):
return self.conj().T
such that:
a = np.random.rand(3, 3).view(myarray)
a.H
will give you the desired behavior.
Edit:
As suggested by #slek120, you can force to transpose only the last 2 axes with:
self.swapaxes(-2, -1).conj()
instead of self.conj().T.
In general, the difficulty in this problem is that Numpy is a C-extension, which cannot be monkey patched...or can it? The forbiddenfruit module allows one to do this, although it feels a little like playing with knives.
So here is what I've done:
Install the very simple forbiddenfruit package
Determine the user customization directory:
import site
print site.getusersitepackages()
In that directory, edit usercustomize.py to include the following:
from forbiddenfruit import curse
from numpy import ndarray
from numpy.linalg import inv
curse(ndarray,'H',property(fget=lambda A: A.conj().T))
curse(ndarray,'I',property(fget=lambda A: inv(A)))
Test it:
python -c python -c "import numpy as np; A = np.array([[1,1j]]); print A; print A.H"
Results in:
[[ 1.+0.j 0.+1.j]]
[[ 1.-0.j]
[ 0.-1.j]]

Why does built-in sum behave wrongly after "from numpy import *"?

I have some code like:
import math, csv, sys, re, time, datetime, pickle, os, gzip
from numpy import *
x = [1, 2, 3, ... ]
y = sum(x)
The sum of the actual values in x is 2165496761, which is larger than the limit of 32bit integer. The reported y value is -2129470535, implying integer overflow.
Why did this happen? I thought the built-in sum was supposed to use Python's arbitrary-size integers?
See How to restore a builtin that I overwrote by accident? if you've accidentally done something like this at the REPL (interpreter prompt).
Doing from numpy import * causes the built-in sum function to be replaced with numpy.sum:
>>> sum(xrange(10**7))
49999995000000L
>>> from numpy import sum
>>> sum(xrange(10**7)) # assuming a 32-bit platform
-2014260032
To verify that numpy.sum is in use, try to check the type of the result:
>>> sum([721832253, 721832254, 721832254])
-2129470535
>>> type(sum([721832253, 721832254, 721832254]))
<type 'numpy.int32'>
To avoid this problem, don't use star import.
If you must use numpy.sum and want an arbitrary-sized integer result, specify a dtype for the result like so:
>>> sum([721832253, 721832254, 721832254],dtype=object)
2165496761L
or refer to the builtin sum explicitly (possibly giving it a more convenient binding):
>>> __builtins__.sum([721832253, 721832254, 721832254])
2165496761L
The reason why you get this invalid value is that you're using np.sum on a int32. Nothing prevents you from not using a np.int32 but a np.int64 or np.int128 dtype to represent your data. You could for example just use
x.view(np.int64).sum()
On a side note, please make sure that you never use from numpy import *. It's a terrible practice and a habit you must get rid of as soon as possible. When you use the from ... import *, you might be overwriting some Python built-ins which makes it very difficult to debug. Typical example, your overwriting of functions like sum or max...
Python handles large numbers with arbitrary precision:
>>> sum([721832253, 721832254, 721832254])
2165496761
Just sum them up!
To make sure you don't use numpy.sum, try __builtins__.sum() instead.

Changing floating point behavior in Python to Numpy style

Is there a way to make Python floating point numbers follow numpy's rules regarding +/- Inf and NaN? For instance, making 1.0/0.0 = Inf.
>>> from numpy import *
>>> ones(1)/0
array([ Inf])
>>> 1.0/0.0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: float division
Numpy's divide function divide(1.0,0.0)=Inf however it is not clear if it can be used similar to from __future__ import division.
You should have a look at how Sage does it. IIRC they wrap the Python REPL in their own preprocessor.
I tried to do something similar, and I never figured out how to do it nicely. But, I can tell you a few things I tried, that didn't work:
Setting float = numpy.float -- python still uses the old float
trying to change float.div to a user-defined function -- "TypeError: can't set attributes of built-in/extension type float". Also, python doesn't like you mucking with the dict object in built-in objects.
I decided to go in and change the actual cpython source code to have it do what I wanted, which is obviously not practical, but it worked.
I think the reason why something like this is not possible is that float/int/list are implemented in C in the background, and their behavior cannot be changed cleanly from inside the language.
You could wrap all your floats in numpy.float64, which is the numpy float type.
a = float64(1.)
a/0 # Inf
In fact, you only need to wrap the floats on the left of arithmetic operations, obviously.

Categories