I'd like to know how to calculate the factorial of a matrix elementwise. For example,
import numpy as np
mat = np.array([[1,2,3],[2,3,4]])
np.the_function_i_want(mat)
would give a matrix mat2 such that mat2[i,j] = mat[i,j]!. I've tried something like
np.fromfunction(lambda i,j: np.math.factorial(mat[i,j]))
but it passes the entire matrix as argument for np.math.factorial. I've also tried to use scipy.vectorize but for matrices larger than 10x10 I get an error. This is the code I wrote:
import scipy as sp
javi = sp.fromfunction(lambda i,j: i+j, (15,15))
fact = sp.vectorize(sp.math.factorial)
fact(javi)
OverflowError: Python int too large to convert to C long
Such an integer number would be greater than 2e9, so I don't understand what this means.
There's a factorial function in scipy.special which allows element-wise computations on arrays:
>>> from scipy.special import factorial
>>> factorial(mat)
array([[ 1., 2., 6.],
[ 2., 6., 24.]])
The function returns an array of float values and so can compute "larger" factorials up to the accuracy floating point numbers allow:
>>> factorial(15)
array(1307674368000.0)
You may need to adjust the print precision of NumPy arrays if you want to avoid the number being displayed in scientific notation.
Regarding scipy.vectorize: the OverflowError implies that the result of some of the calculations are too big to be stored as integers (normally int32 or int64).
If you want to vectorize sp.math.factorial and want arbitrarily large integers, you'll need to specify that the function return an output array with the 'object' datatype. For instance:
fact = sp.vectorize(sp.math.factorial, otypes='O')
Specifying the 'object' type allows Python integers to be returned by fact. These are not limited in size and so you can calculate factorials as large as your computer's memory will permit. Be aware that arrays of this type lose some of the speed and efficiency benefits which regular NumPy arrays have.
Related
What is the equivalent of math.remainder() function in NumPy? Basically I would like to compute y = x - np.around(x) for a NumPy array x. (It is important that all elements of y have absolute at most 1/2). Looking at NumPy documentation neither np.fmod nor np.remainder do the job.
Obviously, I thought about writing x - np.around(x) but I am afraid that for large values of x subtraction produces floating-point inaccuracies. For example:
import numpy as np
a = np.arange(1000) * 1e-9
x = a / 1e-9
y = x - np.around(x)
Should produce an all-zero vector y, but in practice there will be some errors (that get larger if I increase the size of arrays from 1000 to 10000).
The reason I am asking this question is to figure out if there is a NumPy function for this purpose that calls directly C math library remainder (as math.remainder must do) so as to minimize the floating-points errors.
I don't think this currently exists numpy. That said, the automagic numpy.vectorize call does the right thing for me, e.g:
import math
import numpy as np
ieee_remainder = np.vectorize(math.remainder)
ieee_remainder(np.arange(5), 5)
this broadcasts over the parameters nicely, giving:
array([ 0., 1., 2., -2., -1.])
which might be what you want.
performance is about 10 times slower than you'd get with a native implementation. given an array of 10,000 elements my laptop takes approximately the following times:
1200 µs for ieee_remainder
150 µs for a Cython version I hacked together
120 µs for a C program doing a naive loop
80 µs for numpy.fmod
given the complexity of glibc's remainder implementation I'm amazed it's as fast as it is.
I am trying to get rid of the for loop and instead do an array-matrix multiplication to decrease the processing time when the weights array is very large:
import numpy as np
sequence = [np.random.random(10), np.random.random(10), np.random.random(10)]
weights = np.array([[0.1,0.3,0.6],[0.5,0.2,0.3],[0.1,0.8,0.1]])
Cov_matrix = np.matrix(np.cov(sequence))
results = []
for w in weights:
result = np.matrix(w)*Cov_matrix*np.matrix(w).T
results.append(result.A)
Where:
Cov_matrix is a 3x3 matrix
weights is an array of n lenght with n 1x3 matrices in it.
Is there a way to multiply/map weights to Cov_matrix and bypass the for loop? I am not very familiar with all the numpy functions.
I'd like to reiterate what's already been said in another answer: the np.matrix class has much more disadvantages than advantages these days, and I suggest moving to the use of the np.array class alone. Matrix multiplication of arrays can be easily written using the # operator, so the notation is in most cases as elegant as for the matrix class (and arrays don't have several restrictions that matrices do).
With that out of the way, what you need can be done in terms of a call to np.einsum. We need to contract certain indices of three matrices while keeping one index alone in two matrices. That is, we want to perform w_{ij} * Cov_{jk} * w.T_{ki} with a summation over j, k, giving us an array with i indices. The following call to einsum will do:
res = np.einsum('ij,jk,ik->i', weights, Cov_matrix, weights)
Note that the above will give you a single 1d array, whereas you originally had a list of arrays with shape (1,1). I suspect the above result will even make more sense. Also, note that I omitted the transpose in the second weights argument, and this is why the corresponding summation indices appear as ik rather than ki. This should be marginally faster.
To prove that the above gives the same result:
In [8]: results # original
Out[8]: [array([[0.02803215]]), array([[0.02280609]]), array([[0.0318784]])]
In [9]: res # einsum
Out[9]: array([0.02803215, 0.02280609, 0.0318784 ])
The same can be achieved by working with the weights as a matrix and then looking at the diagonal elements of the result. Namely:
np.diag(weights.dot(Cov_matrix).dot(weights.transpose()))
which gives:
array([0.03553664, 0.02394509, 0.03765553])
This does more calculations than necessary (calculates off-diagonals) so maybe someone will suggest a more efficient method.
Note: I'd suggest slowly moving away from np.matrix and instead work with np.array. It takes a bit of getting used to not being able to do A*b but will pay dividends in the long run. Here is a related discussion.
I'm trying to make a dot product of an expression and it was supposed to be symmetric.
It turns out that it just isn't.
B is a 4D array which I must transpose its last two dimensions to become B^t.
D is a 2D array. (It's an expression of the Stiffness Matrix known to the Finite Element Method programmers)
The numpy.dotproduct associated with numpy.transpose and as a second alternative numpy.einsum (the idea came from this topic: Numpy Matrix Multiplication U*B*U.T Results in Non-symmetric Matrix) have already been used and the problem persists.
By the end of the calculations the product B^tDB is obtained and when it's verified if it really is symmetric by subtracting its transpose B^tDB, there is still a residue.
The Dot product or the Einstein Summation are used only over the dimensions of interest (last ones).
The question is: How can these residues be eliminated?
You need to use arbitrary precision floating point math. Here's how you can combine numpy and the mpmath package to define an arbitrary precision version of matrix multiplication (ie the np.dot method):
from mpmath import mp, mpf
import numpy as np
# stands for "decimal places". Larger values
# mean higher precision, but slower computation
mp.dps = 75
def tompf(arr):
"""Convert any numpy array to one of arbitrary precision mpmath.mpf floats
"""
if arr.size and not isinstance(arr.flat[0], mpf):
return np.array([mpf(x) for x in arr.flat]).reshape(*arr.shape)
else:
return arr
def dotmpf(arr0, arr1):
"""An arbitrary precision version of np.dot
"""
return tompf(arr0).dot(tompf(arr1))
As an example, if you then set up B, B^t, and D matrices as so:
bshape = (8,8,8,8)
dshape = (8,8)
B = np.random.rand(*bshape)
BT = np.swapaxes(B, -2, -1)
d = np.random.rand(*dshape)
D = d.dot(d.T)
then B^tDB - (B^tDB)^t will always have a non-zero value if you calculate it using the standard matrix multiplication method from numpy:
M = np.dot(np.dot(B, D), BT)
np.sum(M - M.T)
but if you use the arbitrary precision version given above it won't have a residue:
M = dotmpf(dotmpf(B, D), BT)
np.sum(M - M.T)
Watch out though. Calculations using arbitrary precision math run much slower than those done using standard floating point numbers.
I am working on a vision algorithm with OpenCV in Python. One of the components of it requires comparing points in color-space, where the x and y components are not integers. Our list of points is stored as ndarray with dtype = float64, and our numbers range from -10 to 10 give or take.
Part of our algorithm involves running a convex hull on some of the points in this space, but cv2.convexHull() requires an ndarray with dtype = int.
Given the narrow range of the values we are comparing, simple truncation causes us to lose ~60 bits of information. Is there any way to have numpy directly interpret the float array as an int array? Since the scale has no significance, I would like all 64 bits to be considered.
Is there any defined way to separate the exponent from the mantissa in a numpy float, without doing bitwise extraction for every element?
"Part of our algorithm involves running a convex hull on some of the points in this space, but cv2.convexHull() requires an ndarray with dtype = int."
cv2.convexHull() also accepts numpy array with float32 number.
Try using cv2.convexHull(numpy.array(a,dtype = 'float32')) where a is a list of dimension n*2 (n = no. of points).
I have a numpy array A with mpf elements that have decimal precision 100. Is this precision cast away if I decide to take the numpy dot product of A with itself?
If this is the case, is there any way to convert a numpy array to an mpmath matrix, so I can keep the precision?
Numpy arrays can hold objects, in particular mpf objects, and their methods such as dot can use the addition/multiplication methods of these objects. Example:
import mpmath
import numpy
mpmath.mp.dps = 25 # higher precision for demonstration
a = [mpmath.sin(mpmath.pi*n/3) for n in range(99)]
b = numpy.array(a)
b.dot(b)
outputs mpf('49.50000000000000000000000165')
For comparison, this is what happens if the array elements are cast to double-precision floats when converting to numpy:
c = numpy.array(a, dtype=float)
c.dot(c)
outputs 49.499999999999993. So, the higher precision provided by mpmath is preserved when the dot method is invoked in the first version.
The previous answer is correct. However, sometimes there are things that are working in numpy and it is not working in mpmath (at least it is done in a different way). Hence, the original (general) question of
"...is there any way to convert a numpy array to an mpmath matrix, so I can keep the precision?.."
In my experience, this (more general) question still needs to have a general answer. One of the answers to this problem is to convert the numpy array first to list then List to mpmath matrix.
Here is a simple example that works for me (warning, may not be efficient):
import mpmath as mp, numpy as np
N = 5
L = np.ones(N)
M = np.diag(L, 2) # A numpy matrix 7x7
# Notes that MPMath "diag" function is limited to one parameter only
M = mp.matrix(M.tolist())
print(type(M),'\n', M)