In numpy if we want to raise a matrix A to power N (but raise it as defined in mathematics, in linear algebra in particular), then it seems we need to use this function
numpy.linalg.matrix_power
Isn't there a simpler way? Some Python symbol/operator?
E.g. I was expecting A**N to do this but it doesn't.
Seems that A**N is raising each element to power N, and not the whole matrix to power N (in the usual math sense). So A**N is some strange element-wise raising to power N.
By matrix I mean of course a two-dimensional ndarray.
In [4]: x=np.arange(4).reshape(2,2)
For this square array:
In [6]: np.linalg.matrix_power(x,3)
Out[6]:
array([[ 6, 11],
[22, 39]])
In [7]: x#x#x
Out[7]:
array([[ 6, 11],
[22, 39]])
matrix_power is written in python so you can easily read it. It essentially does a sequence of dot products, with some refinements to reduce the steps.
For np.matrix subclass, ** does the same thing:
In [8]: mx=np.matrix(x)
In [9]: mx**3
Out[9]:
matrix([[ 6, 11],
[22, 39]])
** is translated by the interpreter to a __pow__ call. For this class that just amounts to a matrix_power call:
In [10]: mx.__pow__??
Signature: mx.__pow__(other)
Docstring: Return pow(self, value, mod).
Source:
def __pow__(self, other):
return matrix_power(self, other)
File: c:\users\paul\anaconda3\lib\site-packages\numpy\matrixlib\defmatrix.py
Type: method
But for ndarray this method is a compiled one:
In [3]: x.__pow__??
Signature: x.__pow__(value, mod=None, /)
Call signature: x.__pow__(*args, **kwargs)
Type: method-wrapper
String form: <method-wrapper '__pow__' of numpy.ndarray object at 0x0000022A8B2B5ED0>
Docstring: Return pow(self, value, mod).
numpy does not alter python syntax. It has not added any operators. The # operator was added to python several years ago, largely as a convenience for packages like numpy. But it had to added to the interpreter's syntax first.
Note that matrix_power works for a
a : (..., M, M) array_like
Matrix to be "powered".
That means it has to have at least 2 dimensions, and the trailing two must be equal size. So even that extends the normal linear algebra definition (which is limited to 2d).
numpy isn't just a linear algebra package. It's meant to be a general purpose array tool. Linear algebra is just a subset of math that can be performed with multidimensional collections of numbers (and other objects).
numpy.linalg.matrix_power is the best way as far as I know. You could use dot or * in a loop, but that would just be more code, and probably less efficient.
Related
**
Scenario 1: sum
**
I found that when working on 2d arrays in numpy, I realised that summing up has different options - i.e., Python built-in method sum provides summation along the axes only, whereas the numpy sum provides summation on the total 2d array (matrix).
**
Scenario 2: and versus &
**
I noticed that logical and (and) bitwise and (&) both work on the same data element but produce different results. In fact, Logical and and does not work within series of dataframe whereas bitwise and & works just fine.
Why does this happen ? Can anybody provide insights based on the language's history, design, purpose, etc so one can understand better ?
Regards
Ssp
numpy operates within Python, and gets all its special behavior from ndarray class methods and module functions. It does not alter Python syntax.
Python sum treats its input as an iterable; it's easy to understand with a 1d array, which is just like operating on a list. But on a 2d array, it's harder to understand:
In [52]: x = np.arange(12).reshape(3,4)
In [53]: sum(x)
Out[53]: array([12, 15, 18, 21]) # what's this doing?
In [54]: x.sum() # or np.sum(x)
Out[54]: 66
In [55]: x.sum(axis=0)
Out[55]: array([12, 15, 18, 21]) # sum down rows, one per column
In [56]: x.sum(axis=1)
Out[56]: array([ 6, 22, 38]) # sum across columns, one per row
Python and is a short circuiting operator. Like an if statement, its use with numpy arrays is likely to produce an ambiguity error. Comparisons of arrays produces boolean arrays. Boolean arrays cannot be used in Python contexts that require a scalar boolean value.
Operators like +,*,& has class specific meanings/methods. [1,2,3]*3 is different from np.array([1,2,3])*3. "a"+"string" is different from np.arange(3)+3.
I know np.exp2(x) exists that calculates 2^x where x is a numpy array, however, I am looking for a method that does K^x where K is any arbitrary number.
Is there any elegant way of doing it rather than stretching out K to the shape of x and doing a piecewise exponent?
Just use the standard Python exponentiation operator **:
K**x
For example, if you have:
x = np.array([1,2,3])
K = 3
print(K**x)
The output is:
[ 3 9 27]
Notes
For Python classes, the behavior of the binary ** operator is implemented via the __pow__, __rpow__, and __ipow__ magic methods (the reality for np.ndarray is slightly more complicated since it's implemented in the C layer, but that's not actually important here). For Numpy arrays, these magic methods in turn appear to call numpy.power, so you can expect that ** will have the same behavior as documented for numpy.power. In particular,
Note that an integer type raised to a negative integer power will raise a ValueError.
With numpy you can just use numpy.power
arr = numpy.array([1,2,3])
print(numpy.power(3,arr)) # Outputs [ 3 9 27]
I just accidentally discovered that I can mix sympy expressions up with numpy arrays:
>>> import numpy as np
>>> import sympy as sym
>>> x, y, z = sym.symbols('x y z')
>>> np.ones(5)*x
array([1.0*x, 1.0*x, 1.0*x, 1.0*x, 1.0*x], dtype=object)
# I was expecting this to throw an error!
# sum works and collects terms etc. as I would expect
>>> np.sum(np.array([x+0.1,y,z+y]))
x + 2*y + z + 0.1
# dot works too
>>> np.dot(np.array([x,y,z]),np.array([z,y,x]))
2*x*z + y**2
>>> np.dot(np.array([x,y,z]),np.array([1,2,3]))
x + 2*y + 3*z
This is quite useful for me, because I'm doing both numerical and symbolic calculations in the same program. However, I'm curious about the pitfalls and limitations of this approach --- it seems for example that neither np.sin nor sym.sin are supported on Numpy arrays containing Sympy objects, since both give an error.
However, this numpy-sympy integration doesn't appear to be documented anywhere. Is it just an accident of how these libraries are implemented, or is it a deliberate feature? If the latter, when is it designed to be used, and when would it be better to use sympy.Matrix or other solutions? Can I expect to keep some of numpy's speed when working with arrays of this kind, or will it just drop back to Python loops as soon as a sympy symbol is involved?
In short I'm pleased to find this feature exists, but I would like to know more about it!
This is just NumPy's support for arrays of objects. It is not specific to SymPy. NumPy examines the operands and finds not all of them are scalars; there are some objects involved. So it calls that object's __mul__ or __rmul__, and puts the result into an array of objects. For example: mpmath objects,
>>> import mpmath as mp
>>> np.ones(5) * mp.mpf('1.23')
array([mpf('1.23'), mpf('1.23'), mpf('1.23'), mpf('1.23'), mpf('1.23')],
dtype=object)
or lists:
>>> np.array([[2], 3])*5
array([list([2, 2, 2, 2, 2]), 15], dtype=object)
>>> np.array([2, 3])*[[1, 1], [2]]
array([list([1, 1, 1, 1]), list([2, 2, 2])], dtype=object)
Can I expect to keep some of numpy's speed when working with arrays of this kind,
No. NumPy object arrays have no performance benefits over Python lists; there is probably more overhead in accessing elements than would be in a list. Storing Python objects in a Python list vs. a fixed-length Numpy array
There is no reason to use such arrays if a more specific data structure is available.
I just came across a relevant note in the latest numpy release notes (https://docs.scipy.org/doc/numpy-1.15.1/release.html)
Comparison ufuncs accept dtype=object, overriding the default bool
This allows object arrays of symbolic types, which override == and other operators to return expressions, to be compared elementwise with np.equal(a, b, dtype=object).
I think that means this works, but didn't before:
In [9]: np.array([x+.1, 2*y])==np.array([.1+x, y*2])
Out[9]: array([ True, True])
I'm processing a numpy.matrix and I'm missing the round-up and down functions.
I.e. I can do:
data = [[1, -20],[-30, 2]]
np.matrix(data).mean(0).round().astype(int).tolist()[0]
Out[58]: [-14, -9]
Thus use .round(). But I cannot use .floor() or .ceil().
They are also not mentioned in the SciPy NumPy 1.14 reference.
Why are these (quite essential) functions missing?
edit:
I've found that you can do np.floor(np.matrix(data).mean(0)).astype(int).tolist()[0]. But why the difference? Why is .round() a method and .floor not?
As with most of these why questions we can only deduce likely reasons from patterns, and some knowledge of the history.
https://docs.scipy.org/doc/numpy/reference/ufuncs.html#floating-functions
floor and ceil are classed as floating ufuncs. rint is also a ufunc that performs like round. ufuncs have a standardized interface, including parameters like out and where.
np.round is in /usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py. numeric is one of original packages that was merged to form the current numpy. It is alias for np.round_ which ends up calling np.around, also in fromnumeric. Note the available parameters include out, but also decimals (which is missing from rint). And it delegates to the .round method.
One advantage to having a function is that you don't have to first convert the list into an array:
In [115]: data = [[1, -20],[-30, 2]]
In [119]: np.mean(data,0)
Out[119]: array([-14.5, -9. ])
In [120]: np.mean(data,0).round()
Out[120]: array([-14., -9.])
In [121]: np.rint(np.mean(data,0))
Out[121]: array([-14., -9.])
using other parameters:
In [138]: np.mean(data,axis=0, keepdims=True,dtype=int)
Out[138]: array([[-14, -9]])
In python 3.5, the # operator was introduced for matrix multiplication, following PEP465. This is implemented e.g. in numpy as the matmul operator.
However, as proposed by the PEP, the numpy operator throws an exception when called with a scalar operand:
>>> import numpy as np
>>> np.array([[1,2],[3,4]]) # np.array([[1,2],[3,4]]) # works
array([[ 7, 10],
[15, 22]])
>>> 1 # 2 # doesn't work
Traceback (most recent call last):
File "<input>", line 1, in <module>
TypeError: unsupported operand type(s) for #: 'int' and 'int'
This is a real turnoff for me, since I'm implementing numerical signal processing algorithms that should work for both scalars and matrices. The equations for both cases are mathematically exactly equivalent, which is no surprise, since "1-D x 1-D matrix multiplication" is equivalent to scalar multiplication. The current state however forces me to write duplicate code in order to handle both cases correctly.
So, given that the current state is not satisfactory, is there any reasonable way I can make the # operator work for scalars? I thought about adding a custom __matmul__(self, other) method to scalar data types, but this seems like a lot of hassle considering the number of involved internal data types. Could I change the implementation of the __matmul__ method for numpy array data types to not throw an exception for 1x1 array operands?
And, on a sidenote, which is the rationale behind this design decision? Off the top of my head, I cannot think of any compelling reasons not to implement that operator for scalars as well.
As ajcr suggested, you can work around this issue by forcing some minimal dimensionality on objects being multiplied. There are two reasonable options: atleast_1d and atleast_2d which have different results in regard to the type being returned by #: a scalar versus a 1-by-1 2D array.
x = 3
y = 5
z = np.atleast_1d(x) # np.atleast_1d(y) # returns 15
z = np.atleast_2d(x) # np.atleast_2d(y) # returns array([[15]])
However:
Using atleast_2d will lead to an error if x and y are 1D-arrays that would otherwise be multiplied normally
Using atleast_1d will result in the product that is either a scalar or a matrix, and you don't know which.
Both of these are more verbose than np.dot(x, y) which would handle all of those cases.
Also, the atleast_1d version suffers from the same flaw that would also be shared by having scalar # scalar = scalar: you don't know what can be done with the output. Will z.T or z.shape throw an error? These work for 1-by-1 matrices but not for scalars. In the setting of Python, one simply cannot ignore the distinction between scalars and 1-by-1 arrays without also giving up all the methods and properties that the latter have.