Contraction along the last axe in numpy tensordot

Contraction along the last axe in numpy tensordot - python

I am not very familiar with tensor algebra and I am having trouble understanding how to make numpy.tensordot do what I want.
The example I am working with is simple: given a tensor a with shape (2,2,3) and another b with shape (2,1,3), I want a result tensor c with shape (2,1). This tensor would be the result of the following, equivalent python code:
n = a.shape[2]
c = np.zeros((2,n))
for k in range(n):
c += a[:,:,k]*b[:,:,k]
The documentation says that the optional parameter axes:
If an int N, sum over the last N axes of a and the first N axes of b in order. The sizes of the corresponding axes must match.
But I don't understand which "axes" are needed here (furthermore, when axes is a tuple or a tuple of tuples it gets even more confusing). Examples aren't very clear to me either.

The way tensordot works, it won't work here (not at least directly) because of the alignment requirement along the first axes. You can use np.einsum though to solve your case -
c = np.einsum('ijk,ilk->ij',a,b)
Alternatively, use np.matmul/#-operator (Python 3.x) -
np.matmul(a,b.swapaxes(1,2))[...,0] # or (a # b.swapaxes(1,2))[...,0]

Related

Does np.dot automatically transpose vectors?

I am trying to calculate the first and second order moments for a portfolio of stocks (i.e. expected return and standard deviation).
expected_returns_annual
Out[54]:
ticker
adj_close CNP 0.091859
F -0.007358
GE 0.095399
TSLA 0.204873
WMT -0.000943
dtype: float64
type(expected_returns_annual)
Out[55]: pandas.core.series.Series
weights = np.random.random(num_assets)
weights /= np.sum(weights)
returns = np.dot(expected_returns_annual, weights)
So normally the expected return is calculated by
(x1,...,xn' * (R1,...,Rn)
with x1,...,xn are weights with a constraint that all the weights have to sum up to 1 and ' means that the vector is transposed.
Now I am wondering a bit about the numpy dot function, because
returns = np.dot(expected_returns_annual, weights)
and
returns = np.dot(expected_returns_annual, weights.T)
give the same results.
I tested also the shape of weights.T and weights.
weights.shape
Out[58]: (5,)
weights.T.shape
Out[59]: (5,)
The shape of weights.T should be (,5) and not (5,), but numpy displays them as equal (I also tried np.transpose, but there is the same result)
Does anybody know why numpy behave this way? In my opinion the np.dot product automatically shape the vector the right why so that the vector product work well. Is that correct?
Best regards
Tom

The semantics of np.dot are not great
As Dominique Paul points out, np.dot has very heterogenous behavior depending on the shapes of the inputs. Adding to the confusion, as the OP points out in his question, given that weights is a 1D array, np.array_equal(weights, weights.T) is True (array_equal tests for equality of both value and shape).
Recommendation: use np.matmul or the equivalent # instead
If you are someone just starting out with Numpy, my advice to you would be to ditch np.dot completely. Don't use it in your code at all. Instead, use np.matmul, or the equivalent operator #. The behavior of # is more predictable than that of np.dot, while still being convenient to use. For example, you would get the same dot product for the two 1D arrays you have in your code like so:
returns = expected_returns_annual # weights
You can prove to yourself that this gives the same answer as np.dot with this assert:
assert expected_returns_annual # weights == expected_returns_annual.dot(weights)
Conceptually, # handles this case by promoting the two 1D arrays to appropriate 2D arrays (though the implementation doesn't necessarily do this). For example, if you have x with shape (N,) and y with shape (M,), if you do x # y the shapes will be promoted such that:
x.shape == (1, N)
y.shape == (M, 1)
Complete behavior of matmul/#
Here's what the docs have to say about matmul/# and the shapes of inputs/outputs:
If both arguments are 2-D they are multiplied like conventional matrices.
If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.
If the first argument is 1-D, it is promoted to a matrix by prepending a 1 to its dimensions. After matrix multiplication the prepended 1 is removed.
If the second argument is 1-D, it is promoted to a matrix by appending a 1 to its dimensions. After matrix multiplication the appended 1 is removed.
Notes: the arguments for using # over dot
As hpaulj points out in the comments, np.array_equal(x.dot(y), x # y) for all x and y that are 1D or 2D arrays. So why do I (and why should you) prefer #? I think the best argument for using # is that it helps to improve your code in small but significant ways:
# is explicitly a matrix multiplication operator. x # y will raise an error if y is a scalar, whereas dot will make the assumption that you actually just wanted elementwise multiplication. This can potentially result in a hard-to-localize bug in which dot silently returns a garbage result (I've personally run into that one). Thus, # allows you to be explicit about your own intent for the behavior of a line of code.
Because # is an operator, it has some nice short syntax for coercing various sequence types into arrays, without having to explicitly cast them. For example, [0,1,2] # np.arange(3) is valid syntax.
To be fair, while [0,1,2].dot(arr) is obviously not valid, np.dot([0,1,2], arr) is valid (though more verbose than using #).
When you do need to extend your code to deal with many matrix multiplications instead of just one, the ND cases for # are a conceptually straightforward generalization/vectorization of the lower-D cases.

I had the same question some time ago. It seems that when one of your matrices is one dimensional, then numpy will figure out automatically what you are trying to do.
The documentation for the dot function has a more specific explanation of the logic applied:
If both a and b are 1-D arrays, it is inner product of vectors
(without complex conjugation).
If both a and b are 2-D arrays, it is matrix multiplication, but using
matmul or a # b is preferred.
If either a or b is 0-D (scalar), it is equivalent to multiply and
using numpy.multiply(a, b) or a * b is preferred.
If a is an N-D array and b is a 1-D array, it is a sum product over
the last axis of a and b.
If a is an N-D array and b is an M-D array (where M>=2), it is a sum
product over the last axis of a and the second-to-last axis of b:

In NumPy, a transpose .T reverses the order of dimensions, which means that it doesn't do anything to your one-dimensional array weights.
This is a common source of confusion for people coming from Matlab, in which one-dimensional arrays do not exist. See Transposing a NumPy Array for some earlier discussion of this.
np.dot(x,y) has complicated behavior on higher-dimensional arrays, but its behavior when it's fed two one-dimensional arrays is very simple: it takes the inner product. If we wanted to get the equivalent result as a matrix product of a row and column instead, we'd have to write something like
np.asscalar(x # y[:, np.newaxis])
adding a trailing dimension to y to turn it into a "column", multiplying, and then converting our one-element array back into a scalar. But np.dot(x,y) is much faster and more efficient, so we just use that.
Edit: actually, this was dumb on my part. You can, of course, just write matrix multiplication x # y to get equivalent behavior to np.dot for one-dimensional arrays, as tel's excellent answer points out.

The shape of weights.T should be (,5) and not (5,),
suggests some confusion over the shape attribute. shape is an ordinary Python tuple, i.e. just a set of numbers, one for each dimension of the array. That's analogous to the size of a MATLAB matrix.
(5,) is just the way of displaying a 1 element tuple. The , is required because of older Python history of using () as a simple grouping.
In [22]: tuple([5])
Out[22]: (5,)
Thus the , in (5,) does not have a special numpy meaning, and
In [23]: (,5)
File "<ipython-input-23-08574acbf5a7>", line 1
(,5)
^
SyntaxError: invalid syntax
A key difference between numpy and MATLAB is that arrays can have any number of dimensions (upto 32). MATLAB has a lower boundary of 2.
The result is that a 5 element numpy array can have shapes (5,), (1,5), (5,1), (1,5,1)`, etc.
The handling of a 1d weight array in your example is best explained the np.dot documentation. Describing it as inner product seems clear enough to me. But I'm also happy with the
sum product over the last axis of a and the second-to-last axis of b
description, adjusted for the case where b has only one axis.
(5,) with (5,n) => (n,) # 5 is the common dimension
(n,5) with (5,) => (n,)
(n,5) with (5,1) => (n,1)
In:
(x1,...,xn' * (R1,...,Rn)
are you missing a )?
(x1,...,xn)' * (R1,...,Rn)
And the * means matrix product? Not elementwise product (.* in MATLAB)? (R1,...,Rn) would have size (n,1). (x1,...,xn)' size (1,n). The product (1,1).
By the way, that raises another difference. MATLAB expands dimensions to the right (n,1,1...). numpy expands them to the left (1,1,n) (if needed by broadcasting). The initial dimensions are the outermost ones. That's not as critical a difference as the lower size 2 boundary, but shouldn't be ignored.

Seamlessly solve square linear system that could be 1-dimensional in numpy

I am solving a linear system of equations Ax=b.
It is known that A is square and of full rank, but it is the result of a few matrix multiplications, say A = numpy.dot(C,numpy.dot(D,E)) in which the result can be 1x1 depending on the inputs C,D,E. In that case A is a float.
b is ensured to be a vector, even when it is a 1x1 one.
I am currently doing
A = numpy.dot(C,numpy.dot(D,E))
try:
x = numpy.linalg.solve(A,b)
except:
x = b[0] / A
I searched numpy's documentation and didn't find other alternatives for solve and dot that would accept scalars for the first or output arrays for the second. Actually numpy.linalg.solve requires dimension at least 2. If we were going to produce an A = numpy.array([5]) it would complain too.
Is there some alternative that I missed?

in which the result can be 1x1 depending on the inputs C,D,E. In that case A is a float.
This is not true, it is a 1x1 matrix, as expected
x=np.array([[1,2]])
z=x.dot(x.T) # 1x2 matrix times 2x1
print(z.shape) # (1, 1)
which works just fine with linalg.solve
linalg.solve(z, z) # returns [[1]], as expected

While you could expand the dimensions of A:
A = numpy.atleast_2d(A)
it sounds like A never should have been a float in the first place, and you should instead fix whatever is causing it to be one.

Preserving dimensions when slicing symbolic block matrices in sympy

I am using sympy (python 3.6, sympy 1.0) to facilitate the calculation of matrix-transformations in mathematical proofs.
To calculate the Schur complements it is necessary to slice a block-matrix consisting of symbolic matrices.
As directly addressing the matrix with:
M[0:1,1]
is not working I tried sympy.matrices.expressions.blockmatrix.blocks Unfortunately blocks is messing up the dimensions of the matrices when addressing a range of blocks:
from sympy import *
n = Symbol('n')
Aj = MatrixSymbol('Aj', n,n)
M = BlockMatrix([[Aj, Aj],[Aj, Aj]])
M1 = M.blocks[0:1,0:1]
M2 = M.blocks[0,0]
print(M1.shape)
print(M2.shape)
M.blocks returns a matrix with the dimension 1,1 for the matrix M1 while the matrix M2 has the right dimension n,n.
Any suggestion how to get the right dimensions when using an interval ?

The method blocks returns an ImmutableMatrix object, not a BlockMatrix object. Here it is for reference:
def blocks(self):
from sympy.matrices.immutable import ImmutableMatrix
mats = self.args
data = [[mats[i] if i == j else ZeroMatrix(mats[i].rows, mats[j].cols)
for j in range(len(mats))]
for i in range(len(mats))]
return ImmutableMatrix(data)
The shape of an ImmutableMatrix object is determined by the number of symbols it contains; the structure of symbols is not taken into account. Hence, you get (1,1).
When using M.blocks[0,0] you access an element of the matrix, which is Aj. This is known as a MatrixSymbol, so the shape works as expected.
When using M.blocks[0:1, 0:1] you slice a SymPy matrix. Slicing always returns a submatrix, even if the size of the slice is 1 by 1. So you get an ImmutableMatrix with one entry, Matrix([[Aj]]). As said above, the shape of this thing is (1,1) since there is no recognition of the block structure.
As user2357112 suggested, converting the sliced output of blocks into a BlockMatrix causes the shape to be determined on the basis of the shape of Aj:
>>> M3 = BlockMatrix(M.blocks[0:, 0:1])
>>> M3.shape
(2*n, n)
It's often useful to check the type of objects that behave in unexpected way: e.g., type(M1) and type(M2).

numpy einsum with '...'

The code below is meant to conduct a linear coordinate transformation on a set of 3d coordinates. The transformation matrix is A, and the array containing the coordinates is x. The zeroth axis of x runs over the dimensions x, y, z. It can have any arbitrary shape beyond that.
Here's my attempt:
A = np.random.random((3, 3))
x = np.random.random((3, 4, 2))
x_prime = np.einsum('ij,j...->i...', A, x)
The output is:
x_prime = np.einsum('ij,j...->i...', A, x)
ValueError: operand 0 did not have enough dimensions
to match the broadcasting, and couldn't be extended
because einstein sum subscripts were specified at both
the start and end
If I specify the additional subscripts in x explicitly, the error goes away. In other words, the following works:
x_prime = np.einsum('ij,jkl->ikl', A, x)
I'd like x to be able to have any arbitrary number of axes after the zeroth axis, so the workaround I give about is not optimal. I'm actually not sure why the first einsum example is not working. I'm using numpy 1.6.1. Is this a bug, or am I misunderstanding the documentation?

Yep, it's a bug. It was fixed in this pull request: https://github.com/numpy/numpy/pull/4099
This was only merged a month ago, so it'll be a while before it makes it to a stable release.
EDIT: As #hpaulj mentions in the comment, you can work around this limitation by adding an ellipsis even when all indices are specified:
np.einsum('...ij,j...->i...', A, x)

How to create the histogram of an array with masked values, in Numpy?

In Numpy 1.4.1, what is the simplest or most efficient way of calculating the histogram of a masked array? numpy.histogram and pyplot.hist do count the masked elements, by default!
The only simple solution I can think of right now involves creating a new array with the non-masked value:
histogram(m_arr[~m_arr.mask])
This is not very efficient, though, as this unnecessarily creates a new array. I'd be happy to read about better ideas!

(Undeleting this as per discussion above...)
I'm not sure whether or not the numpy developers would consider this a bug or expected behavior. I asked on the mailing list, so I guess we'll see what they say.
Either way, it's an easy fix. Patching numpy/lib/function_base.py to use numpy.asanyarray rather than numpy.asarray on the inputs to the function will allow it to properly use masked arrays (or any other subclass of an ndarray) without creating a copy.
Edit: It seems like it is expected behavior. As discussed here:
If you want to ignore masked data it's
just on extra function call
histogram(m_arr.compressed())
I don't think the fact that this makes
an extra copy will be relevant,
because I guess full masked array
handling inside histogram will be a
lot more expensive.
Using asanyarray would also allow
matrices in and other subtypes that
might not be handled correctly by the
histogram calculations.
For anything else besides dropping
masked observations, it would be
necessary to figure out what the
masked array definition of a histogram
is, as Bruce pointed out.

Try hist(m_arr.compressed()).

This is a super old question, but these days I just use:
numpy.histogram(m_arr, bins=.., range=.., density=False, weights=m_arr_mask)
Where m_arr_mask is an array with the same shape as m_arr, consisting of 0 values for elements of m_arr to be excluded from the histogram and 1 values for elements that are to be included.

After running into casting issues by trying Erik's solution (see https://github.com/numpy/numpy/issues/16616), I decided to write a numba function to achieve this behavior.
Some of the code was inspired by https://numba.pydata.org/numba-examples/examples/density_estimation/histogram/results.html. I added the mask bit.
import numpy
import numba
#numba.jit(nopython=True)
def compute_bin(x, bin_edges):
# assuming uniform bins for now
n = bin_edges.shape[0] - 1
a_min = bin_edges[0]
a_max = bin_edges[-1]
# special case to mirror NumPy behavior for last bin
if x == a_max:
return n - 1 # a_max always in last bin
bin = int(n * (x - a_min) / (a_max - a_min))
if bin < 0 or bin >= n:
return None
else:
return bin
#numba.jit(nopython=True)
def masked_histogram(img, bin_edges, mask):
hist = numpy.zeros(len(bin_edges) - 1, dtype=numpy.intp)
for i, value in enumerate(img.flat):
if mask.flat[i]:
bin = compute_bin(value, bin_edges)
if bin is not None:
hist[int(bin)] += 1
return hist # , bin_edges
The speedup is significant. On a (1000, 1000) image:

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.