Applying function on multiple dimensions of higher dimensional array

Applying function on multiple dimensions of higher dimensional array - python

Suppose you have a higher dimensional array (3 or greater) which is composed of a series of 2d images. If this array is called x, then a 2d image will be represented as x[0,0,:,:]. Now what I want to do is apply a function that takes in a 2d image and outputs a scalar, on this higher dimensional array so that I would convert the dimension of the original array to one that is 2 dimensions lower. How would I do such a thing?
In other words, what is the faster numpy way of doing this: np.array([[f(x[i,j,:,:]) for i in range(x.shape[0])] for j in range(x.shape[1])]) for a list of axes and some function f that takes in an array.
I've looked at numpy.apply_along_axis but that only acts on a 1d array and the shape must be identical. numpy.apply_on_axes also doesn't work since it doesn't reduce the amount of dimensions which are given to the function (it gives my function a 4d array, not a 2d array which I can work with). numpy.vectorize doesn't work because it doesn't ever apply on more than one element at once.

Related

Numpy shape function for single dimension array

I've started to learn NumPy, when I create an array and then invoke the .shape function, I understand how it works for most cases. However, the result does not make sense to me for a single-dimensional array. Can someone please explain the outcome?
array = np.array([4,5,6])
print(array.shape)
The outcome is (3,)

Output tulpe of ints in the "np.shape" function gives the lengths of the corresponding array dimension, and this tuple will be (n,m), in which n and m indicate row and columns, respectively. For a single dimension array, this Tuple will be just (n,), in which n indicates the number of array elements.

What does (n,) mean in the context of numpy and vectors?

I've tried searching StackOverflow, googling, and even using symbolhound to do character searches, but was unable to find an answer. Specifically, I'm confused about Ch. 1 of Nielsen's Neural Networks and Deep Learning, where he says "It is assumed that the input a is an (n, 1) Numpy ndarray, not a (n,) vector."
At first I thought (n,) referred to the orientation of the array - so it might refer to a one-column vector as opposed to a vector with only one row. But then I don't see why we need (n,) and (n, 1) both - they seem to say the same thing. I know I'm misunderstanding something but am unsure.
For reference a refers to a vector of activations that will be input to a given layer of a neural network, before being transformed by the weights and biases to produce the output vector of activations for the next layer.
EDIT: This question equivocates between a "one-column vector" (there's no such thing) and a "one-column matrix" (does actually exist). Same for "one-row vector" and "one-row matrix".
A vector is only a list of numbers, or (equivalently) a list of scalar transformations on the basis vectors of a vector space. A vector might look like a matrix when we write it out, if it only has one row (or one column). Confusingly, we will sometimes refer to a "vector of activations" but actually mean "a single-row matrix of activation values transposed so that it is a single-column."
Be aware that in neither case are we discussing a one-dimensional vector, which would be a vector defined by only one number (unless, trivially, n==1, in which case the concept of a "column" or "row" distinction would be meaningless).

In numpy an array can have a number of different dimensions, 0, 1, 2 etc.
The typical 2d array has dimension (n,m) (this is a Python tuple). We tend to describe this as having n rows, m columns. So a (n,1) array has just 1 column, and a (1,m) has 1 row.
But because an array may have just 1 dimension, it is possible to have a shape (n,) (Python notation for a 1 element tuple: see here for more).
For many purposes (n,), (1,n), (n,1) arrays are equivalent (also (1,n,1,1) (4d)). They all have n terms, and can be reshaped to each other.
But sometimes that extra 1 dimension matters. A (1,m) array can multiply a (n,1) array to produce a (n,m) array. A (n,1) array can be indexed like a (n,m), with 2 indices, x[:,0] where as a (n,) only accepts x[0].
MATLAB matrices are always 2d (or higher). So people transfering ideas from MATLAB tend to expect 2 dimensions. There is a np.matrix subclass that supposed to imitate that.
For numpy programmers the distinctions between vector, row vector, column vector, matrix are loose and relatively unimportant. Or the use is derived from the application rather than from numpy itself. I think that's what's happening with this network book - the notation and expectations come from outside of numpy.
See as well this answer for how to interpret the shapes with respect to the data stored in ndarrays. It also provides insight on how to use .reshape: https://stackoverflow.com/a/22074424/3277902

(n,) is a tuple of length 1, whose only element is n. (The syntax isn't (n) because that's just n instead of making a tuple.)
If an array has shape (n,), that means it's a 1-dimensional array with a length of n along its only dimension. It's not a row vector or a column vector; it doesn't have rows or columns. It's just a vector.

Multiplying 3D matrix with 2D matrix

I have two matrices to multiply. One is the weight matrix W, whose size is 900x2x2. Another is input matrix I, whose size is 2x2.
I want to perform a summation over c = WI which will be a 900x1 matrix, but when I perform the operation it multiplies them and gives me a 900x2x2 matrix again.
Question #2 (related): So I made both of them 2D and multiplied 900x4 * 4x1, but that gives me an error saying:
ValueError: operands could not be broadcast together with shapes (900,4) (4,1)

It seems you are trying to lose the last two axes of the first array against the only two axes of the second weight array with that matrix-multiplication. We could translate that idea into NumPy code with np.tensordot and assuming arr1 and arr2 as the input arrays respectively, like so -
np.tensordot(arr1,arr2,axes=([1,2],[0,1]))
Another simpler way to put into NumPy code would be with np.einsum, like so -
np.einsum('ijk,jk',arr1,arr2)

Concatenating numpy arrays of different shapes

I have several N-dimensional arrays of different shapes and want to combine them into a new (N+1)-dimensional array, where the new axis has a length corresponding to the number of initial N-d arrays.
This answer is sufficient if the original arrays are all the same shape; however, it does not work if they have different shapes.
I don't really want to reshape the arrays to a congruent size and fill with empty elements due to the subsequent analysis I need to perform on the final array.
Specifically, I have four 4D arrays. One of the things I want to do with the resulting 5D array is plot parts of the four arrays on the same matplotlib figure. Obviously I could plot each one separately, however soon I will have more than four 4D arrays and am looking for a dynamic solution.

While I was writing this, Sven gave the same answer in the comments...
Put the arrays in a python list in the following manner:
5d_list = []
5d_list.append(4D_array_1)
5d_list.append(4D_array_2)
...
Then you can unpack them:
for 4d_array in 5d_list:
#plot 4d array on figure

Axis elimination

I'm having a trouble understanding the concept of Axis elimination in numpy. Suppose I have the following 2D matrix:
A =
1 2 3
3 4 5
6 7 8
Ok I understand that sum(A, axis=0) will sum each column down and will give a 1D array with 3 elements. I also understand that sum(A, axis=1) will sum each row.
But my trouble is when I read that axis=0 eliminates the 0th axis and axis=1 eliminates the 1th axis. Also sometime people mention "reduce" instead of "eliminate". I'm unable to understand what does that eliminate. For example sum(A, axis=0) will sum each column from top to bottom, but I don't see elimination or reduction here. What's the point? The same also for sum(A,axis=1).
AND how is it for higher dimensions?
p.s. I always confused between matrix dimensions and array dimensions. I wished that people who write the numpy documentation makes this distinction very clear.

http://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.reduce.html
Reduces a‘s dimension by one, by applying ufunc along one axis.
For example, add.reduce() is equivalent to sum().
In numpy, the base class is ndarray - a multidimensional array (can 0d, 1d, or more)
http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html
Matrix is a subclass of array
http://docs.scipy.org/doc/numpy/reference/arrays.classes.html
Matrix objects are always two-dimensional
The history of the numpy Matrix is old, but basically it's meant to resemble the MATLAB matrix object. In the original MATLAB nearly everything was a matrix, which was always 2d. Later they generalized it to allow more dimensions. But it can't have fewer dimensions. MATLAB does have 'vectors', but they are just matrices with one dimension being 1 (row vector versus column vector).
'axis elimination' is not a common term when working with numpy. It could, conceivably, refer to any of several ways that reduce the number of dimensions of an array. Reduction, as in sum(), is one. Indexing is another: a[:,0,:]. Reshaping can also change the number of dimensions. np.squeeze is another.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.