Role of 'axes' in keras.backend.batch_dot

Role of 'axes' in keras.backend.batch_dot - python

I have a custom layer to multiply two tensors A & B of size (x,1) & (1,y), where I want to produce an output C of size (x,y).
To take into account batching i.e. matrices size are actually (?,x,1) & (?,1,y), I am calling:
C = K.batch_dot(A,B, axes = [2,1])
This seems to producing the desired output, but I don't really understand what the axes variable represents here. My intuition is that these are the axes over which we want to perform the matrix multiplication, but I don't understand why it is in the order [2,1] rather than [1,2] (which produced an error).
Can anyone assist me in my understanding?

As per the official documentation here
The lengths of axes[0] and axes[1] should be the same
In your case A has dimensions (?, x, 1) and B has dimensions (?, 1, y).
So its quite clear that from axis = [2, 1], second dimension of A i.e. 1 equals first dimensions of B i.e. 1 (axis dims starts from 0) and produces the desired results.

Related

Intuition behind the correlation

I'm following this tutorial online from kaggle and I can't get my head round why .T is changing the shape of the matrix. Here is the part I am stuck at:
#saleprice correlation matrix
k = 10 #number of variables for heatmap
cols = corrmat.nlargest(k, 'SalePrice')['SalePrice'].index
cm = np.corrcoef(df_train[cols].values.T)
sns.set(font_scale=1.25)
hm = sns.heatmap(cm, cbar=True, annot=True, square=True, fmt='.2f', annot_kws={'size': 10}, yticklabels=cols.values, xticklabels=cols.values)
plt.show()
I'm basically trouble shooting the code and tried this:
cm = np.corrcoef(df_train[cols].values)
cm.shape
returns a matrix with shape 1460x1460. But when I input:
cm = np.corrcoef(df_train[cols].values.T)
cm.shape
it returns a matrix with shape 10x10. Does anyone know why it does this? I can't figure out.

The correlation gives you a normalized representation of the covariance matrix between all the "columns" of the dataframe. For instance, in the case of having only two variables, you'd end up with a matrix of the shape:
Rx = [[ 1, r_xy],
[r_yx, 1]]
This is quite an expensive computation, since it involves taking the dot product of each column with the rest, resulting in a correlation coefficient for each combination.
So in matrix notation, since you want to end up with a 10x10 matrix, you want to have the shapes correctly aligned. In this case you want (10,1460)x(1460,10) so you get a 10,10 matrix. Hence you need to transpose the 2D-array so that it has shape (10,1460) when you feed it to np.corrcoef.
Though you might find it a little easier by playing around with it yourself and seeing how the actual Pearson correlation is computed:
X = np.random.randint(0,10,(500,2))
print(np.corrcoef(X.T))
array([[1. , 0.04400245],
[0.04400245, 1. ]])
Which is doing the same as:
mean_X = X.mean(axis=0)
std_X = X.std(axis=0)
n, _ = X.shape
print((X.T-mean_X[:,None]).dot(X-mean_X)/(n*std_X**2))
array([[1. , 0.04416552],
[0.04383998, 1. ]])
Note that as mentioned, this is giving as result a normalized dot product of X with itself, so for each (1,1460)x(1460,1) product your getting a single number. So X here, just as in your example, has to be transposed so the dimensions are correctly aligned.

From numpy documentation of corrcoef:
x : array_like
A 1-D or 2-D array containing multiple variables and observations.
Each row of x represents a variable, and
each column a single observation of all those variables. Also see rowvar below.
Note that each row represents a variable, in the first case you have 1460 rows and 10 columns and in the second one you have 10 rows with 1460 columns.
So when you transpose your NumPy array your basically changing from 1460 variables with 10 values for each one to 10 variables with 1460 values for each one.
If you are dealing with pandas you could just use the built-in .corr() method that computes the correlation between columns.

Numpy dot product problems

A=np.array([
[1,2],
[3,4]
])
B=np.ones(2)
A is clearly of shape 2X2
How does numpy allow me to compute a dot product np.dot(A,B)
[1,2] (dot) [1,1]
[3,4]
B has to have dimensions of 2X1 for a dot product or rather this
[1,2] (dot) [1]
[3,4] [1]
This is a very silly question but i am not able to figure out where i am going wrong here?
Earlier i used to think that np.ones(2) would give me this:
[1]
[1]
But it gives me this:
[1,1]

I'm copying part of an answer I wrote earlier today:
You should resist the urge to think of numpy arrays as having rows
and columns, but instead consider them as having dimensions and
shape. This is an important point which differentiates np.array and np.matrix:
x = np.array([1, 2, 3])
print(x.ndim, x.shape) # 1 (3,)
y = np.matrix([1, 2, 3])
print(y.ndim, y.shape) # 2 (1, 3)
An n-D array can only use n integer(s) to represent its shape.
Therefore, a 1-D array only uses 1 integer to specify its shape.
In practice, combining calculations between 1-D and 2-D arrays is not
a problem for numpy, and syntactically clean since # matrix
operation was introduced in Python 3.5. Therefore, there is rarely a
need to resort to np.matrix in order to satisfy the urge to see
expected row and column counts.

This behavior is by design. The NumPy docs state:
If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.

Most of the rules for vector and matrix shapes relating to the dot product exist mostly in order to have a coherent method that scales up into higher tensor orders. But they aren't very important when dealing with 1st order (vectors) and 2nd order (matrix) tensors. And those orders are what the vast majority of numpy users need.
As a result, # and np.dot are optimized (both mathematically and input parsing) for those orders, always summing over the last axis of the first and the second to last axis (if applicable) of the second. The "if applicable" is sort of an idiot-proofing to assure the output is what is expected in the vast majority of cases, even if the shapes don't technically fit.
Those of us who use higher-order tensors, meanwhile, are relegated to np.tensordot or np.einsum, which come complete with all the niggling little rules about dimension matching.

How to label regions that they do not have same values?

I have stacked 5 probability map in a numpy array (a with the shape 256x256x5), that I have stacked them and then I get the argmax of all of them that final output is show by different 5 colors, however, the values correspond to a pixel within an area are not same (values are changing between [0,1]).
max_= np.argmax(a, axis=2)
plt.imshow(max_)
plt.show()
I do not know how to separate each object by value, because pixels inside a region do not have same values. Does someone know how to label this five objects (colored parts and including background)?

If I understand the question, you want the maximum probabilities themselves, not the indices of the maximum probabilities. (Small point: if you array really is shape 5 × 256 × 256, then I think you did np.argmanx(a, axis=0) to get that result.)
This will give you the maximum probabilities themselves:
max_prob = np.amax(a, axis=0)
If you want each 'object' on its own, you could then do this for each of the regions:
prob_1 = np.zeros((256, 256))
prob_1[max_ == 1] = max_prob[max_ == 1]
prob_1[prob_1 == 0] = np.nan

List of simple arrays with pyplot.plot

I have some trouble to understand how pyplot.plot works.
I take a simple example: I want to plot pyplot.plot(lst2, lst2) where lst2 is a list.
The difficulty comes from the fact that each element of lst2 is an array of shape (1,1). If the elements were floating and not array, there would be no problems.
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
V2 = np.array([[1]])
W2 = np.array([[2]])
print('The shape of V2 is', V2.shape)
print('The shape of W2 is', W2.shape)
lst2 = [V2, W2]
plt.plot(lst2, lst2)
plt.show
Below is the end of the error message I got:
~\Anaconda3\lib\site-packages\matplotlib\axes\_base.py in _xy_from_xy(self,x, y)
245 if x.ndim > 2 or y.ndim > 2:
246 raise ValueError("x and y can be no greater than 2-D, but have "
--> 247 "shapes {} and {}".format(x.shape, y.shape))
248
249 if x.ndim == 1:
ValueError: x and y can be no greater than 2-D, but have shapes (2, 1, 1) and (2, 1, 1)
What surprised me in the error message is the mention of an array of dimension (2,1,1). It seems like the array np.array([V2,W2]) is built when we call pyplot.plot.
My question is then what happens behind the scenes when we call pyplot.plot(x,y) with x and y list? It seems like an array with the elements of x is built (and same for y). And these arrays must have maximum 2 axis. Am I correct?
I know that if I use numpy.squeeze on V2 and W2, it would work. But I would like to understand what it happening inside pyplot.plot in the example I gave.

Take a closer look at what you're doing:
V2 = np.array([[1]])
W2 = np.array([[2]])
lst2 = [V2, W2]
plt.plot(lst2, lst2)
For some odd reason you're defining your arrays to be of shape (1,1) by using a nested pair of brackets. When you construct lst2, you stack your arrays along a new leading dimension. This has nothing do with pyplot, this is numpy.
Numpy arrays are rectangular, and they are compatible with lists of lists of ... of lists. The level of nesting determines the number of dimensions of an array. Look at a simple 2d example:
>>> M = np.arange(2*3).reshape(2,3)
>>> print(repr(M))
array([[0, 1, 2],
[3, 4, 5]])
You can for all intents and purposes think of this 2x3 matrix as two row vectors. M[0] is the same as M[0,:] and is the first row, M[1] is the same as M[1,:] is the second row. You could then also construct this array from the two rows in the following way:
row1 = [0, 1, 2]
row2 = [3, 4, 5]
lst = [row1, row2]
np.array(lst)
My point is that we took two flat lists of length 3 (which are compatible with 1d numpy arrays of shape (3,)), and concatenated them in a list. The result was compatible with a 2d array of shape (2,3). The "2" is due to the fact that we put 2 lists into lst, and the "3" is due to the fact that both lists had a length of 3.
So, when you create lst2 above, you're doing something that is equivalent to this:
lst2 = [ [[1]], [[2]] ]
You put two nested sublists into an array-compatible list, and both sublists are compatible with shape (1,1). This implies that you'll end up with a 3d array (in accordance with the fact that you have three opening brackets at the deepest level of nesting), with shape (2,1,1). Again the 2 comes from the fact that you have two arrays inside, and the trailing dimensions come from the contents.
The real question is what you're trying to do. For one, your data shouldn't really be of shape (1,1). In the most straightforward application of pyplot.plot you have 1d datasets: one for the x and one for the y coordinates of your plot. For this you can use a simple (flat) list or 1d array for both x and y. What matters is that they are of the same length.
Then when you plot the two against each other, you pass the x coordinates first, then the y coordinates second. You presumably meant something like
plt.plot(V2,W2)
In which case you'd pass 2d arrays to plot, and you wouldn't see the error caused by passing a 3d-array-like. However, the behaviour of pyplot.plot is non-trivial for 2d inputs (columns of both datasets will get plotted against one another), and you have to make sure that you really want to pass 2d arrays as inputs. But you almost never want to pass the same object as the first two arguments to pyplot.plot.

Why does indexing a matrix by an integer produce a different shape than the dot product with a one hot vector in numpy?

I have a matrix that I initialized with numpy.random.uniform like so:
W = np.random.uniform(-1, 1, (V,N))
In my case, V = 10000 and N = 50, x is a positive integer
When I multiply W by a one hot vector x_vec of dimension V X 1, like W.T.dot(x_vec), I get a column vector with a shape of (50,1). When I try to get the same vector by indexing W, as in W[x].T or W[x,:].T I get shape (50,).
Can anyone explain to me why these two expression return different shapes and if it's possible to return a (50,1) matrix (vector) with the indexing method. The vector of shape (50,) is problematic because it doesn't behave the same way as the (50,1) vector when I multiply it with other matrices, but I'd like to use indexing to speed things up a little.
*Sorry in advance if this question should be in a place like Cross Validated instead of Stack Exchange

They are different operations. matrix (in the maths sense) times matrix gives matrix, some of your matrices just happen to have width 1.
Indexing with an integer scalar eats the dimension you are indexing into. Once you are down to a single dimension, .T does nothing because it doesn't have enough axes to shuffle.
If you want to go from (50,) to (50, 1) shape-wise, the recipe is indexing with None like so v[:, None]. In your case you have at least two one-line options:
W[x, :][:, None] # or W[x][:, None] or
W[x:x+1, :].T # or W[x:x+1].T
The second-line option preserves the first dimension of W by requesting a subrange of length one. The first option can be contracted into a single indexing operation - thanks to #hpaulj for pointing this out - which gives the arguably most readable option:
W[x, :, None]
The first index (scalar integer x) consumes the first dimension of W, the second dimension (unaffected by :) becomes the first and None creates a new dimension on the right.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Role of 'axes' in keras.backend.batch_dot - python

Related

Intuition behind the correlation

Numpy dot product problems

How to label regions that they do not have same values?

List of simple arrays with pyplot.plot

Why does indexing a matrix by an integer produce a different shape than the dot product with a one hot vector in numpy?

Categories

Resources