Related
I have an array of shape [batch_size, N], for example:
[[1 2]
[3 4]
[5 6]]
and I need to create a 3 indices array with shape [batch_size, N, N] where for every batch I have a N x N diagonal matrix, where diagonals are taken by the corresponding batch element, for example in this case, In this simple case, the result I am looking for is:
[
[[1,0],[0,2]],
[[3,0],[0,4]],
[[5,0],[0,6]],
]
How can I make this operation without for loops and exploting vectorization? I guess it is an extension of dimension, but I cannot find the correct function to do this.
(I need it as I am working with tensorflow and prototyping with numpy).
Try it in tensorflow:
import tensorflow as tf
A = [[1,2],[3 ,4],[5,6]]
B = tf.matrix_diag(A)
print(B.eval(session=tf.Session()))
[[[1 0]
[0 2]]
[[3 0]
[0 4]]
[[5 0]
[0 6]]]
Approach #1
Here's a vectorized one with np.einsum for input array, a -
# Initialize o/p array
out = np.zeros(a.shape + (a.shape[1],),dtype=a.dtype)
# Get diagonal view and assign into it input array values
diag = np.einsum('ijj->ij',out)
diag[:] = a
Approach #2
Another based on slicing for assignment -
m,n = a.shape
out = np.zeros((m,n,n),dtype=a.dtype)
out.reshape(-1,n**2)[...,::n+1] = a
Using np.expand_dims with an element-wise product with np.eye
a = np.array([[1, 2],
[3, 4],
[5, 6]])
N = a.shape[1]
a = np.expand_dims(a, axis=1)
a*np.eye(N)
array([[[1., 0.],
[0., 2.]],
[[3., 0.],
[0., 4.]],
[[5., 0.],
[0., 6.]]])
Explanation
np.expand_dims(a, axis=1) adds a new axis to a, which will now be a (3, 1, 2) ndarray:
array([[[1, 2]],
[[3, 4]],
[[5, 6]]])
You can now multiply this array with a size N identity matrix, which you can generate with np.eye:
np.eye(N)
array([[1., 0.],
[0., 1.]])
Which will yield the desired output:
a*np.eye(N)
array([[[1., 0.],
[0., 2.]],
[[3., 0.],
[0., 4.]],
[[5., 0.],
[0., 6.]]])
Yu can use numpy.diag
m = [[1, 2],
[3, 4],
[5, 6]]
[np.diag(b) for b in m]
EDIT The following plot shows the average execution time for the solution above (solid line), and compared it against #Divakar's (dashed line) for different batch-sizes and different matrix sizes
I don't believe you get much of an improvement, but this is just based on this simple metric
You basically want a function that does the opposite of/reverses np.block(..)
I needed the same thing, so I wrote this little function:
def split_blocks(x, m=2, n=2):
"""
Reverse the action of np.block(..)
>>> x = np.random.uniform(-1, 1, (2, 18, 20))
>>> assert (np.block(split_blocks(x, 3, 4)) == x).all()
:param x: (.., M, N) input matrix to split into blocks
:param m: number of row splits
:param n: number of column, splits
:return:
"""
x = np.array(x, copy=False)
nd = x.ndim
*shape, nr, nc = x.shape
return list(map(list, x.reshape((*shape, m, nr//m, n, nc//n)).transpose(nd-2, nd, *range(nd-2), nd-1, nd+1)))
I have a question about typed array initialization in IronPython. I want to initialize inline typed two-dimensional array in IronPython.
In IronPython I discovered how to initialize simple typed array:
pythonTypedArray = Array[int]([0, 1, 2, 3, 4])
and how to initialize typed array of arrays:
pythonTypedArrayOfArrays = Array[Array[int]]([Array[int]([0, 1]), Array[int]([2, 3])])
For example, in C# I can do like so:
int[,] twoDimensionalArray = new int[,] { {0, 1, 2, 3, 4}, {5, 6, 7, 8, 9} };
Can I initialize inline two-dimensional typed array in IronPython? If no, what is the best way to initialize two-dimensional typed array in IronPython?
So, you need a matrix. In Python it can be achieved by doing this:
matrix = [[0 for x in range(w)] for y in range(h)]
Or using numpy (you can install it by running pip install numpy):
>>> import numpy
>>> numpy.zeros((4, 4))
array([[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.]])
In numpy, generating a matrix with random numbers is as simple as:
>>> import numpy as np
>>> np.random.rand(2,3)
array([[ 0.22568268, 0.0053246 , 0.41282024],
[ 0.68824936, 0.68086462, 0.6854153 ]])
IMO, whenever you want matrices, you want to use numpy because it has some nice methods which really help you process them.
//EDIT: Since you added more context, in IronPython you can create an array of arrays by doing:
array = Array[Array[int]]( ( (1,2), (3,4) ) )
or you can create multidimensional arrays is to use Array.CreateInstance passing type as a first argument followed by dimension sizes:
array = Array.CreateInstance(int, 2, 3)
You can always read the docs when you need more information
import numpy
a = numpy.arange(12)
a.shape = 3, 4
print(a)
>>>
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
I'm trying to pad a numpy array, and I cannot seem to find the right approach from the documentation for numpy. I have an array:
a = array([2, 1, 3, 5, 7])
This represents the index for an array I wish to create. So at index value 2 or 1 or 3 etc I would like to have a one in the array, and everywhere else in the target array, to be padded with zeros. Sort of like an array mask. I would also like to specify the overall length of the target array, l. So my ideal function would like something like:
>>> foo(a,l)
array([0,1,1,1,0,1,0,1,0,0,0]
, where l=10 for the above example.
EDIT:
So I wrote this function:
def padwithones(a,l) :
p = np.zeros(l)
for i in a :
p = np.insert(p,i,1)
return p
Which gives:
Out[19]:
array([ 0., 1., 0., 1., 1., 1., 0., 1., 0., 0., 0., 0., 0.,
0., 0.])
Which isn't correct!
What you're looking for is basically a one-hot array:
def onehot(foo, l):
a = np.zeros(l, dtype=np.int32)
a[foo] = 1
return a
Example:
In [126]: onehot([2, 1, 3, 5, 7], 10)
Out[126]: array([0, 1, 1, 1, 0, 1, 0, 1, 0, 0])
m is a ndarray with shape (12, 21, 21), now I want to take only a sparse slice of it to form a new 2D array, with
sliceid = 0
indx = np.array([0, 2, 4, 6, 8, 10])
so that sparse_slice is, intuitively,
sparse_slice = m[sliceid, indx, indx]
but apparently the above operation does not work, currently what I am using is
sparse_slice = m[sliceid,indx,:][:, indx]
why is the first "intuitive" way not working? and is there a more compact way than my current solution? all my previous ndarray slicing trials were based on nothing but intuition, maybe I shall switch to read some serious manual now...
The more compact way is to do new = m[0, :12:2, :12:2]. This is what the numpy docs call "basic indexing" meaning that you slice with an integer or a slice object (ie 0:12:2). When you use basic indexing numpy returns a view of the original array. For example:
In [3]: a = np.zeros((2, 3, 4))
In [4]: b = a[0, 1, ::2]
In [5]: b
Out[5]: array([ 0., 0.])
In [6]: b[:] = 7
In [7]: a
Out[7]:
array([[[ 0., 0., 0., 0.],
[ 7., 0., 7., 0.],
[ 0., 0., 0., 0.]],
[[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.]]])
In you "intuitive" approach what you're doing is indexing an array with another array. When you index an numpy array with another array the arrays need to be the same size (or they need to broadcast against each other, more on this in a sec). In the docs this is called fancy indexing or advanced indexing. For example:
In [10]: a = np.arange(9).reshape(3,3)
In [11]: a
Out[11]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
In [12]: index = np.array([0,1,2])
In [13]: b = a[index, index]
In [14]: b
Out[14]: array([0, 4, 8])
You see that I get a[0,0], a[1,1], and a[2,2] not a[0,0], a[0,1] ... If you want the "outer product" of index with index you can do the following.
In [22]: index1 = np.array([[0,0],[1,1]])
In [23]: index2 = np.array([[0,1],[0,1]])
In [24]: b = a[index1, index2]
In [25]: b
Out[25]:
array([[0, 1],
[3, 4]])
There is a shorthand for doing the above, like this:
In [28]: index = np.array([0,1])
In [29]: index1, index2 = np.ix_(index, index)
In [31]: index1
Out[31]:
array([[0],
[1]])
In [32]: index2
Out[32]: array([[0, 1]])
In [33]: a[index1, index2]
Out[33]:
array([[0, 1],
[3, 4]])
In [34]: a[np.ix_(index, index)]
Out[34]:
array([[0, 1],
[3, 4]])
You'll notice that index1 is (2, 1) and index2 is (1, 2), not (2, 2). That's because the two arrays get broadcast against one another, you can read more about broadcasting here. Keep in mind that when you're using fancy indexing you get a copy of the original data not a view. Sometimes this is better (if you want to leave the original data unchanged) and sometimes it just takes more memory. More about indexing here.
If I'm not mistaken, for input m = np.array(range(5292)).reshape(12,21,21) you are expecting output sparse_slice = m[sliceid,indx,:][:, indx] of
array([[ 0, 2, 4, 6, 8, 10],
[ 42, 44, 46, 48, 50, 52],
[ 84, 86, 88, 90, 92, 94],
[126, 128, 130, 132, 134, 136],
[168, 170, 172, 174, 176, 178],
[210, 212, 214, 216, 218, 220]])
In that case, you can get it using the step part of a slice:
m[0, :12:2, :12:2]
In Matlab, one can access a column of an array with ::
>> array=[1 2 3; 4 5 6]
array =
1 2 3
4 5 6
>> array(:,2)
ans =
2
5
How to do this in Python?
>>> array=[[1,2,3],[4,5,6]]
>>> array[:,2]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list indices must be integers, not tuple
>>> array[:][2]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
Addendum
I'd like an example applied to an array of dimensions greater than three:
>> B = cat(3, eye(3), ones(3), magic(3))
B(:,:,1) =
1 0 0
0 1 0
0 0 1
B(:,:,2) =
1 1 1
1 1 1
1 1 1
B(:,:,3) =
8 1 6
3 5 7
4 9 2
>> B(:,:,1)
ans =
1 0 0
0 1 0
0 0 1
>> B(:,2,:)
ans(:,:,1) =
0
1
0
ans(:,:,2) =
1
1
1
ans(:,:,3) =
1
5
9
Use Numpy.
>>> import numpy as np
>>>
>>> a = np.array([[1,2,3],[4,5,6]])
>>> a[:, 2]
array([3, 6])
If you come from Matlab, this should be of interest: Link
You can group data in a two-dimensional list by column using the built-in zip() function:
>>> array=[[1,2,3],[4,5,6]]
>>> zip(*array)
[(1, 4), (2, 5), (3, 6)]
>>> zip(*array)[1]
(2, 5)
Note that the index starts at 0, so to get the second column as in your example you use zip(*array)[1] instead of zip(*array)[2]. zip() returns tuples instead of lists, depending on how you are using it this may not be a problem, but if you need lists you can always do map(list, zip(*array)) or list(zip(*array)[1]) to do the conversion.
If you use Matlab, you probably will want to install NumPy:
Using NumPy, you can do this:
In [172]: import numpy as np
In [173]: arr = np.matrix('1 2 3; 4 5 6')
In [174]: arr
Out[174]:
matrix([[1, 2, 3],
[4, 5, 6]])
In [175]: arr[:,2]
Out[175]:
matrix([[3],
[6]])
Since Python uses 0-based indexing (while Matlab uses 1-based indexing), to get the same slice you posted you would do:
In [176]: arr[:,1]
Out[176]:
matrix([[2],
[5]])
It is easy to build numpy arrays of higher dimension as well. You could use np.dstack for instance:
In [199]: B = np.dstack( (np.eye(3), np.ones((3,3)), np.arange(9).reshape(3,3)) )
In [200]: B.shape
Out[200]: (3, 3, 3)
In [201]: B[:,:,0]
Out[201]:
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])
In [202]: B[:,:,1]
Out[202]:
array([[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.]])
In [203]: B[:,:,2]
Out[203]:
array([[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]])
And here is the array formed from the second column from each of the 3 arrays above:
In [204]: B[:,1,:]
Out[204]:
array([[ 0., 1., 1.],
[ 1., 1., 4.],
[ 0., 1., 7.]])
Numpy doesn't have a function to create magic squares, however. sniff
Indexing / slicing with Python using the colon results in things a bit differently than matlab. If you have your array, [:] will copy it. If you want all values at a specific index of nested arrays, you probably want something like this:
array = [[1,2,3],[4,5,6]]
col1 = [inner[0] for inner in array] # note column1 is index 0 in Python.
If using nested lists, you can use a list comprehension:
array = [ [1, 2, 3], [4, 5, 6] ]
col2 = [ row[1] for row in array ]
Keep in mind that since Python doesn't natively know about matrices, col2 is a list, and as such both "rows" and "columns" are the same type, namely lists. Use the numpy package for better support for matrix math.
def get_column(array, col):
result = []
for row in array:
result.appen(row[col])
return result
Use like this (remember that indexes start from 0):
>>> a = [[1,2,3], [2,3,4]]
>>> get_column(a, 1)
[2, 3]
Use a list comprehension to build a list of values from that column:
def nthcolumn(n, matrix):
return [row[n] for row in matrix]
Optionally use itemgetter if you need a (probably slight) performance boost:
from operator import itemgetter
def nthcolumn(n, matrix):
nthvalue = itemgetter(n)
return [nthvalue(row) for row in matrix]