Numpy, how to reshape a vector to multi column array - python

I am wondering how to use np.reshape to reshape a long vector into n columns array without giving the row numbers.
Normally I can find out the row number by len(a)//n:
a = np.arange(0, 10)
n = 2
b = a.reshape(len(a)//n,n)
If there a more direct way without using len(a)//n?

You can use -1 on one dimension, numpy will figure out what this number should be:
a = np.arange(0, 10)
n = 2
b = a.reshape(-1, n)
The doc is pretty clear about this feature: https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html
One shape dimension can be -1. In this case, the value is inferred
from the length of the array and remaining dimensions.

Related

Element wise divide like MATLAB's ./ operator?

I am trying to normalize some Nx3 data. If X is a Nx3 array and D is a Nx1 array, in MATLAB, I can do
Y = X./D
If I do the following in Python, I get an error
X = np.random.randn(100,3)
D = np.linalg.norm(X,axis=1)
Y = X/D
ValueError: operands could not be broadcast together with shapes (100,3) (100,)
Any suggestions?
Edit: Thanks to dm2.
Y = X/D.reshape((100,1))
Another way is to use scikitlearn.
from sklearn import preprocessing
Y = preprocessing.normalize(X)
From numpy documentation on array broadcasting:
When operating on two arrays, NumPy compares their shapes
element-wise. It starts with the trailing (i.e. rightmost) dimensions
and works its way left. Two dimensions are compatible when
they are equal, or
one of them is 1
Both of your arrays have the same first dimension, but your X array is 2-dimensional, while your D array is 1-dimensional, which means the shapes of these two arrays do not meet the requirements to be broadcast together.
To make sure they do, you could reshape your D array into a 2-dimensional array of shape (100,1), which would satisfy the requirements to broadcast: rightmost dimensions are 3 and 1 (one of them is 1) and the other dimensions are equal (100 and 100).
So:
Y = X/D.reshape((-1,1))
or
Y = X/D.reshape((100,1))
or
Y = X/D[:,np.newaxis]
Should give you the result you're after.

Setting numpy array to slice without any in-place operations

How can I do this operation efficiently without any inplace operations?
n_id = np.random.choice(np.arange(2708), size=100)
z = np.random.rand(100, 64)
z_sparse = np.zeros((2708,64))
z_sparse[n_id[:100]] = z
Essentially I want the n_id rows of z_sparse to contain z's rows, but I can't do any inplace operations because my end goal is to use this in a pytorch problem.
One though would be to create zero rows within z precisely so that the rows of z end up in the positions n_id, but not sure how this would work efficiently.
Essentially row 1 of z should be placed at row n_id[0] of z_sparse, then row 2 of z should be at row n_id[1] of z_sparse, and so on...
Here's the PyTorch error jic you are curious:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
If n_id is a fixed index array, you can get z_sparse as a matrix multiplication:
# N, n, m = 2078,100, 64
row_mat = (n_id[:n] == np.arange(N)[:,None])
# for pytorch tensor
# row_mat = Tensor(n_id[:n] == np.arange(N)[:,None])
z_sparse = row_mat # z
Since row_mat is a constant array (tensor), your graph should work just fine.

How can I ensure a numpy array to be either a 2D row- or column vector?

Is there a numpy function to ensure a 1D- or 2D- array to be either a column or row vector?
For example, I have either one of the following vectors/lists. What is the easiest way to convert any of the input into a column vector?
x1 = np.array(range(5))
x2 = x1[np.newaxis, :]
x3 = x1[:, np.newaxis]
def ensureCol1D(x):
# The input is either a 0D list or 1D.
assert(len(x.shape)==1 or (len(x.shape)==2 and 1 in x.shape))
x = np.atleast_2d(x)
n = x.size
print(x.shape, n)
return x if x.shape[0] == n else x.T
assert(ensureCol1D(x1).shape == (x1.size, 1))
assert(ensureCol1D(x2).shape == (x2.size, 1))
assert(ensureCol1D(x3).shape == (x3.size, 1))
Instead of writing my own function ensureCol1D, is there something similar already available in numpy that ensures a vector to be column?
Your question is essentially how to convert an array into a "column", a column being a 2D array with a row length of 1. This can be done with ndarray.reshape(-1, 1).
This means that you reshape your array to have a row length of one, and let numpy infer the number of rows / column length.
x1 = np.array(range(5))
print(x1.reshape(-1, 1))
Output:
array([[0],
[1],
[2],
[3],
[4]])
You get the same output when reshaping x2 and x3. Additionally this also works for n-dimensional arrays:
x = np.random.rand(1, 2, 3)
print(x.reshape(-1, 1).shape)
Output:
(6, 1)
Finally the only thing missing here is that you make some assertions to ensure that arrays that cannot be converted are not converted incorrectly. The main check you're making is that the number of non-one integers in the shape is less than or equal to one. This can be done with:
assert sum(i != 1 for i in x1.shape) <= 1
This check along with .reshape let's you apply your logic on all numpy arrays.

how to randomly sample in 2D matrix in numpy

I have a 2d array/matrix like this, how would I randomly pick the value from this 2D matrix, for example getting value like [-62, 29.23]. I looked at the numpy.choice but it is built for 1d array.
The following is my example with 4 rows and 8 columns
Space_Position=[
[[-62,29.23],[-49.73,29.23],[-31.82,29.23],[-14.2,29.23],[3.51,29.23],[21.21,29.23],[39.04,29.23],[57.1,29.23]],
[[-62,11.28],[-49.73,11.28],[-31.82,11.28],[-14.2,11.28],[3.51,11.28],[21.21,11.28] ,[39.04,11.28],[57.1,11.8]],
[[-62,-5.54],[-49.73,-5.54],[-31.82,-5.54] ,[-14.2,-5.54],[3.51,-5.54],[21.21,-5.54],[39.04,-5.54],[57.1,-5.54]],
[[-62,-23.1],[-49.73,-23.1],[-31.82,-23.1],[-14.2,-23.1],[3.51,-23.1],[21.21,-23.1],[39.04,-23.1] ,[57.1,-23.1]]
]
In the answers the following solution was given:
random_index1 = np.random.randint(0, Space_Position.shape[0])
random_index2 = np.random.randint(0, Space_Position.shape[1])
Space_Position[random_index1][random_index2]
this indeed works to give me one sample, how about more than one sample like what np.choice() does?
Another way I am thinking is to tranform the matrix into a array instead of matrix like,
Space_Position=[
[-62,29.23],[-49.73,29.23],[-31.82,29.23],[-14.2,29.23],[3.51,29.23],[21.21,29.23],[39.04,29.23],[57.1,29.23], ..... ]
and at last use np.choice(), however I could not find the ways to do the transformation, np.flatten() makes the array like
Space_Position=[-62,29.23,-49.73,29.2, ....]
Just use a random index (in your case 2 because you have 3 dimensions):
import numpy as np
Space_Position = np.array(Space_Position)
random_index1 = np.random.randint(0, Space_Position.shape[0])
random_index2 = np.random.randint(0, Space_Position.shape[1])
Space_Position[random_index1, random_index2] # get the random element.
The alternative is to actually make it 2D:
Space_Position = np.array(Space_Position).reshape(-1, 2)
and then use one random index:
Space_Position = np.array(Space_Position).reshape(-1, 2) # make it 2D
random_index = np.random.randint(0, Space_Position.shape[0]) # generate a random index
Space_Position[random_index] # get the random element.
If you want N samples with replacement:
N = 5
Space_Position = np.array(Space_Position).reshape(-1, 2) # make it 2D
random_indices = np.random.randint(0, Space_Position.shape[0], size=N) # generate N random indices
Space_Position[random_indices] # get N samples with replacement
or without replacement:
Space_Position = np.array(Space_Position).reshape(-1, 2) # make it 2D
random_indices = np.arange(0, Space_Position.shape[0]) # array of all indices
np.random.shuffle(random_indices) # shuffle the array
Space_Position[random_indices[:N]] # get N samples without replacement
Refering to numpy.random.choice:
Sampling random rows from a 2-D array is not possible with this function, but is possible with Generator.choice through its axis keyword.
The genrator documentation is linked here numpy.random.Generator.choice.
Using this knowledge. You can create a generator and then "choice" from your array:
rng = np.random.default_rng() #creates the generator ==> Generator(PCG64) at 0x2AA703BCE50
N = 3 #Number of Choices
a = np.array(Space_Position) #makes sure, a is an ndarray and numpy-supported
s = a.shape #(4,8,2)
a = a.reshape((s[0] * s[1], s[2])) #makes your array 2 dimensional keeping the last dimension seperated
a.shape #(32, 2)
b = rng.choice(a, N, axis=0, replace=False) #returns N choices of a in array b, e.g. narray([[ 57.1 , 11.8 ], [ 21.21, -5.54], [ 39.04, 11.28]])
#Note: replace=False prevents having the same entry several times in the result
Space_Position[np.random.randint(0, len(Space_Position))]
[np.random.randint(0, len(Space_Position))]
gives you what you want

Adapting matrix array multiplication to use Numpy Tensordot

I'm trying to speed up my code to perform some numerical calculations where I need to multiply 3 matrices with an array. The structure of the problem is the following:
The array as a shape of (N, 10)
The first matrix is constant along the dynamic dimension of the array and has a shape of (10, 10)
The other two matrices vary along the first dimension of the array and have a (N, 10, 10) shape
The result of the calculation should be an array with (N, shape)
I've implemented a solution using for loops that is working, but I'd like to have a better performance so I'm trying to use the numpy functions. I've tried using numpy.tensordot but when multiplying the dynamic matrices with the array I get a shape of (N, 10, N) instead of (N, 10)
My for loop is the following:
res = np.zeros(temp_rho.shape, dtype=np.complex128)
for i in range(temp_rho.shape[0]):
res[i] = np.dot(self.constMatrix, temp_rho[i])
res[i] += np.dot(self.dinMat1[i], temp_rho[i])
res[i] += np.dot(self.dinMat2[i], np.conj(temp_rho[i]))
#temp_rho.shape = (N, 10)
#res.shape = (N, 10)
#self.constMatrix.shape = (10, 10)
#self.dinMat1.shape = (N, 10, 10)
#self.dinMat2.shape = (N, 10, 10)
How should this code be implemented dot products of numpy, returning the correct dimensions?
Here's an approach using a combination of np.dot and np.einsum -
parte1 = constMatrix.dot(temp_rho.T).T
parte2 = np.einsum('ijk,ik->ij',dinMat1, temp_rho)
parte3 = np.einsum('ijk,ik->ij',dinMat2, np.conj(temp_rho))
out = parte1 + parte2 + parte3
Alternative way to get parte1 would be with np.tensordot -
parte1 = np.tensordot(temp_rho, constMatrix, axes=([1],[1]))
Why doesn't numpy.tensordot work for the later two sum-reductions?
Well, we need to keep the first axis between dinMat1/dinMat2 aligned against the first axis of temp_rho/np.conj(temp_rho), which isn't possible with tensordot as the axes that are not sum-reduced are elementwise multiplied along two separate axes. Therefore, when used with np.tensordot, we would end up with two axes of length N corresponding to the first axis each from the two inputs.

Categories