I want to merge multiple 2d Numpy array of shapes let say (r, a) ,(r,b) ,(r,c),...(r,z) into single 2d array of shape (r,a+b+c...+z)
I tried np.hstack but it needs the same shape & np.concat operates only on tuple as 2nd array.
You can use np.concatenate or np.hstack. Here is an example:
>>> a = np.arange(15).reshape(5,3)
>>> a
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]])
>>> b = np.arange(10).reshape(5,2)
>>> b
array([[0, 1],
[2, 3],
[4, 5],
[6, 7],
[8, 9]])
>>> np.concatenate((a,b), axis =1)
array([[ 0, 1, 2, 0, 1],
[ 3, 4, 5, 2, 3],
[ 6, 7, 8, 4, 5],
[ 9, 10, 11, 6, 7],
[12, 13, 14, 8, 9]])
>>> np.hstack((a,b))
array([[ 0, 1, 2, 0, 1],
[ 3, 4, 5, 2, 3],
[ 6, 7, 8, 4, 5],
[ 9, 10, 11, 6, 7],
[12, 13, 14, 8, 9]])
Hope it helps
I am new to numpy but I think its not possible. Precondition is
"The arrays must have the same shape along all but the second axis, except 1-D arrays which can be any length."
Actually one of my function was returning scipy.sparse.csr.csr_matrix and I was converting it into np.array along with lists returned by another function so that I can merge all them but the sparse matrix was converted into
array(<73194x17 sparse matrix of type '' with 203371 stored elements in Compressed Sparse Row format>, dtype=object)
which was not compatible with np.hstack.
so sorry for the inconvenience.
I figured out my solution instead of numpy.hstack i used scipy hstack function.
Thank you, Everyone, for responding.
Related
I have a 2-d numpy array of shape NxM which represents M contiguous samples from N different sequences. I need to present patches of L samples (L << M) covering the entire dataset as a 2-d numpy array. There is too much data to construct a new dataset by simply copying of all the patches.
If there was a single sequence, it would be very straight forward to generate the overlapping patches without copying any data using the as_strided trick:
patches = np.lib.stride_tricks.as_strided(data, shape(N*M-L+1,L), strides=(8,8))
The problem with this approach for my data is that it produces patches that overlap separate sequences.
I can also see how to generate a 3-d array of shape N,M-L+1,L using something like:
patches = np.lib.stride_ticks.as_strided(data, shape(N,M-L+1,L), strides=(8*M,8,8))
This produces the correct patches, but I am not sure how to collapse the first two dimensions into one.
There are obviously several SO answers related to as_strided, but I could not find any that address these particular requirements.
Any ideas are appreciated.
Edit: Short example follows
Here is an example of using as_strided to make a 3-d array that almost accomplishes the task:
>>> a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
>>> a
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
>>> b = np.lib.stride_tricks.as_strided(a, shape=(3, 3, 2), strides=(32,8,8))
>>> b
array([[[ 1, 2],
[ 2, 3],
[ 3, 4]],
[[ 5, 6],
[ 6, 7],
[ 7, 8]],
[[ 9, 10],
[10, 11],
[11, 12]]])
>>>
The issue with trying to flatten this 3-d array into 2-d as suggested by #Divakar is that the reshaping produces the correct data but does so by making a copy which creates an unmanageable amount of data for the actual problem at hand:
>>> c = b.reshape(-1,b.shape[-1])
>>> c
array([[ 1, 2],
[ 2, 3],
[ 3, 4],
[ 5, 6],
[ 6, 7],
[ 7, 8],
[ 9, 10],
[10, 11],
[11, 12]])
>>> b[0][0][0] = 9999
>>> c
array([[ 1, 2],
[ 2, 3],
[ 3, 4],
[ 5, 6],
[ 6, 7],
[ 7, 8],
[ 9, 10],
[10, 11],
[11, 12]])
>>>
In Numpy, given a stack of large images A of size(N,hl,wl), and coordinates x of size(N) and y of size(N) I want to get smaller images of size (N,16,16)
In a for loop it would look like this:
B=numpy.zeros((N,16,16))
for i in range(0,N):
B[i,:,:]=A[i,y[i]:y[i]+16,x[i]:x[i]+16]
But can I do this just with indexing?
Bonus question: Will this indexing also work in pytorch? If not how can I implement this there?
In numpy slicing is very simple and the same logic works with a pytorch example. For example
imgs = np.random.normal(size=(16,24,24))
imgs[:,0:12,0:12].shape
imgs_tensor = torch.from_numpy(imgs)
imgs_tensor[:,0:12,0:12].size()
where the first : in the slicing indicates to select all the images in the batch. The 2nd and 3rd : indicates the slicing for height and width.
Pretty simple really with view_as_windows from scikit-image, to get those sliding windowed views as a 6D array with the fourth axis being singleton. Then, use advanced-indexing to select the ones we want based off the y and x indices for indexing into the second and third axes of the windowed array to get our B.
Hence, the implementation would be -
from skimage.util.shape import view_as_windows
BSZ = 16, 16 # Blocksize
A6D = view_as_windows(A,(1,BSZ[0],BSZ[1]))
B_out = A6D[np.arange(N),y,x,0]
Explanation
To explain to other readers on what's really going on with the problem, here's a sample run on a smaller dataset and with a blocksize of (2,2) -
1) Input array (3D) :
In [78]: A
Out[78]:
array([[[ 5, 5, 3, 5, 3, 8],
[ 5, *2, 6, 2, 2, 4],
[ 4, 3, 4, 9, 3, 8],
[ 6, 3, 3, 10, 4, 5],
[10, 2, 5, 7, 6, 7],
[ 5, 4, 2, 5, 2, 10]],
[[ 4, 9, 8, 4, 9, 8],
[ 7, 10, 8, 2, 10, 9],
[10, *9, 3, 2, 4, 7],
[ 5, 10, 8, 3, 5, 4],
[ 6, 8, 2, 4, 10, 4],
[ 2, 8, 6, 2, 7, 5]],
[[ *4, 8, 7, 2, 9, 9],
[ 2, 10, 2, 3, 8, 8],
[10, 7, 5, 8, 2, 10],
[ 7, 4, 10, 9, 6, 9],
[ 3, 4, 9, 9, 10, 3],
[ 6, 4, 10, 2, 6, 3]]])
2) y and x indices to index into the second and third axes :
In [79]: y
Out[79]: array([1, 2, 0])
In [80]: x
Out[80]: array([1, 1, 0])
3) Finally the desired output, which is a block each from each of the 2D slice along the first axis and whose starting point (top left corner point) is (y,x) on that 2D slice. Refer to the asterisks in A for those -
In [81]: B
Out[81]:
array([[[ 2, 6],
[ 3, 4]],
[[ 9, 3],
[10, 8]],
[[ 4, 8],
[ 2, 10]]])
This is an implementation of extract_glimpse similar with tf.image.extract_glimpse in PyTorch. It should be satisfied your need:
https://github.com/jimmysue/xvision/blob/main/xvision/ops/extract_glimpse.py#L14
I have a matrix with dimention (2,5) and I have have a vector of values to be fill in that matrix. What is the best way. I can think of three methods but I have trouble using the np.empty & fill and np.full without loops
x=np.array(range(0,10))
mat=x.reshape(2,5)
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
mat=np.empty((2,5))
newMat=mat.fill(x) # Error: The x has to be scalar
mat=np.full((2,5),x) # Error: The x has to be scalar
full and fill are for setting all elements the same
In [557]: np.full((2,5),10)
Out[557]:
array([[10, 10, 10, 10, 10],
[10, 10, 10, 10, 10]])
Assigning an array works provided the shapes match (in the broadcasting sense):
In [558]: arr[...] = x.reshape(2,5) # make source the same shape as target
In [559]: arr
Out[559]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
In [560]: arr.flat = x # make target same shape as source
In [561]: arr
Out[561]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
arr.flat and arr.ravel() are equivalent. Well, not quite:
In [562]: arr.flat = x.reshape(2,5) # don't need the [:] with flat #wim
In [563]: arr
Out[563]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
In [564]: arr.ravel()[:] = x.reshape(2,5)
ValueError: could not broadcast input array from shape (2,5) into shape (10)
In [565]: arr.ravel()[:] = x.reshape(2,5).flat
flat works with any shape source, even ones that require replication
In [570]: arr.flat = [1,2,3]
In [571]: arr
Out[571]:
array([[1, 2, 3, 1, 2],
[3, 1, 2, 3, 1]])
More broadcasted inputs
In [572]: arr[...] = np.ones((2,1))
In [573]: arr
Out[573]:
array([[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1]])
In [574]: arr[...] = np.arange(5)
In [575]: arr
Out[575]:
array([[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]])
An example of the problem Eric mentioned. The ravel (or other reshape) of a transpose is (often) a copy. So writing to that does not modify the original.
In [578]: arr.T.ravel()[:]=10
In [579]: arr
Out[579]:
array([[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]])
In [580]: arr.T.flat=10
In [581]: arr
Out[581]:
array([[10, 10, 10, 10, 10],
[10, 10, 10, 10, 10]])
ndarray.flat returns an object which can modify the contents of the array by direct assignment:
>>> array = np.empty((2,5), dtype=int)
>>> vals = range(10)
>>> array.flat = vals
>>> array
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
If that seems kind of magical to you, then read about the descriptor protocol.
Warning: assigning to flat does not raise exceptions for size mismatch. If there are not enough values on the right hand side of the assignment, the data will be rolled/repeated. If there are too many values, only the first few will be used.
If you want a 10x2 matrix of 5:
np.ones((10,2))*5
If you have a list of values and just want them in a particular shape:
datavalues = [1,2,3,4,5,6,7,8,9,10]
np.reshape(datavalues,(2,5))
array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10]])
Related to How to get indices of N maximum values in a numpy array?, I have a numpy matrix a, and I would like to produce the array whose i-th row is the column indices of the top N elements of the i-th row of a.
Following the top-voted answer to the linked question, adapting it for arrays, here is what I have so far (using N=4):
>>> a
array([[9, 4, 4, 3, 3, 9, 0, 4, 6, 0],
[3, 4, 6, 9, 5, 7, 1, 2, 8, 4]])
>>> ind=np.argpartition(a,-4)[:,-4:]
>>> ind
array([[1, 5, 8, 0],
[2, 3, 8, 5]])
>>> rows=np.transpose([np.arange(a.shape[0])])
>>> rows
array([[0],
[1]])
>>> ind_sorted = ind[rows,np.argsort(a[rows,ind])]
>>> ind_sorted
array([[1, 8, 5, 0],
[2, 5, 8, 3]])
This works, but seems to be not very (num)pythonic. I'm sure there's a better way to do the indexing that doesn't require a dummy array. Any suggestions?
Slicing the last four elements of the order index by row seems to be working:
a.argsort(axis = 1)[:, -4:]
# array([[7, 8, 0, 5],
# [2, 5, 8, 3]])
The tie method is not defined, so there will be some difference between 1 and 7 as well as the order of 0 and 5.
Here I am posting code for getting indices for top 2 values in each row.
Unsorted_Array = np.array ([[23, 56, 12, 24], [6, 36, 9, 99], [24, 11, 87, 9], [7,29,103, 5]])
Index_of_Top_2_Values_of_EachRow = np.argsort(Unsorted_Array,axis = 1)[:,-2:]
print(Index_of_Top_2_Values_of_EachRow)
I can't figure out the difference between these two kinds of indexing. It seems like they should produce the same results but they do not. Any explanation?
A[1:3, 0:2] takes rows 1-3 and columns 0-2 thus returning a 2x2 array.
A[1:3][0:2] first takes rows 1-3 and from this subarray takes the rows 0-2, resulting in a 2xn array where n is the original number of columns.
In [1]: import numpy as np
In [2]: a = np.arange(16).reshape(4,4)
In [3]: a
Out[3]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [4]: a[1:3,0:2]
Out[4]:
array([[4, 5],
[8, 9]])
In [5]: a[1:3]
Out[5]:
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
In [6]: a[1:3][0:2]
Out[6]:
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
The equivalent of A[1:3,0:2] using two [] is: A[1:3][:,0:2]:
In [7]: a[1:3][:,0:2]
Out[7]:
array([[4, 5],
[8, 9]])
Where : means "all the rows". So you are first selecting the rows via [1:3] and then, from all the rows select columns 0-2.
A[1:3][0:2] means first apply [1:3] on A, and then apply [0:2] on the array returned from the first step, so both slicing are only applied on the rows. OTOH A[1:3, 0:2] means apply 1:3 on the rows and 0:2 on columns, ie. get second and third row only and get only the first two columns of those rows.
>>> import numpy as np
>>> a = np.arange(12).reshape(3, 4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> a[1:3][0:2]
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> a[1:3] #Get 2nd and 3rd row.
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> _[0:2] #Get the first two rows of the last array.
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> a[1:3, 0:2]
array([[4, 5],
[8, 9]])