I have a matrix with dimention (2,5) and I have have a vector of values to be fill in that matrix. What is the best way. I can think of three methods but I have trouble using the np.empty & fill and np.full without loops
x=np.array(range(0,10))
mat=x.reshape(2,5)
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
mat=np.empty((2,5))
newMat=mat.fill(x) # Error: The x has to be scalar
mat=np.full((2,5),x) # Error: The x has to be scalar
full and fill are for setting all elements the same
In [557]: np.full((2,5),10)
Out[557]:
array([[10, 10, 10, 10, 10],
[10, 10, 10, 10, 10]])
Assigning an array works provided the shapes match (in the broadcasting sense):
In [558]: arr[...] = x.reshape(2,5) # make source the same shape as target
In [559]: arr
Out[559]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
In [560]: arr.flat = x # make target same shape as source
In [561]: arr
Out[561]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
arr.flat and arr.ravel() are equivalent. Well, not quite:
In [562]: arr.flat = x.reshape(2,5) # don't need the [:] with flat #wim
In [563]: arr
Out[563]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
In [564]: arr.ravel()[:] = x.reshape(2,5)
ValueError: could not broadcast input array from shape (2,5) into shape (10)
In [565]: arr.ravel()[:] = x.reshape(2,5).flat
flat works with any shape source, even ones that require replication
In [570]: arr.flat = [1,2,3]
In [571]: arr
Out[571]:
array([[1, 2, 3, 1, 2],
[3, 1, 2, 3, 1]])
More broadcasted inputs
In [572]: arr[...] = np.ones((2,1))
In [573]: arr
Out[573]:
array([[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1]])
In [574]: arr[...] = np.arange(5)
In [575]: arr
Out[575]:
array([[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]])
An example of the problem Eric mentioned. The ravel (or other reshape) of a transpose is (often) a copy. So writing to that does not modify the original.
In [578]: arr.T.ravel()[:]=10
In [579]: arr
Out[579]:
array([[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]])
In [580]: arr.T.flat=10
In [581]: arr
Out[581]:
array([[10, 10, 10, 10, 10],
[10, 10, 10, 10, 10]])
ndarray.flat returns an object which can modify the contents of the array by direct assignment:
>>> array = np.empty((2,5), dtype=int)
>>> vals = range(10)
>>> array.flat = vals
>>> array
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
If that seems kind of magical to you, then read about the descriptor protocol.
Warning: assigning to flat does not raise exceptions for size mismatch. If there are not enough values on the right hand side of the assignment, the data will be rolled/repeated. If there are too many values, only the first few will be used.
If you want a 10x2 matrix of 5:
np.ones((10,2))*5
If you have a list of values and just want them in a particular shape:
datavalues = [1,2,3,4,5,6,7,8,9,10]
np.reshape(datavalues,(2,5))
array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10]])
Related
I want to merge multiple 2d Numpy array of shapes let say (r, a) ,(r,b) ,(r,c),...(r,z) into single 2d array of shape (r,a+b+c...+z)
I tried np.hstack but it needs the same shape & np.concat operates only on tuple as 2nd array.
You can use np.concatenate or np.hstack. Here is an example:
>>> a = np.arange(15).reshape(5,3)
>>> a
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]])
>>> b = np.arange(10).reshape(5,2)
>>> b
array([[0, 1],
[2, 3],
[4, 5],
[6, 7],
[8, 9]])
>>> np.concatenate((a,b), axis =1)
array([[ 0, 1, 2, 0, 1],
[ 3, 4, 5, 2, 3],
[ 6, 7, 8, 4, 5],
[ 9, 10, 11, 6, 7],
[12, 13, 14, 8, 9]])
>>> np.hstack((a,b))
array([[ 0, 1, 2, 0, 1],
[ 3, 4, 5, 2, 3],
[ 6, 7, 8, 4, 5],
[ 9, 10, 11, 6, 7],
[12, 13, 14, 8, 9]])
Hope it helps
I am new to numpy but I think its not possible. Precondition is
"The arrays must have the same shape along all but the second axis, except 1-D arrays which can be any length."
Actually one of my function was returning scipy.sparse.csr.csr_matrix and I was converting it into np.array along with lists returned by another function so that I can merge all them but the sparse matrix was converted into
array(<73194x17 sparse matrix of type '' with 203371 stored elements in Compressed Sparse Row format>, dtype=object)
which was not compatible with np.hstack.
so sorry for the inconvenience.
I figured out my solution instead of numpy.hstack i used scipy hstack function.
Thank you, Everyone, for responding.
I have a numpy array of shape 28 x 1875. Each element is a 3-element list (only floats). I need to split each of these elements to individual ones, to obtain an array of shape 28x5625(1875*3). I've tried np.split, however it only separates each element, but no each sub-element. Is there a fast way to do this?
Making a 2d array of lists:
In [522]: arr = np.empty(6,object)
In [523]: arr[:] = [list(range(i,i+3)) for i in range(6)]
In [524]: arr = arr.reshape(2,3)
In [525]: arr
Out[525]:
array([[list([0, 1, 2]), list([1, 2, 3]), list([2, 3, 4])],
[list([3, 4, 5]), list([4, 5, 6]), list([5, 6, 7])]], dtype=object)
It's easier to fill such an array if it is 1d, which is why I start with (6,) and reshape after.
Paul Panzer's suggestion:
In [526]: np.array(arr.tolist())
Out[526]:
array([[[0, 1, 2],
[1, 2, 3],
[2, 3, 4]],
[[3, 4, 5],
[4, 5, 6],
[5, 6, 7]]])
In [527]: _.reshape(2,-1)
Out[527]:
array([[0, 1, 2, 1, 2, 3, 2, 3, 4],
[3, 4, 5, 4, 5, 6, 5, 6, 7]])
You can also use np.stack (a version of np.concatenate) to create a nd array. It does though, require a 1d object array - hence the ravel:
In [536]: np.stack(arr.ravel())
Out[536]:
array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6],
[5, 6, 7]])
That can be reshaped as needed:
In [537]: np.stack(arr.ravel()).reshape(2,-1)
Out[537]:
array([[0, 1, 2, 1, 2, 3, 2, 3, 4],
[3, 4, 5, 4, 5, 6, 5, 6, 7]])
In some cases we need to transpose axes to get the desired order.
I have a memory usage problem in python but haven't been able to find a satisfying solution yet.
The problem is quite simple :
I have collection of images as numpy arrays of shape (n_samples, size_image). I need to slice each image in the same way and feed these slices to a classification algorithm all at once.
How do you take numpy array slices without duplicating data in memory?
Naively, as slices are simple "views" of the original data, I assume that there must be a way to do the slicing without copying data in the memory.
The problem being critical when dealing with large datasets such as the MNIST handwritten digits dataset.
I have tried to find a solution using numpy.lib.stride_tricks.as_strided but struggle to get it work on collections of images.
A similar toy problem would be to slice the scikit handwritten digits in a memory-friendly way.
from sklearn.datasets import load_digits
digits = load_digits()
X = digits.data
X has shape (1797, 64) , i.e. the picture is a 8x8 element.
With a window size of 6x6 it will give (8-6+1)*(8-6+1) = 9 slices of size 36 per image resulting in an array sliced_Xof shape (16173, 36).
Now the question is how do you get from X to sliced_Xwithout using too much memory???
I would start off assuming that the input array is (M,n1,n2) (if it's not we can always reshape it). Here's an implementation to have a sliding windowed view into it with an output array of shape (M,b1,b2,n1-b1+1,n2-b2+1) with the block size being (b1,b2) -
def strided_lastaxis(a, blocksize):
d0,d1,d2 = a.shape
s0,s1,s2 = a.strides
strided = np.lib.stride_tricks.as_strided
out_shp = (d0,) + tuple(np.array([d1,d2]) - blocksize + 1) + blocksize
return strided(a, out_shp, (s0,s1,s2,s1,s2))
Being a view it won't occupy anymore of memory space, so we are doing okay on memory. But keep in mind that we shouldn't reshape, as that would force a memory copy.
Here's a sample run to make things with a manual check -
Setup input and get output :
In [72]: a = np.random.randint(0,9,(2, 6, 6))
In [73]: out = strided_lastaxis(a, blocksize=(4,4))
In [74]: np.may_share_memory(a, out) # Verify this is a view
Out[74]: True
In [75]: a
Out[75]:
array([[[1, 7, 3, 5, 6, 3],
[3, 2, 3, 0, 1, 5],
[6, 3, 5, 5, 3, 5],
[0, 7, 0, 8, 2, 4],
[0, 3, 7, 3, 4, 4],
[0, 1, 0, 8, 8, 1]],
[[4, 1, 4, 5, 0, 8],
[0, 6, 5, 6, 6, 7],
[6, 3, 1, 8, 6, 0],
[0, 1, 1, 7, 6, 8],
[6, 3, 3, 1, 6, 1],
[0, 0, 2, 4, 8, 3]]])
In [76]: out.shape
Out[76]: (2, 3, 3, 4, 4)
Output values :
In [77]: out[0,0,0]
Out[77]:
array([[1, 7, 3, 5],
[3, 2, 3, 0],
[6, 3, 5, 5],
[0, 7, 0, 8]])
In [78]: out[0,0,1]
Out[78]:
array([[7, 3, 5, 6],
[2, 3, 0, 1],
[3, 5, 5, 3],
[7, 0, 8, 2]])
In [79]: out[0,0,2]
Out[79]:
array([[3, 5, 6, 3],
[3, 0, 1, 5],
[5, 5, 3, 5],
[0, 8, 2, 4]]) # ............
In [80]: out[1,2,2] # last block
Out[80]:
array([[1, 8, 6, 0],
[1, 7, 6, 8],
[3, 1, 6, 1],
[2, 4, 8, 3]])
I have a static shape-(l,l) array C. I want to extract portions of it into some other array K, which has shape (m,m,n,n). The starting index of what I want to extract from C is given in array i0, which has shape (m,m).
Some element of K will be given by K[i,j,:,:] = C[i0[i,j]:i0[i,j]+n, i0[i,j]:i0[i,j]+n]. So going off some other similar questions it seemed like this might do the job...
C[i0[None, None, ...] + np.arange(n)[..., None, None],
i0[None, None, ...] + np.arange(n)[..., None, None], I, J]
which raises an IndexError. I guess this is because C is only 2D, and the dimensions can't be increased. Though that could be easily fixed by tiling C, since C is large, that would be rather expensive to remake m*m times.
So my question is how to extract different (2D) portions of a 2D array into corresponding portions of a 4D array.
One way would be with np.meshgrid to create 2D indexing meshes corresponding to the window of (n,n) shape, adding those with i0 that's extended with two new axes along which broadcasting would take place. Finally, we simply index into C to give us the desired 4D output. Thus, one implementation would be like so -
N = np.arange(n)
X,Y = np.meshgrid(N,N)
out = C[i0[...,None,None] + Y,i0[...,None,None] + X]
Sample run -
In [153]: C
Out[153]:
array([[3, 5, 1, 6, 3, 5, 8, 7, 0, 2],
[8, 4, 6, 8, 7, 2, 6, 2, 5, 0],
[3, 7, 7, 7, 3, 4, 4, 6, 7, 6],
[7, 0, 8, 2, 1, 1, 0, 4, 4, 6],
[2, 4, 6, 0, 0, 5, 6, 8, 0, 0],
[4, 6, 1, 0, 5, 6, 2, 1, 7, 4],
[0, 5, 5, 3, 7, 5, 7, 1, 4, 0],
[6, 4, 4, 7, 2, 4, 6, 6, 6, 5],
[5, 2, 3, 2, 2, 5, 4, 5, 2, 5],
[3, 7, 1, 0, 4, 4, 6, 6, 2, 2]])
In [154]: i0
Out[154]:
array([[1, 0, 4, 4],
[0, 4, 4, 0],
[2, 3, 1, 3],
[2, 2, 0, 4]])
In [155]: n = 3
In [157]: out[0,0,:,:]
Out[157]:
array([[4, 6, 8],
[7, 7, 7],
[0, 8, 2]])
In [158]: C[i0[0,0]:i0[0,0]+n,i0[0,0]:i0[0,0]+n]
Out[158]:
array([[4, 6, 8],
[7, 7, 7],
[0, 8, 2]])
I would like to generate a 2-by-N array in python for use with scipy.optimize.curve_fit.
I have a function of two independent variables stored as 1-D arrays, and the data in a 2-D array. curve_fit requires that the data be flattened, which is easy with data.ravel().
However, this is the hack I'm using to generate the 2xN array of ordinate values:
ordinate = np.array([[l,t] for l in length for t in time]).T
which works, but is slow. What's the (vectorized?) faster way?
If I got the question correctly, you are looking to form a 2D mesh out of the two independent variables stored as 1D arrays. So, for the same, you can use np.meshgrid -
time2D,length2D = np.meshgrid(time,length)
ordinate_vectorized = np.row_stack((length2D.ravel(),time2D.ravel()))
Sample run -
In [149]: time
Out[149]: array([7, 2, 1, 9, 6])
In [150]: length
Out[150]: array([3, 5])
In [151]: ordinate = np.array([[l,t] for l in length for t in time]).T
In [152]: ordinate
Out[152]:
array([[3, 3, 3, 3, 3, 5, 5, 5, 5, 5],
[7, 2, 1, 9, 6, 7, 2, 1, 9, 6]])
In [153]: time2D,length2D = np.meshgrid(time,length)
...: ordinate_vectorized = np.row_stack((length2D.ravel(),time2D.ravel()))
...:
In [154]: ordinate_vectorized
Out[154]:
array([[3, 3, 3, 3, 3, 5, 5, 5, 5, 5],
[7, 2, 1, 9, 6, 7, 2, 1, 9, 6]])