I would like to select every nth group of n columns in a numpy array. It means that I want the first n columns, not the n next columns, the n next columns, not the n next columns etc.
For example, with the following array and n=2:
import numpy as np
arr = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15, 16, 17, 18, 19, 20]])
I would like to get:
[[1, 2, 5, 6, 9, 10],
[11, 12, 15, 16, 19, 20]]
And with n=3:
[[1, 2, 3, 7, 8, 9],
[11, 12, 13, 17, 18, 19]]
With n=1 we can simply use the syntax arr[:,::2], but is there something similar for n>1?
You can use modulus to create ramps starting from 0 until 2n and then select the first n from each such ramp. Thus, for each ramp, we would have first n as True and rest as False, to give us a boolean array covering the entire length of the array. Then, we simply use boolean indexing along the columns to select the valid columns for the final output. Thus, the implementation would look something like this -
arr[:,np.mod(np.arange(arr.shape[-1]),2*n)<n]
Step by step code runs to give a better idea -
In [43]: arr
Out[43]:
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15, 16, 17, 18, 19, 20]])
In [44]: n = 3
In [45]: np.mod(np.arange(arr.shape[-1]),2*n)
Out[45]: array([0, 1, 2, 3, 4, 5, 0, 1, 2, 3])
In [46]: np.mod(np.arange(arr.shape[-1]),2*n)<n
Out[46]: array([ True,True,True,False,False,False,True,True,True,False])
In [47]: arr[:,np.mod(np.arange(arr.shape[-1]),2*n)<n]
Out[47]:
array([[ 1, 2, 3, 7, 8, 9],
[11, 12, 13, 17, 18, 19]])
Sample runs across various n -
In [29]: arr
Out[29]:
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15, 16, 17, 18, 19, 20]])
In [30]: n = 1
In [31]: arr[:,np.mod(np.arange(arr.shape[-1]),2*n)<n]
Out[31]:
array([[ 1, 3, 5, 7, 9],
[11, 13, 15, 17, 19]])
In [32]: n = 2
In [33]: arr[:,np.mod(np.arange(arr.shape[-1]),2*n)<n]
Out[33]:
array([[ 1, 2, 5, 6, 9, 10],
[11, 12, 15, 16, 19, 20]])
In [34]: n = 3
In [35]: arr[:,np.mod(np.arange(arr.shape[-1]),2*n)<n]
Out[35]:
array([[ 1, 2, 3, 7, 8, 9],
[11, 12, 13, 17, 18, 19]])
Related
I am using this for loop to separate dataset into groups. but the list "y" is converting into an array with an error.
def to_sequences(dataset, seq_size=1):
x = []
y = []
for i in range(len(dataset)-seq_size):
window = dataset[i:(i+seq_size), 0]
x.append(window)
window2 = dataset[(i+seq_size):i+seq_size+5, 0]
y.append(window2)
return np.array(x),np.array(y)
seq_size = 5
trainX, trainY = to_sequences(train, seq_size)
print("Shape of training set: {}".format(trainX.shape))
print("Shape of training set: {}".format(trainY.shape))
And this is the error message I get
VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
return np.array(x),np.array(y)
Couldn't find the issue why it is working for 'x' and not for 'y'. Any idea ?
In [247]: dataset = np.arange(20)
In [248]: def to_sequences(dataset, seq_size=1):
...: x = []
...: y = []
...: for i in range(len(dataset)-seq_size):
...: window = dataset[i:(i+seq_size), 0]
...: x.append(window)
...: window2 = dataset[(i+seq_size):i+seq_size+5, 0]
...: y.append(window2)
...: return np.array(x),np.array(y)
...:
and a sample run:
In [250]: to_sequences(dataset[:,None], 5)
<ipython-input-248-176eb762993c>:9: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
return np.array(x),np.array(y)
Out[250]:
(array([[ 0, 1, 2, 3, 4],
[ 1, 2, 3, 4, 5],
[ 2, 3, 4, 5, 6],
[ 3, 4, 5, 6, 7],
[ 4, 5, 6, 7, 8],
[ 5, 6, 7, 8, 9],
[ 6, 7, 8, 9, 10],
[ 7, 8, 9, 10, 11],
[ 8, 9, 10, 11, 12],
[ 9, 10, 11, 12, 13],
[10, 11, 12, 13, 14],
[11, 12, 13, 14, 15],
[12, 13, 14, 15, 16],
[13, 14, 15, 16, 17],
[14, 15, 16, 17, 18]]),
array([array([5, 6, 7, 8, 9]), array([ 6, 7, 8, 9, 10]),
array([ 7, 8, 9, 10, 11]), array([ 8, 9, 10, 11, 12]),
array([ 9, 10, 11, 12, 13]), array([10, 11, 12, 13, 14]),
array([11, 12, 13, 14, 15]), array([12, 13, 14, 15, 16]),
array([13, 14, 15, 16, 17]), array([14, 15, 16, 17, 18]),
array([15, 16, 17, 18, 19]), array([16, 17, 18, 19]),
array([17, 18, 19]), array([18, 19]), array([19])], dtype=object))
The first array is (n,5) int dtype. The second is object dtype, containing arrays. Most of the arrays (5,), but the last ones are (4,),(3,),(2,),(1,).
dataset[(i+seq_size):i+seq_size+5, 0] is slicing off the end of dataset. Python/numpy allows that but the result is truncated.
You'll have to rethink that y slicing if you want a (n,5) shaped array.
Slicing off the end of a list:
In [252]: [1,2,3,4,5][1:4]
Out[252]: [2, 3, 4]
In [253]: [1,2,3,4,5][3:6]
Out[253]: [4, 5]
I have an array like matrix using numpy like this.
import numpy as np
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16], [17, 18, 19, 20]])
the desired array is like this:
array([[ 9, 10, 11, 12],
[13, 14, 15, 16],
[17, 18, 19, 20],
[ 1, 2, 3, 4],
[ 5, 6, 7, 8],])
description: first and second arrays move to the end of the matrix.
I tried something with changing a to a list and used append and del functions and then convert it to a numpy array but it could not be something good to write in python.
is there any function to replace an array position in a larger array-like matrix in numpy?
Function that takes the number of rotations
In [5]: a
Out[5]:
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16],
[17, 18, 19, 20]])
In [14]: def rotate(n):
...: n = n%len(a)
...: return np.concatenate([a[n:], a[:n]])
In [13]: rotate(2)
Out[13]:
array([[ 9, 10, 11, 12],
[13, 14, 15, 16],
[17, 18, 19, 20],
[ 1, 2, 3, 4],
[ 5, 6, 7, 8]])
What if you give n more than the length of the array? It's handled - n = n%len(a)
In [16]: rotate(9)
Out[16]:
array([[17, 18, 19, 20],
[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16]])
Another solution given in comments is roll() method.
In [6]: a
Out[6]:
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16],
[17, 18, 19, 20]])
In [7]: def rotate(n):
...: n = n % len(a)
...: return np.roll(a,-n,axis=0)
...:
In [8]: rotate(8)
Out[8]:
array([[13, 14, 15, 16],
[17, 18, 19, 20],
[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
In [9]: rotate(2)
Out[9]:
array([[ 9, 10, 11, 12],
[13, 14, 15, 16],
[17, 18, 19, 20],
[ 1, 2, 3, 4],
[ 5, 6, 7, 8]])
This is so easy if you use this simple line of code. no function and other things are needed.
simply use numpy.roll. see explanations here.
# Assume your matrix is named a.
>>> a
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16],
[17, 18, 19, 20]])
>>> np.roll(a,-(n % len(a)),axis=0)
array([[ 9, 10, 11, 12],
[13, 14, 15, 16],
[17, 18, 19, 20],
[ 1, 2, 3, 4],
[ 5, 6, 7, 8]])
Numpy ravel works well if I need to create a vector by reading by rows or by columns. However, I would like to transform a matrix to a 1d array, by using a method that is often used in image processing. This is an example with initial matrix A and final result B:
A = np.array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
B = np.array([[ 0, 1, 4, 8, 5, 2, 3, 6, 9, 12, 13, 10, 7, 11, 14, 15])
Is there an existing function already that could help me with that? If not, can you give me some hints on how to solve this problem? PS. the matrix A is NxN.
I've been using numpy for several years, and I've never seen such a function.
Here's one way you could do it (not necessarily the most efficient):
In [47]: a
Out[47]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [48]: np.concatenate([np.diagonal(a[::-1,:], k)[::(2*(k % 2)-1)] for k in range(1-a.shape[0], a.shape[0])])
Out[48]: array([ 0, 1, 4, 8, 5, 2, 3, 6, 9, 12, 13, 10, 7, 11, 14, 15])
Breaking down the one-liner into separate steps:
a[::-1, :] reverses the rows:
In [59]: a[::-1, :]
Out[59]:
array([[12, 13, 14, 15],
[ 8, 9, 10, 11],
[ 4, 5, 6, 7],
[ 0, 1, 2, 3]])
(This could also be written a[::-1] or np.flipud(a).)
np.diagonal(a, k) extracts the kth diagonal, where k=0 is the main diagonal. So, for example,
In [65]: np.diagonal(a[::-1, :], -3)
Out[65]: array([0])
In [66]: np.diagonal(a[::-1, :], -2)
Out[66]: array([4, 1])
In [67]: np.diagonal(a[::-1, :], 0)
Out[67]: array([12, 9, 6, 3])
In [68]: np.diagonal(a[::-1, :], 2)
Out[68]: array([14, 11])
In the list comprehension, k gives the diagonal to be extracted. We want to reverse the elements in every other diagonal. The expression 2*(k % 2) - 1 gives the values 1, -1, 1, ... as k varies from -3 to 3. Indexing with [::1] leaves the order of the array being indexed unchanged, and indexing with [::-1] reverses the order of the array. So np.diagonal(a[::-1, :], k)[::(2*(k % 2)-1)] gives the kth diagonal, but with every other diagonal reversed:
In [71]: [np.diagonal(a[::-1,:], k)[::(2*(k % 2)-1)] for k in range(1-a.shape[0], a.shape[0])]
Out[71]:
[array([0]),
array([1, 4]),
array([8, 5, 2]),
array([ 3, 6, 9, 12]),
array([13, 10, 7]),
array([11, 14]),
array([15])]
np.concatenate() puts them all into a single array:
In [72]: np.concatenate([np.diagonal(a[::-1,:], k)[::(2*(k % 2)-1)] for k in range(1-a.shape[0], a.shape[0])])
Out[72]: array([ 0, 1, 4, 8, 5, 2, 3, 6, 9, 12, 13, 10, 7, 11, 14, 15])
I found discussion of zigzag scan for MATLAB, but not much for numpy. One project appears to use a hardcoded indexing array for 8x8 blocks
https://github.com/lot9s/lfv-compression/blob/master/scripts/our_mpeg/zigzag.py
ZIG = np.array([[0, 1, 5, 6, 14, 15, 27, 28],
[2, 4, 7, 13, 16, 26, 29, 42],
[3, 8, 12, 17, 25, 30, 41, 43],
[9, 11, 18, 24, 31, 40, 44,53],
[10, 19, 23, 32, 39, 45, 52,54],
[20, 22, 33, 38, 46, 51, 55,60],
[21, 34, 37, 47, 50, 56, 59,61],
[35, 36, 48, 49, 57, 58, 62,63]])
Apparently it's used jpeg and mpeg compression.
Data = [day(1) day(2)...day(N)...day(2N)..day(K-N)...day(K)]
I am looking to create a numpy array with two arrays, N and K with shapes (120,) and (300,). The array needs to be of the form:
x1 = [day(1) day(2) day (3)...day(N)]
x2 = [day(2) day(3)...day(N) day(N+1)]
xN = [day(N) day(N+1) day(N+2)...day(2N)]
xK-N = [day(K-N) day(K-N+1)...day(K)]
X is basically of shape (K-N)xN, with the above x1,x2,...xK-N as rows. I have tried using iloc for getting two arrays N and K with the same shapes. Good till then. But, when I try to merge the arrays using X = np.array([np.concatenate((N[i:], K[:i] )) for i in range(len(N)]), I am getting an NxN array in the form of an overlap array only, and not in the desired format.
Is this what you are trying to produce (with simpler data)?
In [253]: N,K=10,15
In [254]: data = np.arange(K)+10
In [255]: data
Out[255]: array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24])
In [256]: np.array([data[np.arange(N)+i] for i in range(K-N+1)])
Out[256]:
array([[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
[12, 13, 14, 15, 16, 17, 18, 19, 20, 21],
[13, 14, 15, 16, 17, 18, 19, 20, 21, 22],
[14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
[15, 16, 17, 18, 19, 20, 21, 22, 23, 24]])
There's another way of generating this, using advanced ideas about strides:
np.lib.stride_tricks.as_strided(data, shape=(K-N+1,N), strides=(4,4))
In the first case, all values in the new array are copies of the original. The strided case is actually a view. So any changes to data appear in the 2d array. And without data copying, the 2nd is also faster. I can try to explain it if you are interested.
Warren suggests using hankel. That's a short function, which in our case does essentially:
a, b = np.ogrid[0:K-N+1, 0:N]
data[a+b]
a+b is an array like:
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[ 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[ 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
[ 4, 5, 6, 7, 8, 9, 10, 11, 12, 13],
[ 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]])
In this example case it is just a bit better than the list comprehension solution, but I expect it will be a lot better for much larger cases.
It is probably not worth adding a dependence on scipy for the following, but if you are already using scipy in your code, you could use the function scipy.linalg.hankel:
In [75]: from scipy.linalg import hankel
In [76]: K = 16
In [77]: x = np.arange(K)
In [78]: x
Out[78]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
In [79]: N = 8
In [80]: hankel(x[:K-N+1], x[K-N:])
Out[80]:
array([[ 0, 1, 2, 3, 4, 5, 6, 7],
[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 2, 3, 4, 5, 6, 7, 8, 9],
[ 3, 4, 5, 6, 7, 8, 9, 10],
[ 4, 5, 6, 7, 8, 9, 10, 11],
[ 5, 6, 7, 8, 9, 10, 11, 12],
[ 6, 7, 8, 9, 10, 11, 12, 13],
[ 7, 8, 9, 10, 11, 12, 13, 14],
[ 8, 9, 10, 11, 12, 13, 14, 15]])
I have a multidimensional NumPy array:
In [1]: m = np.arange(1,26).reshape((5,5))
In [2]: m
Out[2]:
array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25]])
and another array p = np.asarray([[1,1],[3,3]]). I wanted p to act as a array of indexes for m, i.e.:
m[p]
array([7, 19])
However I get:
In [4]: m[p]
Out[4]:
array([[[ 6, 7, 8, 9, 10],
[ 6, 7, 8, 9, 10]],
[[16, 17, 18, 19, 20],
[16, 17, 18, 19, 20]]])
How can I get the desired slice of m using p?
Numpy is using your array to index the first dimension only. As a general rule, indices for a multidimensional array should be in a tuple. This will get you a little closer to what you want:
>>> m[tuple(p)]
array([9, 9])
But now you are indexing the first dimension twice with 1, and the second twice with 3. To index the first dimension with a 1 and a 3, and then the second with a 1 and a 3 also, you could transpose your array:
>>> m[tuple(p.T)]
array([ 7, 19])