reshaping 2-d array using specific block shape [duplicate] - python

This question already has answers here:
Flatten or group array in blocks of columns - NumPy / Python
(6 answers)
Closed 3 years ago.
I've got problem with reshaping simple 2-d array into another.
Let`s assume matrix :
[[4 1 2 1 2 4 1 2 4]
[2 3 0 3 0 2 3 0 2]
[5 5 1 5 1 5 5 1 5]
[6 6 6 6 6 6 6 6 6]]
What I want to do is to reshape it to (12, 3) matrix, but using (4, 3) block. What I meant to do is to get matrix like:
[[4 1 2
2 3 0
5 5 1
6 6 6
1 2 4
3 0 2
5 1 5
6 6 6
1 2 4
3 0 2
5 1 5
6 6 6]]
I have highlighted the "egde" of cutting this matrix by additional newline.
I`ve tried numpy reshape (with all available order parameter value), but still I get array with "mixed" values.

You can always do this manually for custom reshapes:
import numpy as np
data = [[4, 1, 2, 1, 2, 4, 1, 2, 4],
[2, 3, 0, 3, 0, 2, 3, 0, 2],
[5, 5, 1, 5, 1, 5, 5, 1, 5],
[6, 6, 6, 6, 6, 6, 6, 6, 6]]
X = np.array(data)
Z = np.r_[X[:, 0:3], X[:, 3:6], X[:, 6:9]]
print(Z)
yields
array([[4, 1, 2],
[2, 3, 0],
[5, 5, 1],
[6, 6, 6],
[1, 2, 4],
[3, 0, 2],
[5, 1, 5],
[6, 6, 6],
[1, 2, 4],
[3, 0, 2],
[5, 1, 5],
[6, 6, 6]])
note the special np.r_ operator that concatenates arrays on rows (first axis). It is just a handy alias for np.concatenate.

Related

How to store numbers in chunks from an array and create another array or list? [duplicate]

This question already has answers here:
How to Split or break a Python list into Unequal chunks, with specified chunk sizes
(3 answers)
Python For Loop Appending Only Last Value to List
(2 answers)
Closed 1 year ago.
I have two arrays x and y:
x = [2 3 1 1 2 5 7 3 6]
y = [0 0 4 2 4 5 8 4 5 6 7 0 5 3 2 8 1 3 1 0 4 2 4 5 4 4 5 6 7 0]
I want to create a list "z" and want to store group/chunks of numbers from y into z and the size of groups is defined by the values of x.
so z store numbers as
z = [[0,0],[4,2,4],[5],[8],[4,5],[6,7,0,5,3],[2,8,1,3,1,0,4],[2,4,5],[4,4,5,6,7,0]]
I tried this loop:
h=[]
for j in x:
h=[[a] for i in range(j) for a in y[i:i+1]]
But it is only storing for last value of x.
Also I am not sure whether the title of this question is appropriate for this problem. Anyone can edit if it is confusing. Thank you so much.
You're reassigning h each time through the loop, so it ends up with just the last iteration's assignment.
You should append to it, not assign it.
start = 0
for j in x:
h.append(y[start:start+j])
start += j
Another way to do it would be by using (and consuming as you do) an iterator like so:
x = [2, 3, 1, 1, 2, 5, 7, 3, 6]
y = [0, 0, 4, 2, 4, 5, 8, 4, 5, 6, 7, 0, 5, 3, 2, 8, 1, 3, 1, 0, 4, 2, 4, 5, 4, 4, 5, 6, 7, 0]
yi = iter(y)
res = [[next(yi) for _ in range(i)] for i in x]
print(res) # -> [[0, 0], [4, 2, 4], [5], [8], [4, 5], [6, 7, 0, 5, 3], [2, 8, 1, 3, 1, 0, 4], [2, 4, 5], [4, 4, 5, 6, 7, 0]]
Aside of the problem you are facing, and as a general rule to live by, try to give more meaningful names to your variables.

Using numpy to select rows based on a condition of one column

I have a file with various columns. Say
1 2 3 4 5 6
2 4 5 6 7 4
3 4 5 6 7 6
2 0 1 5 6 0
2 4 6 8 9 9
I would like to select and save out rows (in each column) in a new file which have the values in column two in the range [0 - 2].
The answer in the new file should be
1 2 3 4 5 6
2 0 1 5 6 0
Kindly assist me. I prefer doing this with numpy in python.
For array a, you can use:
a[(a[:,1] <= 2) & (a[:,1] >= 0)]
Here, the condition filters the values in your second column.
For your example:
>>> a
array([[1, 2, 3, 4, 5, 6],
[2, 4, 5, 6, 7, 4],
[3, 4, 5, 6, 7, 6],
[2, 0, 1, 5, 6, 0],
[2, 4, 6, 8, 9, 9]])
>>> a[(a[:,1] <= 2) & (a[:,1] >= 0)]
array([[1, 2, 3, 4, 5, 6],
[2, 0, 1, 5, 6, 0]])

Sliding windows along last axis of a 2D array to give a 3D array using NumPy strides

I am trying to use the function as_strided from numpy.lib.stride_tricks to extract sub series from a larger 2D array, but I struggled to find the right thing to write for the strides argument.
Let's say I have a matrix m which contains 5 1D array of length (a=)10. I want to extract sub 1D arrays of length (b=)4 for each 1D array in m.
import numpy
from numpy.lib.stride_tricks import as_strided
a, b = 10, 4
m = numpy.array([range(i,i+a) for i in range(5)])
# first try
sub_m = as_strided(m, shape=(m.shape[0], m.shape[1]-b+1, b))
print sub_m.shape # (5,7,4) which is what i expected
print sub_m[-1,-1,-1] # Some unexpected strange number: 8227625857902995061
# second try with strides argument
sub_m = as_strided(m, shape=(m.shape[0], m.shape[1]-b+1, b), strides=(m.itemize,m.itemize,m.itemize))
# gives error, see below
AttributeError: 'numpy.ndarray' object has no attribute 'itemize'
As you can see I succeed to get the right shape for sub_m in my first try. However I can't find what to write in strides=()
For information:
m = [[ 0 1 2 3 4 5 6 7 8 9]
[ 1 2 3 4 5 6 7 8 9 10]
[ 2 3 4 5 6 7 8 9 10 11]
[ 3 4 5 6 7 8 9 10 11 12]
[ 4 5 6 7 8 9 10 11 12 13]]
Expected output:
sub_n = [
[[0 1 2 3] [1 2 3 4] ... [5 6 7 8] [6 7 8 9]]
[[1 2 3 4] [2 3 4 5] ... [6 7 8 9] [7 8 9 10]]
[[2 3 4 5] [3 4 5 6] ... [7 8 9 10] [8 9 10 11]]
[[3 4 5 6] [4 5 6 7] ... [8 9 10 11] [9 10 11 12]]
[[4 5 6 7] [5 6 7 8] ... [9 10 11 12] [10 11 12 13]]
]
edit: I have much more data, that's the reason why I want to use as_strided (efficiency)
Here's one approach with np.lib.stride_tricks.as_strided -
def strided_lastaxis(a, L):
s0,s1 = a.strides
m,n = a.shape
return np.lib.stride_tricks.as_strided(a, shape=(m,n-L+1,L), strides=(s0,s1,s1))
Bit of explanation on strides for as_strided :
We have 3D strides, that increments by one element along the last/third axis, so s1 there for the last axis striding. The second axis strides by the same one element "distance", so s1 for that too. For the first axis, the striding is same as the first axis stride length of the array, as we move on the next row, so s0 there.
Sample run -
In [46]: a
Out[46]:
array([[0, 5, 6, 2, 3, 6, 7, 1, 4, 8],
[2, 1, 3, 7, 0, 3, 5, 4, 0, 1]])
In [47]: strided_lastaxis(a, L=4)
Out[47]:
array([[[0, 5, 6, 2],
[5, 6, 2, 3],
[6, 2, 3, 6],
[2, 3, 6, 7],
[3, 6, 7, 1],
[6, 7, 1, 4],
[7, 1, 4, 8]],
[[2, 1, 3, 7],
[1, 3, 7, 0],
[3, 7, 0, 3],
[7, 0, 3, 5],
[0, 3, 5, 4],
[3, 5, 4, 0],
[5, 4, 0, 1]]])

Translate reshape from Matlab to Python

I'm using numpy and I don't know how translate this MATLAB code to python:
C = reshape(A(B.',:).', 6, []).';
I think that the only right thing that I did is:
temp=A[B.transpose(),:]
but I don't know how translate all of the rows.
example of matrix:
A =
1 2
1 3
1 4
1 5
1 6
2 3
2 4
2 5
2 6
B =
1 2 3
1 2 4
1 2 5
1 2 6
1 2 7
1 2 8
1 2 9
C =
1 2 1 3 1 4
1 2 1 3 1 5
1 2 1 3 1 6
1 2 1 3 2 3
1 2 1 3 2 4
1 2 1 3 2 5
1 2 1 3 2 6
This looks like an indexing plus reshaping operation; one thing to keep in mind is that numpy is zero-indexed, while matlab is one-indexed. That means you need to index A with B - 1, and then reshape your result as desired. For example:
import numpy as np
A = np.array([[1, 2],
[1, 3],
[1, 4],
[1, 5],
[1, 6],
[2, 3],
[2, 4],
[2, 5],
[2, 6]])
B = np.array([[1, 2, 3],
[1, 2, 4],
[1, 2, 5],
[1, 2, 6],
[1, 2, 7],
[1, 2, 8],
[1, 2, 9]])
C = A[B - 1].reshape(B.shape[0], -1)
The result is:
>>> C
array([[1, 2, 1, 3, 1, 4],
[1, 2, 1, 3, 1, 5],
[1, 2, 1, 3, 1, 6],
[1, 2, 1, 3, 2, 3],
[1, 2, 1, 3, 2, 4],
[1, 2, 1, 3, 2, 5],
[1, 2, 1, 3, 2, 6]])
One potentially confusing piece: the -1 in the reshape method is a marker that indicates numpy should calculate the appropriate dimension to preserve the size of the array.

Filter a 2D numpy array from an array of values

Let's say I have a numpy array with the following shape :
nonSortedNonFiltered=np.array([[9,8,5,4,6,7,1,2,3],[1,3,2,6,4,5,7,9,8]])
I want to :
- Sort the array according to nonSortedNonFiltered[1]
- Filter the array according to nonSortedNonFiltered[0] and an array of values
I currently do the sorting with :
sortedNonFiltered=nonSortedNonFiltered[:,nonSortedNonFiltered[1].argsort()]
Which gives : np.array([[9 5 8 6 7 4 1 3 2],[1 2 3 4 5 6 7 8 9]])
Now I want to filter sortedNonFiltered from an array of values, for example :
sortedNonFiltered=np.array([[9 5 8 6 7 4 1 3 2],[1 2 3 4 5 6 7 8 9]])
listOfValues=np.array([8 6 5 2 1])
...Something here...
> np.array([5 8 6 1 2],[2 3 4 7 9]) #What I want to get in the end
Note : Each value in a column of my 2D array is exclusive.
You can use np.in1d to get a boolean mask and use it to filter columns in the sorted array, something like this -
output = sortedNonFiltered[:,np.in1d(sortedNonFiltered[0],listOfValues)]
Sample run -
In [76]: nonSortedNonFiltered
Out[76]:
array([[9, 8, 5, 4, 6, 7, 1, 2, 3],
[1, 3, 2, 6, 4, 5, 7, 9, 8]])
In [77]: sortedNonFiltered
Out[77]:
array([[9, 5, 8, 6, 7, 4, 1, 3, 2],
[1, 2, 3, 4, 5, 6, 7, 8, 9]])
In [78]: listOfValues
Out[78]: array([8, 6, 5, 2, 1])
In [79]: sortedNonFiltered[:,np.in1d(sortedNonFiltered[0],listOfValues)]
Out[79]:
array([[5, 8, 6, 1, 2],
[2, 3, 4, 7, 9]])

Categories