Python list of matrices - python

I'm new to Python and could use some help. I tried to find a similar question but apparently, they differ by a bit and the answers don’t work for my problem. I use PyCharm and Python 3.8.
Cutting to the case:
I have a list of matrices and I want to average all the matrix values. I already struggle with accessing the values.
A small test list looks like this:
data = [[[1, 2, 3], [1, 2, 3], [1, 2, 3]],
[[2, 3, 4], [2, 3, 4], [2, 3, 4]],
[[3, 4, 5], [3, 4, 5], [3, 4, 5]],
[[4, 5, 6], [4, 5, 6], [4, 5, 6]]]
I tried to access all values at position (n, m) in the list and expect something like for (1,1) [2,3,4,5]
I tried to use:
print(data[:][1][1])
And got the result:
[2, 3, 4] which is one list entry short. Also, I think it's just data[1][1].
Which is not what I want to have.
Can someone tell me what I’m doing wrong?

When working with matrices in python I advice using numpy.
Your data is a list of four 3x3 matrices:
data = [
[
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]
],
[
[2, 3, 4],
[2, 3, 4],
[2, 3, 4]
],
[
[3, 4, 5],
[3, 4, 5],
[3, 4, 5]
],
[
[4, 5, 6],
[4, 5, 6],
[4, 5, 6]
],
]
We can easily convert this to a numpy array:
import numpy as np
data_np = np.array(data)
print(data_np.shape)
the last statement returns (4, 3, 3) -- equivalent structure to your data, a 3-dimensional array, where the first is index of a matrix, and the last two are indices of elements of each matrix. Now you can subsample along any dimension, including your desired result:
data_np[:, 1, 1]
which returns array([2, 3, 4, 5]). You can also cast it to a python list if needed through data_np[:, 1, 1].tolist()
There's also a pure python version of this, which I do not recommend using, but it might be a useful design pattern in less obvious cases. Using list comprehension we access each matrix, and then retrieve the interesting index.
[matrix[1][1] for matrix in data]
which returns a list [2, 3, 4, 5]

I suggest using the numpy package, which is great for doing all sorts of matrix and vector operations. It makes indexing alot easier.
Note that i added some enters in your data array to make it more readible.
Example:
import numpy as np
data = np.array(
[
[[1,2,3],[1,2,3],[1,2,3]],
[[2,3,4],[2,3,4],[2,3,4]],
[[3,4,5],[3,4,5],[3,4,5]],
[[4,5,6],[4,5,6],[4,5,6]]
]
)
print(data[:,1,1])
Outputs:
[2 3 4 5]

lists cannot be indexed in the same way as numpy arrays.
Have a look what is happening with your slices:
The first slice is just returning the whole list
data[:]
[[[1, 2, 3], [1, 2, 3], [1, 2, 3]],
[[2, 3, 4], [2, 3, 4], [2, 3, 4]],
[[3, 4, 5], [3, 4, 5], [3, 4, 5]],
[[4, 5, 6], [4, 5, 6], [4, 5, 6]]]
the second slice is returning the second list (list indexes start at 0, so the index 1 is the second element)
data[:][1]
Out[46]: [[2, 3, 4], [2, 3, 4], [2, 3, 4]]
The third slice is returning the second list, within the second list
data[:][1][1]
Out[47]: [2, 3, 4]
If you want to achieve what you are looking for with a list, you would use:
[x[1][1] for x in data]
[2, 3, 4, 5]
to loop through each list and select the second element of the first list.
However, it would be better to use numpy
import numpy as np
arr = np.array(data)
arr[:, 1, 1]
Out[56]: array([2, 3, 4, 5])

Related

Numpy 2D to 3D array based on data in a column

Let's say I have data structured in a 2D array like this:
[[1, 3, 4, 6],
[1, 4, 8, 2],
[1, 3, 2, 9],
[2, 2, 4, 8],
[2, 4, 9, 1],
[2, 2, 9, 3]]
The first column denotes a third dimension, so I want to convert this to the following 3D array:
[[[3, 4, 6],
[4, 8, 2],
[3, 2, 9]],
[[2, 4, 8],
[4, 9, 1],
[2, 9, 3]]]
Is there a built-in numpy function to do this?
You can try code below:
import numpy as np
array = np.array([[1, 3, 4, 6],
[1, 4, 8, 2],
[1, 3, 2, 9],
[2, 2, 4, 8],
[2, 4, 9, 1],
[2, 2, 9, 3]])
array = np.delete(array, 0, 1)
array.reshape(2,3,-1)
Output
array([[[3, 4, 6],
[4, 8, 2],
[3, 2, 9]],
[[2, 4, 8],
[4, 9, 1],
[2, 9, 3]]])
However, this code can be used when you are aware of the array's shape. But if you are sure that the number of columns in the array is a multiple of 3, you can simply use code below to show the array in the desired format.
array.reshape(array.shape[0]//3,3,-3)
Use numpy array slicing with reshape function.
import numpy as np
arr = [[1, 3, 4, 6],
[1, 4, 8, 2],
[1, 3, 2, 9],
[2, 2, 4, 8],
[2, 4, 9, 1],
[2, 2, 9, 3]]
# convert the list to numpy array
arr = np.array(arr)
# remove first column from numpy array
arr = arr[:,1:]
# reshape the remaining array to desired shape
arr = arr.reshape(len(arr)//3,3,-1)
print(arr)
Output:
[[[3 4 6]
[4 8 2]
[3 2 9]]
[[2 4 8]
[4 9 1]
[2 9 3]]]
You list a non numpy array. I am unsure if you are just suggesting numpy as a means to get a non numpy result, or you are actually looking for a numpy array as result. If you don't actually need numpy, you could do something like this:
arr = [[1, 3, 4, 6],
[1, 4, 8, 2],
[1, 3, 2, 9],
[2, 2, 4, 8],
[2, 4, 9, 1],
[2, 2, 9, 3]]
# Length of the 3rd and 2nd dimension.
nz = arr[-1][0] + (arr[0][0]==0)
ny = int(len(arr)/nz)
res = [[arr[ny*z_idx+y_idx][1:] for y_idx in range(ny)] for z_idx in range(nz)]
OUTPUT:
[[[3, 4, 6], [4, 8, 2], [3, 2, 9]], [[2, 4, 8], [4, 9, 1], [2, 9, 3]]]
Note that the calculation of nz takes into account that the 3rd dimension index in your array is either 0-based (as python is per default) or 1-based (as you show in your example).

Rank elements in nested list without sorting list

Let's say I have a nested list:
list = [[10, 2, 8, 4], [12, 6, 4, 1], [8, 4, 3, 2], [9, 3, 4, 6]]
I want to rank the elements in the sublist against each other to create a new nested list with the rankings.
result = [[1, 4, 2, 3], [1, 2, 3, 4], [1, 2, 3, 4], [1, 4, 3, 2]]
in the first sublist 10 would be 1st, 8 2nd, etc.
There are already some good solutions. Here just another one - functional approach for reference:
No 3rd library used.
lst = # your lists - don't use builtin "list"
def ranking(nums):
ranks = {x:i for i, x in enumerate(sorted(nums, reverse=True),1)}
return [ranks[x] for x in nums] # quick mapping back: O(1)
Calling it:
result = list(map(ranking, lst))
As already mentioned in the comment, you can use numpy.argsort, using it twice gives you the rank for the values, which need to be subtracted from len of the sub list to rank from highest to lowest, you can use List-Comprehension to do it for all the sub lists.
>>> import numpy as np
>>> lst = [[10, 2, 8, 4], [12, 6, 4, 1], [8, 4, 3, 2], [9, 3, 4, 6]]
>>> [(len(sub)-np.argsort(sub).argsort()).tolist() for sub in lst]
[[1, 4, 2, 3], [1, 2, 3, 4], [1, 2, 3, 4], [1, 4, 3, 2]]
You can even use 2D numpy array and negate the values, then directly call argsort twice on the resulting array, and finally add 1:
>>> (-np.array(lst)).argsort().argsort()+1
array([[1, 4, 2, 3],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 4, 3, 2]], dtype=int64)
You can use scipy.stats.rankdata:
my_list = [[10, 2, 8, 4], [12, 6, 4, 1], [8, 4, 3, 2], [9, 3, 4, 6]]
from scipy.stats import rankdata
[list(len(l)+1-rankdata(l).astype(int)) for l in my_list]
output:
[[1, 4, 2, 3], [1, 2, 3, 4], [1, 2, 3, 4], [1, 4, 3, 2]]
Without numpy/scipy:
[[sorted(li, reverse=True).index(x)+1 for x in li] for li in data]
[[1, 4, 2, 3], [1, 2, 3, 4], [1, 2, 3, 4], [1, 4, 3, 2]]
Another solution with no external libraries, and with a better time complexity, just in case your sublists are a bit longer than 4 items (this has some overhead but I presume it is O(n log n) because of the call to sorted).
def rank_all(ls):
result = []
for subls in ls:
pairs = sorted([(subls[j],j) for j in range(len(subls))], reverse=True)
ranked = [0] * len(subls)
for j,p in enumerate(pairs):
ranked[p[1]]=j+1
result.append(ranked)
return result

How to create list of lists for repeating elements

I have the following list A and would like to create the list of lists B. How can I detect the repeating elements and avoid the later repeats because there is a different element in between?
A = [1,3,3,3,3,4,4,4,6,6,6,6,3,3,4,4,4,4,5]
B = [[1],[3,3,3,3],[4,4,4],[6,6,6,6],[3,3],[4,4,4,4],[5]]
itertools.groupby can do this for you.
>>> [list(x[1]) for x in itertools.groupby(A)]
[[1], [3, 3, 3, 3], [4, 4, 4], [6, 6, 6, 6], [3, 3], [4, 4, 4, 4], [5]]

Numba list argument for array indexing

I'm looking to leverage numba to iterate over a large 2d array, where for iteration a subset of the array will be selected by [x, y] location (passed as an argument). I'm having trouble with structuring this the right way to play nice with numba, specifically when passing a list of lists as an argument into the method. Any pointers?
x_y_list = [[1, 2], [3, 4], [5, 6]]
array = ([[1, 2, 3, 4, 5, 6],
[1, 2, 3, 4, 5, 6],
[1, 2, 3, 4, 5, 6]])
#jit
def arrIndexing(array, x_y_list):
for index in x_y_list:
subset = array[index[0]-1:index[0]+1, index[1]-1:index[1]+1]
# do some other stuff
Something like this?
Should do well with Numba but I haven’t tested (did this on my phone which doesn’t support Numba)
import numpy as np
def xy():
x_y_list = np.array([[1, 2], [2, 4], [0, 5]])
array = np.array([[1, 2, 3, 4, 5, 6],[1, 2, 3, 4, 5, 6],[1, 2, 3, 4, 5, 6]])
for i,j in x_y_list:
print(array[np.ix_((i-1, i), (j-1, j))])
>>> xy()
[[2 3]
[2 3]]
[[4 5]
[4 5]]
[[5 6]
[5 6]]

Subsampling 3D array using the neighbourhood sum

The title is probably confusing. I have a reasonably large 3D numpy array. I'd like to cut it's size by 2^3 by binning blocks of size (2,2,2). Each element in the new 3D array should then contain the sum of the elements in it's respective block in the original array.
As an example, consider a 4x4x4 array:
input = [[[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]],
[[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]],
... ]]]
(I'm only representing half of it to save space). Notice that all the elements with the same value constitute a (2x2x2) block. The output should be a 2x2x2 array such that each element is the sum of a block:
output = [[[8, 16],
[24, 32]],
... ]]]
So 8 is the sum of all 1's, 16 is the sum of the 2's, and so on.
There's a builtin to do those block-wise reductions - skimage.measure.block_reduce-
In [36]: a
Out[36]:
array([[[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]],
[[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]]])
In [37]: from skimage.measure import block_reduce
In [39]: block_reduce(a, block_size=(2,2,2), func=np.sum)
Out[39]:
array([[[ 8, 16],
[24, 32]]])
Use other reduction ufuncs, say max-reduction -
In [40]: block_reduce(a, block_size=(2,2,2), func=np.max)
Out[40]:
array([[[1, 2],
[3, 4]]])
Implementing such a function isn't that difficult with NumPy tools and could be done like so -
def block_reduce_numpy(a, block_size, func):
shp = a.shape
new_shp = np.hstack([(i//j,j) for (i,j) in zip(shp,block_size)])
select_axes = tuple(np.arange(a.ndim)*2+1)
return func(a.reshape(new_shp),axis=select_axes)

Categories