Cant seem to flatten numpy array - python

I have a numpy array which when print-ed looks like this:
print(a.shape)
(21,)
print(a)
[array([8.55570588e+03, 4.23078573e+05, 2.81254715e+07, 2.10356201e+09,
4.24558286e+05, 2.10032147e+07, 1.39638949e+09, 1.04453957e+11,
2.81593475e+07, 1.39354786e+09, 9.26480296e+10, 6.92992796e+12,
2.10047682e+09, 1.03982525e+11, 6.91296507e+12, 5.17021191e+14])
array([8.55404706e+03, 4.23328400e+05, 2.80891690e+07, 2.09651453e+09,
4.23874124e+05, 2.09628073e+07, 1.39044370e+09, 1.03745119e+11,
2.81060928e+07, 1.38935279e+09, 9.21288996e+10, 6.87207671e+12,
2.09626303e+09, 1.03584989e+11, 6.86712650e+12, 5.12107449e+14])
array([6.71569608e+03, 3.32364057e+05, 2.20526342e+07, 1.64564735e+09,
3.32826578e+05, 1.64539763e+07, 1.09116635e+09, 8.13888141e+10,
2.20612069e+07, 1.08976996e+09, 7.22409501e+10, 5.38629510e+12,
1.64474898e+09, 8.11907944e+10, 5.38026989e+12, 4.01021156e+14])
array([ 97, 120, 147, 106, 115, 151, 300, 268, 326, 454, 684,
1594, 2202, 2229, 1205, 2])
array([ 1, 0, 0, 0, 0, 1, 0, 1, 0, 2, 1,
11, 359, 1355, 3921, 4348])
array([ 1, 0, 0, 1, 0, 0, 6, 11, 31, 644, 2312,
3046, 3618, 321, 7, 2])
625.0 625.0 625.0 537178.875 1874648.75 1373895.875 1.275734191674592
2.066594119913508 1.6749058704798478 0.11276410212887233 2.55304393588347
1.1167704949278905 2.177796835501801 1.1323869527951895
1.3940068452456151]
Ideally I would want all of these values in one big array of length (3*16 + 3*16 + 15)
np.concatenate did not work, flatten also did bring the desired result.

One simple way is to use np.hstack in order to flatten list of array and float. Example usage is as follows:
import numpy as np
a = [np.array([1, 2, 3]), np.array([4, 5, 6]), 7, 8, 9]
np.hstack(a)
>> array([1, 2, 3, 4, 5, 6, 7, 8, 9])

Related

How to modify rows of numpy arrays stored as a list

I want to modify rows of numpy arrays stored in a list. length of my numpy arrays are not the same. I have several huge numpy arrays stored as list. This is my data (for simplicity I copied only a small list of array):
elements= [array([[971, 466, 697, 1, 15, 18, 28],
[5445, 4, 301, 2, 12, 47, 5]]),
array([[5883, 316, 377, 2, 9, 87, 1]])]
Then, I want to replace the fourth column of each row with the last one and then delete the last column. I want to have the following result:
[array([[971, 466, 697, 1, 28, 18],
[5445, 4, 301, 2, 5, 47]]),
array([[5883, 316, 377, 2, 1, 87]])]
I tried the following code but it was not successful:
length=[len(i) for i in elements] # To find the length of each array
h=sum(length) # to find the total number of rows
for i in range (h):
elements[:,[4,-1]] = elements[:,[-1,4]]
elements=np.delete(elements,[-1],1)
I am facing the following error:
TypeError: list indices must be integers or slices, not tuple
I appreciate ay help in advance.
You can do it without loops but it's still slower (1.75 times on large data) than accepted solution:
counts = list(map(len, elements))
arr = np.concatenate(elements)
arr[:, 4] = arr[:, -1]
new_elements = np.split(arr[:,:-1], np.cumsum(counts)[:-1])
Concatenation is quite slow in numpy.
A simple inefficient solution:
import numpy as np
elements= [np.array([[971, 466, 697, 1, 15, 18, 28],
[5445, 4, 301, 2, 12, 47, 5]]),
np.array([[5883, 316, 377, 2, 9, 87, 1]])]
new_elements = list()
for arr in elements:
arr[:, 4] = arr[:, -1]
new_elements.append(arr[:, :-1])
The new list output is:
new_elements
Out[11]:
[array([[ 971, 466, 697, 1, 28, 18],
[5445, 4, 301, 2, 5, 47]]),
array([[5883, 316, 377, 2, 1, 87]])]
Try this one
p=[]
for x in range(len(elements)):
for y in range(len(elements[x])):
p.append(list(elements[x][y][:4])+[elements[x][y][-1]]+[elements[x][y][-2]])
print(p)
[[971, 466, 697, 1, 28, 18],
[5445, 4, 301, 2, 5, 47],
[5883, 316, 377, 2, 1, 87]]

2D cross-correlation of a 6x6 array with three 3x3 kernels

I have a 6x6 matrix: e.g. matrix A
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]])
I also have a 3x3x3 matrix: e.g. matrix B
array([[[ 1, 7, 2],
[ 5, 9, 3],
[ 2, 8, 6]],
[[ 3, 4, 6],
[ 6, 8, 9],
[ 4, 2, 8]],
[[ 6, 4, 7],
[ 8, 7, 8],
[ 4, 4, 7]]])
Finally, I have a 3x4x4 matrix C, (4 rows, 4 columns, 3 dimensions), that's empty (filled with 0s)
I want to multiply each "3rd dimension" of B (i.e. [1,:,:],[2,:,:],[3,:,:]) with A. However, for each dimension I want to multiply B in "windows", sliding by 1 each time across A till I cannot go further, at which point I move back to the beginning, slide 1 unit down and again sliding across one-by-one multiplying B with A, till the end, then move down and repeat till you don't go over the border. The results being stored in the respective "3rd dimension" of matrix C. So my result would be a [3x4x4] matrix.
Ex. (multiplication is dot product giving a scalar value, np.sum((np.multiply(x,y)))), so...
imagining B "overtop" of A, starting in the right corner, I multiply that 3x3 part of A with Bs [1x3x3] part storing the result in C...
referring to 1st unit (located in 1st row and 1st column) in the 1st dimension of C...
C[1,0,0] = 340. because [[0,1,2],[6,7,8],[12,13,4]] dot product [[1,7,2],[5,9,3],[2,8,6]]
sliding B matrix over by 1 on A, and storing my 2nd result in C...
C[1,0,1] = 383. because [[1,2,3],[7,8,9],[13,14,15]] dot product [[1,7,2],[5,9,3],[2,8,6]]
Then repeat this procedure of sliding across and down and across and ..., for B[2,:,:] and B[3,:,:] over A again, storing in C2,:,:] and C[3,:,:] respectively.
What is a good way to do this?
I think you're asking about 2D cross-correlation with three different kernels, rather than straightforward matrix multiplication.
The following piece of code is not the most efficient way to do this, but does this give you the answer you are looking for? I'm using scipy.signal.correlate2d to achieve 2D correlation here...
>>> from scipy.signal import correlate2d
>>> C = np.dstack([correlate2d(A, B[:, :, i], 'valid') for i in range(B.shape[2])])
>>> C.shape
(4, 4, 3)
>>> C
array([[[ 333, 316, 464],
[ 372, 369, 520],
[ 411, 422, 576],
[ 450, 475, 632]],
[[ 567, 634, 800],
[ 606, 687, 856],
[ 645, 740, 912],
[ 684, 793, 968]],
[[ 801, 952, 1136],
[ 840, 1005, 1192],
[ 879, 1058, 1248],
[ 918, 1111, 1304]],
[[1035, 1270, 1472],
[1074, 1323, 1528],
[1113, 1376, 1584],
[1152, 1429, 1640]]])
Here's a more "fun" way of doing this which doesn't use scipy, but using stride_tricks instead. I'm not sure if it's more efficient:
>>> import numpy.lib.stride_tricks as st
>>> s, t = A.strides
>>> i, j = A.shape
>>> k, l, m = B.shape
>>> D = st.as_strided(A, shape=(i-k+1, j-l+1, k, l), strides=(s, t, s, t))
>>> E = np.einsum('ijkl,klm->ijm', D, B)
>>> (E == C).all()
True

Avoid nested for with numpy arrays

I need to implement a generic operation on the elements of some np 2D arrays (A,B,C). In pseudo-code
for i in A.height:
for j in A.width:
A[i,j] = f(B[i,j],C[i,j])
where f() is concatenating the bits of the two variables by means of struct.pack(), struct.unpack()
x = struct.pack('2B', B[i, j], C[i, j])
y = struct.unpack('H', x)
This code takes a really long time to execute (0.25 secs for 640*480 matrices ... maybe is normal yet I could use something faster ), so I was wondering if anybody could suggest me some pythonic way of achieving the same result which could also improve the performance
Your function:
In [310]: def foo(a,b):
...: x = struct.pack('2B', a,b)
...: return struct.unpack('H',x)[0]
np.vectorize is a convenient way of broadcasting arrays. It passes scalar values to the functions. It does not speed up the code (related frompyfunc may give a 2x speed up relative to plain iteration)
In [311]: fv = np.vectorize(foo)
In [312]: fv(np.arange(5)[:,None],np.arange(10))
Out[312]:
array([[ 0, 256, 512, 768, 1024, 1280, 1536, 1792, 2048, 2304],
[ 1, 257, 513, 769, 1025, 1281, 1537, 1793, 2049, 2305],
[ 2, 258, 514, 770, 1026, 1282, 1538, 1794, 2050, 2306],
[ 3, 259, 515, 771, 1027, 1283, 1539, 1795, 2051, 2307],
[ 4, 260, 516, 772, 1028, 1284, 1540, 1796, 2052, 2308]])
I can replicate those values with a simple math expression on the same arrays:
In [313]: np.arange(5)[:,None]+np.arange(10)*256
Out[313]:
array([[ 0, 256, 512, 768, 1024, 1280, 1536, 1792, 2048, 2304],
[ 1, 257, 513, 769, 1025, 1281, 1537, 1793, 2049, 2305],
[ 2, 258, 514, 770, 1026, 1282, 1538, 1794, 2050, 2306],
[ 3, 259, 515, 771, 1027, 1283, 1539, 1795, 2051, 2307],
[ 4, 260, 516, 772, 1028, 1284, 1540, 1796, 2052, 2308]])
This probably only works for limited ranges of values, but it gives an idea of how you can properly 'vectorize' calculations in numpy.
Depends on what 'f' does... Not sure if this is what you mean
b = np.arange(3*4).reshape(3,4)
c = np.arange(3*4).reshape(3,4)[::-1]
b
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
c
array([[ 8, 9, 10, 11],
[ 4, 5, 6, 7],
[ 0, 1, 2, 3]])
def f(b, c):
"""some function"""
a = b + c
return a
a = f(b, c)
a
array([[ 8, 10, 12, 14],
[ 8, 10, 12, 14],
[ 8, 10, 12, 14]])

Insert elements to beginning and end of numpy array

I have a numpy array:
import numpy as np
a = np.array([2, 56, 4, 8, 564])
and I want to add two elements: one at the beginning of the array, 88, and one at the end, 77.
I can do this with:
a = np.insert(np.append(a, [77]), 0, 88)
so that a ends up looking like:
array([ 88, 2, 56, 4, 8, 564, 77])
The question: what is the correct way of doing this? I feel like nesting a np.append in a np.insert is quite likely not the pythonic way to do this.
Another way to do that would be to use numpy.concatenate . Example -
np.concatenate([[88],a,[77]])
Demo -
In [62]: a = np.array([2, 56, 4, 8, 564])
In [64]: np.concatenate([[88],a,[77]])
Out[64]: array([ 88, 2, 56, 4, 8, 564, 77])
You can use np.concatenate -
np.concatenate(([88],a,[77]))
You can pass the list of indices to np.insert :
>>> np.insert(a,[0,5],[88,77])
array([ 88, 2, 56, 4, 8, 564, 77])
Or if you don't know the length of your array you can use array.size to specify the end of array :
>>> np.insert(a,[0,a.size],[88,77])
array([ 88, 2, 56, 4, 8, 564, 77])
what about:
a = np.hstack([88, a, 77])

Python array/matrix dimension

I create two matrices
import numpy as np
arrA = np.zeros((9000,3))
arrB = np.zerros((9000,6))
I want to concatenate pieces of those matrices.
But when I try to do:
arrC = np.hstack((arrA, arrB[:,1]))
I get an error:
ValueError: all the input arrays must have same number of dimensions
I guess it's because np.shape(arrB[:,1]) is equal (9000,) instead of (9000,1), but I cannot figure out how to resolve it.
Could you please comment on this issue?
You could preserve dimensions by passing a list of indices, not an index:
>>> arrB[:,1].shape
(9000,)
>>> arrB[:,[1]].shape
(9000, 1)
>>> out = np.hstack([arrA, arrB[:,[1]]])
>>> out.shape
(9000, 4)
This is easier to see visually.
Assume:
>>> arrA=np.arange(9000*3).reshape(9000,3)
>>> arrA
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
...,
[26991, 26992, 26993],
[26994, 26995, 26996],
[26997, 26998, 26999]])
>>> arrB=np.arange(9000*6).reshape(9000,6)
>>> arrB
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[ 12, 13, 14, 15, 16, 17],
...,
[53982, 53983, 53984, 53985, 53986, 53987],
[53988, 53989, 53990, 53991, 53992, 53993],
[53994, 53995, 53996, 53997, 53998, 53999]])
If you take a slice of arrB, you are producing a series that looks more like a row:
>>> arrB[:,1]
array([ 1, 7, 13, ..., 53983, 53989, 53995])
What you need is a column the same shape as a column to add to arrA:
>>> arrB[:,[1]]
array([[ 1],
[ 7],
[ 13],
...,
[53983],
[53989],
[53995]])
Then hstack works as expected:
>>> arrC=np.hstack((arrA, arrB[:,[1]]))
>>> arrC
array([[ 0, 1, 2, 1],
[ 3, 4, 5, 7],
[ 6, 7, 8, 13],
...,
[26991, 26992, 26993, 53983],
[26994, 26995, 26996, 53989],
[26997, 26998, 26999, 53995]])
An alternate form is to specify -1 in one dimension and the number of rows or cols desired as the other in .reshape():
>>> arrB[:,1].reshape(-1,1) # one col
array([[ 1],
[ 7],
[ 13],
...,
[53983],
[53989],
[53995]])
>>> arrB[:,1].reshape(-1,6) # 6 cols
array([[ 1, 7, 13, 19, 25, 31],
[ 37, 43, 49, 55, 61, 67],
[ 73, 79, 85, 91, 97, 103],
...,
[53893, 53899, 53905, 53911, 53917, 53923],
[53929, 53935, 53941, 53947, 53953, 53959],
[53965, 53971, 53977, 53983, 53989, 53995]])
>>> arrB[:,1].reshape(2,-1) # 2 rows
array([[ 1, 7, 13, ..., 26983, 26989, 26995],
[27001, 27007, 27013, ..., 53983, 53989, 53995]])
There is more on array shaping and stacking here
I would try something like this:
np.vstack((arrA.transpose(), arrB[:,1])).transpose()
There several ways of making your selection from arrB a (9000,1) array:
np.hstack((arrA,arrB[:,[1]]))
np.hstack((arrA,arrB[:,1][:,None]))
np.hstack((arrA,arrB[:,1].reshape(9000,1)))
np.hstack((arrA,arrB[:,1].reshape(-1,1)))
One uses the concept of indexing with an array or list, the next adds a new axis (e.g. np.newaxis), the third uses reshape. These are all basic numpy array manipulation tasks.

Categories