Differences between X and X.T - python

I have some code here:
import numpy as np
import matplotlib.pyplot as plt
#height (cm)
X = np.array([[147, 150, 153, 158, 163, 165, 168, 170, 173, 175, 178, 180, 183]])
print(X.T)
print("=======================")
print(X)
Can anyone explain me what is T mean and the differences between X and X.T?

The .T is an attribute of numpy array, that transposes the array.

The significant differences are seen in the shape and strides attributes:
In [64]: X = np.array([[1,2,3,4]])
In [65]: X
Out[65]: array([[1, 2, 3, 4]])
In [66]: X.T
Out[66]:
array([[1],
[2],
[3],
[4]])
In [67]: X.shape
Out[67]: (1, 4)
In [68]: X.T.shape
Out[68]: (4, 1)
In [69]: X.strides
Out[69]: (32, 8)
In [70]: X.T.strides
Out[70]: (8, 32)

The .T attribute returns the array transpose. Note that in your case, you declared almost a vector (a real vector would be one dimensional, but in your case you have two dimensions because of the double brackets [[]] use in the definition of x):
import numpy as np
x = np.array([[147, 150, 153, 158, 163, 165, 168, 170, 173, 175, 178, 180]])
print("Line vector:")
print(x)
print("Column vector:")
print(x.T)
Line vector:
[[147 150 153 158 163 165 168 170 173 175 178 180]]
Column vector:
[[147]
[150]
[153]
[158]
[163]
[165]
[168]
[170]
[173]
[175]
[178]
[180]]
Note that in this case, there is no need to use double square brackets [[]]. If you define x with single brackets, the transpose is not different (because there is no dimension to transpose with):
import numpy as np
x = np.array([147, 150, 153, 158, 163, 165, 168, 170, 173, 175, 178, 180])
print("Line vector:")
print(x)
print("Column vector:")
print(x.T)
Line vector:
[147 150 153 158 163 165 168 170 173 175 178 180]
Column vector:
[147 150 153 158 163 165 168 170 173 175 178 180]

Related

How to turn an array of indices into a binary 3D numpy array?

I have array of vertices from an stl file which i converted to a 2D numpy array. Here's some of it as an example:
print(vertices.shape)
(67748, 3)
I need to turn these into a 3D binary array where each element = 1 where the index is given by the vertices array.
Minimal reproducible example (expected output) using a 5 x 3 vertices array instead of 67748 x 3:
verts = np.array([[ 77, 239, 83],
[100, 237, 88],
[100, 149, 94],
[100, 220, 128],
[100, 145, 86]])
voxels = np.zeros((256,256,256)).astype(int)
voxels[77,239,83] = 1
voxels[100,149,94] = 1
voxels[100,237,88] = 1
voxels[100,220,128] = 1
voxels[100,145, 86] = 1
You can use np.put and np.ravel_multi_index
np.put(voxel_array,
np.ravel_multi_index(vertices.T,
voxel_array.shape),
1)
And then:
np.where(voxel_array)
Out[]:
(array([ 77, 100, 100, 100, 100], dtype=int64),
array([239, 145, 149, 220, 237], dtype=int64),
array([ 83, 86, 94, 128, 88], dtype=int64))
Answer from Michael Szczesny in the comments:
for x, y, z in vertices: voxel_array[x, y, z] = 1
This works just as well as Daniel F's but is easier to understand. Both take 0.0 seconds to run on 256 x 265 x 265 array

Sum of positive arrays yields negative results

I try to sum together three positive arrays, however, the result yields an array that has negative values. How is this possible?
#Example of an image
img=np.array(([[[246, 240, 243],[240, 239, 239],
[243, 242, 244]],[[ 241, 240, 240],
[241, 243, 246],[ 239, 239, 239]],
[[249, 249, 250],[ 33, 33, 34],
[249, 249, 249]],[[ 33, 33, 33],
[250, 250, 249],[ 34, 34, 34]]]), dtype=np.uint8)
#Creating three positive arrays from image
#Image type converted to np.int16 as otherwise values remain between 0-255
R=abs((img[:,:,0].astype(np.int16)-255)**2)
G=abs((img[:,:,1].astype(np.int16)-255)**2)
B=abs((img[:,:,2].astype(np.int16)-255)**2)
print(R, G, B)
[[ 81 225 144]
[ 196 196 256]
[ 36 16252 36]
[16252 25 16695]] [[ 225 256 169]
[ 225 144 256]
[ 36 16252 36]
[16252 25 16695]] [[ 144 256 121]
[ 225 81 256]
[ 25 16695 36]
[16252 36 16695]]
#Adding three positive arrays together
R+G+B
array([[ 450, 737, 434],
[ 646, 421, 768],
[ 97, -16337, 108],
[-16780, 86, -15451]], dtype=int16)
I thought it had something to do with the abs() function I am applying, however, the results separately clearly show they are referenced correctly and positive?

How to slice a 3D array from a 2D array without nested loops?

I have four matrices, partially created from each other:
A is a 3D matrix that represents a stack of grayscale images and is of shape (n, h, w)
B is a 3D matrix that also represents a stack of images, where each slice is individually calculated from the corresponding slice in A and is also of shape (n, h, w)
C is a 2D matrix, containing the index with the maximum value of B in z direction and is of shape (h, w)
D is a 2D matrix, where a value from A is copied from a certain slice, which is indicated by the value in C at position (x, y).
A minimum example implemented with loops would look as follows:
import numpy as np
A = np.random.randint(0, 255, size=(3, 4, 5))
B = np.random.randint(0, 255, size=(3, 4, 5))
C = np.argmax(B, axis=0)
D = np.zeros(C.shape, dtype=int)
for y in range(C.shape[0]):
for x in range(C.shape[1]):
D[y, x] = A[C[y, x], y, x]
> A
array([[[ 24, 84, 155, 8, 147],
[ 25, 4, 49, 195, 57],
[ 93, 76, 233, 83, 177],
[ 70, 211, 201, 132, 239]],
[[177, 144, 247, 251, 207],
[242, 148, 28, 40, 105],
[181, 28, 132, 94, 196],
[166, 121, 72, 14, 14]],
[[ 55, 254, 140, 142, 14],
[112, 28, 85, 112, 145],
[ 16, 72, 16, 248, 179],
[160, 235, 225, 14, 211]]])
> B
array([[[246, 14, 55, 163, 161],
[ 3, 152, 128, 104, 203],
[ 43, 145, 59, 169, 242],
[106, 169, 31, 222, 240]],
[[ 41, 26, 239, 25, 65],
[ 47, 252, 205, 210, 138],
[194, 64, 135, 127, 101],
[ 63, 208, 179, 137, 59]],
[[112, 156, 183, 23, 253],
[ 35, 6, 233, 42, 100],
[ 66, 119, 102, 217, 64],
[ 82, 67, 135, 6, 8]]])
> C
array([[0, 2, 1, 0, 2],
[1, 1, 2, 1, 0],
[1, 0, 1, 2, 0],
[0, 1, 1, 0, 0]])
> D
array([[ 24, 254, 247, 8, 14],
[242, 148, 85, 40, 57],
[181, 76, 132, 248, 177],
[ 70, 121, 72, 132, 239]])
My question is: How to slice A with C efficiently eliminating the nested for-loops? My initial idea was to expand C to a 3D boolean mask, where only the positions [c, y, x] are set to True and then to simply multiply it elementwise with A and take the sum over z-axis. But I can't think of an pythonesque implementation without loops (and I probably wouldn't need to create a boolean mask anymore, if I'd knew how to write that).
The closest already implemented function I found is np.choose(), but it only takes 32 elements for C.
The standard approach here is to use np.take_along_axis() in conjunction with np.expand_dims() (the core idea is presented also in the np.argmax() documentation):
np.take_along_axis(A, np.expand_dims(C, axis=0), axis=0).squeeze()
Comparing the proposed approach with the explicit loops and the np.ogrid()-based approaches one would get:
import numpy as np
def take_by_axis_loop_fix(arr, pos):
result = np.zeros(pos.shape, dtype=int)
for i in range(pos.shape[0]):
for j in range(pos.shape[1]):
result[i, j] = arr[pos[i, j], i, j]
return result
def take_by_axis_ogrid_fix(arr, pos):
i, j = np.ogrid[:pos.shape[0], :pos.shape[1]]
return arr[pos[i, j], i, j]
def take_by_axis_np(arr, pos, axis=0):
return np.take_along_axis(arr, np.expand_dims(pos, axis=axis), axis=axis).squeeze()
def take_by_axis_ogrid(arr, pos, axis=0):
ij = tuple(np.ogrid[tuple(slice(None, d, None) for d in pos.shape)])
ij = ij[:axis] + (pos[ij],) + ij[axis:]
return arr[ij]
A_ = np.random.randint(0, 255, size=(300, 400, 500))
B_ = np.random.randint(0, 255, size=(300, 400, 500))
C_ = np.argmax(B_, axis=0)
funcs = take_by_axis_loop_fix, take_by_axis_ogrid_fix, take_by_axis_ogrid, take_by_axis_np
for func in funcs:
print(func.__name__, np.all(func(A_, C_) == take_by_axis_loop_fix(A_, C_)))
%timeit func(A_, C_)
print()
# take_by_axis_loop_fix True
# 10 loops, best of 3: 114 ms per loop
# take_by_axis_ogrid_fix True
# 100 loops, best of 3: 5.94 ms per loop
# take_by_axis_ogrid True
# 100 loops, best of 3: 5.54 ms per loop
# take_by_axis_np True
# 100 loops, best of 3: 3.34 ms per loop
indicating this to be the most efficient approach proposed so far.
Note also that the np.take_along_axis()-based and the take_by_axis_ogrid() approaches would work essentially unchanged for inputs with higher dimensionality, contrarily to the _fix approaches.
Particularly, take_by_axis_ogrid() is the axis-agnostic version of take_by_axis_ogrid_fix() which is, essentially, nth's answer.
y, x = np.ogrid[:C.shape[0],:C.shape[1]]
D = A[C[y, x], y, x]

Product of two equals numpy arrays are different

I'm facing very strange problem with arrays in python and numpy. First of all what Im trying to archive is :
1) Get an MxN matrix from KxTxN matrix
2) Transpose this matrix and calculate product of this transposed matrix and the original one
What I get is some what strange, here comes the code :
First of all, I have read an image with help of cv2, and got K by T by 3 matrix (a field of RGB points), then I'm cutting a small window form it and reshaping this window to M by N matrix :
def clipSubwindowFromImage(img, i, j, winSize):
winI = img[i - winSize: i + winSize + 1, j - winSize : j + winSize + 1, : ]
res = np.vstack((winI[:,::3,:].reshape(winI.shape[1],3), winI[:,1::3,:].reshape(winI.shape[1],3), winI[:,2::3,:].reshape(winI.shape[1],3)))
return res
so far so god, say we had winSize = 1, i = 1, j = 1 and got a 9x3 matrix as a result: this matrix :
>> subWin = clipSubwindowFromImage(background12x12b, 1, 1, 1)
>> [[201 199 187]
[216 219 198]
[226 228 207]
[243 241 228]
[240 244 221]
[233 235 213]
[239 238 220]
[238 240 216]
[233 235 211]]
Then I just want to get the product in question, like this :
>>r1 = subWin.T.dot(subWin)
>>[[197 234 89]
[234 65 163]
[ 89 163 105]]
Well, it's not right, the right result should be :
>>[[477125 479466 438361]
[479466 481857 440483]
[438361 440483 402793]]
But if I initialize subWin manually like this :
>>subWin = np.array([[201, 199, 187], [216, 219, 198], [226, 228, 207], [243, 241, 228], [240, 244, 221], [233, 235, 213],[239, 238, 220], [238, 240, 216],[233, 235, 211]])
I get right result.
I can't get it, subWin is the SAME array in both cases (I checked it). Any ideas?
As #Aguy said, your problem comes from the data-type of your array. The dot product of a uint8 array with an other uint8 array gives an array that is also uint8 hence the data-type is overflowed in your case. Here's an example that shows the effect of overflow on your values:
import numpy as np
a = np.array([[201, 199, 187], [216, 219, 198], [226, 228, 207], [243, 241, 228], [240, 244, 221], [233, 235, 213],[239, 238, 220], [238, 240, 216],[233, 235, 211]])
b = a.T.dot(a)
print b.dtype
print b
print "overflowed uint8 :"
print b.astype(np.uint8)
Gives:
>>> int64
>>> [[477125 479466 438361]
>>> [479466 481857 440483]
>>> [438361 440483 402793]]
>>> overflowed uint8 :
>>> [[197 234 89]
>>> [234 65 163]
>>> [ 89 163 105]]
Just change the data-type of one array to something more suitable in your dot product and you're good to go :
r1 = subWin.T.dot(subWin.astype(np.uint32))

bounded sum or difference of arrays in numpy

I want to add or subtract two arrays in numpy but the result has to be bounded for each element. If I restrict the typ (i.e. uint8) any exeeding sum produces an overflow (i.e. start from zero again) and any exeeding difference an underflow (i.e. start from 255 again). This is not what I want, i.e. I want to stop at 0/255 (in my example).
Is there any way to do this without accessing each element?
Thank you in advance.
you can use a mask
Example: addition not exceeding 255:
import numpy as np
# create exaple data where sum exceeds 255
a = np.arange(118,130,dtype = np.uint8)
b = a.copy()
res = np.add(a,b, dtype = np.uint16);
mask = res > 255
res[mask] = 255
res = np.uint8(res)
Results are:
>>> print a
array([118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129], dtype=uint8)
>>> print b
array([118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129], dtype=uint8)
>>> print mask
array([False, False, False, False, False, False, False, False, False, False, True, True], dtype=bool)
>>> print res
array([236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 255, 255], dtype=uint8)
The mask only works correct as a numpy array. Otherwise, advanced indexing will return a view, not a copy, see SciPy/NumPy documentation.
You can work with OpenCV if you have the cv2 library :
import cv2
import numpy as np
x=np.uint8([250])
y=np.uint8([10])
print cv2.add(x,y) #250+ 10 =260=>255
Answer :
[[255]]
As pointed out by jkalden, it's possible to use the add and subtract function of NumPy with a dtyperange wider than the uint8 data type, but instead of passing through a mask you can use the np.where function:
a = np.arange(118,130,dtype = np.uint8)
b = a.copy()
sum = np.add(a,b, dtype = np.int16)
uint8_sum = np.where(sum>255, 255, sum).astype(np.uint8)
Result:
>>> print(uint8_sum)
array([236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 255, 255],
dtype=uint8)
In the same way it's possible to perform the subtraction:
a = np.arange(118,130,dtype = np.uint8)
b = a.copy()[::-1]
diff = np.subtract(a,b, dtype = np.int16)
uint8_diff = np.where(diff<0, 0, diff).astype(np.uint8)
Result:
>>> print(uint8_diff)
array([0, 0, 0, 0, 0, 0, 1, 3, 5, 7, 9, 11],
dtype=uint8)
simplified one-liners using np.where as suggested by Aelius, without extra variable and extra type convertions
Bounded add:
summ = np.where(np.add(a,b, dtype = np.int16)>255,255,a+b)
Bounded subtract:
subb = np.where(np.subtract(a,b, dtype = np.int16)<0,0,a-b)
Example:
import numpy as np
a = np.arange(118,135,dtype = np.uint8)
b = a.copy()
summ = np.where(np.add(a,b, dtype = np.int16)>255,255,a+b)
>>> print(summ)
[236 238 240 242 244 246 248 250 252 254 255 255 255 255 255 255 255]
subb = np.where(np.subtract(a,b[::-1], dtype = np.int16)<0,0,a-b[::-1])
>>> print(subb)
[ 0 0 0 0 0 0 0 0 0 2 4 6 8 10 12 14 16]

Categories