I am going to generate my train and test datasets from an image representing volume values. This image contains a range of -25 to 75. I want to ignore the negative values in preprocessing step. Could anyone tell me how I should treat negative values? Is there any way to transfer the negative values to zero or no-data without changing the positive pixel values?
I can't advise on if this should be done, but if you want to turn all your negative values to 0 you can use tf.maximum:
import tensorflow as tf
x = tf.random.uniform((10, 10), -25, 75, dtype=tf.int32)
<tf.Tensor: shape=(10, 10), dtype=int32, numpy=
array([[ 57, -11, 48, 43, 29, 21, 15, 42, -9, 12],
[ 18, 67, -9, -21, 6, 27, 50, -1, 72, 51],
[ 2, 22, 70, 49, 50, -10, 67, 4, 59, -10],
[-13, 39, 60, -20, -15, -17, 51, 73, -23, 21],
[ 28, 8, 48, 66, -13, -3, 44, 35, 23, 45],
[-24, 30, 16, 25, 34, -13, 24, 49, 50, -10],
[-24, 25, -1, 35, 67, 45, 27, 6, 65, 4],
[ 20, -5, 41, -14, -10, 40, 21, 69, 13, 14],
[ 53, -2, 6, 0, -13, 28, 11, -11, 29, 17],
[ 15, 40, 61, 56, 3, 56, 12, -12, 19, 0]])>
Here's the magic:
tf.maximum(x, 0)
<tf.Tensor: shape=(10, 10), dtype=int32, numpy=
array([[57, 0, 48, 43, 29, 21, 15, 42, 0, 12],
[18, 67, 0, 0, 6, 27, 50, 0, 72, 51],
[ 2, 22, 70, 49, 50, 0, 67, 4, 59, 0],
[ 0, 39, 60, 0, 0, 0, 51, 73, 0, 21],
[28, 8, 48, 66, 0, 0, 44, 35, 23, 45],
[ 0, 30, 16, 25, 34, 0, 24, 49, 50, 0],
[ 0, 25, 0, 35, 67, 45, 27, 6, 65, 4],
[20, 0, 41, 0, 0, 40, 21, 69, 13, 14],
[53, 0, 6, 0, 0, 28, 11, 0, 29, 17],
[15, 40, 61, 56, 3, 56, 12, 0, 19, 0]])>
Related
Suppose I have an array with shape (3, 4, 5) and want to slice along the second axis with an index array [2, 1, 0].
I could not explain what I want to do in text, so please refer the below code and figure:
>>> src = np.arange(3*4*5).reshape(3,4,5)
>>> index = [2,1,0]
>>> src
>>> array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]],
[[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
>>> # what I need is:
array([[[10, 11, 12, 13, 14]], # slice the 2nd row (index[0])
[[25, 26, 27, 28, 29]], # 1st row (index[1])
[[40, 41, 42, 43, 44]]]) # 0th row (index[2])
src[np.arange(src.shape[0]), [2, 1, 0]]
# src[np.arange(src.shape[0]), [2, 1, 0], :]
array([[10, 11, 12, 13, 14],
[25, 26, 27, 28, 29],
[40, 41, 42, 43, 44]])
We need to compute the indices for axis=0:
>>> np.arange(src.shape[0])
array([0, 1, 2])
And we already have the indices for axes=1. We then slice across axis=3 to extract our cross-section.
You could do:
import numpy as np
arr = np.array([[[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]],
[[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
first, second = zip(*enumerate([2, 1, 0]))
result = arr[first, second, :]
print(result)
Output
[[10 11 12 13 14]
[25 26 27 28 29]
[40 41 42 43 44]]
I wonder if there is a built-in operation which would free my code from Python-loops.
The problem is this: I have two matrices A and B. A has N rows and B has N columns. I would like to multiply every i row from A with corresponding i column from B (using NumPy broadcasting). The resulting matrix would form i layer in the output. So my result would be 3-dimensional array.
Is such operation available in NumPy?
One way to express your requirement directly is by using np.einsum():
>>> A = np.arange(12).reshape(3, 4)
>>> B = np.arange(15).reshape(5, 3)
>>> np.einsum('...i,j...->...ij', A, B)
array([[[ 0, 0, 0, 0, 0],
[ 0, 3, 6, 9, 12],
[ 0, 6, 12, 18, 24],
[ 0, 9, 18, 27, 36]],
[[ 4, 16, 28, 40, 52],
[ 5, 20, 35, 50, 65],
[ 6, 24, 42, 60, 78],
[ 7, 28, 49, 70, 91]],
[[ 16, 40, 64, 88, 112],
[ 18, 45, 72, 99, 126],
[ 20, 50, 80, 110, 140],
[ 22, 55, 88, 121, 154]]])
This uses the Einstein summation convention.
For further discussion, see chapter 3 of Vectors, Pure and Applied: A General Introduction to Linear Algebra by T. W. Körner. In it, the author cites an amusing passage from Einstein's letter to a friend:
"I have made a great discovery in mathematics; I have suppressed the summation sign every time that the summation must be made over an index which occurs twice..."
Yes, in it's simplest form you just add "zero" dimensions so the NumPy broadcasts along the rows of A and columns of B:
>>> import numpy as np
>>> A = np.arange(12).reshape(3, 4) # 3 row, 4 colums
>>> B = np.arange(15).reshape(5, 3) # 5 rows, 3 columns
>>> res = A[None, ...] * B[..., None]
>>> res
array([[[ 0, 0, 0, 0],
[ 4, 5, 6, 7],
[ 16, 18, 20, 22]],
[[ 0, 3, 6, 9],
[ 16, 20, 24, 28],
[ 40, 45, 50, 55]],
[[ 0, 6, 12, 18],
[ 28, 35, 42, 49],
[ 64, 72, 80, 88]],
[[ 0, 9, 18, 27],
[ 40, 50, 60, 70],
[ 88, 99, 110, 121]],
[[ 0, 12, 24, 36],
[ 52, 65, 78, 91],
[112, 126, 140, 154]]])
The result has a shape of (5, 3, 4) and you can easily move the axis around if you want a different shape. For example using np.moveaxis:
>>> np.moveaxis(res, (0, 1, 2), (2, 0, 1)) # 0 -> 2 ; 1 -> 0, 2 -> 1
array([[[ 0, 0, 0, 0, 0],
[ 0, 3, 6, 9, 12],
[ 0, 6, 12, 18, 24],
[ 0, 9, 18, 27, 36]],
[[ 4, 16, 28, 40, 52],
[ 5, 20, 35, 50, 65],
[ 6, 24, 42, 60, 78],
[ 7, 28, 49, 70, 91]],
[[ 16, 40, 64, 88, 112],
[ 18, 45, 72, 99, 126],
[ 20, 50, 80, 110, 140],
[ 22, 55, 88, 121, 154]]])
With a shape of (3, 4, 5).
I have a dataframe, reproduced partly as such:
import pandas as pd
import numpy as np
tab = pd.DataFrame(np.array([[ 46, 39, 25, 29, 21, 12, 33, 32, 70, 109, 144, 158, 161,
184, 163, 113, 117, 82, 76, 88, 77, 76, 64, 35],
[ 39, 33, 29, 29, 26, 14, 25, 33, 60, 83, 126, 117, 111,
148, 141, 104, 92, 75, 78, 74, 63, 67, 52, 39],
[ 30, 27, 14, 11, 20, 17, 21, 31, 48, 62, 83, 78, 88,
90, 80, 67, 53, 61, 47, 54, 50, 48, 35, 26],
[ 30, 24, 19, 15, 17, 10, 12, 18, 34, 69, 88, 79, 109,
95, 89, 82, 53, 46, 53, 57, 39, 41, 26, 29],
[ 37, 31, 18, 12, 30, 13, 15, 19, 51, 61, 74, 81, 77,
100, 96, 74, 60, 57, 42, 48, 43, 40, 29, 25],
[ 14, 8, 14, 11, 13, 7, 9, 15, 42, 49, 50, 44, 53,
42, 31, 31, 30, 27, 33, 25, 27, 17, 20, 17],
[ 10, 15, 6, 10, 15, 11, 7, 18, 28, 43, 49, 37, 41,
33, 37, 32, 26, 28, 19, 24, 19, 19, 13, 18],
[ 9, 9, 8, 12, 7, 11, 4, 8, 14, 15, 23, 30, 29,
34, 25, 39, 22, 20, 15, 23, 12, 19, 14, 13],
[ 0, 3, 4, 1, 1, 0, 3, 4, 4, 5, 3, 5, 6,
7, 3, 3, 6, 4, 2, 3, 3, 2, 2, 2],
[ 3, 0, 1, 0, 0, 0, 1, 1, 4, 8, 2, 4, 7,
2, 2, 9, 3, 5, 1, 5, 2, 0, 4, 1]]), index =
['Stadsdeel Zuid', 'Stadsdeel West', 'Stadsdeel Nieuw-West',
'Stadsdeel Centrum', 'Stadsdeel Oost', 'Stadsdeel Noord',
'Wijk 00 Amstelveen', 'Stadsdeel Zuidoost', 'Wijk 00',
'Wijk 00 Aalsmeer'])
and I created a heatmap as such
ax = sns.heatmap(tab, linewidths=.5 ,robust=True ,annot_kws = {'size':14})
ax.tick_params(labelsize=14)
ax.figure.set_size_inches((12, 10))
I would like though that values to anchor the colormap are based on min-max values per row so that also rows with lower values are well visible. (in reality the table contains many more rows with low values that the heatmap barely shows color-wise)
How to achieve this ?
I would normalize the tab rows by the maximum value in each row with:
tab_n = tab.div(tab.max(axis=1), axis=0)
where tab_n is the normalized tab having values in the range [0,1]. Hope that helps. Plotting tab_n should return an heatmap like this:
How would one (efficiently) do the following:
x = np.arange(49)
x2 = np.reshape(x, (7,7))
x2
array([[ 0, 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27],
[28, 29, 30, 31, 32, 33, 34],
[35, 36, 37, 38, 39, 40, 41],
[42, 43, 44, 45, 46, 47, 48]])
From here I want to roll a couple of things.
I want to roll 0,7,14,21 etc so 14 comes to top.
Then the same with 4,11,18,25 etc so 39 comes to top.
Result should be:
x2
array([[14, 1, 2, 3, 39, 5, 6],
[21, 8, 9, 10, 46, 12, 13],
[28, 15, 16, 17, 4, 19, 20],
[35, 22, 23, 24, 11, 26, 27],
[42, 29, 30, 31, 18, 33, 34],
[ 0, 36, 37, 38, 25, 40, 41],
[ 7, 43, 44, 45, 32, 47, 48]])
I looked up numpy.roll, here and google but couldn't find how one would do this.
For horizontal rolls, I could do:
np.roll(x2[0], 3, axis=0)
x3
array([4, 5, 6, 0, 1, 2, 3])
But how do I return the full array with this roll change as a new copy?
Roll with a negative shift:
x2[:, 0] = np.roll(x2[:, 0], -2)
Roll with a positive shift:
x2[:, 4] = np.roll(x2[:, 4], 2)
gives:
>>>x2
array([[14, 1, 2, 3, 39, 5, 6],
[21, 8, 9, 10, 46, 12, 13],
[28, 15, 16, 17, 4, 19, 20],
[35, 22, 23, 24, 11, 26, 27],
[42, 29, 30, 31, 18, 33, 34],
[ 0, 36, 37, 38, 25, 40, 41],
[ 7, 43, 44, 45, 32, 47, 48]])
Here's a way to roll multiple columns in one go with advanced-indexing -
# Params
cols = [0,4] # Columns to be rolled
dirn = [2,-2] # Offset with direction as sign
n = x2.shape[0]
x2[:,cols] = x2[np.mod(np.arange(n)[:,None] + dirn,n),cols]
Sample run -
In [45]: x2
Out[45]:
array([[ 0, 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27],
[28, 29, 30, 31, 32, 33, 34],
[35, 36, 37, 38, 39, 40, 41],
[42, 43, 44, 45, 46, 47, 48]])
In [46]: cols = [0,4,5] # Columns to be rolled
...: dirn = [2,-2,4] # Offset with direction as sign
...: n = x2.shape[0]
...: x2[:,cols] = x2[np.mod(np.arange(n)[:,None] + dirn,n),cols]
...:
In [47]: x2 # Three columns rolled
Out[47]:
array([[14, 1, 2, 3, 39, 33, 6],
[21, 8, 9, 10, 46, 40, 13],
[28, 15, 16, 17, 4, 47, 20],
[35, 22, 23, 24, 11, 5, 27],
[42, 29, 30, 31, 18, 12, 34],
[ 0, 36, 37, 38, 25, 19, 41],
[ 7, 43, 44, 45, 32, 26, 48]])
You have to overwrite the column
e.g.:
x2[:,0] = np.roll(x2[:,0], 3)
See here a useful method for shifting a 2D array in all 4 directions (up, down, left, right):
def image_shift_roll(img, x_shift, y_roll):
img_roll = img.copy()
img_roll = np.roll(img_roll, -y_roll, axis = 0) # Positive y rolls up
img_roll = np.roll(img_roll, x_roll, axis = 1) # Positive x rolls right
return img_roll
Let L be a list of, say, 55 items :
L=range(55)
for i in range(6):
print L[10*i:10*(i+1)]
The printed list will have 10 items for i = 0, 1, 2, 3 , 4, but for i = 5, it will have 5 items only.
Is there a quick method for auto zero-padding L[50:60] so that it is 10 items-long ?
Using NumPy:
>>> a = np.arange(55)
>>> a.resize(60)
>>> a.reshape(6, 10)
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
[40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
[50, 51, 52, 53, 54, 0, 0, 0, 0, 0]])
>>> L = range(55)
>>> for i in range(6):
... print (L[10*i:10*(i+1)] + [0]*10)[:10]
...
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
[20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
[30, 31, 32, 33, 34, 35, 36, 37, 38, 39]
[40, 41, 42, 43, 44, 45, 46, 47, 48, 49]
[50, 51, 52, 53, 54, 0, 0, 0, 0, 0]
This might look like wizardry, also please note that this create a tuple and not a list.
from itertools import izip_longest
L = range(55)
list_size = 10
padded = list(izip_longest(*[iter(L)] * list_size, fillvalue=0))
for l in padded:
print l
For an explanation about the zip + iter trick see the documentation here
You can also build the intelligence into your objects. I have left out corner cases; this just illustrates the point.
class ZeroList(list):
def __getitem__(self, index):
if index >= len(self):c
return 0
else: return super(ZeroList,self).__getitem__(index)
def __getslice__(self,i,j):
numzeros = j-len(self)
if numzeros <= 0:
return super(ZeroList,self).__getslice__(i,j)
return super(ZeroList,self).__getslice__(i,len(self)) + [0]*numzeros
>>> l = ZeroList(range(55))
>>> l[40:50]
[40, 41, 42, 43, 44, 45, 46, 47, 48, 49]
>>> l[50:60]
[50, 51, 52, 53, 54, 0, 0, 0, 0, 0]