In Tensorflow, SAME padding aims to produce a same sized output as the input, given a stride = 1, by padding the input with zeros as appropriate. For an odd sized kernel, for example like 5x5, it puts the center of the kernel (2,2) onto the first pixel of the input (0,0) and starts to convolve. Both in the x and y coordinates, 2 pixels of zero padding is needed then.
What if an even kernel, for example a 6x6 is used instead? It won't have a pixel's center as its actual center. How does VALID padding handle this? For example according to Image convolution with even-sized kernel the convention in the general image processing literature is to place one more pixel before the zero, like -3 -2 -1 0 1 2 in this case. Three pixel will be hit in the padding area. I refered to the Tensorflow documents for this, but could not find a clarifying answer.
Like you say, the documentation does not seem to specify it clearly. Looking at the source of the 2D convolution kernel (conv_ops.cc), a comment explains:
// Total padding on rows and cols is
// Pr = (R' - 1) * S + (Kr - 1) * Dr + 1 - R
// Pc = (C' - 1) * S + (Kc - 1) * Dc + 1 - C
// where (R', C') are output dimensions, (R, C) are input dimensions, S
// is stride, (Dr, Dc) are dilations, (Kr, Kc) are filter dimensions.
// We pad Pr/2 on the left and Pr - Pr/2 on the right, Pc/2 on the top
// and Pc - Pc/2 on the bottom. When Pr or Pc is odd, this means
// we pad more on the right and bottom than on the top and left.
So it seems you would get one extra padding at the right column and bottom row with even-sized kernels. We can look at one example:
import tensorflow as tf
input_ = tf.ones((1, 10, 10, 1), dtype=tf.float32)
kernel = tf.ones((6, 6, 1, 1), dtype=tf.float32)
conv = tf.nn.conv2d(input_, kernel, [1, 1, 1, 1], 'SAME')
with tf.Session() as sess:
print(sess.run(conv)[0, :, :, 0])
Output:
[[16. 20. 24. 24. 24. 24. 24. 20. 16. 12.]
[20. 25. 30. 30. 30. 30. 30. 25. 20. 15.]
[24. 30. 36. 36. 36. 36. 36. 30. 24. 18.]
[24. 30. 36. 36. 36. 36. 36. 30. 24. 18.]
[24. 30. 36. 36. 36. 36. 36. 30. 24. 18.]
[24. 30. 36. 36. 36. 36. 36. 30. 24. 18.]
[24. 30. 36. 36. 36. 36. 36. 30. 24. 18.]
[20. 25. 30. 30. 30. 30. 30. 25. 20. 15.]
[16. 20. 24. 24. 24. 24. 24. 20. 16. 12.]
[12. 15. 18. 18. 18. 18. 18. 15. 12. 9.]]
Indeed, it does look like extra zeros are added to the right and bottom sides.
Related
I have this file which has an array of data written to it:
[[[ 32. 28. 28. ... 24. 24. 24.]
[ 30. 29. 29. ... 24. 24. 24.]
[ 29. 29. 28. ... 24. 24. 24.]
...
[137. 138. 129. ... 34. 34. 34.]
[140. 139. 128. ... 31. 34. 34.]
[136. 135. 122. ... 30. 30. 33.]]
[[ 40. 40. 40. ... 33. 33. 33.]
[ 38. 38. 37. ... 33. 33. 33.]
[ 37. 37. 37. ... 33. 33. 33.]
...
[140. 137. 132. ... 41. 43. 42.]
[139. 136. 129. ... 42. 43. 43.]
[140. 139. 133. ... 40. 42. 43.]]
[[ 10. 8. 7. ... 4. 4. 4.]
[ 8. 7. 7. ... 4. 4. 4.]
[ 7. 6. 6. ... 4. 4. 4.]
...
[101. 103. 94. ... 12. 13. 13.]
[105. 104. 92. ... 12. 13. 13.]
[ 99. 99. 99. ... 9. 10. 11.]]]
I do not know how to read from this file and use it within my code. Any help would be great! I have this within my code so far:
# Read and pre-process input images
n, c, h, w = net.inputs[input_blob].shape
images = np.ndarray(shape=(n, c, h, w))
for i in range(n):
image = cv2.imread(args.input[i])
if image.shape[:-1] != (h, w):
log.warning("Image {} is resized from {} to {}".format(args.input[i], image.shape[:-1], (h, w)))
image = cv2.resize(image, (w, h))
# Swapping Red and Blue channels
#image[:, :, [0, 2]] = image[:, :, [2, 0]]
# Change data layout from HWC to CHW
image = image.transpose((2, 0, 1))
images[i] = image
eoim = image
eoim16 = eoim.astype(np.float16)
val = []
preprocessed_image_path = 'C:/Users/Owner/Desktop/Ubotica/IOD/cloud_detect/'
formated_image_file = "output_patch_fp"
f = open(preprocessed_image_path + "/" + formated_image_file + ".txt", 'r')
val = f
print(f)
print(val)
# divide by 255 to get value in range 0->1 if necessary (depends on input pixel format)
if(eoim16.max()>1.0):
eoim16 = np.divide(eoim16,255)
print(eoim16)
#f.close()
#print(val)
#val = np.reshape(val, (3,512,512))
eoim16 = np.ndarray(shape=(c, h, w))
#res = val
# calling the instance method using the object cloudDetector
res = cloudDetector.infer(eoim16)
res = res[out_blob]
But when I try to print out val and f (just to see if the data matches and is actually being read within my code nothing appears. Is there any way to solve this so that my array reads into val and I can use the data within my code? Much appreciated!
Try using the eval function. It takes strings and interprets them as Python code.
a = eval(fileData)
print(a)
I was trying to understand the working of the function fast_knn of impyute library. So, I tried to execute it line by line in order to understand the working. Here it is:
import numpy as np
from scipy.spatial import KDTree
def shepards(distances, power=2):
return to_percentage(1/np.power(distances, power))
def to_percentage(vec):
return vec/np.sum(vec)
data_temp = np.arange(25).reshape((5, 5)).astype(np.float)
data_temp[0][2] = np.nan
k=4
eps=0
p=2
distance_upper_bound=np.inf
leafsize=10
idw_fn=shepards
init_impute_fn=mean
nan_xy = np.argwhere(np.isnan(data_temp))
data_temp_c = init_impute_fn(data_temp)
kdtree = KDTree(data_temp_c, leafsize=leafsize)
for x_i, y_i in nan_xy:
distances, indices = kdtree.query(data_temp_c[x_i], k=k+1, eps=eps,
p=p, distance_upper_bound=distance_upper_bound)
# Will always return itself in the first index. Delete it.
distances, indices = distances[1:], indices[1:]
# Add small constant to distances to avoid division by 0
distances += 1e-3
weights = idw_fn(distances)
# Assign missing value the weighted average of `k` nearest neighbours
data_temp[x_i][y_i] = np.dot(weights, [data_temp_c[ind][y_i] for ind in indices])
data_temp
This outputs:
array([[ 0. , 1. , 10.06569379, 3. , 4. ],
[ 5. , 6. , 7. , 8. , 9. ],
[10. , 11. , 12. , 13. , 14. ],
[15. , 16. , 17. , 18. , 19. ],
[20. , 21. , 22. , 23. , 24. ]])
whereas the function has a different output. The code :
from impyute import fast_knn
import numpy as np
data_temp = np.arange(25).reshape((5, 5)).astype(np.float)
data_temp[0][2] = np.nan
fast_knn(data_temp, k=4)
and the output
array([[ 0. , 1. , 16.78451885, 3. , 4. ],
[ 5. , 6. , 7. , 8. , 9. ],
[10. , 11. , 12. , 13. , 14. ],
[15. , 16. , 17. , 18. , 19. ],
[20. , 21. , 22. , 23. , 24. ]])
``
There seems to be discrepancies with the GitHub repository code and library source code ( the repository has not been updated). The following is the library source code :
def fast_knn(data, k=3, eps=0, p=2, distance_upper_bound=np.inf, leafsize=10, **kwargs):
null_xy = find_null(data)
data_c = mean(data)
kdtree = KDTree(data_c, leafsize=leafsize)
for x_i, y_i in null_xy:
distances, indices = kdtree.query(data_c[x_i], k=k+1, eps=eps,
p=p, distance_upper_bound=distance_upper_bound)
# Will always return itself in the first index. Delete it.
distances, indices = distances[1:], indices[1:]
weights = distances/np.sum(distances)
# Assign missing value the weighted average of `k` nearest neighbours
data[x_i][y_i] = np.dot(weights, [data_c[ind][y_i] for ind in indices])
return data
The weights are computed in a different manner (not using the shepards function). Hence, the difference in outputs.
Maybe you used the code on the current master branch of impyute. But the impyute package version you used maybe v0.0.8 — the current recent version — whose code is at the release/0.0.8 branch.
The difference in the definition of fast_knn is below.
On the current master branch:
# Will always return itself in the first index. Delete it.
distances, indices = distances[1:], indices[1:]
# Add small constant to distances to avoid division by 0
distances += 1e-3
weights = idw_fn(distances)
On release/0.0.8 branch:
# Will always return itself in the first index. Delete it.
distances, indices = distances[1:], indices[1:]
weights = distances/np.sum(distances)
If you use the code in the release/0.0.8 branch, you will get the same result as you use the impyute package.
Given a vector, [1,2,3,4,5] for example, how to upsample the vector with linear interpolation to a certain length, such as 45 in python.
If it is linear, there should be a constant increase or decrease between each new element. In your case it is one. So sample the difference between two elements, then add that to the last element however many times you want to.
a = [1,2,3,4,5]
num_add = 45 -len(a)
b = a[1] - a[0]
for z in range(1,num_add):
a.append(b + a[-1])
I think this should work, although you may have to play with the range.
Well, I interpreted your list of [1, 2, 3, 4, 5] as simply an example. If you want a script that will actually interpolate the series you give it, try this:
from scipy.optimize import curve_fit
import numpy as np
# Line equation - doesn't have to be linear
def lin_eq(x, m, b):
return x*m + b
# Your actual data
std_y = np.array([1, 2, 3, 4, 5])
# Index of data
std_x = np.arange(1, len(std_y) + 1)
popt, pcov = curve_fit(lin_eq, std_x, std_y)
top = 45
# Index of projected data
proj_x = np.arange(1, top + 1)
# Interpolated data
proj_y = lin_eq(proj_x, *popt)
print proj_y
[ 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30.
31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45.]
I have a 601x350x200x146 numpy float64 array which, according to my calculations takes about 22.3 Gb of memory. My output of free -m tells me I have about 100Gb of free memory so it fits fine. However, when integrating with
result = np.trapz(large_arr, axis=3)
I get a memory error. I understand that this is because of the intermediate arrays that numpy.trapz has to create to perform the integration. But I'm looking to see if there's a way around it, or at least a way to minimize the extra use of memory.
I have read about memory errors and I know of things to avoid this: one is placing a gc.collect() call before the integration. I tried this and it didn't work.
The other one is using the *= operators such as writing arr*=a instead of arr=arr*a, which I can't really do here. So I don't know what else to try.
Does anyone know of a way to do this operation without raising a memory error?
You can reproduce the error with:
arr = np.ones((601,350,200,146), dtype=np.float64)
arr=np.trapz(arr, axis=3)
although you'll have to scale down the size to match your memory size.
numpy.trapz provides some convenience, but the actual calculation is very simple. To avoid large temporary arrays, just implement it yourself:
In [37]: x.shape
Out[37]: (2, 4, 4, 10)
Here's the result of numpy.trapz(x, axis=3):
In [38]: np.trapz(x, axis=3)
Out[38]:
array([[[ 43. , 48.5, 46.5, 67. ],
[ 35.5, 39.5, 52.5, 35. ],
[ 44.5, 47.5, 34.5, 39.5],
[ 54. , 40. , 46.5, 50.5]],
[[ 42. , 60. , 55.5, 51. ],
[ 51.5, 40. , 52. , 42.5],
[ 48.5, 43. , 32. , 36.5],
[ 42.5, 38. , 38. , 45. ]]])
Here's the calculation written to use no large intermediate arrays. (The slice x[:,:,:,1:-1] does not copy the data associated with the array.)
In [48]: 0.5*(x[:,:,:,0] + 2*x[:,:,:,1:-1].sum(axis=3) + x[:,:,:,-1])
Out[48]:
array([[[ 43. , 48.5, 46.5, 67. ],
[ 35.5, 39.5, 52.5, 35. ],
[ 44.5, 47.5, 34.5, 39.5],
[ 54. , 40. , 46.5, 50.5]],
[[ 42. , 60. , 55.5, 51. ],
[ 51.5, 40. , 52. , 42.5],
[ 48.5, 43. , 32. , 36.5],
[ 42.5, 38. , 38. , 45. ]]])
If x has shape (m, n, p, q), the few temporary arrays that are generated in that expression all have shape (m, n, p).
I checked out
gradient descent using python and numpy
but it didn't solve my problem.
I'm trying to get familiar with image-processing and I want to generate a few test arrays to mess around with in Python.
Is there a method (like np.arange) to create a m x n array where the inner entries form some type of gradient?
I did an example of a naive method for generating the desired output.
Excuse my generality of the term gradient, I'm using it in it's simple meaning as smooth transition in color.
#!/usr/bin/python
import numpy as np
import matplotlib.pyplot as plt
#Set up parameters
m = 15
n = 10
A_placeholder = np.zeros((m,n))
V_m = np.arange(0,m).astype(np.float32)
V_n = np.arange(0,n).astype(np.float32)
#Iterate through combinations
for i in range(m):
m_i = V_m[i]
for j in range(n):
n_j = V_n[j]
A_placeholder[i,j] = m_i * n_j #Some combination
#Relabel
A_gradient = A_placeholder
A_placeholder = None
#Print data
print A_gradient
#[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
[ 0. 2. 4. 6. 8. 10. 12. 14. 16. 18.]
[ 0. 3. 6. 9. 12. 15. 18. 21. 24. 27.]
[ 0. 4. 8. 12. 16. 20. 24. 28. 32. 36.]
[ 0. 5. 10. 15. 20. 25. 30. 35. 40. 45.]
[ 0. 6. 12. 18. 24. 30. 36. 42. 48. 54.]
[ 0. 7. 14. 21. 28. 35. 42. 49. 56. 63.]
[ 0. 8. 16. 24. 32. 40. 48. 56. 64. 72.]
[ 0. 9. 18. 27. 36. 45. 54. 63. 72. 81.]
[ 0. 10. 20. 30. 40. 50. 60. 70. 80. 90.]
[ 0. 11. 22. 33. 44. 55. 66. 77. 88. 99.]
[ 0. 12. 24. 36. 48. 60. 72. 84. 96. 108.]
[ 0. 13. 26. 39. 52. 65. 78. 91. 104. 117.]
[ 0. 14. 28. 42. 56. 70. 84. 98. 112. 126.]]
#Show Image
plt.imshow(A_gradient)
plt.show()
I've tried np.gradient but it didn't give me the desired output.
#print np.gradient(np.array([V_m,V_n]))
#Traceback (most recent call last):
# File "Untitled.py", line 19, in <module>
# print np.gradient(np.array([V_m,V_n]))
# File "/Users/Mu/anaconda/lib/python2.7/site-packages/numpy/lib/function_base.py", line 1458, in gradient
# out[slice1] = (y[slice2] - y[slice3])
#ValueError: operands could not be broadcast together with shapes (10,) (15,)
A_placeholder[i,j] = m_i * n_j
Any operation like that can be expressed in numpy using broadcasting
A = np.arange(m)[:, None] * np.arange(n)[None, :]