Reading an Array File in python

Reading an Array File in python - python

I have this file which has an array of data written to it:
[[[ 32. 28. 28. ... 24. 24. 24.]
[ 30. 29. 29. ... 24. 24. 24.]
[ 29. 29. 28. ... 24. 24. 24.]
...
[137. 138. 129. ... 34. 34. 34.]
[140. 139. 128. ... 31. 34. 34.]
[136. 135. 122. ... 30. 30. 33.]]
[[ 40. 40. 40. ... 33. 33. 33.]
[ 38. 38. 37. ... 33. 33. 33.]
[ 37. 37. 37. ... 33. 33. 33.]
...
[140. 137. 132. ... 41. 43. 42.]
[139. 136. 129. ... 42. 43. 43.]
[140. 139. 133. ... 40. 42. 43.]]
[[ 10. 8. 7. ... 4. 4. 4.]
[ 8. 7. 7. ... 4. 4. 4.]
[ 7. 6. 6. ... 4. 4. 4.]
...
[101. 103. 94. ... 12. 13. 13.]
[105. 104. 92. ... 12. 13. 13.]
[ 99. 99. 99. ... 9. 10. 11.]]]
I do not know how to read from this file and use it within my code. Any help would be great! I have this within my code so far:
# Read and pre-process input images
n, c, h, w = net.inputs[input_blob].shape
images = np.ndarray(shape=(n, c, h, w))
for i in range(n):
image = cv2.imread(args.input[i])
if image.shape[:-1] != (h, w):
log.warning("Image {} is resized from {} to {}".format(args.input[i], image.shape[:-1], (h, w)))
image = cv2.resize(image, (w, h))
# Swapping Red and Blue channels
#image[:, :, [0, 2]] = image[:, :, [2, 0]]
# Change data layout from HWC to CHW
image = image.transpose((2, 0, 1))
images[i] = image
eoim = image
eoim16 = eoim.astype(np.float16)
val = []
preprocessed_image_path = 'C:/Users/Owner/Desktop/Ubotica/IOD/cloud_detect/'
formated_image_file = "output_patch_fp"
f = open(preprocessed_image_path + "/" + formated_image_file + ".txt", 'r')
val = f
print(f)
print(val)
# divide by 255 to get value in range 0->1 if necessary (depends on input pixel format)
if(eoim16.max()>1.0):
eoim16 = np.divide(eoim16,255)
print(eoim16)
#f.close()
#print(val)
#val = np.reshape(val, (3,512,512))
eoim16 = np.ndarray(shape=(c, h, w))
#res = val
# calling the instance method using the object cloudDetector
res = cloudDetector.infer(eoim16)
res = res[out_blob]
But when I try to print out val and f (just to see if the data matches and is actually being read within my code nothing appears. Is there any way to solve this so that my array reads into val and I can use the data within my code? Much appreciated!

Try using the eval function. It takes strings and interprets them as Python code.
a = eval(fileData)
print(a)

Related

Even sized kernels with SAME padding in Tensorflow

In Tensorflow, SAME padding aims to produce a same sized output as the input, given a stride = 1, by padding the input with zeros as appropriate. For an odd sized kernel, for example like 5x5, it puts the center of the kernel (2,2) onto the first pixel of the input (0,0) and starts to convolve. Both in the x and y coordinates, 2 pixels of zero padding is needed then.
What if an even kernel, for example a 6x6 is used instead? It won't have a pixel's center as its actual center. How does VALID padding handle this? For example according to Image convolution with even-sized kernel the convention in the general image processing literature is to place one more pixel before the zero, like -3 -2 -1 0 1 2 in this case. Three pixel will be hit in the padding area. I refered to the Tensorflow documents for this, but could not find a clarifying answer.

Like you say, the documentation does not seem to specify it clearly. Looking at the source of the 2D convolution kernel (conv_ops.cc), a comment explains:
// Total padding on rows and cols is
// Pr = (R' - 1) * S + (Kr - 1) * Dr + 1 - R
// Pc = (C' - 1) * S + (Kc - 1) * Dc + 1 - C
// where (R', C') are output dimensions, (R, C) are input dimensions, S
// is stride, (Dr, Dc) are dilations, (Kr, Kc) are filter dimensions.
// We pad Pr/2 on the left and Pr - Pr/2 on the right, Pc/2 on the top
// and Pc - Pc/2 on the bottom. When Pr or Pc is odd, this means
// we pad more on the right and bottom than on the top and left.
So it seems you would get one extra padding at the right column and bottom row with even-sized kernels. We can look at one example:
import tensorflow as tf
input_ = tf.ones((1, 10, 10, 1), dtype=tf.float32)
kernel = tf.ones((6, 6, 1, 1), dtype=tf.float32)
conv = tf.nn.conv2d(input_, kernel, [1, 1, 1, 1], 'SAME')
with tf.Session() as sess:
print(sess.run(conv)[0, :, :, 0])
Output:
[[16. 20. 24. 24. 24. 24. 24. 20. 16. 12.]
[20. 25. 30. 30. 30. 30. 30. 25. 20. 15.]
[24. 30. 36. 36. 36. 36. 36. 30. 24. 18.]
[24. 30. 36. 36. 36. 36. 36. 30. 24. 18.]
[24. 30. 36. 36. 36. 36. 36. 30. 24. 18.]
[24. 30. 36. 36. 36. 36. 36. 30. 24. 18.]
[24. 30. 36. 36. 36. 36. 36. 30. 24. 18.]
[20. 25. 30. 30. 30. 30. 30. 25. 20. 15.]
[16. 20. 24. 24. 24. 24. 24. 20. 16. 12.]
[12. 15. 18. 18. 18. 18. 18. 15. 12. 9.]]
Indeed, it does look like extra zeros are added to the right and bottom sides.

Fast way to apply function to each row of a numpy array

Suppose I have some nearest neighbor classifier. For a new observation it computes the distance between the new observation and all observations in the "known" data set. It returns the class label of the observation, that has the smallest distance to the new observation.
import numpy as np
known_obs = np.random.randint(0, 10, 40).reshape(8, 5)
new_obs = np.random.randint(0, 10, 80).reshape(16, 5)
labels = np.random.randint(0, 2, 8).reshape(8, )
def my_dist(x1, known_obs, axis=0):
return (np.square(np.linalg.norm(x1 - known_obs, axis=axis)))
def nn_classifier(n, known_obs, labels, axis=1, distance=my_dist):
return labels[np.argmin(distance(n, known_obs, axis=axis))]
def classify_batch(new_obs, known_obs, labels, classifier=nn_classifier, distance=my_dist):
return [classifier(n, known_obs, labels, distance=distance) for n in new_obs]
print(classify_batch(new_obs, known_obs, labels, nn_classifier, my_dist))
For performance reasons I would like to avoid the for loop in the classify_batch function. Is there a way to use numpy operations to apply the nn_classifier function to each row of new_obs?
I already tried apply_along_axis but as often mentioned it is convenient but not fast.

The key to avoiding the loop is to express the action on the (16,8) array of 'distances'. The labels[] and argmin steps just cloud the issue.
If I set labels = np.arange(8), then this
arr = np.array([my_dist(n, known_obs, axis=1) for n in new_obs])
print(arr)
print(np.argmin(arr, axis=1))
produces the same thing. It still has a list comprehension, but we are closer to 'source'.
[[ 32. 115. 22. 116. 162. 86. 161. 117.]
[ 106. 31. 142. 164. 92. 106. 45. 103.]
[ 44. 135. 94. 18. 94. 50. 87. 135.]
[ 11. 92. 57. 67. 79. 43. 118. 106.]
[ 40. 67. 126. 98. 50. 74. 75. 175.]
[ 78. 61. 120. 148. 102. 128. 67. 191.]
[ 51. 48. 57. 133. 125. 35. 110. 14.]
[ 47. 28. 93. 91. 63. 49. 32. 88.]
[ 61. 86. 23. 141. 159. 85. 146. 22.]
[ 131. 70. 155. 149. 129. 127. 44. 138.]
[ 97. 138. 87. 117. 223. 77. 130. 122.]
[ 151. 78. 211. 161. 131. 115. 46. 164.]
[ 13. 50. 31. 69. 59. 43. 80. 40.]
[ 131. 108. 157. 161. 207. 85. 102. 146.]
[ 39. 106. 67. 23. 61. 67. 70. 88.]
[ 54. 51. 74. 68. 42. 86. 35. 65.]]
[2 1 3 0 0 1 7 1 7 6 5 6 0 5 3 6]
With
print((new_obs[:,None,:] - known_obs[None,:,:]).shape)
I get a (16,8,5) array. So can I apply the linalg.norm on the last axis?
This seems to do the trick
np.square(np.linalg.norm(diff, axis=-1))
So together:
diff = (new_obs[:,None,:] - known_obs[None,:,:])
dist = np.square(np.linalg.norm(diff, axis=-1))
idx = np.argmin(dist, axis=1)
print(idx)

Building histogram from a dict without having to iterate over the keys

I have a dict containing numpy array of varying length:
MyDcit= {0:array([[ 15. , 3.89678216],
[ 36. , 9.49245167],
[ 53. , 3.82997799],
[ 83. , 5.25727272],
[ 86. , 8.76663208]]),
1:array([[ 4. , 4.1171155 ],
[ 16. , 12.68122196],
[ 31. , 8.64805222],
[ 37. , 6.07202959]]),
2:array([]),...,
90:array([[ 1. , 1. ],
[ 24. , 8.14221573],
[ 27. , 7.36309862]])}
I would like to obtain an histogram of all the values in the dict. The solution I have now is to iterate over the keys in the dict and fill a numpy array with an histogram of fixed length:
for KeysElements in MyDict.keys():
hist,bins = numpy.histogram(np.asarray(MyDict[KeysElements])[:,1],50)
numpy_hist[KeysElements,:] = hist
I then sum up all the histograms over the fist dimension of the numpy array to obtain the histogram of all the keys of the initial dict:
Total_hist = numpy.sum(numpy_hist,axis=0)
The problems with this solutions is that I do not knwo how to handle the bins which change for each iteration, so my question is: are there any possibilities to achieve this without having to built histograms in a loop?
Thanks for any advices or links.
Greg

You don't seem to use the MyDict index values or the 0th values in the 2nd axis of your np arrays. If this is the case then you could add all the numpy arrays together and do the histogram on that
import numpy as np
MyDict = {0:np.array([[ 15. , 3.89678216],
[ 36. , 9.49245167],
[ 53. , 3.82997799],
[ 83. , 5.25727272],
[ 86. , 8.76663208]]),
1:np.array([[ 4. , 4.1171155 ],
[ 16. , 12.68122196],
[ 31. , 8.64805222],
[ 37. , 6.07202959]]),
2:np.array([]),
90:np.array([[ 1. , 1. ],
[ 24. , 8.14221573],
[ 27. , 7.36309862]])}
np_array = np.array([]).reshape(0,2)
for i in MyDict:
a = MyDict[i]
if len(a.shape) == 2 and a.shape[1] == 2:
np_array = np.append(np_array, MyDict[i], axis=0)
print(np.histogram(np_array, 50))

"m x n" dimensional gradient-style array in Python

I checked out
gradient descent using python and numpy
but it didn't solve my problem.
I'm trying to get familiar with image-processing and I want to generate a few test arrays to mess around with in Python.
Is there a method (like np.arange) to create a m x n array where the inner entries form some type of gradient?
I did an example of a naive method for generating the desired output.
Excuse my generality of the term gradient, I'm using it in it's simple meaning as smooth transition in color.
#!/usr/bin/python
import numpy as np
import matplotlib.pyplot as plt
#Set up parameters
m = 15
n = 10
A_placeholder = np.zeros((m,n))
V_m = np.arange(0,m).astype(np.float32)
V_n = np.arange(0,n).astype(np.float32)
#Iterate through combinations
for i in range(m):
m_i = V_m[i]
for j in range(n):
n_j = V_n[j]
A_placeholder[i,j] = m_i * n_j #Some combination
#Relabel
A_gradient = A_placeholder
A_placeholder = None
#Print data
print A_gradient
#[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
[ 0. 2. 4. 6. 8. 10. 12. 14. 16. 18.]
[ 0. 3. 6. 9. 12. 15. 18. 21. 24. 27.]
[ 0. 4. 8. 12. 16. 20. 24. 28. 32. 36.]
[ 0. 5. 10. 15. 20. 25. 30. 35. 40. 45.]
[ 0. 6. 12. 18. 24. 30. 36. 42. 48. 54.]
[ 0. 7. 14. 21. 28. 35. 42. 49. 56. 63.]
[ 0. 8. 16. 24. 32. 40. 48. 56. 64. 72.]
[ 0. 9. 18. 27. 36. 45. 54. 63. 72. 81.]
[ 0. 10. 20. 30. 40. 50. 60. 70. 80. 90.]
[ 0. 11. 22. 33. 44. 55. 66. 77. 88. 99.]
[ 0. 12. 24. 36. 48. 60. 72. 84. 96. 108.]
[ 0. 13. 26. 39. 52. 65. 78. 91. 104. 117.]
[ 0. 14. 28. 42. 56. 70. 84. 98. 112. 126.]]
#Show Image
plt.imshow(A_gradient)
plt.show()
I've tried np.gradient but it didn't give me the desired output.
#print np.gradient(np.array([V_m,V_n]))
#Traceback (most recent call last):
# File "Untitled.py", line 19, in <module>
# print np.gradient(np.array([V_m,V_n]))
# File "/Users/Mu/anaconda/lib/python2.7/site-packages/numpy/lib/function_base.py", line 1458, in gradient
# out[slice1] = (y[slice2] - y[slice3])
#ValueError: operands could not be broadcast together with shapes (10,) (15,)

A_placeholder[i,j] = m_i * n_j
Any operation like that can be expressed in numpy using broadcasting
A = np.arange(m)[:, None] * np.arange(n)[None, :]

Getting all points of a given connected component rapidly

Scikit-Image has quite a few methods available for blob detection:
Laplacian of Gaussian (LoG)
Difference of Gaussian (DoG)
Determinant of Hessian (DoH)
All three return an array that contains a single point within the bounds of the found components:
>>> from skimage import data, feature
>>> img = data.coins()
>>> feature.blob_doh(img)
array([[ 121. , 271. , 30. ],
[ 123. , 44. , 23.55555556],
[ 123. , 205. , 20.33333333],
[ 124. , 336. , 20.33333333],
[ 126. , 101. , 20.33333333],
[ 126. , 153. , 20.33333333],
[ 156. , 302. , 30. ],
[ 185. , 348. , 30. ],
[ 192. , 212. , 23.55555556],
[ 193. , 275. , 23.55555556],
[ 195. , 100. , 23.55555556],
[ 197. , 44. , 20.33333333],
[ 197. , 153. , 20.33333333],
[ 260. , 173. , 30. ],
[ 262. , 243. , 23.55555556],
[ 265. , 113. , 23.55555556],
[ 270. , 363. , 30. ]])
I'd like to use that information to produce lists that contains the coordinates of all the points in a given component.
I could just iterate through the whole image myself starting with the seeds and just collect all the points in a dict with the key being the point provide by blob detection, but I imagine it would rather slow unless I'm using cython(more than willing to be wrong about this, as I'm fairly new to python). More truthfully, I simply think there is probably a better way then just doing it myself.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Reading an Array File in python - python

Try using the eval function. It takes strings and interprets them as Python code. a = eval(fileData) print(a)

Related

Even sized kernels with SAME padding in Tensorflow

Fast way to apply function to each row of a numpy array

Building histogram from a dict without having to iterate over the keys

"m x n" dimensional gradient-style array in Python

Getting all points of a given connected component rapidly

Categories

Resources