Getting all points of a given connected component rapidly - python

Scikit-Image has quite a few methods available for blob detection:
Laplacian of Gaussian (LoG)
Difference of Gaussian (DoG)
Determinant of Hessian (DoH)
All three return an array that contains a single point within the bounds of the found components:
>>> from skimage import data, feature
>>> img = data.coins()
>>> feature.blob_doh(img)
array([[ 121. , 271. , 30. ],
[ 123. , 44. , 23.55555556],
[ 123. , 205. , 20.33333333],
[ 124. , 336. , 20.33333333],
[ 126. , 101. , 20.33333333],
[ 126. , 153. , 20.33333333],
[ 156. , 302. , 30. ],
[ 185. , 348. , 30. ],
[ 192. , 212. , 23.55555556],
[ 193. , 275. , 23.55555556],
[ 195. , 100. , 23.55555556],
[ 197. , 44. , 20.33333333],
[ 197. , 153. , 20.33333333],
[ 260. , 173. , 30. ],
[ 262. , 243. , 23.55555556],
[ 265. , 113. , 23.55555556],
[ 270. , 363. , 30. ]])
I'd like to use that information to produce lists that contains the coordinates of all the points in a given component.
I could just iterate through the whole image myself starting with the seeds and just collect all the points in a dict with the key being the point provide by blob detection, but I imagine it would rather slow unless I'm using cython(more than willing to be wrong about this, as I'm fairly new to python). More truthfully, I simply think there is probably a better way then just doing it myself.

Related

why I cannot reshape or resize my numpy array

I have the following output for a
[ 1. 3. 5. 7. 9. 11. 13. 15. 17. 19. 21. 23. 25. 27.
29. 31. 33. 35. 37. 39. 41. 43. 45. 47. 97. 99. 101. 103.
105. 107. 109. 111. 113. 115. 117. 119. 121. 123. 125. 127. 129. 131.
133. 135. 137. 139. 141. 143.]
I want to reshape it to the below
[[1. 3. 5. 7. 9. 11. 13. 15.]
[17. 19. 21. 23. 25. 27. 29. 31.]
[33. 35. 37. 39. 41. 43. 45. 47.]
[97. 99. 101. 103. 105. 107. 109. 111.]
[113. 115. 117. 119. 121. 123. 125. 127.]
[129. 131. 133. 135. 137. 139. 141. 143.]]
I tried to use a.resize(6, 8), but it gives me this error: "resize only works on single-segment arrays"
Also, when I am trying to use a.reshape(6, 8), it gives me the same array.
I don't understand what is the reason for that as I have tested another array and worked well.
try a.reshape((8, 6))
notice the double parentheses
a = np.array([1., 3., 5., 7., 9., 11., 13., 15., 17., 19., 21., 23., 25., 27.,
29., 31., 33., 35., 37., 39., 41., 43., 45., 47., 97., 99., 101., 103.,
105., 107., 109., 111., 113., 115., 117., 119., 121., 123., 125., 127., 129., 131.,
133., 135., 137., 139., 141., 143.])
print(a.reshape((8, 6)))
out:
[[ 1. 3. 5. 7. 9. 11.]
[ 13. 15. 17. 19. 21. 23.]
[ 25. 27. 29. 31. 33. 35.]
[ 37. 39. 41. 43. 45. 47.]
[ 97. 99. 101. 103. 105. 107.]
[109. 111. 113. 115. 117. 119.]
[121. 123. 125. 127. 129. 131.]
[133. 135. 137. 139. 141. 143.]]
Process finished with exit code 0
do notice that for the output you requested, the dimensions should be
a.reshape((6,8))
out:
[[ 1. 3. 5. 7. 9. 11. 13. 15.]
[ 17. 19. 21. 23. 25. 27. 29. 31.]
[ 33. 35. 37. 39. 41. 43. 45. 47.]
[ 97. 99. 101. 103. 105. 107. 109. 111.]
[113. 115. 117. 119. 121. 123. 125. 127.]
[129. 131. 133. 135. 137. 139. 141. 143.]]
Process finished with exit code 0
you can read about NumPy's reshape here: reshape documentation
Try
b = a.reshape((8,6))
and keep in mind 2 things, for future use of similar methods:
the reshape method takes a tuple as input, in that case (8,6) , calling b = a.reshape(8,6) gives 2 int arguments to the method instead of the tuple it expects. always pay attention to the expected values. you can investigate that by just hovering over a function in pycharm and most editors.
in numpy, many methods do not manipulate the given object but rather return a new value for you to use.
it is healthy to always check for that in documentation, in order to avoid catastrophic heartbreaks, trust me.

Python - how to store multiple lists in one list?

this is the output of multiple lists and i want to store them in one list or array
(array([[ 1.52494154e+11, 1.52811638e+11, 1.52565040e+11, ...,
1.47778892e+11, 1.46781213e+11, 1.46678951e+11],
[ 7.69589176e+10, 7.73638333e+10, 7.76935891e+10, ...,
7.48498747e+10, 7.40088248e+10, 7.40343108e+10],
[ 6.32683585e+04, 1.58170271e+06, 6.11287648e+06, ...,
5.06690834e+05, 3.31360693e+05, 7.04757400e+05],
...,
[ 7.79589127e+05, 8.09843763e+04, 2.52907491e+05, ...,
2.48520301e+05, 2.11734697e+05, 2.50917758e+05],
[ 9.41199946e+05, 4.98371406e+05, 1.29328139e+06, ...,
2.56729806e+05, 3.45253951e+05, 3.51932417e+05],
[ 4.36846676e+05, 1.24123764e+06, 9.20694394e+05, ...,
8.35807658e+04, 8.36986905e+05, 3.57807267e+04]]), array([ 0. , 3.90625, 7.8125 , 11.71875, 15.625 ,
19.53125, 23.4375 , 27.34375, 31.25 , 35.15625,
39.0625 , 42.96875, 46.875 , 50.78125, 54.6875 ,
58.59375, 62.5 , 66.40625, 70.3125 , 74.21875,
78.125 , 82.03125, 85.9375 , 89.84375, 93.75 ,
97.65625, 101.5625 , 105.46875, 109.375 , 113.28125,
117.1875 , 121.09375, 125. , 128.90625, 132.8125 ,
136.71875, 140.625 , 144.53125, 148.4375 , 152.34375,
156.25 , 160.15625, 164.0625 , 167.96875, 171.875 ,
175.78125, 179.6875 , 183.59375, 187.5 , 191.40625,
195.3125 , 199.21875, 203.125 , 207.03125, 210.9375 ,
214.84375, 218.75 , 222.65625, 226.5625 , 230.46875,
234.375 , 238.28125, 242.1875 , 246.09375, 250. ,
253.90625, 257.8125 , 261.71875, 265.625 , 269.53125,
273.4375 , 277.34375, 281.25 , 285.15625, 289.0625 ,
292.96875, 296.875 , 300.78125, 304.6875 , 308.59375,
312.5 , 316.40625, 320.3125 , 324.21875, 328.125 ,
332.03125, 335.9375 , 339.84375, 343.75 , 347.65625,
351.5625 , 355.46875, 359.375 , 363.28125, 367.1875 ,
371.09375, 375. , 378.90625, 382.8125 , 386.71875,
390.625 , 394.53125, 398.4375 , 402.34375, 406.25 ,
410.15625, 414.0625 , 417.96875, 421.875 , 425.78125,
429.6875 , 433.59375, 437.5 , 441.40625, 445.3125 ,
449.21875, 453.125 , 457.03125, 460.9375 , 464.84375,
468.75 , 472.65625, 476.5625 , 480.46875, 484.375 ,
488.28125, 492.1875 , 496.09375, 500. ]), array([ 1.28000000e-01, 2.56000000e-01, 3.84000000e-01, ...,
1.41529600e+03, 1.41542400e+03, 1.41555200e+03]), <matplotlib.image.AxesImage object at 0x000002161A78F898>)
this my code, tese multiple lists comes from the spectrogram of three axis sensor data, to calculate the spectrogram i have calculate the magnitude of the three axis.. what i want is to save the spectrogram output in a more efficient way to use it as an input in another model as a text file
dataset = np.loadtxt("trainingdatasetMAG.txt", delimiter=",")
X = dataset[:,0:6]
Y = dataset[:,6]
fake_size = 1415684
time = np.arange(fake_size)/1000 # 1kHz
base_freq = 2 * np.pi * 100
magnitude = dataset[:,5]
plt.title('xyz_magnitude')
ls=(plt.specgram(magnitude, Fs=1000))
It sounds like you want to flatten the list. Please see this answer, repeated here.
newlist = [item for sublist in mainlist for item in sublist]

Avoid memory errors when integrating large array with Numpy

I have a 601x350x200x146 numpy float64 array which, according to my calculations takes about 22.3 Gb of memory. My output of free -m tells me I have about 100Gb of free memory so it fits fine. However, when integrating with
result = np.trapz(large_arr, axis=3)
I get a memory error. I understand that this is because of the intermediate arrays that numpy.trapz has to create to perform the integration. But I'm looking to see if there's a way around it, or at least a way to minimize the extra use of memory.
I have read about memory errors and I know of things to avoid this: one is placing a gc.collect() call before the integration. I tried this and it didn't work.
The other one is using the *= operators such as writing arr*=a instead of arr=arr*a, which I can't really do here. So I don't know what else to try.
Does anyone know of a way to do this operation without raising a memory error?
You can reproduce the error with:
arr = np.ones((601,350,200,146), dtype=np.float64)
arr=np.trapz(arr, axis=3)
although you'll have to scale down the size to match your memory size.
numpy.trapz provides some convenience, but the actual calculation is very simple. To avoid large temporary arrays, just implement it yourself:
In [37]: x.shape
Out[37]: (2, 4, 4, 10)
Here's the result of numpy.trapz(x, axis=3):
In [38]: np.trapz(x, axis=3)
Out[38]:
array([[[ 43. , 48.5, 46.5, 67. ],
[ 35.5, 39.5, 52.5, 35. ],
[ 44.5, 47.5, 34.5, 39.5],
[ 54. , 40. , 46.5, 50.5]],
[[ 42. , 60. , 55.5, 51. ],
[ 51.5, 40. , 52. , 42.5],
[ 48.5, 43. , 32. , 36.5],
[ 42.5, 38. , 38. , 45. ]]])
Here's the calculation written to use no large intermediate arrays. (The slice x[:,:,:,1:-1] does not copy the data associated with the array.)
In [48]: 0.5*(x[:,:,:,0] + 2*x[:,:,:,1:-1].sum(axis=3) + x[:,:,:,-1])
Out[48]:
array([[[ 43. , 48.5, 46.5, 67. ],
[ 35.5, 39.5, 52.5, 35. ],
[ 44.5, 47.5, 34.5, 39.5],
[ 54. , 40. , 46.5, 50.5]],
[[ 42. , 60. , 55.5, 51. ],
[ 51.5, 40. , 52. , 42.5],
[ 48.5, 43. , 32. , 36.5],
[ 42.5, 38. , 38. , 45. ]]])
If x has shape (m, n, p, q), the few temporary arrays that are generated in that expression all have shape (m, n, p).

Building histogram from a dict without having to iterate over the keys

I have a dict containing numpy array of varying length:
MyDcit= {0:array([[ 15. , 3.89678216],
[ 36. , 9.49245167],
[ 53. , 3.82997799],
[ 83. , 5.25727272],
[ 86. , 8.76663208]]),
1:array([[ 4. , 4.1171155 ],
[ 16. , 12.68122196],
[ 31. , 8.64805222],
[ 37. , 6.07202959]]),
2:array([]),...,
90:array([[ 1. , 1. ],
[ 24. , 8.14221573],
[ 27. , 7.36309862]])}
I would like to obtain an histogram of all the values in the dict. The solution I have now is to iterate over the keys in the dict and fill a numpy array with an histogram of fixed length:
for KeysElements in MyDict.keys():
hist,bins = numpy.histogram(np.asarray(MyDict[KeysElements])[:,1],50)
numpy_hist[KeysElements,:] = hist
I then sum up all the histograms over the fist dimension of the numpy array to obtain the histogram of all the keys of the initial dict:
Total_hist = numpy.sum(numpy_hist,axis=0)
The problems with this solutions is that I do not knwo how to handle the bins which change for each iteration, so my question is: are there any possibilities to achieve this without having to built histograms in a loop?
Thanks for any advices or links.
Greg
You don't seem to use the MyDict index values or the 0th values in the 2nd axis of your np arrays. If this is the case then you could add all the numpy arrays together and do the histogram on that
import numpy as np
MyDict = {0:np.array([[ 15. , 3.89678216],
[ 36. , 9.49245167],
[ 53. , 3.82997799],
[ 83. , 5.25727272],
[ 86. , 8.76663208]]),
1:np.array([[ 4. , 4.1171155 ],
[ 16. , 12.68122196],
[ 31. , 8.64805222],
[ 37. , 6.07202959]]),
2:np.array([]),
90:np.array([[ 1. , 1. ],
[ 24. , 8.14221573],
[ 27. , 7.36309862]])}
np_array = np.array([]).reshape(0,2)
for i in MyDict:
a = MyDict[i]
if len(a.shape) == 2 and a.shape[1] == 2:
np_array = np.append(np_array, MyDict[i], axis=0)
print(np.histogram(np_array, 50))

"m x n" dimensional gradient-style array in Python

I checked out
gradient descent using python and numpy
but it didn't solve my problem.
I'm trying to get familiar with image-processing and I want to generate a few test arrays to mess around with in Python.
Is there a method (like np.arange) to create a m x n array where the inner entries form some type of gradient?
I did an example of a naive method for generating the desired output.
Excuse my generality of the term gradient, I'm using it in it's simple meaning as smooth transition in color.
#!/usr/bin/python
import numpy as np
import matplotlib.pyplot as plt
#Set up parameters
m = 15
n = 10
A_placeholder = np.zeros((m,n))
V_m = np.arange(0,m).astype(np.float32)
V_n = np.arange(0,n).astype(np.float32)
#Iterate through combinations
for i in range(m):
m_i = V_m[i]
for j in range(n):
n_j = V_n[j]
A_placeholder[i,j] = m_i * n_j #Some combination
#Relabel
A_gradient = A_placeholder
A_placeholder = None
#Print data
print A_gradient
#[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
[ 0. 2. 4. 6. 8. 10. 12. 14. 16. 18.]
[ 0. 3. 6. 9. 12. 15. 18. 21. 24. 27.]
[ 0. 4. 8. 12. 16. 20. 24. 28. 32. 36.]
[ 0. 5. 10. 15. 20. 25. 30. 35. 40. 45.]
[ 0. 6. 12. 18. 24. 30. 36. 42. 48. 54.]
[ 0. 7. 14. 21. 28. 35. 42. 49. 56. 63.]
[ 0. 8. 16. 24. 32. 40. 48. 56. 64. 72.]
[ 0. 9. 18. 27. 36. 45. 54. 63. 72. 81.]
[ 0. 10. 20. 30. 40. 50. 60. 70. 80. 90.]
[ 0. 11. 22. 33. 44. 55. 66. 77. 88. 99.]
[ 0. 12. 24. 36. 48. 60. 72. 84. 96. 108.]
[ 0. 13. 26. 39. 52. 65. 78. 91. 104. 117.]
[ 0. 14. 28. 42. 56. 70. 84. 98. 112. 126.]]
#Show Image
plt.imshow(A_gradient)
plt.show()
I've tried np.gradient but it didn't give me the desired output.
#print np.gradient(np.array([V_m,V_n]))
#Traceback (most recent call last):
# File "Untitled.py", line 19, in <module>
# print np.gradient(np.array([V_m,V_n]))
# File "/Users/Mu/anaconda/lib/python2.7/site-packages/numpy/lib/function_base.py", line 1458, in gradient
# out[slice1] = (y[slice2] - y[slice3])
#ValueError: operands could not be broadcast together with shapes (10,) (15,)
A_placeholder[i,j] = m_i * n_j
Any operation like that can be expressed in numpy using broadcasting
A = np.arange(m)[:, None] * np.arange(n)[None, :]

Categories