I have been stumped on this problem for a while now and was wondering if anyone would be able to help. So let's say I have a binary image as shown below and I would like to count the black elements (zero). The problem is I want to know the number of elements associated with 'background' and 'trapezoid' in the middle individually, so output two values. What would be the easiest way to approach this? I have been trying to do it without using a mask but is that even possible? I have the numpy and scipy libraries if that helps.
You can use two functions from scipy.ndimage.measurements: label and find_objects.
First you invert the array, because label function considers zero to be the background.
inverted = 1 - binary_image_array
Then you call label to find the different regions:
labeled_array, num_features = scipy.ndimage.measurements.label(inverted)
So, for this particular array, where you already know there are exactely two black blobs, you have the two regions in labeled_array.
Obviously, the scipy approach is a good answer.
I was thinking that you might be able to work with numpy.cumsum and numpy.diff to find an enclosed area.
The cumulative sum will be zero while you are in the black area, then increase by one for every pixel in the white area, be stable again while you traverse the enclosed area, then start increasing again, etc.
The second order difference then finds places where the jumps occur, and you are left with a "classified" map. No guarantee that this generalizes, just an idea.
a = numpy.zeros((10,10))
a[3:7,3:7] = 1
a[4:6, 4:6] = 0
y = numpy.cumsum(a, axis=0)
x = numpy.cumsum(a, axis=1)
yy= numpy.diff(y, n=2, axis=0)
xx = numpy.diff(x, n=2, axis=1)
numpy.dot(xx,yy)
array([[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 2., 2., 2., 2., 0., 0., 0.],
[ 0., 0., 0., 2., 4., 4., 2., 0., 0., 0.],
[ 0., 0., 0., 2., 4., 4., 2., 0., 0., 0.],
[ 0., 0., 0., 2., 2., 2., 2., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
Related
This question already has an answer here:
Finding connected components in a pixel-array
(1 answer)
Closed 6 months ago.
I have the following problem that I wanted to solve using opencv or scikit-image.
Suppose I have a "map" in the following form:
1 is ground 0 is water
map = np.array([
[ 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0.],
[ 0., 1., 1., 1., 0., 0.],
[ 0., 0., 1., 0., 1., 0.],
[ 0., 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 0., 0.]])
How many islands in the map? considering 4 neighbors. In this example there is 2
Given an (i,j) position, return the number of ground neighbors.
example: (2,2) -> 4
Solving question no. 1 with Scikit-Image: The measure module will be your friend. Please checkout the documentation for it.
import numpy as np
from skimage import measure
import matplotlib.pyplot as plt
img = np.array([ [ 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0.],
[ 0., 1., 1., 1., 0., 0.],
[ 0., 0., 1., 0., 1., 0.],
[ 0., 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 0., 0.]])
imglabeled,island_count = measure.label(img,background=0,return_num=True,connectivity=1)
plt.imshow(imglabeled)
I have:
mask = mask_model(input_spectrogram)
mask = torch.round(mask).float()
torch.set_printoptions(precision=4)
print(mask.size(), input_spectrogram.size(), masked_input.size())
print(mask[0][-40:], mask[0].size())
This prints:
tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0.], grad_fn=<SliceBackward>) torch.Size([400])
But I want it to print 1.0000 instead of 1.. Why won't my set_precision do it? Even when I converted to float()?
Unfortunately, this is simply how PyTorch displays tensor values. Your code works fine, if you do print(mask * 1.1), you can see that PyTorch does indeed print out 4 decimal values when the tensor values can no longer be represented as integers.
How can I remove the NaN rows from the array below using indices (since I will need to remove the same rows from a different array.
array([[[nan, 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.]],
[[ 0., 0., 0., 0.],
[ 0., nan, 0., 0.],
[ 0., 0., 0., 0.]]])
I get the indices of the rows to be removed by using the command
a[np.isnan(a).any(axis=2)]
But using what I would normally use on a 2D array does not produce the desired result, losing the array structure.
a[~np.isnan(a).any(axis=2)]
array([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])
How can I remove the rows I want using the indices obtained from my first command?
You need to reshape:
a[~np.isnan(a).any(axis=2)].reshape(a.shape[0], -1, a.shape[2])
But be aware that the number of NaN-rows at each 2D subarray should be the same to get a new 3D array.
So lets say I have a (4,10) array initialized to zeros, and I have an input array in the form [2,7,0,3]. The input array will modify the zeros matrix to look like this:
[[0,0,1,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,1,0,0],
[1,0,0,0,0,0,0,0,0,0],
[0,0,0,1,0,0,0,0,0,0]]
I know I can do that by looping through the input target and indexing the matrix array with something like matrix[i][target in input target], but I tried to do it without a loop doing something like:
matrix[:, input_target] = 1, but that sets me the entire matrix to all 1.
Apparently the way to do it is:
matrix[range(input_target.shape[0]), input_target], the question is why this works and not using the colon ?
Thanks!
You only wish to update one column for each row. Therefore, with advanced indexing you must explicitly provide those row identifiers:
A = np.zeros((4, 10))
A[np.arange(A.shape[0]), [2, 7, 0, 3]] = 1
Result:
array([[ 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
[ 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.]])
Using a colon for the row indexer will tell NumPy to update all rows for the specified columns:
A[:, [2, 7, 0, 3]] = 1
array([[ 1., 0., 1., 1., 0., 0., 0., 1., 0., 0.],
[ 1., 0., 1., 1., 0., 0., 0., 1., 0., 0.],
[ 1., 0., 1., 1., 0., 0., 0., 1., 0., 0.],
[ 1., 0., 1., 1., 0., 0., 0., 1., 0., 0.]])
I am doing audio analysis in Python. My end goal is to get a list of frequencies and their respective volumes, like { frequency : volume (0.0 - 1.0) }.
I have my audio data as a list of frames with values between -1.0 and +1.0. I used numpy's fourier transform on this list — numpy.fftpack.fft(). But the resulting data makes no sense to me.
I do understand that the fourier transform transforms from the time to the frequency domain, but not quite how it mathematically works. That's why I don't quite understand the results.
What do the values in the list that numpy.fftpack.fft() returns mean? How do I work with it/interpret it?
What would be the max/min values of the fourier transform performed on a list as described above be?
How can I get to my end goal of a dictionary in the form { frequency : volume (0.0 - 1.0) }?
Thank you. Sorry if my lack of understanding of the fourier transform made you facepalm.
Consider the FFT of a single period of a sine wave:
>>> t = np.linspace(0, 2*np.pi, 100)
>>> x = np.sin(t)
>>> f = np.fft.rfft(x)
>>> np.round(np.abs(f), 0)
array([ 0., 50., 1., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0.])
The FFT returns an array of complex numbers which give the amplitude and phase of the frequencies. Assuming you're only interested in the amplitude, I've used np.abs to get the magnitude for each frequency and rounded it to the nearest integer using np.round(__, 0). You can see the spike at index 1 indicating a sin wave with period equal to the number of samples was found.
Now make the wave a bit more complex
>>> x = np.sin(t) + np.sin(3*t) + np.sin(5*t)
>>> f = np.fft.rfft(x)
>>> np.round(np.abs(f), 0)
array([ 0., 50., 1., 50., 0., 48., 4., 2., 2., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0.])
We now see spikes at indicies 1, 3 & 5 corresponding to our input. Sine waves with periods of n, n/3 and n/5 (where n in the number of input samples).
EDIT
Here's a good conceptual explanation of the Fourier transform: http://betterexplained.com/articles/an-interactive-guide-to-the-fourier-transform/