Generate pixel density map (heatmap) from image with numpy array manipulation - python

The specific problem I try to solve is:
I have a binary image binary map that I want to generate a heatmap (density map) for, my idea is to get the 2D array of this image, let say it is 12x12
a = np.random.randint(20, size=(12, 12));
index and process it with a fixed-size submatrix (let say 3x3), so for every submatrix, a pixel percentage value will be calculated (nonzero pixels/total pixel).
submatrix = a[0:3, 0:3]
pixel_density = np.count_nonzero(submatrix) / submatrix.size
At the end, all the percentage values will made up a new 2D array (a smaller, 4x4 density array) that represent the density estimation of the original image. Lower resolution is fine because the data it will be compared to has a lower resolution as well.
I am not sure how to do that through numpy, especially for the indexing part. Also if there is a better way for generating heatmap like that, please let me know as well.
Thank you!

Maybe a 2-D convolution? Basically this will sweep through the a matrix with the b matrix, which is just 1s below. So it will do the summation you were looking for. This link has a visual demo of convolution near the bottom.
import numpy as np
from scipy import signal
a = np.random.randint(2, size=(12, 12))
b = np.ones((4,4))
signal.convolve2d(a,b, 'valid') / b.sum()

Related

Is it possible to reverse an image if we did a dot product between noise and the image?

I did dot product of the image with a noise.
import numpy as np
np.random.seed(100)
x = grayscale.shape[0]
y = grayscale.shape[1]
noise = np.random.rand(x,y)
noise_dot_img = grayscale.dot(noise)
plt.imshow(noise_dot_img, cmap = "gray")
Image with noise
Original image
Apologies for the horrible formatting but stack overflow doesn't support latex.
The dot product between two vectors (if they are NxM matrices you can just drop the transpose since dot product between matrices is defined as matrix multiplication in numpy) A and B is A dot B = AB^T
If A is your original image and B is the noise matrix you can reverse it by multiplying your final image matrix with the inverse of B^T (if it has one), since matrix multiplication is associative.
So to get your original matrix A = A dot B * (B^T)^-1
EDIT: for clarity here is some example code:
import numpy as np
A = np.random.randint(10, size=(3, 3))
B = np.random.randint(10, size=(3, 3))
image_with_noise = A.dot(B)
noise_inverse = np.linalg.inv(B)
recreated_image = np.matmul(image_with_noise, noise_inverse)
I think you should share some more information about what exactly you are trying to achieve here.
In any case, you actually can get your image back in this specific example, by inverting the noise matrix and multiplying with it the noisy image:
inv = np.linalg.inv(noise)
restored_img = noise_dot_img#inv
However, there are a lot of things that need explaining. Overall, this is not really how we tackle this problem, since we almost never know the "noise" matrix. This is why signal processing exists. Also, in this example you are dealing with a square image. Otherwise, we would not be able to find the inverse (and we would have to use the pseudo-inverse). That said, one should always be careful before deciding to invert matrices.

Extract mask from 3D RGB image using a 1D Boolean array

I have a 3D image which is a numpy array of shape (1314, 489, 3) and looks as follows:
Now I want to calculate the mean RGB color value of the mask (the cob without the black background). Calculating the RGB value for the whole image is easy:
print(np.mean(colormaskcutted, axis=(0, 1)))
>>[186.18434633 88.89164511 46.32022921]
But now I want this mean RGB color value only for the cob. I have a 1D boolean mask
array for the mask with this shape where one value corresponds to all of the 3 color channel values: (1314, 489)
I tried slicing the image array for the mask, as follows:
print(np.mean(colormaskcutted[boolean[:,:,0]], axis=(0, 1)))
>>124.57794089613752
But this returned only one value instead of 3 values for the RGB color.
How can I filter the 3D numpy image for a 1D boolean mask so that the mean RGB color calculation can be performed?
If your question is limited to computing the mean, you don't necessarily need to subset the image. You can simply do, e.g.
np.sum(colormaskcutted*boolean[:,:,None], axis = (0,1))/np.sum(boolean)
P.S. I've played around with indexing, you can amend your original approach as follows:
np.mean(colormaskcutted[boolean,:], axis = 0)
P.P.S. Can't resist some benchmarking. So, the summation approach takes 15.9s (1000 iterations, dimensions like in the example, old computer); the advanced indexing approach is slightly longer, at 17.7s. However, the summation can be optimized further. Using count_nonzero as per Mad Physicist suggestion marginally improves the time to 15.3s. We can also use tensordot to skip creating a temporary array:
np.tensordot(colormaskcutted, boolean, axes = [[0,1], [0,1]])/np.count_nonzero(msk)
This cuts the time to 4.5s.

3D local averages and using 3D convolution

I'm new to python and am far more familliar with Matlab. If my question is ill suited for this forum, don't hesitate to point it out.
I'm trying to make local averages at a very fast speed. It's like I'm trying to reduce the number of pixel in an image, by making an average of multiple pixels for each new pixel, except I'm doing it in 3D.
Imagine a 1000x1000x6 arrays. I'm dividing this array in multiple tiny arrays of 10x10x3. I then want to calculate the mean of all those tiny arrays and put them back together to build back my array.
The way I did it on Matlab was with convn(array,seed,'valid'), which is a multi-dimension convolution function.
What would be the easiest way to do it in python?
Thanks
RMT
I think the closest thing that you can find to the convn is the SciPy's convolve. Below is the example
import numpy as np
from scipy.ndimage import convolve
M = np.random.random((1000, 1000, 6))
seed = np.ones((3, 3, 3)) * 0.1 / 27.
N = convolve(M, seed, mode='constant', cval=0)
The mode='constant', cval=0 is just zero-padding.
Not sure if that's what you need, but that's a start
Doc: https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.ndimage.filters.convolve.html

Calculating 3D pixel variance from 4D array

Let there be some 4D array [x,y,z,k] comprised of k 3D images [x,y,z].
Is there any way to calculate the variance of each individual pixel in 3D from the 4D array?
E.g. I have a 10x10x10x5 array and would like to return a 10x10x10 variance array; the variance is calculated for each pixel (or voxel, really) along k
If this doesn't make sense, let me know and I'll try explaining better.
Currently, my code is:
tensors = []
while error > threshold:
for _ in range(5): #arbitrary
new_tensor = foo(bar) #always returns array of same size
tensors.append(new_tensor)
tensors = np.stack(tensors, axis = 3)
#tensors.shape
And I would like the calculate a variance array for tensors
There is a simple way to do that if you're using numpy:
variance = tensors.var(axis=3)

python hcluster, distance matrix and condensed distance matrix

I'm using the module hcluster to calculate a dendrogram from a distance matrix. My distance matrix is an array of arrays generated like this:
import hcluster
import numpy as np
mols = (..a list of molecules)
distMatrix = np.zeros((10, 10))
for i in range(0,10):
for j in range(0,10):
sim = OETanimoto(mols[i],mols[j]) # a function to calculate similarity between molecules
distMatrix[i][j] = 1 - sim
I then use the command distVec = hcluster.squareform(distMatrix) to convert the matrix into a condensed vector and calculate the linkage matrix with vecLink = hcluster.linkage(distVec).
All this works fine but if I calculate the linkage matrix using the distance matrix and not the condensed vector matLink = hcluster.linkage(distMatrix) I get a different linkage matrix (the distances between the nodes are a lot larger and topology is slightly different)
Now I'm not sure whether this is because hcluster only works with condensed vectors or whether I'm making mistakes on the way there.
Thanks for your help!
I knocked up a quick random example similar to yours and experienced the same problem.
In the docstring it does say :
Performs hierarchical/agglomerative clustering on the
condensed distance matrix y. y must be a :math:{n \choose 2} sized
vector where n is the number of original observations paired
in the distance matrix.
However, having had a quick look at the code, it seems like the intent is for it to both work with vector shaped and matrix shaped code:
In hierachy.py there is a switch based upon the shape of the matrix.
It seems however that the key bit of info is in the function linkage's docstring:
- Q : ndarray
A condensed or redundant distance matrix. A condensed
distance matrix is a flat array containing the upper
triangular of the distance matrix. This is the form that
``pdist`` returns. Alternatively, a collection of
:math:`m` observation vectors in n dimensions may be passed as
a :math:`m` by :math:`n` array.
So I think that the interface doesn't allow the passing of a distance matrix.
Instead it thinks you are passing it m observation vectors in n dimensions .
Hence the difference in result?
Does that seem reasonable?
Else just take a look at the code itself I'm sure you'll be able to debug it and figure out why your examples are different.
Cheers
Matt

Categories