Calculating the density of an MNIST database grayscale image

Calculating the density of an MNIST database grayscale image - python

Using this function:
import numpy as np
def blockshaped(arr, nrows, ncols):
''' Return an array of shape (n, nrows, ncols) where
n * nrows * ncols = arr.size
If arr is a 2D array, the returned array should look like n subblocks with
each subblock preserving the "physical" layout of arr.
'''
h, w = arr.shape
assert h % nrows == 0, "{} rows is not evenly divisble by {}".format(h, nrows)
assert w % ncols == 0, "{} cols is not evenly divisble by {}".format(w, ncols)
return (arr.reshape(h//nrows, nrows, -1, ncols)
.swapaxes(1, 2)
.reshape(-1, nrows, ncols))
I was able to divide my image into blocks of 16 pixels each.
What I want to do is calculate the density of the black pixels in each block.
I know that the values of pixels range from 0 to 255.
I wanted to do black_density = numberof_zeros / 16, but I'm not sure.

Well, if you want to know the density of black, in each block of your image you simply do
np.sum(block_shaped(img, 16, 16).reshape(-1, 16*16) <= black_threshold, axis=1)
If you are uncertain about the threshold you could try Otsu method, either by blocks or the whole image, it depends on what you really want.

Related

NumPy template matching SQDIFF with `sliding window_view`

The SQDIFF is defined as openCV definition. (I believe they omit channels)
Which in junior numpy Python should be
A = np.arange(27, dtype=np.float32)
A = A.reshape(3,3,3) # The "image"
B = np.ones([2, 2, 3], dtype=np.float32) # window
rw, rh = A.shape[0] - B.shape[0] + 1, A.shape[1] - B.shape[1] + 1 # End result size
result = np.zeros([rw, rh])
for i in range(rw):
for j in range(rh):
w = A[i:i + B.shape[0], j:j + B.shape[1]]
res = B - w
result[i, j] = np.sum(
res ** 2
)
cv_result = cv.matchTemplate(A, B, cv.TM_SQDIFF) # this result is the same as the simple for loops
assert np.allclose(cv_result, result)
This is comparatively slow solution. I have read about sliding_window_view but cannot get it correct.
# This will fail with these large arrays but is ok for smaller ones
A = np.random.rand(1028, 1232, 3).astype(np.float32)
B = np.random.rand(248, 249, 3).astype(np.float32)
locations = np.lib.stride_tricks.sliding_window_view(A, B.shape)
sqdiff = np.sum((B - locations) ** 2, axis=(-1,-2, -3, -4)) # This will fail with normal sized images
will fail with MemoryError even if the result easily fits to memory. How can I produce similar results to the cv2.matchTemplate function with this faster way?

As a last resort, you may perform the computation in tiles, instead of computing "all at once".
np.lib.stride_tricks.sliding_window_view returns a view of the data, so it doesn't consume a lot of RAM.
The expression B - locations can't use a view, and requires the RAM for storing an array with shape (781, 984, 1, 248, 249, 3) of float elements.
The total RAM for storing B - locations is 781*984*1*248*249*3*4 = 569,479,908,096 bytes.
For avoiding the need for storing B - locations at the RAM at once, we may compute sqdiff in tiles, when "tile" computation requires less RAM.
A simple tiles division is using every row as a tile - loop over the rows of sqdiff, and compute the output row by row.
Example:
sqdiff = np.zeros((locations.shape[0], locations.shape[1]), np.float32) # Allocate an array for storing the result.
# Compute sqdiff row by row instead of computing all at once.
for i in range(sqdiff.shape[0]):
sqdiff[i, :] = np.sum((B - locations[i, :, :, :, :, :]) ** 2, axis=(-1, -2, -3, -4))
Executable code sample:
import numpy as np
import cv2
A = np.random.rand(1028, 1232, 3).astype(np.float32)
B = np.random.rand(248, 249, 3).astype(np.float32)
locations = np.lib.stride_tricks.sliding_window_view(A, B.shape)
cv_result = cv2.matchTemplate(A, B, cv2.TM_SQDIFF) # this result is the same as the simple for loops
#sqdiff = np.sum((B - locations) ** 2, axis=(-1, -2, -3, -4)) # This will fail with normal sized images
sqdiff = np.zeros((locations.shape[0], locations.shape[1]), np.float32) # Allocate an array for storing the result.
# Compute sqdiff row by row instead of computing all at once.
for i in range(sqdiff.shape[0]):
sqdiff[i, :] = np.sum((B - locations[i, :, :, :, :, :]) ** 2, axis=(-1, -2, -3, -4))
assert np.allclose(cv_result, sqdiff)
I know the solution is a bit disappointing... But it is the only generic solution I could find.

is equivalent to
where the 'star' operation is a cross-correlation, the 1_[m, n] is a window the size of the template, and 1_[k, l] is a window with the size of the image.
You can compute the cross-correlation terms using 'scipy.signal.correlate' and find the matches by looking for local minima in the square difference map.
You might want to do some non-minimum suppression too.
This solution will require orders of magnitude less memory to store.
For more help, please post a reproducible example with an image and template that are valid for the algorithm. Using noise will result in meaningless outputs.

How to divide a 2D matrix into patches and multiply each patch by its center element?

I need to divide a 2D matrix into a set of 2D patches with a certain stride, then multiply every patch by its center element and sum the elements of each patch.
It feels not unlike a convolution where a separate kernel is used for every element of the matrix.
Below is a visual illustration.
The elements of the result matrix are calculated like this:
The result should look like this:
Here's a solution I came up with:
window_shape = (2, 2)
stride = 1
# Matrix
m = np.arange(1, 17).reshape((4, 4))
# Pad it once per axis to make sure the number of views
# equals the number of elements
m_padded = np.pad(m, (0, 1))
# This function divides the array into `windows`, from:
# https://stackoverflow.com/questions/45960192/using-numpy-as-strided-function-to-create-patches-tiles-rolling-or-sliding-w#45960193
w = window_nd(m_padded, window_shape, stride)
ww, wh, *_ = w.shape
w = w.reshape((ww * wh, 4)) # Two first dimensions multiplied is the number of rows
# Tile each center element for element-wise multiplication
m_tiled = np.tile(m.ravel(), (4, 1)).transpose()
result = (w * m_tiled).sum(axis = 1).reshape(m.shape)
In my view it's not very efficient as a few arrays are allocated in the intermediary steps.
What is a better or more efficient way to accomplish this?

Try scipy.signal.convolve
from scipy.signal import convolve
window_shape = (2, 2)
stride = 1
# Matrix
m = np.arange(1, 17).reshape((4, 4))
# Pad it once per axis to make sure the number of views
# equals the number of elements
m_padded = np.pad(m, (0, 1))
output = convolve(m_padded, np.ones(window_shape), 'valid') * m
print(output)
Output:
array([[ 14., 36., 66., 48.],
[150., 204., 266., 160.],
[414., 500., 594., 336.],
[351., 406., 465., 256.]])

Insert N matrices with 3 dimensions to a new variable

I need to insert 3-dimensional matrices into a new variable.
I'm trying to do that by:
Creating a 4-dimensional matrix and by promoting the fourth dimension saving the three dimensions respectively.
Sample code:
from python_speech_features import mfcc
import numpy as np
X = np.zeros((0,0,0,0),float) #4-dimensional - (0, 0, 0, 0)
ii = 0
for ii in range 1000:
data, fs = sf.read(curfile[ii])
sig = mfcc(data, fs, winstep=winstep,winlen=winlen,nfft=1024) #size - (49, 13)
sig = sig[:, :, np.newaxis] #add third-dimensional - (49, 13, 1)
X[:,:,:,ii] = sig
Error:
IndexError: index 0 is out of bounds for axis 3 with size 0
Someone can help me with that problem?

You are not creating array in right way. You cannot insert value in axis which have zero length at least specify some length for axis
X = np.zeros((10, 10, 10,1000), float)
print(X.shape)
# (10, 10, 10, 1000)
Now you can set value in whatever axis you want by simply,
X[:, :, :, 2] = 1
# this will simply set value of 3rd axis's 3rd element to 1

Either use np.stack (i think it is the best way of doing it) or create the initial array in its final size:
np.zeros((49,13,1,1000), float)
In your case

How to save 4d numpy array to images

I want to create an image date_set which includes 176 small images (128*128*3) from one big image (1408, 2048, 3).
I do the following thing:
step 1.
Load the big image and convert it to numpy array. (1408, 2048, 3) 3d array
step 2.
cut it into 176 pieces: (176, 128, 128, 3) 4d array
step 3.
I don't know how to save 176 images from 4d array in this step. Does anyone could help me to solve this problem?
Thanks very much!
from astropy.io import fits
from astropy.utils.data import download_file
image_file = download_file('https://data.sdss.org/sas/dr12/boss/photoObj/frames/301/1035/3/frame-irg-001035-3-0011.jpg', cache=True )
image = imread(image_file)
def blockshaped(arr, nrows, ncols, c):
"""
Return an array of shape (n, nrows, ncols) where
n * nrows * ncols = arr.size
If arr is a 2D array, the returned array should look like n subblocks with
each subblock preserving the "physical" layout of arr.
"""
h, w = arr.shape[:2]
return (arr.reshape(h//nrows, nrows, -1, ncols, c)
.swapaxes(1,2)
.reshape(-1, nrows, ncols, c))
a= image[:1408, :]
b= blockshaped(a, 128, 128, 3)
b.shape
b.shape = (176, 128, 128, 3)

Here's a possible way to do it.
import numpy as np
import scipy.misc
images = np.zeros((176,128,128,3))
for i in range(len(images)):
scipy.misc.imsave('date_set_' + str(i) + '.jpg', images[i,:,:,:])

Changing structure of numpy array using most common value

How can I downscale the raster data of 4*6 size into 2*3 size using 'mode' i.e., most common value with in 2*2 pixels?
import numpy as np
data=np.array([
[0,0,1,1,1,1],
[1,0,0,1,1,1],
[1,0,1,1,0,1],
[1,1,0,1,0,0]])
The result should be:
result = np.array([
[0,1,1],
[1,1,0]])

Please refer to this thread for a full explanation. The following code will calculate your desired result.
from sklearn.feature_extraction.image import extract_patches
data=np.array([
[0,0,1,1,1,1],
[1,0,0,1,1,1],
[1,0,1,1,0,1],
[1,1,0,1,0,0]])
patches = extract_patches(data, patch_shape=(2, 2), extraction_step=(2, 2))
most_frequent_number = ((patches > 0).sum(axis=-1).sum(axis=-1) > 2).astype(int)
print most_frequent_number

Here's one way to go,
from itertools import product
from numpy import empty,argmax,bincount
res = empty((data.shape[0]/2,data.shape[1]/2))
for j,k in product(xrange(res.shape[0]),xrange(res.shape[1])):
subvec = data[2*j:2*j+2,2*k:2*k+2].flatten()
res[j,k]=argmax(bincount(subvec))
This works as long as the input data contains an integer number of 2x2 blocks.
Notice that a block like [[0,0],[1,1]] will lead 0 as result, because argmax returns the index of the first occurrence only. Use res[j,k]=subvec.max()-argmax(bincount(subvec)[::-1]) if you want these 2x2 blocks to count as 1.

There appears to be more than one statistic you wish to collect about each block. Using toblocks (below) you can apply various computations to the last axis of blocks to obtain the desired statistics:
import numpy as np
import scipy.stats as stats
def toblocks(arr, nrows, ncols):
h, w = arr.shape
blocks = (arr.reshape(h // nrows, nrows, -1, ncols)
.swapaxes(1, 2)
.reshape(h // nrows, w // ncols, ncols * nrows))
return blocks
data=np.array([
[0,0,1,1,1,1],
[1,0,0,1,1,1],
[1,0,1,1,0,1],
[1,1,0,1,0,0]])
blocks = toblocks(data, 2, 2)
vals, counts = stats.mode(blocks, axis=-1)
vals = vals.squeeze()
print(vals)
# [[ 0. 1. 1.]
# [ 1. 1. 0.]]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Calculating the density of an MNIST database grayscale image - python

Related

NumPy template matching SQDIFF with `sliding window_view`

How to divide a 2D matrix into patches and multiply each patch by its center element?

Insert N matrices with 3 dimensions to a new variable

How to save 4d numpy array to images

Changing structure of numpy array using most common value

Categories

Resources