Python image processing - noise removal

Python image processing - noise removal - python

I'm trying to de-noise an image which looks like this:
How can the white noisy pixels be removed? Initially I thought a median filter should suffice, but it doesn't look like it. Moreover, the image is RGB. Any thoughts?
I use Python.
(img: https://www.dropbox.com/s/aiyxkswtiyji7cw/exampledenoise.png?dl=0)

Median blur, thresholding, comparisons, boolean expressions.
Unless you know why this picture was damaged and how, this is just guesswork.
In the PNG the defects all have an exact value of 255. Defects happen in dark areas of the picture. I'm guessing that something caused an underflow in those pixels, wrapping them from 0 or -1 to 255.
# using my own "imshow" shim for jupyter notebook, which handles boolean arrays
im = cv.imread("exampledenoise.png")
# imshow((im == 255).any(axis=2))
# estimating local brightness... so we DON'T "fix" bright spots
ksize = 13 # least value that appears to suppress all defects
med = cv.medianBlur(im, ksize=ksize)
# imshow(med)
# imshow(med >= 64) # candidate areas/channels.
# true = values likely explained by overexposure (don't fix)
# false = values likely explained by defect
fixup = (med < 64) & (im == 255)
assert len(fixup.shape) == 3 # fix individual channels
# imshow(fixup.any(axis=2)) # gonna fix values in these pixels
output = im.copy()
output[fixup] = 0 # assuming the damage was from some underflow, so 0 is appropriate
# imshow(output)

Related

How do I identify a bright region within an image as accurately as possible?

I have the following problem. In the image below, a so-called teardrop-like DNA is shown.
In red, a trace yielded by skimage.morphology.skeletonize is shown.
The bright yellow region is a molecular complex called "Streptavidin", which I need to identify.
In essence, my problem revolves around two steps:
I need to identify the brightest spot in the image. For that, I apply a Gaussian filter to the image and subsequently try a thresholding algorithm (for example, skimage.filters.threshold_yen).
Then, I use skimage.measure.regionprops.centroid to identify the Streptavidin.
However, this approach is prone to errors: Since I need not only identify this single image, but do the same for other images as well, I cannot use a global threshold. For some images, Yen Thresholding works fine, for some it doesn't yield a result at all. So my question here would be: Is there an alternative way to identify the centroid of the Streptavidin complex, given that the various image display different intenstity values, i.e. a global approach is difficult to implement?
I then want to delete all trace points (in red) that lie within the Streptavidin complex. Here, I wanted to use a boolean mask representing the Streptavidin complex and then tell the program to delete all points that lie within the complex.
But again, this is very dependent on the actual boolean mask I get from thresholding, so I would like to think of another way to do it.
Overall, here is my code for reference:
img_gaussian = filters.gaussian(mol_filtered, sigma=1.)
# Get threshold (use yen threshold)
if self.anal_pars['strep_thresh'] is None:
thresh = filters.threshold_yen(img_gaussian)
# If Yen thresholding fails:
if thresh < 0.8:
thresh = filters.threshold_minimum(img_gaussian)
else:
thresh = self.anal_pars['strep_thresh']
# Split the image according to threshold selected
img_bw = copy.deepcopy(img_gaussian)
img_bw[img_bw < thresh] = 0
img_bw[img_bw != 0] = 1
# Label the regions that lie above the threshold
img_labelled = morphology.label(img_bw)
# Check if the thresholding yielded more than one area:
if img_labelled.max() != 1:
# Remove possible artefacts (happens for Yen filter)
img_bw = morphology.remove_small_objects(img_bw.astype(bool))
img_labelled = morphology.label(img_bw) # relabel image
for region in measure.regionprops(img_labelled):
strep_index = region.centroid
mol_pars['strep_index'] = strep_index
# Add the streptavidin area to the mol_pars dictionary
img_strep = copy.deepcopy(img_bw)
mol_pars['img_strep'] = img_strep
So this seems quite messy and inefficient, there surely is a faster way to do it?
Thanks for your help

cv2 SimpleBlobDetector difficulties

I try to detect some dots on some dice for a hobby project.
The dice are cut free nicely and there are more or less good photos of them.
The photos are then gray-scaled, treated anti-noise with a median filter and made into np.array so that cv2 likes them.
Now I try to count the pips and... well... SimpleBlobDetector finds blobs everywhere depending on the exact adjustments of the parameters, or no blob at all, and never those most obvious pips on top. Especially if I activate "circularity" more than 0.7 or "Inertia", it finds nothing at all.
What's wrong here? Do I have to preprocess my picture further, or is it in the parameters below?
I tried to exchange the filter with Gaussian Blur, I also tried to posterize the image in order to have fairly large areas with exactly the same greyscale value. Nothing helps.
I tried many variants by now, this current variant results in the picture below.
blob_params = cv2.SimpleBlobDetector_Params()
blob_params.minDistBetweenBlobs = 1
blob_params.filterByCircularity = True
blob_params.minCircularity = 0.5
blob_params.minThreshold = 1
blob_params.thresholdStep = 1
blob_params.maxThreshold = 255
#blob_params.filterByInertia = False
#blob_params.minInertiaRatio = 0.2
blob_params.minArea = 6
blob_params.maxArea = 10000
detector = cv2.SimpleBlobDetector_create(blob_params)
keypoints = detector.detect(die_np) # detect blobs... just it doesn't work
image = cv2.drawKeypoints(die_np,
keypoints,
np.array([]),
(255,255,0),
cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
im = Image.fromarray(image)
im

Okay, and the solution is:
There are some parameters set before you even start. Those parameters are optimized for finding dark and round blobs in a light background. Not the case in my image! This was taking me by surprise - after printing out every single parameter and correcting everything what was not fitting to my use case, I was finally able to use this function. Now it easily finds the dice pips on several different kind of dice, I've tried.
This would have been easier if the openCV parameter list would have been a little bit more accessible, as dict for example, or better commented.
I'll leave my code here - it contains a comprehensive list of parameters that this function uses, in the comments also remarks how to use some of these, and a function to use the function. I also leave the link to the simpleBlobDetector documentation here, as it is somewhat difficult to find in the net.
Coming from C/C++ background, this function totally not pythonic. It does not accept every image format as it would be normal in python context, instead it wants a very specific data type. In python at least it would tell you what it wants instead, but this function keeps quiet in order not to destroy the riddle. It then goes on to throw an exception without explanation.
So I was making a function that at least accepts numpy in colored and in black/white format, and PIL.Image. That opens the usability of this function up a little bit.
OpenCV, when you read this - normally in python a readable text is provided for each function in '''triple marks'''. Emphasis on "readable".
import cv2
import numpy as np
from PIL import Image
def blobize(im, blob_params):
'''
Takes an image and tries to find blobs on this image.
Parameters
----------
im : nd.array, single colored. In case of several colors, the first
color channel is taken. Alternatively you can provide an
PIL.Image, which is then converted to "L" and used directly.
blob_params : a cv2.SimpleBlobDetector_Params() parameter list
Returns
-------
blobbed_im : A greyscale image with the found blobs circled in red
keypoints : an OpenCV keypoint list.
'''
if Image.isImageType(im):
im = np.array(im.convert("L"))
if isinstance(im, np.ndarray):
if (len(im.shape) >= 3
and im.shape[2] > 1):
im = im[:,:,0]
detector = cv2.SimpleBlobDetector_create(blob_params)
try:
keypoints = detector.detect(im)
except:
keypoints = None
if keypoints:
blobbed_im = cv2.drawKeypoints(im, keypoints, np.array([]),
(255,0,0), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
else:
blobbed_im = im
return blobbed_im, keypoints
blob_params = cv2.SimpleBlobDetector_Params()
# images are converted to many binary b/w layers. Then 0 searches for dark blobs, 255 searches for bright blobs. Or you set the filter to "false", then it finds bright and dark blobs, both.
blob_params.filterByColor = False
blob_params.blobColor = 0
# Extracted blobs have an area between minArea (inclusive) and maxArea (exclusive).
blob_params.filterByArea = True
blob_params.minArea = 3. # Highly depending on image resolution and dice size
blob_params.maxArea = 400. # float! Highly depending on image resolution.
blob_params.filterByCircularity = True
blob_params.minCircularity = 0. # 0.7 could be rectangular, too. 1 is round. Not set because the dots are not always round when they are damaged, for example.
blob_params.maxCircularity = 3.4028234663852886e+38 # infinity.
blob_params.filterByConvexity = False
blob_params.minConvexity = 0.
blob_params.maxConvexity = 3.4028234663852886e+38
blob_params.filterByInertia = True # a second way to find round blobs.
blob_params.minInertiaRatio = 0.55 # 1 is round, 0 is anywhat
blob_params.maxInertiaRatio = 3.4028234663852886e+38 # infinity again
blob_params.minThreshold = 0 # from where to start filtering the image
blob_params.maxThreshold = 255.0 # where to end filtering the image
blob_params.thresholdStep = 5 # steps to go through
blob_params.minDistBetweenBlobs = 3.0 # avoid overlapping blobs. must be bigger than 0. Highly depending on image resolution!
blob_params.minRepeatability = 2 # if the same blob center is found at different threshold values (within a minDistBetweenBlobs), then it (basically) increases a counter for that blob. if the counter for each blob is >= minRepeatability, then it's a stable blob, and produces a KeyPoint, otherwise the blob is discarded.
res, keypoints = blobize(im2, blob_params)

Proper image thresholding to prepare it for OCR in python using opencv

I am really new to opencv and a beginner to python.
I have this image:
I want to somehow apply proper thresholding to keep nothing but the 6 digits.
The bigger picture is that I intend to try to perform manual OCR to the image for each digit separately, using the k-nearest neighbours algorithm on a per digit level (kNearest.findNearest)
The problem is that I cannot clean up the digits sufficiently, especially the '7' digit which has this blue-ish watermark passing through it.
The steps I have tried so far are the following:
I am reading the image from disk
# IMREAD_UNCHANGED is -1
image = cv2.imread(sys.argv[1], cv2.IMREAD_UNCHANGED)
Then I'm keeping only the blue channel to get rid of the blue watermark around digit '7', effectively converting it to a single channel image
image = image[:,:,0]
# openned with -1 which means as is,
# so the blue channel is the first in BGR
Then I'm multiplying it a bit to increase contrast between the digits and the background:
image = cv2.multiply(image, 1.5)
Finally I perform Binary+Otsu thresholding:
_,thressed1 = cv2.threshold(image,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
As you can see the end result is pretty good except for the digit '7' which has kept a lot of noise.
How to improve the end result? Please supply the image example result where possible, it is better to understand than just code snippets alone.

You can try to medianBlur the gray(blur) image with different kernels(such as 3, 51), divide the blured results, and threshold it. Something like this：
#!/usr/bin/python3
# 2018/09/23 17:29 (CST)
# (中秋节快乐)
# (Happy Mid-Autumn Festival)
import cv2
import numpy as np
fname = "color.png"
bgray = cv2.imread(fname)[...,0]
blured1 = cv2.medianBlur(bgray,3)
blured2 = cv2.medianBlur(bgray,51)
divided = np.ma.divide(blured1, blured2).data
normed = np.uint8(255*divided/divided.max())
th, threshed = cv2.threshold(normed, 100, 255, cv2.THRESH_OTSU)
dst = np.vstack((bgray, blured1, blured2, normed, threshed))
cv2.imwrite("dst.png", dst)
The result：

Why not just keep values in the image that are above a certain threshold?
Like this:
import cv2
import numpy as np
img = cv2.imread("./a.png")[:,:,0] # the last readable image
new_img = []
for line in img:
new_img.append(np.array(list(map(lambda x: 0 if x < 100 else 255, line))))
new_img = np.array(list(map(lambda x: np.array(x), new_img)))
cv2.imwrite("./b.png", new_img)
Looks great:
You could probably play with the threshold even more and get better results.

It doesn't seem easy to completely remove the annoying stamp.
What you can do is flattening the background intensity by
computing a lowpass image (Gaussian filter, morphological closing); the filter size should be a little larger than the character size;
dividing the original image by the lowpass image.
Then you can use Otsu.
As you see, the result isn't perfect.

I tried a slightly different approach then Yves on the blue channel:
Apply median filter (r=2):
Use Edge detection (e.g. Sobel operator):
Automatic thresholding (Otsu)
Closing of the image
This approach seems to make the output a little less noisy. However, one has to address the holes in the numbers. This can be done by detecting black contours which are completely surrounded by white pixels and simply filling them with white.

How to remove small connected objects using OpenCV

I use OpenCV and Python and I want to remove the small connected object from my image.
I have the following binary image as input:
The image is the result of this code:
dilation = cv2.dilate(dst,kernel,iterations = 2)
erosion = cv2.erode(dilation,kernel,iterations = 3)
I want to remove the objects highlighted in red:
How can I achieve this using OpenCV?

How about with connectedComponentsWithStats (doc):
# find all of the connected components (white blobs in your image).
# im_with_separated_blobs is an image where each detected blob has a different pixel value ranging from 1 to nb_blobs - 1.
nb_blobs, im_with_separated_blobs, stats, _ = cv2.connectedComponentsWithStats(im)
# stats (and the silenced output centroids) gives some information about the blobs. See the docs for more information.
# here, we're interested only in the size of the blobs, contained in the last column of stats.
sizes = stats[:, -1]
# the following lines result in taking out the background which is also considered a component, which I find for most applications to not be the expected output.
# you may also keep the results as they are by commenting out the following lines. You'll have to update the ranges in the for loop below.
sizes = sizes[1:]
nb_blobs -= 1
# minimum size of particles we want to keep (number of pixels).
# here, it's a fixed value, but you can set it as you want, eg the mean of the sizes or whatever.
min_size = 150
# output image with only the kept components
im_result = np.zeros_like(im_with_separated_blobs)
# for every component in the image, keep it only if it's above min_size
for blob in range(nb_blobs):
if sizes[blob] >= min_size:
# see description of im_with_separated_blobs above
im_result[im_with_separated_blobs == blob + 1] = 255
Output :

In order to remove objects automatically you need to locate them in the image.
From the image you provided I see nothing that distinguishes the 7 highlighted items from others.
You have to tell your computer how to recognize objects you don't want. If they look the same, this is not possible.
If you have multiple images where the objects always look like that you could use template matching techniques.
Also the closing operation doesn't make much sense to me.

#For isolated or unconnected blobs: Try this (you can set noise_removal_threshold to whatever you like and make it relative to the largest contour for example or a nominal value like 100 or 25).
mask = np.zeros_like(img)
for contour in contours:
area = cv2.contourArea(contour)
if area > noise_removal_threshold:
cv2.fillPoly(mask, [contour], 255)

Removing small connected components by area is called area opening. OpenCV does not have this as a function, it can be implemented as shown in other answers. But most other image processing packages will have an area opening function.
For example using scikit-image:
import skimage
import imageio.v3 as iio
img = iio.imread('cQMZm.png')[:,:,0]
out = skimage.morphology.area_opening(img, area_threshold=150, connectivity=2)
For example using DIPlib:
import diplib as dip
out = dip.AreaOpening(img, filterSize=150, connectivity=2)
PS: The DIPlib implementation is noticeably faster. Disclaimer: I'm an author of DIPlib.

Python - find out how much of an image is black

I am downloading satellite pictures like this
(source: u0553130 at home.chpc.utah.edu)
Since some images are mostly black, like this one, I don't want to save it.
How can I use python to check if the image is more than 50% black?

You're dealing with gifs which are mostly grayscale by the look of your example image, so you might expect most of the RGB components to be equal.
Using PIL:
from PIL import Image
im = Image.open('im.gif')
pixels = im.getdata() # get the pixels as a flattened sequence
black_thresh = 50
nblack = 0
for pixel in pixels:
if pixel < black_thresh:
nblack += 1
n = len(pixels)
if (nblack / float(n)) > 0.5:
print("mostly black")
Adjust your threshold for "black" between 0 (pitch black) and 255 (bright white) as appropriate).

The thorough way is to count the pixels using something like PIL, as given in the other answers.
However, if they're all compressed images, you may be able to check the file size, as images with lots of plain-colour areas should compress a lot more than ones with variation like the cloud cover.
With some tests, you could at least find a heuristic of which images with lots of cloud you know you can instantly discard without expensive looping over their pixels. Others closer to 50% can be checked pixel by pixel.
Additionally, when iterating over the pixels, you don't need to count all the black pixels, and then check if at least 50% are black. Instead, stop counting and discard as soon as you know at least 50% are black.
A second optimisation: if you know the images are generally mostly cloudy rather than mostly black, go the other way. Count the number of non-black pixels, and stop and keep the images as soon as that crosses 50%.

Load image
Read each pixel and increment result if pixel = (0,0,0)
If result =< (image.width * image.height)/2
Save image
Or check if it's almost black by returning true if your pixel R (or G or B) component is less that 15 for example.

Utilizing your test image, the most common color has an RGB value of (1, 1, 1). This is very black, but not exactly black. My answer utilizes the PIL library, webcolors and a generous helping of code from this answer.
from PIL import Image
import webcolors
def closest_color(requested_color):
min_colors = {}
for key, name in webcolors.css3_hex_to_names.items():
r_c, g_c, b_c = webcolors.hex_to_rgb(key)
rd = (r_c - requested_color[0]) ** 2
gd = (g_c - requested_color[1]) ** 2
bd = (b_c - requested_color[2]) ** 2
min_colors[(rd + gd + bd)] = name
return min_colors[min(min_colors.keys())]
def get_color_name(requested_color):
try:
closest_name = actual_name = webcolors.rgb_to_name(requested_color)
except ValueError:
closest_name = closest_color(requested_color)
actual_name = None
return actual_name, closest_name
if __name__ == '__main__':
lt = Image.open('test.gif').convert('RGB').getcolors()
lt.sort(key=lambda tup:tup[0], reverse=True)
actual_name, closest_name = get_color_name(lt[0][4])
print lt[0], actual_name, closest_name
Output:
(531162, (1, 1, 1)) None black
In this case, you'd be interested in the closest_name variable. The first (lt[0]) is showing you the most common RGB value. This doesn't have a defined web color name, hence the None for actual_name
Explanation:
This is opening the file you've provided, converting it to RGB and then running PIL's getcolors method on the image. The result of this is a list of tuples in the format (count, RGB_color_value). I then sort the list (in reverse order). Utilizing the functions from the other answer, I pass the most common RGB color value (now the first tuple in the list and the RBG is the second element in the tuple).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python image processing - noise removal - python

Related

How do I identify a bright region within an image as accurately as possible?

cv2 SimpleBlobDetector difficulties

Proper image thresholding to prepare it for OCR in python using opencv

How to remove small connected objects using OpenCV

Python - find out how much of an image is black

Categories

Resources