I'm trying to determine if an image is squared(pixelated).
I've heard of 2D fourrier transform with numpy or scipy but it is a bit complicated.
The goal is to determine an amount of squared zone due to bad compression like this (img a):
I have no idea if this would work - but, something you could try is to get the nearest neighbors around a pixel. The pixellated squares will be a visible jump in RGB values around a region.
You can find the nearest neighbors for every pixel in an image with something like
def get_neighbors(x,y, img):
ops = [-1, 0, +1]
pixels = []
for opy in ops:
for opx in ops:
try:
pixels.append(img[x+opx][y+opy])
except:
pass
return pixels
This will give you the nearest pixels in a region of your source image.
To use it, you'd do something like
def detect_pixellated(fp):
img = misc.imread(fp)
width, height = np.shape(img)[0:2]
# Pixel change to detect edge
threshold = 20
for x in range(width):
for y in range(height):
neighbors = get_neighbors(x, y, img)
# Neighbors come in this order:
# 6 7 8
# 3 4 5
# 0 1 2
center = neighbor[4]
del neighbor[4]
for neighbor in neighbors:
diffs = map(operator.abs, map(operator.sub, neighbor, center))
possibleEdge = all(diff > threshold for diff in diffs)
After further thought though, use OpenCV and do edge detection and get contour sizes. That would be significantly easier and more robust.
If you scan through lines of it it's abit easier because then you deal with linear graphs instead of 2d image graphs, which is always simpler.
Solution:
scan a line across the pixels, put the line in an array if it is faster to access for computations, and then run algorithms on the line(s) to determine the blockiness:
1/ run through every pixel in your line and compare it to the previous pixel by substracting the value between the two pixels. make an array of previous pixel values. if large jumps in pixel values are at regular invervals, it's blocky. if there are large jumps in values combined with small jumps in values, it's blocky... you can assume that if there are many equal pixel differences, it's blocky, especially if you repeat the analysis twice at 2 and 4 neighbour pixel intervals, and on multiple lines.
you can also make graphs of pixel differences between pixels 3-5-10 pixels apart, to have additional information on gradient changes of sampled lines of pics. if the ratio of pixel differences of neighbour pixels and 5th neighbour pixels is similar, it also indicates unsmooth colors.
there can be many algorythms, including fast fourrier on a linear graph, same as audio, that you would use on line(s) from the pic, that is simpler than a 2d image algorythm.
Related
Problem Statement: after successfully getting the boundary box around the object in yolo, i wanted to separate the background from the object itself.
My Solution: i have an RGB-D camera that returns a depth map as well as the image (image is given to yolo obv) , using the depth map , i made a simple function to get the depths (rounded) and how many pixels have that same value
def GetAllDepthsSortedMeters(depth_image_ocv):
_depth = depth_image_ocv[np.isfinite(depth_image_ocv)]
_depth= -np.sort(-depth_image_ocv)[:int(len(_depth)/2)]
_depth= np.round(_depth,1)
unique, counts = np.unique(_depth, return_counts=True)
return dict(zip(counts, unique))
and plotting them, i noticed that there are dominant peaks and the rest lay around them, after some filtering i was able to successfully get those peaks each time.
#get the values of depths and their number of occurences
counts,values = GetKeysAndValues(_depths)
#find the peaks of depths in those values
peaks = find_peaks_cwt(counts, widths=np.ones(counts.shape)*2)-1
using those peaks, i was able to segment the required object from the background by checking what peaks is this value close to, and make a mask for each peak(and pixels around it).
def GetAcceptedMasks(h,w,depth_map,depths_of_accepted_peaks,accepted_masks):
prev=None
prev_index=None
for pos in product(range(h), range(w)):
pixel = depth_map.item(pos)
if ( (prev is not None) and (round(prev,1) == round(pixel,1)) ):
accepted_masks[prev_index][pos[0],pos[1]]= 255
else:
_temp_array = abs(depths_of_accepted_peaks-pixel)
_min = np.amin(_temp_array)
_ind = np.where( _temp_array == _min )[0][0]
accepted_masks[_ind][pos[0],pos[1]]= 255
prev_index = _ind
prev = pixel
return accepted_masks
after passing the image through YOLOv3 and applying the filtering and depth segmentation, it takes 0.8s which is far from optimal,
it's mostly result of above funcution, any help would be amazing. thank you
this is masks i get at the end
Mask1-Of-Closest-Depth
Mask2-Of-2nd-Closest-Depth
Mask3-Of-3rd-Closest-Depth
Edit:
Example of distance:
[0.60000002 1.29999995 1.89999998]
Example of DepthMap when show with imshow:
Example of Depth Map
Here's a way to do it.
Make an array of floats the same height and width as your image, and with the final dimension equal to the number of unique depths you want to identify
At each pixel location, calculate the distance to each of the three desired depths and store in the final dimension
Use np.argmin(..., axis=2) to select the nearest depth of the three
I am not at a computer to test, and your image is not your actual image but rather a picture of it with window decorations and title bar and different values, but something like this:
import cv2
# Load the image as greyscale float - so we can store positive and negative distances
im = cv2.imread('depth.png', cv2.IMREAD_GRAYSCALE).astype(np.float)
# Make list of the desired depths
depths = [255, 181, 125]
# Make array with distance to each depth
d2each = np.zeros(((im.shape[0],im.shape[1],len(depths)), dtype=np.float)
for i in range(len(depths)):
d2each[...,i] = np.abs(im - depths[i])
# Now let Numpy choose nearest of three distances
mask = np.argmin(d2each, axis=2)
Another way, is to range test the distances. Load the image as above:
# Make mask of pixels matching first distance
d0 = np.logical_and(im>100, im<150)
# Make mask of pixels matching second distance
d1 = np.logical_and(im>180, im<210)
# Make mask of pixels matching third distance
d2 = im >= 210
Those masks will be logical (i.e. True/False), but if you want to make them black and white, just multiply them by 255 and cast with mask0 = d0.astype(np.uint8)
Another approach could be to use K-means clustering.
I am trying to find all the circular particles in the image attached. This is the only image I am have (along with its inverse).
I have read this post and yet I can't use hsv values for thresholding. I have tried using Hough Transform.
circles = cv2.HoughCircles(img, cv2.HOUGH_GRADIENT, dp=0.01, minDist=0.1, param1=10, param2=5, minRadius=3,maxRadius=6)
and using the following code to plot
names =[circles]
for nums in names:
color_img = cv2.imread(path)
blue = (211,211,211)
for x, y, r in nums[0]:
cv2.circle(color_img, (x,y), r, blue, 1)
plt.figure(figsize=(15,15))
plt.title("Hough")
plt.imshow(color_img, cmap='gray')
The following code was to plot the mask:
for masks in names:
black = np.zeros(img_gray.shape)
for x, y, r in masks[0]:
cv2.circle(black, (x,y), int(r), 255, -1) # -1 to draw filled circles
plt.imshow(black, gray)
Yet I am only able to get the following mask which if fairly poor.
This is an image of what is considered a particle and what is not.
One simple approach involves slightly eroding the image, to separate touching circular objects, then doing a connected component analysis and discarding all objects larger than some chosen threshold, and finally dilating the image back so the circular objects are approximately of the original size again. We can do this dilation on the labelled image, such that you retain the separated objects.
I'm using DIPlib because I'm most familiar with it (I'm an author).
import diplib as dip
a = dip.ImageRead('6O0Oe.png')
a = a(0) > 127 # the PNG is a color image, but OP's image is binary,
# so we binarize here to simulate OP's condition.
separation = 7 # tweak these two parameters as necessary
size_threshold = 500
b = dip.Erosion(a, dip.SE(separation))
b = dip.Label(b, maxSize=size_threshold)
b = dip.Dilation(b, dip.SE(separation))
Do note that the image we use here seems to be a zoomed-in screen grab rather than the original image OP is dealing with. If so, the parameters must be made smaller to identify the smaller objects in the smaller image.
My approach is based on a simple observation that most of the particles in your image have approximately same perimeter and the "not particles" have greater perimeter than them.
First, have a look at the RANSAC algorithm and how does it find inliers and outliers. It basically is for 2D data but we will have to transform it to 1D data in our case.
In your case, I am calling inliers to the correct particles and Outliers to incorrect particles.
Our data on which we have to work on will be the perimeter of these particles. To get the perimeter, find contours in this image and get the perimeter of each contour. Refer this for information about Contours.
Now we have the data, knowledge about RANSAC algo and our simple observation mentioned above. Now in this data, we have to find the most dense and compact cluster which will contain all the inliers and others will be outliers.
Now let's assume the inliers are in the range of 40-60 and the outliers are beyond 60. Let's define a threshold value T = 0. We say that for each point in the data, inliers for that point are in the range of (value of that point - T, value of that point + T).
Now first iterate over all the points in the data and count number of inliers to that point for a T and store this information. Find the maximum number of inliers possible for a value of T. Now increment the value of T by 1 and again find the maximum number of inliers possible for that T. Repeat these steps by incrementing value of T one by one.
There will be a range of values of T for which Maximum number of inliers are the same. These inliers are the particles in your image and the particles having perimeter greater than these inliers are the outliers thus the "not particles" in your image.
I have tried this algorithm in my test cases which are similar to your and it works. I am always able to determine the outliers. I hope it works for you too.
One last thing, I see that boundary of your particles are irregular and not smooth, try to make them smooth and use this algorithm if this doesn't work for you in this image.
We need to detect whether the images produced by our tunable lens are blurred or not.
We want to find a proxy measure for blurriness.
My current thinking is to first apply Sobel along the x direction because the jumps or the stripes are mostly along this direction. Then computing the x direction marginal means and finally compute the standard deviation of these marginal means.
We expect this Std is bigger for a clear image and smaller for a blurred one because clear images shall have a large intensity or more bigger jumps of pixel values.
But we get the opposite results. How could we improve this blurriness measure?
def sobel_image_central_std(PATH):
# use the blue channel
img = cv2.imread(PATH)[:,:,0]
# extract the central part of the image
hh, ww = img.shape
hh2 = hh // 2
ww2 = ww// 2
hh4 = hh // 4
ww4 = hh //4
img_center = img[hh4:(hh2+hh4), ww4:(ww2+ww4)]
# Sobel operator
sobelx = cv2.Sobel(img_center, cv2.CV_64F, 1, 0, ksize=3)
x_marginal = sobelx.mean(axis = 0)
plt.plot(x_marginal)
return(x_marginal.std())
Blur #1
Blur #2
Clear #1
Clear #2
In general:
Is there a way to detect if an image is blurry?
You can combine calculation this with your other question where you are searching for the central angle.
Once you have the angle (and the center, maybe outside of the image) you can make an axis transformation to remove the circular component of the cone. Instead you get x (radius) and y (angle) where y would run along the circular arcs.
Maybe you can get the center of the image from the camera set-up.
Then you don't need to calculate it using the intersection of the edges from the central angle. Or just do it manually once if it is fixed for all images.
Look at polar coordinate systems.
Due to the shape of the cone the image will be more dense at the peak but this should be a fixed factor. But this will probably bias the result when calculation the blurriness along the transformed image.
So what you could to correct this is create a synthetic cone image with circular lines and do the transformation on it. Again, requires some try-and-error.
But it should deliver some mask that you could use to correct the "blurriness bias".
I am trying to warp an 640x360 image via the OpenCV remap function (in python 2.7). The steps executed are the following
Generate a curve and store its x and y coordinates in two seperate arrays, curve_x and curve_y.I am attaching the generated curve as an image(using pyplot):
Load image via the opencv imread function
original = cv2.imread('C:\\Users\\User\\Desktop\\alaskan-landscaps3.jpg')
Execute a nested for loop so that each pixel is shifted upwards in proportion to the height of the curve at that point.For each pixel I calculate a warping factor by dividing the distance between the curve's y coordinate and the "ceiling" (360) by the height of the image. The factor is then multiplied with the distance between the pixel's y-coordinate and the "ceiling" in order to find the new distance that the pixel must have from the "ceiling" (it will be shorter since we have an upward shift). Finally I subtract this new distance from the "ceiling" to obtain the new y-coordinate for the pixel. I thought of this formula in order to ensure that all entries in the map_y array used in the remap function will be within the area of the original image.
for i in range(0, y_size):
for j in range(0,x_size):
map_y[i][j]= y_size-((y_size - i) * ((y_size - curve_y[j]) / y_size))
map_x[i][j]=j`
Then using the remap function
warped=cv2.remap(original,map_x,map_y,cv2.INTER_LINEAR)
The resulting image appears to be warped somewhat along the curve's path but it is cropped - I am attaching both the original and resulting image
I know I must be missing something but I can't figure out where the mistake is in my code - I don't understand why since all y-coordinates in map_y are between 0-360 the top third of the image has disappeared following the remapping
Any pointers or help will be appreciated. Thanks
[EDIT:] I have edited my function as follows:
#array to store previous y-coordinate, used as a counter during mapping process
floor_y=np.zeros((x_size),np.float32)
#for each row and column of picture
for i in range(0, y_size):
for j in range(0,x_size):
#calculate distance between top of the curve at given x coordinate and top
height_above_curve = (y_size-1) - curve_y_points[j]
#calculated a mapping factor, using total height of picture and distance above curve
mapping_factor = (y_size-1)/height_above_curve
# if there was no curve at given x-coordinate then do not change the pixel coordinate
if(curve_y_points[j]==0):
map_y[i][j]=j
#if this is the first time the column is traversed, save the curve y-coordinate
elif (floor_y[j]==0):
#the pixel is translated upwards according to the height of the curve at that point
floor_y[j]=i+curve_y_points[j]
map_y[i][j]=i+curve_y_points[j] # new coordinate saved
# use a modulo operation to only translate each nth pixel where n is the mapping factor.
# the idea is that in order to fit all pixels from the original picture into a new smaller space
#(because the curve squashes the picture upwards) a number of pixels must be removed
elif ((math.floor(i % mapping_factor))==0):
#increment the "floor" counter so that the next group of pixels from the original image
#are mapped 1 pixel higher up than the previous group in the new picture
floor_y[j]=floor_y[j]+1
map_y[i][j]=floor_y[j]
else:
#for pixels that must be skipped map them all to the last pixel actually translated to the new image
map_y[i][j]=floor_y[j]
#all x-coordinates remain unchanges as we only translate pixels upwards
map_x[i][j] = j
#printout function to test mappings at x=383
for j in range(0, 360):
print('At x=383,y='+str(j)+'for curve_y_points[383]='+str(curve_y_points[383])+' and floor_y[383]='+str(floor_y[383])+' mapping is:'+str(map_y[j][383]))
The bottom line is that now the higher part of the image should not receive mappings from the lowest part so overwriting of pixels should not take place. Yet i am still getting a hugely exaggerated upwards warping effect in the picture which I cannot explain. (see new image below).The top of the curved part is at around y=140 in the original picture yet now is very close to the top i.e y around 300. There is also the question of why I am not getting a blank space at the bottom for the pixels below the curve.
I'm thinking that maybe there is also something going on with the order of rows and columns in the map_y array?
I don't think the image is being cropped. Rather, the values are "crowded" in the top-middle pixels, so that they get overwritten. Consider the following example with a simple function on a checkerboard.
import numpy as np
import cv2
import pickle
y_size=200
x_size=200
x=np.linspace(0,x_size,x_size+1)
y=(-(x-x_size/2)*(x-x_size/2))/x_size+x_size
plt.plot(x,y)
The function looks like this:
Then let's produce an image with a regular pattern.
test=np.zeros((x_size,y_size),dtype=np.float32)
for i in range(0, y_size):
for j in range(0,x_size):
if i%2 and j%2:
test[i][j]=255
cv2.imwrite('checker.png',test)
Now let's apply your shift function to that pattern:
map_y=np.zeros((x_size,y_size),dtype=np.float32)
map_x=np.zeros((x_size,y_size),dtype=np.float32)
for i in range(0, y_size):
for j in range(0,x_size):
map_y[i][j]= y_size-((y_size - i) * ((y_size - y[j]) / y_size))
map_x[i][j]=j
warped=cv2.remap(test,map_x,map_y,cv2.INTER_LINEAR)
cv2.imwrite('warped.png',warped)
If you notice, because of the shift, more than one value corresponds to the top-middle areas, which makes it look like it is cropped. But if you check to the top left and right corners of the image, notice that the values are sparser, thus the "cropping" effect does not occur much. I hope the simple example helps better to understand what is going on.
What would be the approach to trim an image that's been input using a scanner and therefore has a large white/black area?
the entropy solution seems problematic and overly intensive computationally. Why not edge detect?
I just wrote this python code to solve this same problem for myself. My background was dirty white-ish, so the criteria that I used was darkness and color. I simplified this criteria by just taking the smallest of the R, B or B value for each pixel, so that black or saturated red both stood out the same. I also used the average of the however many darkest pixels for each row or column. Then I started at each edge and worked my way in till I crossed a threshold.
Here is my code:
#these values set how sensitive the bounding box detection is
threshold = 200 #the average of the darkest values must be _below_ this to count (0 is darkest, 255 is lightest)
obviousness = 50 #how many of the darkest pixels to include (1 would mean a single dark pixel triggers it)
from PIL import Image
def find_line(vals):
#implement edge detection once, use many times
for i,tmp in enumerate(vals):
tmp.sort()
average = float(sum(tmp[:obviousness]))/len(tmp[:obviousness])
if average <= threshold:
return i
return i #i is left over from failed threshold finding, it is the bounds
def getbox(img):
#get the bounding box of the interesting part of a PIL image object
#this is done by getting the darekest of the R, G or B value of each pixel
#and finding were the edge gest dark/colored enough
#returns a tuple of (left,upper,right,lower)
width, height = img.size #for making a 2d array
retval = [0,0,width,height] #values will be disposed of, but this is a black image's box
pixels = list(img.getdata())
vals = [] #store the value of the darkest color
for pixel in pixels:
vals.append(min(pixel)) #the darkest of the R,G or B values
#make 2d array
vals = np.array([vals[i * width:(i + 1) * width] for i in xrange(height)])
#start with upper bounds
forupper = vals.copy()
retval[1] = find_line(forupper)
#next, do lower bounds
forlower = vals.copy()
forlower = np.flipud(forlower)
retval[3] = height - find_line(forlower)
#left edge, same as before but roatate the data so left edge is top edge
forleft = vals.copy()
forleft = np.swapaxes(forleft,0,1)
retval[0] = find_line(forleft)
#and right edge is bottom edge of rotated array
forright = vals.copy()
forright = np.swapaxes(forright,0,1)
forright = np.flipud(forright)
retval[2] = width - find_line(forright)
if retval[0] >= retval[2] or retval[1] >= retval[3]:
print "error, bounding box is not legit"
return None
return tuple(retval)
if __name__ == '__main__':
image = Image.open('cat.jpg')
box = getbox(image)
print "result is: ",box
result = image.crop(box)
result.show()
For starters, Here is a similar question. Here is a related question. And a another related question.
Here is just one idea, there are certainly other approaches. I would select an arbitrary crop edge and then measure the entropy* on either side of the line, then proceed to re-select the crop line (probably using something like a bisection method) until the entropy of the cropped-out portion falls below a defined threshold. As I think, you may need to resort to a brute root-finding method as you will not have a good indication of when you have cropped too little. Then repeat for the remaining 3 edges.
*I recall discovering that the entropy method in the referenced website was not completely accurate, but I could not find my notes (I'm sure it was in a SO post, however.)
Edit:
Other criteria for the "emptiness" of an image portion (other than entropy) might be contrast ratio or contrast ratio on an edge-detect result.