I need to add space between two lines by using OpenCV or PIL.
If the lines vary "sufficiently" in their length, then the following approach might be useful:
Inverse binarize the image using cv2.threshold.
Dilate the image with a horizontal line kernel using cv2.dilate to emphasize the lines.
Sum all pixels row-wise using np.sum, and calculate the absolute differences between the rows using np.diff.
There will be "steps" in the differences between the rows, which resemble the step between the lines. Set up a threshold and find the proper indices using np.where.
Insert white lines in the original image before the found indices using np.insert. In the below example, the index was chosen manually. Work has to be done to properly automatize this: Exclude "steps" to "background", find "steps" between multiple lines.
Here comes a code snippet:
import cv2
from matplotlib import pyplot as plt
import numpy as np
from skimage import io # Only needed for web grabbing images, use cv2.imread for local images
# Read and binarize image
image = cv2.cvtColor(io.imread('https://i.stack.imgur.com/56g7s.jpg'), cv2.COLOR_RGB2GRAY)
_, image_bin = cv2.threshold(image, 128, 255, cv2.THRESH_BINARY_INV)
# Dilate rows by using horizontal line as kernel
kernel = np.ones((1, 51), np.uint8)
image_dil = cv2.dilate(image_bin, kernel)
# Sum pixels row-wise, and calculate absolute differences between the rows
row_sum = np.sum(image_dil / 255, 1, dtype=np.int32)
row_sum_diff = np.abs(np.diff(row_sum))
# Just for visualization: Summed row-wise pixels
plt.plot(row_sum)
plt.show()
# Find "steps" in the differences between the rows
step_thr = 100
step_idx = np.where(row_sum_diff > step_thr)[0]
# Insert n lines before desired index; simple hard-coding here, more work needs to be done for multiple lines
n_lines = 5
image_mod = np.insert(image, step_idx[1] + 1, 255 * np.ones((n_lines, image.shape[1]), np.uint8), axis=0)
# Result visualization
cv2.imshow('image', image)
cv2.imshow('image_dil', image_dil)
cv2.imshow('image_mod', image_mod)
cv2.waitKey(0)
cv2.destroyAllWindows()
The dilated, inverse binarized image:
The visualization of the "steps":
The final output with n = 5 inserted white lines:
As you can see, the result isn't perfect, but that's due to the original image. In the corresponding row, you have parts of the first and second line. So, a proper distinction between these two isn't possible. One might add a very small morphological closing to the output to get rid of these artifacts.
Hope that helps!
Related
I've been researching and trying a couple functions to get what I want and I feel like I might be overthinking it.
One version of my code is below. The sample image is here.
My end goal is to find the angle (yellow) of the approximated line with respect to the frame (green line) Final
I haven't even got to the angle portion of the program yet.
The results I was obtaining from the below code were as follows. Canny Closed Small Removed
Anybody have a better way of creating the difference and establishing the estimated line?
Any help is appreciated.
import cv2
import numpy as np
pX = int(512)
pY = int(768)
img = cv2.imread('IMAGE LOCATION', cv2.IMREAD_COLOR)
imgS = cv2.resize(img, (pX, pY))
aimg = cv2.imread('IMAGE LOCATION', cv2.IMREAD_GRAYSCALE)
# Blur image to reduce noise and resize for viewing
blur = cv2.medianBlur(aimg, 5)
rblur = cv2.resize(blur, (384, 512))
canny = cv2.Canny(rblur, 120, 255, 1)
cv2.imshow('canny', canny)
kernel = np.ones((2, 2), np.uint8)
#fringeMesh = cv2.dilate(canny, kernel, iterations=2)
#fringeMesh2 = cv2.dilate(fringeMesh, None, iterations=1)
#cv2.imshow('fringeMesh', fringeMesh2)
closing = cv2.morphologyEx(canny, cv2.MORPH_CLOSE, kernel)
cv2.imshow('Closed', closing)
nb_components, output, stats, centroids = cv2.connectedComponentsWithStats(closing, connectivity=8)
#connectedComponentswithStats yields every separated component with information on each of them, such as size
sizes = stats[1:, -1]; nb_components = nb_components - 1
min_size = 200 #num_pixels
fringeMesh3 = np.zeros((output.shape))
for i in range(0, nb_components):
if sizes[i] >= min_size:
fringeMesh3[output == i + 1] = 255
#contours, _ = cv2.findContours(fringeMesh3, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
#cv2.drawContours(fringeMesh3, contours, -1, (0, 255, 0), 1)
cv2.imshow('final', fringeMesh3)
#cv2.imshow("Natural", imgS)
#cv2.imshow("img", img)
cv2.imshow("aimg", aimg)
cv2.imshow("Blur", rblur)
cv2.waitKey()
cv2.destroyAllWindows()
You can fit a straight line to the first white pixel you encounter in each column, starting from the bottom.
I had to trim your image because you shared a screen grab of it with a window decoration, title and frame rather than your actual image:
import cv2
import math
import numpy as np
# Load image as greyscale
im = cv2.imread('trimmed.jpg', cv2.IMREAD_GRAYSCALE)
# Get index of first white pixel in each column, starting at the bottom
yvals = (im[::-1,:]>200).argmax(axis=0)
# Make the x values 0, 1, 2, 3...
xvals = np.arange(0,im.shape[1])
# Fit a line of the form y = mx + c
z = np.polyfit(xvals, yvals, 1)
# Convert the slope to an angle
angle = np.arctan(z[0]) * 180/math.pi
Note 1: The value of z (the result of fitting) is:
array([ -0.74002694, 428.01463745])
which means the equation of the line you are looking for is:
y = -0.74002694 * x + 428.01463745
i.e. the y-intercept is at row 428 from the bottom of the image.
Note 2: Try to avoid JPEG format as an intermediate format in image processing - it is lossy and changes your pixel values - so where you have thresholded and done your morphology you are expecting values of 255 and 0, JPEG will lossily alter those values and you end up testing for a range or thresholding again.
Your 'Closed' image seems to quite clearly segment the two regions, so I'd suggest you focus on turning that boundary into a line that you can do something with. Connected components analysis and contour detection don't really provide any useful information here, so aren't necessary.
One quite simple approach to finding the line angle is to find the first white pixel in each row. To get only the rows that are part of your diagonal, don't include rows where that pixel is too close to either side (e.g. within 5%). That gives you a set of points (pixel locations) on the boundary of your two types of grass.
From there you can either do a linear regression to get an equation for the straight line, or you can get two points by averaging the x values for the top and bottom half of the rows, and then calculate the gradient angle from that.
An alternative approach would be doing another morphological close with a very large kernel, to end up with just a solid white region and a solid black region, which you could turn into a line with canny or findContours. From there you could either get some points by averaging, use the endpoints, or given a smooth enough result from a large enough kernel you could detect the line with hough lines.
I'm currently trying to apply an activation heatmap to a photo.
Currently, I have the original photo, as well as a mask of probabilities. I multiply the probabilities by 255 and then round down to the nearest integer. I'm then using cv2.applyColorMap with COLORMAP.JET to apply the colormap to the image with an opacity of 25%.
img_cv2 = cv2.cvtColor(np_img, cv2.COLOR_RGB2BGR)
heatmapshow = np.uint8(np.floor(mask * 255))
colormap = cv2.COLORMAP_JET
heatmapshow = cv2.applyColorMap(np.uint8(heatmapshow - 255), colormap)
heatmap_opacity = 0.25
image_opacity = 1.0 - heatmap_opacity
heatmap_arr = cv2.addWeighted(heatmapshow, heatmap_opacity, img_cv2, image_opacity, 0)
This current code successfully produces a heatmap. However, I'd like to be able to make two changes.
Keep the opacity at 25% For all values above a certain threshold (Likely > 0, but I'd prefer more flexibility), but then when the mask is below that threshold, reduce the opacity to 0% for those cells. In other words, if there is very little activation, I want to preserve the color of the original image.
If possible I'd also like to be able to specify a custom colormap, since the native ones are pretty limited, though I might be able to get away without this if I can do the custom opacity thing.
I read on Stackoverflow that you can possibly trick cv2 into not overlaying any color with NaN values, but also read that only works for floats and not ints, which complicates things since I'm using int8. I'm also concerned that this functionality could change in the future as I don't believe this is intentional design purposefully built into cv2.
Does anyone have a good way of accomplishing these goals? Thanks!
With regard to your second question:
Here is how to create a simple custom two color gradient color map in Python/OpenCV.
Input:
import cv2
import numpy as np
# load image as grayscale
img = cv2.imread('lena_gray.png', cv2.IMREAD_GRAYSCALE)
# convert to 3 equal channels
img = cv2.merge((img, img, img))
# create 1 pixel red image
red = np.full((1, 1, 3), (0,0,255), np.uint8)
# create 1 pixel blue image
blue = np.full((1, 1, 3), (255,0,0), np.uint8)
# append the two images
lut = np.concatenate((red, blue), axis=0)
# resize lut to 256 values
lut = cv2.resize(lut, (1,256), interpolation=cv2.INTER_LINEAR)
# apply lut
result = cv2.LUT(img, lut)
# save result
cv2.imwrite('lena_red_blue_lut_mapped.png', result)
# display result
cv2.imshow('RESULT', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result of colormap applied to image:
With regard to your first question:
You are blending the heat map image with the original image using a constant "opacity" value. You can replace the single opacity value with an image. You just have to do the addWeighted manually as heatmap * opacity_img + original * (1-opacity_img) where your opacity image is float in the range 0 to 1. Then clip and convert back to uint8. If your opacity image is binary, then you can use cv2.bitWiseAnd() in place of multiply.
I need to get the information in the below barcode with the Python pyzbar library, but it does not recognize it. Should I make any improvement before using pyzbar?
this is the code:
from pyzbar.pyzbar import decode
import cv2
def barcodeReader(image):
gray_img = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
barcodes = decode(gray_img)
barcode = barcodeReader("My_image")
print (barcode)
Result: []
You could try to reconstruct the barcode by:
Inverse binarizing the image with cv2.threshold, such that you get white lines on black background.
Counting all non-zero pixels along the rows using np.count_nonzero.
Getting all indices, where the count exceeds a pre-defined threshold, let's say 100 here.
On a new, all white image, drawing black lines at the found indices.
Here's some code:
import cv2
import numpy as np
from skimage import io # Only needed for web grabbing images, use cv2.imread for local images
# Read image from web, convert to grayscale, and inverse binary threshold
image = cv2.cvtColor(io.imread('https://i.stack.imgur.com/D8Jk7.jpg'), cv2.COLOR_RGB2GRAY)
_, image_thr = cv2.threshold(image, 128, 255, cv2.THRESH_BINARY_INV)
# Count non-zero pixels along the rows; get indices, where count exceeds certain threshold (here: 100)
row_nz = np.count_nonzero(image_thr, axis=0)
idx = np.argwhere(row_nz > 100)
# Generate new image, draw lines at found indices
image_new = np.ones_like(image_thr) * 255
image_new[35:175, idx] = 0
cv2.imshow('image_thr', image_thr)
cv2.imshow('image_new', image_new)
cv2.waitKey(0)
cv2.destroyAllWindows()
The inverse binarized image:
The reconstructed image:
I'm not sure, if the result is a valid barcode. To improve the solution you could get rid of the numbers beforehand. Also, play around with the threshold.
Hope that helps!
You can follow below approach:
Using morphological operation detect vertical lines and stored the xmin ymin, xmax and ymax of vertical image.
sort all xmin values and grouped them based distance.
do same excercise ymin and ymax and grouped them.
consider smallest pixels values from larger group of xmin and ymin larger group respectively.
consider largest values from larger group of xmax and ymax larger grop respectively.
you will get exact xmin,ymin,xmax,ymax of barcode.
I am really new to opencv and a beginner to python.
I have this image:
I want to somehow apply proper thresholding to keep nothing but the 6 digits.
The bigger picture is that I intend to try to perform manual OCR to the image for each digit separately, using the k-nearest neighbours algorithm on a per digit level (kNearest.findNearest)
The problem is that I cannot clean up the digits sufficiently, especially the '7' digit which has this blue-ish watermark passing through it.
The steps I have tried so far are the following:
I am reading the image from disk
# IMREAD_UNCHANGED is -1
image = cv2.imread(sys.argv[1], cv2.IMREAD_UNCHANGED)
Then I'm keeping only the blue channel to get rid of the blue watermark around digit '7', effectively converting it to a single channel image
image = image[:,:,0]
# openned with -1 which means as is,
# so the blue channel is the first in BGR
Then I'm multiplying it a bit to increase contrast between the digits and the background:
image = cv2.multiply(image, 1.5)
Finally I perform Binary+Otsu thresholding:
_,thressed1 = cv2.threshold(image,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
As you can see the end result is pretty good except for the digit '7' which has kept a lot of noise.
How to improve the end result? Please supply the image example result where possible, it is better to understand than just code snippets alone.
You can try to medianBlur the gray(blur) image with different kernels(such as 3, 51), divide the blured results, and threshold it. Something like this:
#!/usr/bin/python3
# 2018/09/23 17:29 (CST)
# (中秋节快乐)
# (Happy Mid-Autumn Festival)
import cv2
import numpy as np
fname = "color.png"
bgray = cv2.imread(fname)[...,0]
blured1 = cv2.medianBlur(bgray,3)
blured2 = cv2.medianBlur(bgray,51)
divided = np.ma.divide(blured1, blured2).data
normed = np.uint8(255*divided/divided.max())
th, threshed = cv2.threshold(normed, 100, 255, cv2.THRESH_OTSU)
dst = np.vstack((bgray, blured1, blured2, normed, threshed))
cv2.imwrite("dst.png", dst)
The result:
Why not just keep values in the image that are above a certain threshold?
Like this:
import cv2
import numpy as np
img = cv2.imread("./a.png")[:,:,0] # the last readable image
new_img = []
for line in img:
new_img.append(np.array(list(map(lambda x: 0 if x < 100 else 255, line))))
new_img = np.array(list(map(lambda x: np.array(x), new_img)))
cv2.imwrite("./b.png", new_img)
Looks great:
You could probably play with the threshold even more and get better results.
It doesn't seem easy to completely remove the annoying stamp.
What you can do is flattening the background intensity by
computing a lowpass image (Gaussian filter, morphological closing); the filter size should be a little larger than the character size;
dividing the original image by the lowpass image.
Then you can use Otsu.
As you see, the result isn't perfect.
I tried a slightly different approach then Yves on the blue channel:
Apply median filter (r=2):
Use Edge detection (e.g. Sobel operator):
Automatic thresholding (Otsu)
Closing of the image
This approach seems to make the output a little less noisy. However, one has to address the holes in the numbers. This can be done by detecting black contours which are completely surrounded by white pixels and simply filling them with white.
I have two images, one with only background and the other with background + detectable object (in my case its a car). Below are the images
I am trying to remove the background such that I only have car in the resulting image. Following is the code that with which I am trying to get the desired results
import numpy as np
import cv2
original_image = cv2.imread('IMG1.jpg', cv2.IMREAD_COLOR)
gray_original = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
background_image = cv2.imread('IMG2.jpg', cv2.IMREAD_COLOR)
gray_background = cv2.cvtColor(background_image, cv2.COLOR_BGR2GRAY)
foreground = np.absolute(gray_original - gray_background)
foreground[foreground > 0] = 255
cv2.imshow('Original Image', foreground)
cv2.waitKey(0)
The resulting image by subtracting the two images is
Here is the problem. The expected resulting image should be a car only.
Also, If you take a deep look in the two images, you'll see that they are not exactly same that is, the camera moved a little so background had been disturbed a little. My question is that with these two images how can I subtract the background. I do not want to use grabCut or backgroundSubtractorMOG algorithm right now because I do not know right now whats going on inside those algorithms.
What I am trying to do is to get the following resulting image
Also if possible, please guide me with a general way of doing this not only in this specific case that is, I have a background in one image and background+object in the second image. What could be the best possible way of doing this. Sorry for such a long question.
I solved your problem using the OpenCV's watershed algorithm. You can find the theory and examples of watershed here.
First I selected several points (markers) to dictate where is the object I want to keep, and where is the background. This step is manual, and can vary a lot from image to image. Also, it requires some repetition until you get the desired result. I suggest using a tool to get the pixel coordinates.
Then I created an empty integer array of zeros, with the size of the car image. And then I assigned some values (1:background, [255,192,128,64]:car_parts) to pixels at marker positions.
NOTE: When I downloaded your image I had to crop it to get the one with the car. After cropping, the image has size of 400x601. This may not be what the size of the image you have, so the markers will be off.
Afterwards I used the watershed algorithm. The 1st input is your image and 2nd input is the marker image (zero everywhere except at marker positions). The result is shown in the image below.
I set all pixels with value greater than 1 to 255 (the car), and the rest (background) to zero. Then I dilated the obtained image with a 3x3 kernel to avoid losing information on the outline of the car. Finally, I used the dilated image as a mask for the original image, using the cv2.bitwise_and() function, and the result lies in the following image:
Here is my code:
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Load the image
img = cv2.imread("/path/to/image.png", 3)
# Create a blank image of zeros (same dimension as img)
# It should be grayscale (1 color channel)
marker = np.zeros_like(img[:,:,0]).astype(np.int32)
# This step is manual. The goal is to find the points
# which create the result we want. I suggest using a
# tool to get the pixel coordinates.
# Dictate the background and set the markers to 1
marker[204][95] = 1
marker[240][137] = 1
marker[245][444] = 1
marker[260][427] = 1
marker[257][378] = 1
marker[217][466] = 1
# Dictate the area of interest
# I used different values for each part of the car (for visibility)
marker[235][370] = 255 # car body
marker[135][294] = 64 # rooftop
marker[190][454] = 64 # rear light
marker[167][458] = 64 # rear wing
marker[205][103] = 128 # front bumper
# rear bumper
marker[225][456] = 128
marker[224][461] = 128
marker[216][461] = 128
# front wheel
marker[225][189] = 192
marker[240][147] = 192
# rear wheel
marker[258][409] = 192
marker[257][391] = 192
marker[254][421] = 192
# Now we have set the markers, we use the watershed
# algorithm to generate a marked image
marked = cv2.watershed(img, marker)
# Plot this one. If it does what we want, proceed;
# otherwise edit your markers and repeat
plt.imshow(marked, cmap='gray')
plt.show()
# Make the background black, and what we want to keep white
marked[marked == 1] = 0
marked[marked > 1] = 255
# Use a kernel to dilate the image, to not lose any detail on the outline
# I used a kernel of 3x3 pixels
kernel = np.ones((3,3),np.uint8)
dilation = cv2.dilate(marked.astype(np.float32), kernel, iterations = 1)
# Plot again to check whether the dilation is according to our needs
# If not, repeat by using a smaller/bigger kernel, or more/less iterations
plt.imshow(dilation, cmap='gray')
plt.show()
# Now apply the mask we created on the initial image
final_img = cv2.bitwise_and(img, img, mask=dilation.astype(np.uint8))
# cv2.imread reads the image as BGR, but matplotlib uses RGB
# BGR to RGB so we can plot the image with accurate colors
b, g, r = cv2.split(final_img)
final_img = cv2.merge([r, g, b])
# Plot the final result
plt.imshow(final_img)
plt.show()
If you have a lot of images you will probably need to create a tool to annotate the markers graphically, or even an algorithm to find markers automatically.
The problem is that you're subtracting arrays of unsigned 8 bit integers. This operation can overflow.
To demonstrate
>>> import numpy as np
>>> a = np.array([[10,10]],dtype=np.uint8)
>>> b = np.array([[11,11]],dtype=np.uint8)
>>> a - b
array([[255, 255]], dtype=uint8)
Since you're using OpenCV, the simplest way to achieve your goal is to use cv2.absdiff().
>>> cv2.absdiff(a,b)
array([[1, 1]], dtype=uint8)
I recommend using OpenCV's grabcut algorithm. You first draw a few lines on the foreground and background, and keep doing this until your foreground is sufficiently separated from the background. It is covered here: https://docs.opencv.org/trunk/d8/d83/tutorial_py_grabcut.html
as well as in this video: https://www.youtube.com/watch?v=kAwxLTDDAwU