For clarity, I am a beginner in Python. I'm creating a script that replaces the eyes with a robotic one from a series of images in a folder. I'm stuck on the seamlessClone function that keeps throwing this error.
The code is incomplete as far as writing out the entire script (it only clones one eye so far), but everything should be working. I've been stuck on this one for 6 hours and thought I would ask on here.
I've tried checking my filenames, paths, seeing if the image file was corrupt by using print, changing the image files with different dimensions, filetypes (png, jpg) and so on.
I've also tried converting every numpy array(cv2_image, Eye) into a 32bit array to see if that was the issue, but nothing prevailed.
# Import
from PIL import Image, ImageDraw, ImageFilter
from statistics import mean
import face_recognition
import cv2
import glob
import numpy as np
# Open Eye Images
Eye = cv2.imread('eye.jpg')
# Loop Through Images
for filename in glob.glob('images/*.jpg'):
cv2_image = cv2.imread(filename)
image = face_recognition.load_image_file(filename)
face_landmarks_list = face_recognition.face_landmarks(image)
for facemarks in face_landmarks_list:
# Get Eye Data
eyeLPoints = facemarks['left_eye']
eyeRPoints = facemarks['right_eye']
npEyeL = np.array(eyeLPoints)
npEyeR = np.array(eyeRPoints)
# Create Mask
mask = np.zeros(cv2_image.shape, cv2_image.dtype)
mask.fill(0)
poly = np.array([eyeLPoints])
cv2.fillPoly(mask, [poly], (255,255, 255))
# Get Eye Image Centers
npEyeL = np.array(eyeLPoints)
eyeLCenter = npEyeL.mean(axis=0).astype("int")
x1 = (eyeLCenter[0])
x2 = (eyeLCenter[1])
print(x1, x2)
# Get Head Rotation (No code yet)
# Apply Seamless Clone To Main Image
saveImage = cv2.seamlessClone(Eye, cv2_image, mask ,(x1, x2) , cv2.NORMAL_CLONE)
# Output and Save
cv2.imwrite('output/output.png', saveImage)
Here is the full error:
Traceback (most recent call last):
File "stack.py", line 45, in <module>
saveImage = cv2.seamlessClone(Eye, cv2_image, mask ,(x1, x2) , cv2.NORMAL_CLONE)
cv2.error: OpenCV(4.1.1) /io/opencv/modules/core/src/matrix.cpp:466: error: (-215:Assertion failed) 0 <= roi.x && 0 <= roi.width && roi.x + roi.width <= m.cols && 0 <= roi.y && 0 <= roi.height && roi.y + roi.height <= m.rows in function 'Mat'
The result I'm expecting is for the eye image to be cloned onto the original image, but this error keeps being thrown, preventing me from completing the script. If there were any hint as to what's going on, I feel that the culprit is the "Eye" file, but I could be wrong. Any help would be appreciated.
There are two sets of keypoints:
There is a set of keypoints for the eye in the destination image:
There is a set of keypoints for the eye in the source image:
You are using the wrong set of keypoints to make the mask for the cv2.seamlessClone() function. You should be using the keypoints from the source image. The mask needs to be a TWO channel image, in your code (among the other problems) you are using a THREE channel image.
This is the result. You can see there should also be a resize function to match the sizes of the eyes:
This is the code that I used:
import face_recognition
import cv2
import numpy as np
# Open Eye Images
eye = cv2.imread('eye.jpg')
# Open Face image
face = cv2.imread('face.jpeg')
# Get Keypoints
image = face_recognition.load_image_file('face.jpeg')
face_landmarks_list = face_recognition.face_landmarks(image)
for facemarks in face_landmarks_list:
# Get Eye Data
eyeLPoints = facemarks['left_eye']
eyeRPoints = facemarks['right_eye']
npEyeL = np.array(eyeLPoints)
# These points define the contour of the eye in the EYE image
poly_left = np.array([(51, 228), (100, 151), (233, 102), (338, 110), (426, 160), (373, 252), (246, 284), (134, 268)], np.int32)
# Create a mask for the eye
src_mask = np.zeros(face.shape, face.dtype)
cv2.fillPoly(src_mask, [poly_left], (255, 255, 255))
cv2.imwrite('src_mask.png', src_mask)
# Find where the eye should go
center, r = cv2.minEnclosingCircle(npEyeL)
center = tuple(np.array(center, int))
# Clone seamlessly.
output = cv2.seamlessClone(eye, face, src_mask, center, cv2.NORMAL_CLONE)
Related
I have this image and I want to read the text on it but pytesseract returns blank
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
import cv2
import numpy as np
import math
from scipy import ndimage
import easyocr
import pytesseract
img = cv2.imread('cikti.jpg')
scale_percent = 220 # percent of original size
width = int(img.shape[1] * scale_percent / 100)
height = int(img.shape[0] * scale_percent / 100)
dim = (width, height)
# resize image
img = cv2.resize(img, dim, interpolation = cv2.INTER_AREA)
cv2.imshow('img', img)
cv2.waitKey(0)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray, 50, 150, apertureSize=3)
cv2.imshow('edges', edges)
cv2.waitKey(0)
angles = []
lines = cv2.HoughLinesP(edges, 1, math.pi / 180.0, 90)
for [[x1, y1, x2, y2]] in lines:
#cv2.line(img, (x1, y1), (x2, y2), (255, 0, 0), 3)
angle = math.degrees(math.atan2(y2 - y1, x2 - x1))
if(angle != 0):
angles.append(angle)
print(angles)
median_angle = np.median(angles)
img = ndimage.rotate(img, median_angle)
print(median_angle)
filiter = np.array([[-1,-1,-1],
[-1,9,-1],
[-1,-1,-1]])
cv2.imshow('filitird', img)
cv2.waitKey(0)
reader = easyocr.Reader(['tr'])
ocr_result = reader.readtext(img,)
print(ocr_result)
cv2.imshow('result', img)
k = cv2.waitKey(0)
cv2.destroyAllWindows()
here is the code i wrote
It may be because of the long distance, but enlarging the picture did not solve my problem.
what should I do
I was able successfully to read this image with tesseract by doing the following:
cropping out the pink border
reducing to grayscale (binarising)
running tesseract with --psm 8 (see this question )
I don't know if the cropping is necessary, but I couldn't get any output at all with any page segregation mode before binarising.
I did the processing manually here, but you will likely want to automate it. A good trick for setting thresholds is to look at the standard deviation of the image in question and use that to scale your thresholds, rather than picking some absolute value and having it fail on you.
Here's the image I got working:
And the run:
$ tesseract img3.png img3 --psm 8 txt
Tesseract Open Source OCR Engine v4.1.1 with Leptonica
$ cat img3.txt
47 F02 43
I've not tried with pytesseract, but you should be able to set the same thing.
Easy ocr was able to read the image immediately, albeit inaccurately, when I tried with the web service
Update: grayscaling
This is a whole subject in itself. You might want to start with this tutorial from the opencv docs. There are basically two approaches--trying properly to binarise the image (convert it to two colour pixels, on or off) and just grayscaling it. Somewhere inbetween is 'posterising', where you reduce the number of tones (binarising is a special case of posterising, where the number of tones is 2). I normally handle grayscaling with the inbuilt function in PIL (pillow); I've had good result with a quick-and-dirty sort-of binarisation algorithm where I first normalise the brightness and contrast of an image and then apply a skewing function like
def filter_point(point: int) -> int:
if point < THRESH:
return round(point/1.2)
else:
return round(point *2)
This drives most pixels to fully white/black but leaves some intermediate values in place. It's a poor solution in that it depends on three magic numbers, but in my application (preparing scanned pdfs for human reading) I got better results than with automated thresholding or posterisation.
Thus sadly the answer is going to be 'play with it'. I'd suggest you start out with an image editor and see what the bare minimum you can do to the image to get tesseract to work is---perhaps just grayscaling (which you do earlier in the code anyway) will be enough; do you need to crop it, etc. Not drawing that pink box is going to help. I provided a very crude example filter to demonstrate that pixels are just numbers and you can do your image processing that way, but you are much better off using built in methods if you possibly can.
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
import cv2 as cv
import numpy as np
import easyocr
img = cv.imread('result.png',0)
th2 = cv.adaptiveThreshold(img, 255, cv.ADAPTIVE_THRESH_MEAN_C, cv.THRESH_BINARY, 29, 10);
cv.imshow("ADAPTIVE_THRESH_MEAN_C", th2)
cv.waitKey(0)
cv.destroyAllWindows()
reader = easyocr.Reader(['tr'])
ocr_result = reader.readtext(th2,)
print(ocr_result)
it worked like this
Image befor ocr :
Result :
I've been trying to get some masking to work in my pictures, but I guess there has to be an easier way to do this:
a) I have an BW picture (photo) showing numbers from a display, "test.png" ( 1000x300 px)
b) I want to copy (only) the black pixels and paste them in the same image
c) When pasting, I want the paste to be offset by its "original" place by 20px (both x/y)
I try running the code below but get an error:
import cv2
test = Image.open('test.png')
np = Image.new('1', (1000, 300), 255)
mask = np.bitwise_and(test, np.roll(test, 20, (0,1)))
mask.save('mask.png')
I get AttributeError: 'Image' object has no attribute 'bitwise_and'
If image is logical (0/1) then:
res=np.bitwise_and(image, np.roll(image, 20, (0,1)))
You can do it like this:
from PIL import Image
import numpy as np
# Load image and make Numpy array
im = Image.open('image.png').convert('L')
na = np.array(im)
# Get x,y coordinates of black pixels
Y, X = np.where(na==0)
# Make pixels 30 across and 30 down from them black
na[Y+30, X+30] = 0
# Convert back to PIL Image and save
Image.fromarray(na).save('result.png')
I am trying to draw a shape and then check whether the point is inside the shape or no. Thought using cv2.polylines() to draw it and cv2.pointPolygonTest() to test should work am getting an error which is not very informative.
Traceback (most recent call last):
File "C:\Users\XXX\Desktop\Heatmap\cvtest.py", line 32, in <module>
dist = cv2.pointPolygonTest(cv2.polylines(img,[pts],True,(0, 255 , 0), 2), (52,288), False)
cv2.error: OpenCV(4.1.0) C:\projects\opencv-python\opencv\modules\imgproc\src\geometry.cpp:103: error: (-215:Assertion failed) total >= 0 && (depth == CV_32S || depth == CV_32F) in function 'cv::pointPolygonTest'
I am guessing the shape created with cv2.polylines() is not a contour. What would be the correct way to do it then? My current code:
import cv2
import numpy as np
img = cv2.imread('image.png')
pts = np.array([[18,306],[50,268],[79,294],[165,328],[253,294],[281,268],[313,306],[281,334],[270,341],[251,351],[230,360],[200,368],[165,371],[130,368],[100,360],[79,351],[50,334],[35,323]], np.int32)
pts = pts.reshape((-1,1,2))
dist = cv2.pointPolygonTest(cv2.polylines(img,[pts],True,(0, 255 , 0), 2), (52,288), False)
#print(dist)
cv2.imshow('test', img)
cv2.waitKey()
cv2.destroyAllWindows()
polylines is not the right input, it is used to draw a shape (docs)
pointPolygonTest instead needs the contour as an input (docs)
dist = cv2.pointPolygonTest(pts, (52,288), False) will return 1.0, meaning inside the contour.
Note that you can perform a pointPolygonTest without an image. But if you want to draw the results, you can use this code as a starter:
import cv2
import numpy as np
#create background
img = np.zeros((400,400),dtype=np.uint8)
# define shape
pts = np.array([[18,306],[50,268],[79,294],[165,328],[253,294],[281,268],[313,306],[281,334],[270,341],[251,351],[230,360],[200,368],[165,371],[130,368],[100,360],[79,351],[50,334],[35,323]], np.int32)
pts = pts.reshape((-1,1,2))
# draw shape
cv2.polylines(img,[pts],True,(255), 2)
# draw point of interest
cv2.circle(img,(52,288),1,(127),3)
# perform pointPolygonTest
dist = cv2.pointPolygonTest(pts, (52,288), False)
print(dist)
# show image
cv2.imshow('test', img)
cv2.waitKey()
cv2.destroyAllWindows()
Result:
I would like to know how to write a function in Python 3 using OpenCV which takes in an image and a threshold and returns either 'dark' or 'light' after heavily blurring it and reducing quality (faster the better). This might sound vague , but anything that just works will do.
You could try this :
import imageio
import numpy as np
f = imageio.imread(filename, as_gray=True)
def img_estim(img, thrshld):
is_light = np.mean(img) > thrshld
return 'light' if is_light else 'dark'
print(img_estim(f, 127))
Personally, I would not bother writing any Python, or loading up OpenCV for such a simple operation. If you absolutely have to use Python, please just disregard this answer and select a different one.
You can just use ImageMagick at the command-line in your Terminal to get the mean brightness of an image as a percentage, where 100 means "fully white" and 0 means "fully black", like this:
convert someImage.jpg -format "%[fx:int(mean*100)]" info:
Alternatively, you can use libvips which is less common, but very fast and very lightweight:
vips avg someImage.png
The vips answer is on a scale of 0..255 for 8-bit images.
Note that both these methods will work for many image types, from PNG, through GIF, JPEG and TIFF.
However, if you really want Python/OpenCV code, I note none of the existing answers do that - some use a different library, some are incomplete, some do superfluous blurring and some read video cameras for some unknown reason, and none handle more than one image or errors. So, here's what you actually asked for:
#!/usr/bin/env python3
import cv2
import sys
import numpy as np
# Iterate over all arguments supplied
for filename in sys.argv[1:]:
# Load image as greyscale
im = cv2.imread(filename, cv2.IMREAD_GRAYSCALE)
if im is None:
print(f'ERROR: Unable to load {filename}')
continue
# Calculate mean brightness as percentage
meanpercent = np.mean(im) * 100 / 255
classification = "dark" if meanpercent < 50 else "light"
print(f'{filename}: {classification} ({meanpercent:.1f}%)')
Sample Output
OpenCVBrightOrDark.py g*png nonexistant
g30.png: dark (30.2%)
g80.png: light (80.0%)
ERROR: Unable to load nonexistant
You could try this, considering image is a grayscale image -
blur = cv2.blur(image, (5, 5)) # With kernel size depending upon image size
if cv2.mean(blur) > 127: # The range for a pixel's value in grayscale is (0-255), 127 lies midway
return 'light' # (127 - 255) denotes light image
else:
return 'dark' # (0 - 127) denotes dark image
Refer to these -
Smoothing, Mean, Thresholding
import numpy as np
import cv2
def algo_findDark(image):
blur = cv2.blur(image, (5, 5))
mean = np.mean(blur)
if mean > 85:
return 'light'
else:
return 'dark'
cam = cv2.VideoCapture(0)
while True:
check, frame = cam.read()
frame_gray = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)
ans = algo_findDark(frame_gray)
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(frame, ans, (10, 450), font, 3, (0, 0, 255), 2, cv2.LINE_AA)
cv2.imshow('video', frame)
key = cv2.waitKey(1)
if key == 27:
break
cam.release()
cv2.destroyAllWindows()
I am working on lane lines detection. My current working strategy is:
defining a region of interest where lane lines could be
Warping the image to get a bird eye view
Converting the image to YUV color space
Normalizing the Y channel
Fitting the second order polynomial and sliding window approach
every thing works fine but where there are shadows the algorithm do not work.
I have tried adaptive thresholding, otssu thresholding but not succeeded.
Source Image without Shadow:
Processed Source Image without Shadow:
Source Image with Shadow:
Processed Source Image with Shadow:
In the second Image it can be seen that the shadowed area is not detected. Actually shadows drops the image values down so i tried to threshold the image with new values lower than the previous one new image can be found here:
This technique does not work as it comes with a lot of noise
Currently I am trying background subtraction and shadow removal techniques but its not working. I am struck in this problem from last 2 3 weeks.
Any help will really be appreciated...
import cv2
import matplotlib.pyplot as plt
import numpy as np
from helper_functions import undistort, threshholding, unwarp,sliding_window_polyfit
from helper_functions import polyfit_using_prev_fit,calc_curv_rad_and_center_dist
from Lane_Lines_Finding import RoI
img = cv2.imread('./test_images/new_test.jpg')
new =undistort(img)
new = cv2.cvtColor(new, cv2.COLOR_RGB2BGR)
#new = threshholding(new)
h,w = new.shape[:2]
# define source and destination points for transform
imshape = img.shape
vertices = np.array([[
(257,670),
(590, 446),
(722, 440),
(1150,650)
]],
dtype=np.int32)
p1 = (170,670)
p2 = (472, 475)
p3 = (745, 466)
p4 = (1050,650)
vertices = np.array([[p1,
p2,
p3,
p4
]],
dtype=np.int32)
masked_edges = RoI(new, vertices)
#masked_edges = cv2.cvtColor(masked_edges, cv2.COLOR_RGB2BGR)
src = np.float32([(575,464),
(707,464),
(258,682),
(1049,682)])
dst = np.float32([(450,0),
(w-450,0),
(450,h),
(w-450,h)])
warp_img, M, Minv = unwarp(masked_edges, src, dst)
warp_img = increase_brightness_img(warp_img)
warp_img = contrast_img(warp_img)
YUV = cv2.cvtColor(warp_img, cv2.COLOR_RGB2YUV)
Y,U,V = cv2.split(YUV)
Y_equalized= cv2.equalizeHist(Y)
YUV = cv2.merge((Y,U,V))
thresh_min = 253
thresh_max = 255
binary = np.zeros_like(Y)
binary[(Y_equalized>= thresh_min) & (Y_equalized <= thresh_max)] = 1
kernel_opening= np.ones((3,3),np.uint8)
opening = cv2.morphologyEx(binary, cv2.MORPH_OPEN, kernel_opening)
kernel= np.ones((7,7),np.uint8)
dilation = cv2.dilate(opening,kernel,iterations = 3)