I'm a 1st-grade cs student and I know only a little bit of python. For a project, I need to use OpenCV to detect several traffic signs. I searched a little bit on the web and I decided to use Haar-Cascade classifier. I followed this tutorial : haar-cascade
I trained the code for this sign left-sign
Everything was fine up to this point. However my code (trained with 3000 positive 1500 negative jpgs and finished 8 stages)detects both right and left signs. Code needs to recognize right and left signs separately because my aim is to command my robot to turn left or turn right.
Here is my code:
import numpy as np
import cv2
ok_cascade = cv2.CascadeClassifier('new_kocum.xml')
cap = cv2.VideoCapture(0)
while 1:
ret, img = cap.read()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
oks = ok_cascade.detectMultiScale(gray,3,5)
for (x,y,w,h) in oks:
cv2.rectangle(img,(x,y),(x+w,y+h),(255,255,0),2)
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(img,'ok',(x-w,y-h), font, 0.5, (11,255,255), 2, cv2.LINE_AA)
cv2.imshow('img',img)
k = cv2.waitKey(30) & 0xff
if k == 27:
break
cap.release()
cv2.destroyAllWindows()
Here is the right sign : right-sign
So my question: Is it possible to fix that just by changing the code? If easier, which method should I use to detect these signs?
The correct way: you need add to the negative set a lot of the right signs.
The best way: don't use haar-cascade.
Simplest way: train the second classifier (for example naive Bayes) for comparing left and right signs after work you cascade. Features: correlation between images, Hu-moments etc.
Related
I'm working on a project which requires detection of people and due to the complexity of the system, I decided to use movement detection.
I faced some problems and upon asking on stack overflow, this answer seemed the best.
So I implemented the algorithm in the following steps:
Implement saliency on the input video
Applied K-means clustering
Background Subtraction
Morphological Transformation
Here is the code
import cv2
import time
import numpy as np
cap=cv2.VideoCapture(0)
#i wanted to try different background subtractors to get the best result.
fgbg=cv2.createBackgroundSubtractorMOG2()
fgbg1 = cv2.bgsegm.createBackgroundSubtractorMOG()
h = cap.get(4)
w = cap.get(3)
frameArea = h*w
areaTH = frameArea/150
while(cap.isOpened()):
#time.sleep(0.05)
_,frame=cap.read()
cv2.imshow("frame",frame)
image=frame
################Implementing Saliency########################
saliency = cv2.saliency.StaticSaliencySpectralResidual_create()
(success, saliencyMap) = saliency.computeSaliency(image)
saliencyMap = (saliencyMap * 255).astype("uint8")
#cv2.imshow("Image", image)
#cv2.imshow("Output", saliencyMap)
saliency = cv2.saliency.StaticSaliencyFineGrained_create()
(success, saliencyMap) = saliency.computeSaliency(image)
saliencyMap = (saliencyMap * 255).astype("uint8")
threshMap = cv2.threshold(saliencyMap.astype("uint8"), 0, 255,cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
# show the images
#cv2.imshow("Image", image)
#cv2.imshow("saliency", saliencyMap)
#cv2.imshow("Thresh", threshMap)
kouts=saliencyMap
#cv2.imshow("kouts", kouts)
##############implementing k-means clustering#######################
clusters=12
z=kouts.reshape((-1,3))
#covert to np.float32
z=np.float32(z)
#define criteria and accuracy
criteria= (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER,5,1.0)
#apply k-means
ret,label,center=cv2.kmeans(z,clusters,None,criteria,10,cv2.KMEANS_RANDOM_CENTERS)
#converting back the float 32 data to unit 8 and making the image
center=np.uint8(center)
res=center[label.flatten()]
kouts=res.reshape((kouts.shape))
cv2.imshow('clustered image',kouts)
############applying background subtraction#######################
fgmask=fgbg.apply(kouts)
fgmask1=fgbg1.apply(kouts)
cv2.imshow('fg',fgmask)
cv2.imshow('fgmask1',fgmask1)
#as i said earlier, i wanted to get the best background subtractor
#########################morphological transformation#####################
#Below i tried various techniques to get the best possible result
kernel=np.ones((5,5),np.uint8)
erosion=cv2.erode(fgmask1,kernel,iterations=1)
cv2.imshow('erosion',erosion)
dilation=cv2.dilate(fgmask1,kernel,iterations=1)
cv2.imshow('dilation',dilation)
gradient = cv2.morphologyEx(fgmask1, cv2.MORPH_GRADIENT, kernel)
cv2.imshow("gradient",gradient)
opening=cv2.morphologyEx(fgmask1,cv2.MORPH_OPEN,kernel)
closing=cv2.morphologyEx(fgmask1,cv2.MORPH_CLOSE,kernel)
cv2.imshow('opening',opening)
cv2.imshow('closing',closing)
#########for detection of contours##################
contours0, hierarchy = cv2.findContours(erosion,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contours0:
area = cv2.contourArea(cnt)
if area > areaTH and area<frameArea*0.50:
M = cv2.moments(cnt)
x,y,f,g = cv2.boundingRect(cnt)
img = cv2.rectangle(frame,(x,y),(x+f,y+g),(0,255,0),2)
cv2.imshow('Original',frame)
k = cv2.waitKey(1) & 0xff
if k == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
I tried this algorithm on this video but still there was a lot of noise in the output. I previously thought that the problem might be in the quality of the video but when I did cv2.VideoCapture(0), the problem still persist and the code doesn't seem to remove the noise and the situation I'm working in, has sometimes high noise.
Tell me any suggestions or where did I go wrong or a different approach to the problem.
Thanks in advance.
I spent sometime trying to see if something can be done with noise reduction, but I believe you already tried many of the known techniques in OpenCV. My opinion is to approach your problem using neural networks as they will be more accurate detecting the objects.
I created a Colab notebook, to illustrate this:
https://colab.research.google.com/drive/1rBrcu46sfo0F7fsQf4BC9hKoXTk_wNBk?usp=sharing
Even with this simple approach, it's possible to detect objects: persons and clothing. You can set a criteria that can just consider the top 10 items. As a bus entrance has a limit of people that can enter at the same time.
This is not a final solution because I am using a general purpose detector. This can be improved in your application by training the network with your video inputs. Labeling will be required but I believe this will give you the most accurate results.
I also think for the challenge to track the people that are inside the bus and the ones entering. For that you can take track the rectangles. There is an excellent example using dlib: https://www.pyimagesearch.com/2018/10/22/object-tracking-with-dlib/
I am trying to create a program that can input an image (I am doing it by imageGrab from PIL) and detect some known symbols in it, and their locations. The good thing is that I am pretty sure I don't need neural networks, because I know the exact shape and size of each symbol. the problem is that I have no idea how much of these will be, and what is the color in the background of each symbol. some of the symbols are numbers, I have an image of each digit 0-9, but there may be up to 3-digit numbers. I think I will be able to find a way to know which digits are part of the same number by their location, but lets talk about it later. right now, I have turned the image into grayscale and imshow it using opencv2.
do you have any idea how can I do it with opencv? some other library?
and I need it to be fast enough, hopefuly 10 frames per second.
this is my current code (modified sentdex's "python plays GTA" code, the most bottom of the page):
import numpy as np
from PIL import ImageGrab
import cv2
def screen_record():
while(True):
global printscreen
image = ImageGrab.grab(bbox=(20,270,430,685))
printscreen = np.array(image)
grayscale_image = cv2.cvtColor(printscreen, cv2.COLOR_BGR2GRAY)
cv2.imshow('window', grayscale_image)
if cv2.waitKey(25) & 0xFF == ord('q'):
cv2.destroyAllWindows()
break
if cv2.waitKey(25) & 0xFF == ord('w'):
image.save("screen_shot.png")
print("Saved current window as image")
screen_record()
EDIT: I managed to get to something with opencv's template match, only with the digit 2 (for now). I found a nice tutorial here. my problem is when there is not exactly 1 match of the template, means no number 2s, or more then 1. when there aren't any it looks like its choosing something random on the image, and when there's more then one, I have only 1 of them detected. is it ossible to apply it in a different way to match my needs?
So, I have a solution to my problem.
For all of those who reach this page in the future to get help, here are the steps to regognize templates in images:
create 2 images, the one you want to detect, and another one for your template.
then, upload the whoever you want using opencv, and copy this function:
def locate_symbol(x, template):
w, h = filter_num2.shape[::-1]
res = cv2.matchTemplate(x, template, cv2.TM_SQDIFF_NORMED)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
min_thresh = 0.45
match_locations = np.where(res<=min_thresh)
return w, h, match_locations
and use these lines to draw bounding boxes on the image:
w, h, locs = locate_symbol(grayscale_image, filter_num2)
for (x, y) in zip(locs[1], locs[0]):
cv2.rectangle(printable_image, (x, y), (x+w, y+h), [255, 0, 0], 2)
then you can draw everything with cv2.imshow()
I'm traying to extract some text from video stream coming from my camera using opencv2 and pytesseract. I crop the image to get an other small image. I trayed different image processing to get it work. I inverted the image values, blur it, binarize it, but no one of these is working with tesseract. The data that I want to extract has these form 'float/float' here is example of the small image:
Seems like the characters are not separated and this is the maximum resolution that I can get from my camera. I tried then to filter by color, but no result because it is video and the background is always moving.
I will use any suggested Python module that can work.
not trivial as it seems. i generated 32x32 png image for every character and add a white noise to it. the backgound on the video is moving. and caracters like 8 and 6 are not very different.
here is my code for the moment:
cap = cv2.VideoCapture("rtsp:...")
time.sleep(2)
templates = {}
w=[]
h=[]
for i in range(0,11):
templates["template_"+str(i)]=cv2.imread(str(i)+'.bmp',0)
tmp_w,tmp_h=templates["template_"+str(i)].shape[::-1]
w.append(tmp_w)
h.append(tmp_h)
threshold = 0.70
while(True):
les_points=[[],[],[],[],[],[],[],[],[],[],[]]
ret, frame = cap.read()
if frame==None:
break
crop_image=frame[38:70,11:364]
gray=cv2.cvtColor(crop_image,cv2.COLOR_BGR2GRAY)
for i in range(0,11):
res= cv2.matchTemplate(gray,templates["template_"+str(i)],cv2.TM_CCOEFF_NORMED)
loc = np.where( res >= threshold)
for pt in zip(*loc[::-1]):
les_points[i].append(pt[0])
cv2.rectangle(crop_image, pt, (pt[0] + w[i], pt[1] + h[i]), (0,i*10,255), 2)
print les_points
cv2.imshow('normal',crop_image)
if cv2.waitKey(1)& 0xFF == ord('p'):
threshold=threshold+0.01
print threshold
if cv2.waitKey(1)& 0xFF == ord('m'):
threshold=threshold-0.01
print threshold
if cv2.waitKey(1) & 0xFF == ord('q'):
break
I'm doing other tests by split the image to the exact same size of the caracters in templates. but this is not giving good results
I am trying to make a program that is capable of identifying a road in a scene and proceeded to using morphological filtering and the watershed algorithm. However the program produces either mediocre or bad results. It seems to do okay (not good enough through) if the road takes up most of the scene. However in other pictures, it turns out that the sky gets segmented instead (watershed with the clouds).
I tried to see if I can preform more image processing to improve the results, but this is the best I have so far and don't know how to move forward to improve my program.
How can I improve my program?
Code:
import numpy as np
import cv2
from matplotlib import pyplot as plt
import imutils
def invert_img(img):
img = (255-img)
return img
#img = cv2.imread('images/coins_clustered.jpg')
img = cv2.imread('images/road_4.jpg')
img = imutils.resize(img, height = 300)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
thresh = invert_img(thresh)
# noise removal
kernel = np.ones((3,3), np.uint8)
opening = cv2.morphologyEx(thresh,cv2.MORPH_OPEN,kernel, iterations = 4)
# sure background area
sure_bg = cv2.dilate(opening,kernel,iterations=3)
#sure_bg = cv2.morphologyEx(sure_bg, cv2.MORPH_TOPHAT, kernel)
# Finding sure foreground area
dist_transform = cv2.distanceTransform(opening, cv2.DIST_L2, 5)
ret, sure_fg = cv2.threshold(dist_transform,0.7*dist_transform.max(),255,0)
# Finding unknown region
sure_fg = np.uint8(sure_fg)
unknown = cv2.subtract(sure_bg,sure_fg)
# Marker labelling
ret, markers = cv2.connectedComponents(sure_fg)
# Add one to all labels so that sure background is not 0, but 1
markers = markers+1
# Now, mark the region of unknown with zero
markers[unknown==255] = 0
'''
imgray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
imgray = cv2.GaussianBlur(imgray, (5, 5), 0)
img = cv2.Canny(imgray,200,500)
'''
markers = cv2.watershed(img,markers)
img[markers == -1] = [255,0,0]
cv2.imshow('background',sure_bg)
cv2.imshow('foreground',sure_fg)
cv2.imshow('threshold',thresh)
cv2.imshow('result',img)
cv2.waitKey(0)
For start, segmentation problems are hard. The more general you want the solution to be, the more hard it gets. Road segemntation is a well-known problem, and i'm sure you can find many papers which tackle this issue from various directions.
Something that helps me get ideas for computer vision problems is trying to think what makes it so easy for me to detect it and so hard for computer.
For example, let's look on the road on your images. What makes it unique from the background?
Distinct gray color.
Always have 2 shoulders lines in white color
Always on the bottom section of the image
Always have a seperation line in the middle (yellow/white)
Pretty smooth
Wider on the bottom and vanishing into horizon.
Now, after we have found some unique features, we need to find ways to quantify them, so it will be obvious to the algorithm as it is obvious to us.
Work on the RGB (or even better - HSV) image, don't convert it to gray on the beginning and lose all the color data. Look for gray area!
Again, let's find white regions (inside gray ones). You can try do edge detection in the specific orientation of the shoulders line. You are looking for line that takes about half of the height of the image. etc...
Lets delete the upper half of the image. It is hardly that you ever have there a road, and you will get rid from a lot of noise in your algorithm.
see 2...
Lets check the local standard deviation, or some other smoothness feature.
If we found some shape, lets check if it fits what we expect.
I know these are just ideas and I don't claim they are easy to implement, but if you want to improve your algorithm you must give it more "knowledge", just as you have.
Exploit some domain knowledge; in other words, make some simplifying assumptions. Even basic things like "the camera's not upside down" and "the pavement has a uniform hue" will improve the common case.
If you can treat crossroads as a special case, then finding the edges of the roadway may be a simpler and more useful task than finding the roadway itself.
I need to find edge detection of medical images using OpenCV python .Which edge detector will be the best suited for my work? I have tried using canny Edge detector. I want to find edges of the medical images and find the histogram matching between two images.
Thanks in Advance:)
Can you post the images you're working on ? That will be better.
Also, you can try this code. It allows you to change the parameters of canny filters, Thresold 1 and thresold 2 and hence you will get an overall idea how you can apply canny filter to the image.
import cv2
import numpy as np
def nothing(x):
pass
#image window
cv2.namedWindow('image')
#loading images
img = cv2.imread('leo-messi-pic.jpg',0) # load your image with proper path
# create trackbars for color change
cv2.createTrackbar('th1','image',0,255,nothing)
cv2.createTrackbar('th2','image',0,255,nothing)
while(1):
# get current positions of four trackbars
th1 = cv2.getTrackbarPos('th1','image')
th2 = cv2.getTrackbarPos('th2','image')
#apply canny
edges = cv2.Canny(img,th1,th2)
#show the image
cv2.imshow('image',edges)
#press ESC to stop
k = cv2.waitKey(1) & 0xFF
if k == 27:
break
cv2.destroyAllWindows()
As far as, histogram comparison is concerned. You can find all the histogram related cv2 APIs here.
http://docs.opencv.org/modules/imgproc/doc/histograms.html
Hope it helps.