The problem
I'm working with a camera that posts a snapshot to the web every 5 seconds or so. The camera is monitoring a line of people. I'd like my script to be able to tell me how long the line of people is.
What I've tried
At first, I thought I could do this using BackgroundSubtractorMOG, but this is just producing a black image. Here's my code for that, modified to use an image instead of a video capture:
import numpy as np
import cv2
frame = cv2.imread('sample.jpg')
fgbg = cv2.BackgroundSubtractorMOG()
fgmask = fgbg.apply(frame)
cv2.imshow('frame', fgmask)
cv2.waitKey()
Next, I looked at foreground extraction on an image, but this is interactive and doesn't suit my use case of needing the script to tell me how long the line of people is.
I also tried to use peopledetect.py, but since the image of the line is from an elevated position, that script doesn't detect any people.
I'm brand new to opencv, so any help is greatly appreciated. I can supply any additional details upon request.
Note:
I'm not so much looking for someone to solve the overall problem, as I am just trying to figure out a way to separate out the people from the background. However, I am open to approaching the problem a different way if you think you have a better solution.
EDIT: Here's a sample image as requested:
I figured it out! #QED helped me get there. Basically, you can't do this with just one image. You need AT LEAST 2 frames to compare so the algorithm can tell what's different (foreground) and what's the same (background). So I took 2 frames and looped through them to "train" the algorithm. Here's my code:
import numpy as np
import cv2
i = 1
while(1):
fgbg = cv2.BackgroundSubtractorMOG()
while(i < 3):
print 'img' + `i` + '.jpg'
frame = cv2.imread('img' + `i` + '.jpg')
fgmask = fgbg.apply(frame)
cv2.imshow('frame', fgmask)
i += 1
k = cv2.waitKey(30) & 0xff
if k == 27:
break
cv2.destroyAllWindows()
And here's the result from 2 consecutive images!
Related
I am trying to make this videocapture opencv2 python script allow me to do multiple video streams from my laptop cam and USB cams and I succeeded (with help of youtube) to do so only every time I add a camera I have to edit the line of code and add another videocapture line and another frame and another cv2.imshow. But I want to edit the video capture code in a way that allows me to stream as many cameras as detected without the need to add a line every time there is a camera using a loop. I'm obviously new here so accept my apologies if the solution is too simple.
This is the code that allows me to stream multiple cameras but with adding a line for each camera.
import urllib.request
import time
import numpy as np
import cv2
# Defining URL for camera
video_capture_0 = cv2.VideoCapture(0)
video_capture_1 = cv2.VideoCapture(1)
while True:
# Capture frame-by-frame
ret0, frame0 = video_capture_0.read()
ret1, frame1 = video_capture_1.read()
if (ret0):
# Display the resulting frame
cv2.imshow('Cam 0', frame0)
if (ret1):
# Display the resulting frame
cv2.imshow('Cam 1', frame1)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# When everything is done, release the capture
video_capture_0.release()
video_capture_1.release()
cv2.destroyAllWindows()
I tried making a list camlist = [i for i in range(100)] and then adding that to a for loop that keeps adding it to videocapture. But I believe that's a mess so I deleted the code plus that doesn't seem so effective.
If you want to work with many cameras then first you should keep them on list - and then you can use for-loop to get all frame. And frames you should also keep on list so later you can use for-loop to display them. And finally you can use for-loop to release cameras
import cv2
#video_captures = [cv2.VideoCapture(x) for x in range(2)]
video_captures = [
cv2.VideoCapture(0),
#cv2.VideoCapture(1),
cv2.VideoCapture('https://imageserver.webcamera.pl/rec/krupowki-srodek/latest.mp4'),
cv2.VideoCapture('https://imageserver.webcamera.pl/rec/krakow4/latest.mp4'),
cv2.VideoCapture('https://imageserver.webcamera.pl/rec/warszawa/latest.mp4'),
]
while True:
results = []
for cap in video_captures:
ret, frame = cap.read()
results.append( [ret, frame] )
for number, (ret, frame) in enumerate(results):
if ret:
cv2.imshow(f'Cam {number}', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
for cap in video_captures:
cap.release()
cv2.destroyAllWindows()
And now if you even add new camera to list then you don't have to change rest of code.
But for many cameras it is good to run every camera in separated thread - but still keep all on lists.
Few days ago was question: How display multi videos with threading using tkinter in python?
I'm challenging myself to automate somethings playing a game called Pokemon TCG Online.
As I don't know nothing about reverse engineering, I'm trying to use Computer Vision to identify objects and perform tasks.
The GUI of the game is always the same, so I dont have to deal with color variance and other things. My first tought was to use template matching, but, I'm having a problem with false positives.
The other two alternatives I found was using a HAAR Cascade (I found a "bot" of other game that uses it) or using a neural network and train it to recognize every element.
Before I go deep in a way to do it, I would like to find the best way, to avoid time wasting on a non functional way. Also, I don't want to "use a sledgehammer to crack a nut", so I'm looking for a simple and elegant way to do it.
My first aproach was using python and opencv, since both are simple to use, but I'm open to other tools. I know how to use YOLO on python, but I only succeed installing it on Linux and the game runs on Windows.
Thank you very much
The code I'm using:
import cv2
import pyautogui
from PIL import ImageGrab
fourcc = cv2.VideoWriter_fourcc('X','V','I','D') #you can use other codecs as well.
vid = cv2.VideoWriter('record.avi', fourcc, 8, (1440,900))
jogar = cv2.imread("jogar.png", 0)
while(True):
img = ImageGrab.grab(bbox=(0, 0, 1000, 1000)) #x, y, w, h
img_np = np.array(img)
img_npGray = cv2.cvtColor(img_np, cv2.COLOR_BGR2GRAY)
#frame = cv2.cvtColor(img_np, cv2.COLOR_BGR2GRAY)
vid.write(img_np)
cv2.imshow("frame", img_npGray)
res = cv2.matchTemplate(img_npGray, jogar, cv2.TM_SQDIFF)
threshold = 0.9
loc = np.where (res >= threshold)
# pyautogui.moveTo(loc)
print(loc)
key = cv2.waitKey(1)
if key == 27:
break
vid.release()
cv2.destroyAllWindows()
I said the tutorials in the official docs were great in my comment. And they are. But you have do some searching for the sample images. Many of them are here including the Messi picture used for the template matching tutorial.
This code works. If you are using TM_SQDIFF, then the best match will be found as a minimum. Also, you probably want the best match using cv2.minMaxLoc, rather than using a threshold.
import cv2
import numpy as np
screenshot = cv2.imread("screenshot.png", 0)
template = cv2.imread("template.png", 0)
res = cv2.matchTemplate(screenshot, template, cv2.TM_SQDIFF)
# threshold = 0.1
# loc = np.where (res >= threshold)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
print(min_loc)
which gives
(389, 412)
Screenshot:
Template
I'm trying to write a program that cuts my gameplay clips for me. In order to achieve what I want I need to detect a red "X" (indicates a kill) in the middle of the frame, which is a 12px * 12px region, so I need full quality. While I was debugging I realized that the frames read by OpenCV seemed lower quality than what I see in a video player.
Here is an example:
This is the "X" I've cut from a video player
This is the "X" that OpenCV read in
Both are from the same frame.
Here is the code I was debugging with:
import cv2
import numpy as np
vid = cv2.VideoCapture(path)
success, frame = vid.read()
while success:
cv2.imshow("Frame", frame)
# Pause video
if cv2.waitKey(1) & 0xFF == ord('s'):
cv2.waitKey(0)
# Quit video
if cv2.waitKey(1) & 0xFF == ord('q'):
break
success, frame = vid.read()
cv2.destroyAllWindows()
I've also checked the width and the height and they seem to be the same as the original video. I'm using OpenCV version 4.2.0. What could cause this issue?
Any help is appreciated.
If you are trying to read the frame from a file and want to change it's resolution, you'd probably want to use the resize method as described here. This would need to be done inside the loop, right after you read in the frame. It could be something like:
resize(ret, Size(width, height), 0, 0, INTER_CUBIC);
Example :
b = cv2.resize(frame,(1280,720),fx=0,fy=0, interpolation = cv2.INTER_CUBIC)
I hope this helps.
I figured it out. The only thing I had to do is to convert the picture to jpeg (for some reason). I used this function to convert it in memory:
def convertToJpeg(img):
result, encoded = cv2.imencode('.jpg', img, [cv2.IMWRITE_JPEG_QUALITY, 100])
return cv2.imdecode(encoded, 1)
Thanks for all the help!
Disclaimer : I have no experience in computer vision or image processing, but I need to process videos to obtain data for machine learning.
I wish to read a greyscale movie (I made it using greyscale images) - frame by frame using moviepy. For further processing, I need greyscale frames. Here is my code:
clip = VideoFileClip('movie.mp4')
count =1
for frames in clip.iter_frames():
print frames.shape
count+=1
print count
The frame shapes come out to be (360L, 480L, 3L) while I was expecting (360L, 480L). And this puzzles me. Is there a way to get the "expected" shape? Python OpenCV ideas may work too, but I would prefer moviepy.
If your are dealing with videos and images, OpenCV is your friend:
import cv2
from moviepy.editor import VideoFileClip
clip = VideoFileClip('movie.mp4')
count =1
for frames in clip.iter_frames():
gray_frames = cv2.cvtColor(frames, cv2.COLOR_RGB2GRAY)
print frames.shape
print gray_frames.shape
count+=1
print count
We're doing a project in school where we need to do basic image processing. Our goal is to use every video frame for the Raspberry Pi and do real time image processing.
We've tried to include raspistill in our python-program but so far nothing has worked. The goal of our project is to design a RC-car which follows a blue/red/whatever coloured line with help from image processing.
We thought it would be a good idea to make a python-program which does all image processing necessary, but we currently struggle with the idea of bringing recorded images into the python program. Is there a way to do this with picamera or should we try a different way?
For anyone curious, this is how our program currently looks
while True:
#camera = picamera.PiCamera()
#camera.capture('image1.jpg')
img = cv2.imread('image1.jpg')
width = img.shape[1]
height = img.shape[0]
height=height-1
for x in range (0,width):
if x>=0 and x<(width//2):
blue = img.item(height,x,0)
green = img.item(height,x,1)
red = img.item(height,x,2)
if red>green and red>blue:
OpenCV already contains functions to process live camera data.
This OpenCV documentation provides a simple example:
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while(True):
# Capture frame-by-frame
ret, frame = cap.read()
# Our operations on the frame come here
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Display the resulting frame
cv2.imshow('frame',gray)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
Of course, you do not want to show the image but all your processing can be done there.
Remember to sleep a few hundred milliseconds so the pi does not overheat that much.
Edit:
"how exactly would I go about it though. I used "img = cv2.imread('image1.jpg')" all the time. What do I need to use instead to get the "img" variable right here? What do I use? And what is ret, for? :)"
ret indicates whether the read was successful. Exit program if not.
The read frame is nothing other than your img = cv2.imread('image1.jpg') so your detection code should work exactly the same.
The only difference is that your image does not need to be saved and reopened. Also for debugging purposes you can save the recorded image, like:
import cv2, time
cap = cv2.VideoCapture(0)
ret, frame = cap.read()
if ret:
cv2.imwrite(time.strftime("%Y%m%d-%H%M%S"), frame)
cap.release()
You can use picamera to acquire images.
To make it "real time", you can acquire data each X milliseconds. You need to set X depending on the power of your hardware (and the complexity of the openCV algorithm).
Here's an example (from http://picamera.readthedocs.io/en/release-1.10/api_camera.html#picamera.camera.PiCamera.capture_continuous) how to acquire 60 images per second using picamera:
import time
import picamera
with picamera.PiCamera() as camera:
camera.start_preview()
try:
for i, filename in enumerate(camera.capture_continuous('image{counter:02d}.jpg')):
print(filename)
time.sleep(1)
if i == 59:
break
finally:
camera.stop_preview()