Face detection inside a MP4 or YUV video file in linux? - python

I am stuck in finding help on my next project. My use case as follows,
1) Read frames from a mp4 file.
2) Detect faces inside the frames.
3) Store or Display the final output.
"same use case to be executed with a YUV420P (raw) video"
Am very very new to openCV platform but am quite familiar with gstreamer and linux interface programming.
Please help me to find any reference (example) for the same.

When you are familiar with GStreamer - there is a OpenCV facedetect element: https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-bad-plugins/html/gst-plugins-bad-plugins-facedetect.html.
Reading and writing raw video data should be trivial for you then I guess? ;-)

There is a very similar post on OpenCV's forum. Basically you have to build your OpenCV from source and enable the extra features for working with additional video formats. For that you need ffmpeg (which you will probably also have to build from source), gstreamer (if you want to use it) and fourcc, which allows you to call cv::CreateVideoCapture() with CV_FOURCC(...) set to the codec you are using (the list of abrevations can be found here).

Related

Python Open CV image streaming without decoding and reencoding?

Is it possible in python with the latest CV2 to use CV2 to directly bind mjpeg output from the camera to a stream without having to do source -> cv2.read() -> numpy array -> cv2.imencode(".jpg") -> mjpeg? I am looking to do source -> mjpeg in a pythonic way.
Latency is a major issue so any advice including options beyond CV2 would be appreciated.
No. OpenCV is not a media library. Its video I/O is not intended or made for this.
I would advise to use PyAV, which is the only proper python wrapper around ffmpeg's libraries that I know of. PyAV comes with a few examples to give you a feel for how it works.
the basic problem then is how to use ffmpeg's libraries to enumerate available video devices, query their modes, select the mode you want, and move packets around.

Capturing and manipulating a webcam feed and exposing it as a "virtual webcam" - in Python, on Windows

The final goal would be to capture the regular webcam feed, manipulate it in some way (blur face, replace background, ...) and then output the result in some way so that the manipulated feed can be chosen as input for whatever application expects a webcam (Discord, Teams, ...).
I am working on a Windows machine and would prefer to do this in Python. This combination has me lost, at the moment.
capturing and manipulating is easy with https://pypi.org/project/opencv-python/
the exposing the feed step seems overly complicated
Apparently, on Linux there are Python libraries just offering that functionality, but they do not work on Windows. Everything that sounded like it could hint towards a good solution went directly into C++ country. There are programs which basically do what I want, e.g. webcamoid (https://webcamoid.github.io/) and I could hack together a solution which captures and processes the feed via Python, then uses webcamoid to record the output and feed it into a virtual webcam. But I'd much prefer to do the whole thing in one.
I have been searching around a bit and found these questions on stackoverflow on the topic:
Using OpenCV Output as Webcam (uses C++ but also gives a Python solution - however, pyfakewebcam does not work on Windows)
How do I stream to a new video source? (not really answered, just links to other question)
How to simulate a webcam device (more C++ hints, links to msdn's Writing a Custom Media Source)
Artificial webcam on windows (basically what I want, but in C++ again)
Writing a virtual webcam? (more explanation on how this might work in C++)
I am getting the strong impression that I need C++ for this or have to work on Linux. However, lacking both a Linux machine and any setup as well as experience in programming in C++, this seems like a large amount of work for the "toy project" this was supposed to be. But maybe I am just missing an obvious library or functionality somewhere?
Hence, the question is: Is there a way to expose a "webcam" stream via Python on Windows?
And, one last idea: What if I used a docker container with a Linux Python environment to implement the functionality I want. Could that container then stream a "virtual webcam" to the host?
You can do this by using pyvirtualcam
First, you need to install it using pip
pip install pyvirtualcam
Then go to This Link and download the zip file from the latest release
Unzip and navigate to \bin\[your computer's bittedness]
Open Command Prompt in that directory and type
regsvr32 /n /i:1 "obs-virtualsource.dll"
This will register a fake camera to your computer
and if you want to unregister the camera then run this command:
regsvr32 /u "obs-virtualsource.dll"
Now you can send frames to the camera using pyvirtualcam
This is a sample:
import pyvirtualcam
import numpy as np
with pyvirtualcam.Camera(width=1280, height=720, fps=30) as cam:
while True:
frame = np.zeros((cam.height, cam.width, 4), np.uint8) # RGBA
frame[:,:,:3] = cam.frames_sent % 255 # grayscale animation
frame[:,:,3] = 255
cam.send(frame)
cam.sleep_until_next_frame()

How to write avi file with OpenCV larger than 2 GB?

I'm using OpenCV with Python, but actually can switch to C++, so if it's matter please answer question considering it.
I'm writing .avi file(joining multiple avi files into one) using
cv2.VideoWriter([filename, fourcc, fps, frameSize[, isColor]])
but recently found out that I can't write .avi file larger than 2 GB with it. It even mentioned there: Due to this OpenCV for video containers supports only the avi extension, its first version. A direct limitation of this is that you cannot save a video file larger than 2 GB.
But right now I've got no time to learn new library like ffmpeg, I need to do it very fast.
How can I write this file, using C++ or Python with knowledge of OpenCV, or at least with input part - using
cv::Mat
as frames
This limitation was removed in OpenCV 3.0, due to the introduction of new file formats such as .mkv, who do support video files larger than 2GB.
See Does OpenCV 3.0 Still Has Limits On VideoWriter Size?.
NOTE: The documentation and examples weren't updated yet, so maybe this should be considered experimental.
You have answered your own question but I'm afriad it isn't the answer you want.
From your link
As you can see things can get really complicated with videos. However, OpenCV is mainly a computer vision library, not a video stream, codec and write one. Therefore, the developers tried to keep this part as simple as possible. Due to this OpenCV for video containers supports only the avi extension, its first version. A direct limitation of this is that you cannot save a video file larger than 2 GB. Furthermore you can only create and expand a single video track inside the container. No audio or other track editing support here. Nevertheless, any video codec present on your system might work. If you encounter some of these limitations you will need to look into more specialized video writing libraries such as FFMpeg or codecs as HuffYUV, CorePNG and LCL.
What this paragraph says is that the developers of OpenCV made a design choice that says you cannot write video files larger than 2Gb using OpenCV for the specific reason that it is a computer vision library not a video tool.
Unfortunately if you want to write videos larger than 2Gb you are going to need to learn to use FFMPEG or something similar (It isn't that hard and has good bindings to OpenCV)

Python get Audio/Video frames separately from video file

What i'm trying to do:
Hi!
I'm trying to store the Video and Audio information from a video file. I would like to store video frames and audio frames separately in different variables.
My intention is to manage video/files and do some actions with the audio and video frame list, but to do what I'm plannign to do I need to store this audio/video frames separately. I've read a lot of questions in StackOverflow about python and audio/video managing.
Most people recommend to use OpenCV or ffmpeg to manage videos. I saw some scripts using these libraries to get video(only video) frames, but none of them are getting audio, most of them are just getting video frames and save them as RGB images. I also check some scripts where people get audio frames from a mp3 file, but I'm not sure if you can do that in a video file
Most important thing to me is to know the best way to manage video and audio separately. I'm not looking for people to do my code, just asking to point me in a good direction.
One of the things I'm trying to do is to send this information via socket, but as I said I need the audio and video frames to be in separated variables (yes, i'm wondering about an stream app, but that's not the only thing I'm trying to do)
I know I should give more information, and maybe show some code, but I don't have any concret code I tried some things, but I've never been capable to separate audio and video. I know that each format has his own encryption, and at the end I decided to use "mp4" as video format but I don't know neither if this is the best format for what I'm trying to do.
Resume:
Is openCV the best way to manage video and audio separately ?
Wich is the easiest way to separate video and audio frames ? Is it possible ?
Wich is the best documentation I should read to learn about video/audio management ?
I would like to do the things with my own code, and use in the less way possible openCV or other libraries.
My "basic" idea is to get a "list" of audio and video frames, and then I would like to do some operations, but right now I can't find the best way for me to manage a vide using python. I even wonder if could be possible to manage a video as raw data
I need to know wich is the best library to manage videos using python, for me the best library, will be the one that allows me to manage the videos more "freely"
I've already checked:
I've read too many questions on this theme, the most recent are :
How to extract audio from video file
Split audio video separately from given video using MLT
Embed audio video in python gui

mpeg-1 video writing with python

I am trying to use the Opencv VideoWriter object with the mpeg-1 encoding to create videos, I am aiming at writing only two images on that video, using mpeg-1 encoding, I would like to know how much the first image that I wrote first helps in compressing the second image. In other words find the file size before writing the 2nd image and after. My questions are:
Is there any way to perform this process using Opencv?
Is there a way to avoid writing on disks and just have the information of the size of the compreesed video( after adding the second image)?
Is there any other good alternatives reach my goals?
I suggest you learn GStreamer framework which has Python bindings available.
http://gstreamer.freedesktop.org/modules/gst-python.html
It works best on Linux platforms, some OSX support is available.
GStreamer provides "sane", but very powerful and very complex, APIs for procedural video and audio generation.
See also:
GStreamer: status of Python bindings and encoding video with mixed audio
Alternative you can write out frames to raw image images files and parse them to a video using ffmpeg command. Might work on Microsoft Windows platforms too.

Categories