What i'm trying to do:
Hi!
I'm trying to store the Video and Audio information from a video file. I would like to store video frames and audio frames separately in different variables.
My intention is to manage video/files and do some actions with the audio and video frame list, but to do what I'm plannign to do I need to store this audio/video frames separately. I've read a lot of questions in StackOverflow about python and audio/video managing.
Most people recommend to use OpenCV or ffmpeg to manage videos. I saw some scripts using these libraries to get video(only video) frames, but none of them are getting audio, most of them are just getting video frames and save them as RGB images. I also check some scripts where people get audio frames from a mp3 file, but I'm not sure if you can do that in a video file
Most important thing to me is to know the best way to manage video and audio separately. I'm not looking for people to do my code, just asking to point me in a good direction.
One of the things I'm trying to do is to send this information via socket, but as I said I need the audio and video frames to be in separated variables (yes, i'm wondering about an stream app, but that's not the only thing I'm trying to do)
I know I should give more information, and maybe show some code, but I don't have any concret code I tried some things, but I've never been capable to separate audio and video. I know that each format has his own encryption, and at the end I decided to use "mp4" as video format but I don't know neither if this is the best format for what I'm trying to do.
Resume:
Is openCV the best way to manage video and audio separately ?
Wich is the easiest way to separate video and audio frames ? Is it possible ?
Wich is the best documentation I should read to learn about video/audio management ?
I would like to do the things with my own code, and use in the less way possible openCV or other libraries.
My "basic" idea is to get a "list" of audio and video frames, and then I would like to do some operations, but right now I can't find the best way for me to manage a vide using python. I even wonder if could be possible to manage a video as raw data
I need to know wich is the best library to manage videos using python, for me the best library, will be the one that allows me to manage the videos more "freely"
I've already checked:
I've read too many questions on this theme, the most recent are :
How to extract audio from video file
Split audio video separately from given video using MLT
Embed audio video in python gui
Related
Using python - I want to be able to sample in 2 second chunks, the audio that is being played through the output audio device on the computer.
My goal is to be able to detect a specific noise that comes through the speakers of a PC using Python, and this is the first step in doing so.
I have taken a look at the sounddevice docs but can't seem to determine the correct way to achieve this behaviour, the documentation doesn't appear to cover this.
Please can somebody help out?
It appears I am looking for some form of audio lookback recording in python.
I recently copied a bunch of audio files, which are feedback left during a phone call.
The vast majority of them are mp3, but a small percentage are files ending in a .ul extension, which I believe is ULAW.
I have tried to play them in Audacity and VLC, but get garbled sounds. I suspect they are corrupted, but I'd like to confirm that by attempting to convert them to another audio format.
Would anyone be able to recommend a library to do that?
I know Python has the audioop module but I do not know enough to start messing with the audio data.
I'm using OpenCV with Python, but actually can switch to C++, so if it's matter please answer question considering it.
I'm writing .avi file(joining multiple avi files into one) using
cv2.VideoWriter([filename, fourcc, fps, frameSize[, isColor]])
but recently found out that I can't write .avi file larger than 2 GB with it. It even mentioned there: Due to this OpenCV for video containers supports only the avi extension, its first version. A direct limitation of this is that you cannot save a video file larger than 2 GB.
But right now I've got no time to learn new library like ffmpeg, I need to do it very fast.
How can I write this file, using C++ or Python with knowledge of OpenCV, or at least with input part - using
cv::Mat
as frames
This limitation was removed in OpenCV 3.0, due to the introduction of new file formats such as .mkv, who do support video files larger than 2GB.
See Does OpenCV 3.0 Still Has Limits On VideoWriter Size?.
NOTE: The documentation and examples weren't updated yet, so maybe this should be considered experimental.
You have answered your own question but I'm afriad it isn't the answer you want.
From your link
As you can see things can get really complicated with videos. However, OpenCV is mainly a computer vision library, not a video stream, codec and write one. Therefore, the developers tried to keep this part as simple as possible. Due to this OpenCV for video containers supports only the avi extension, its first version. A direct limitation of this is that you cannot save a video file larger than 2 GB. Furthermore you can only create and expand a single video track inside the container. No audio or other track editing support here. Nevertheless, any video codec present on your system might work. If you encounter some of these limitations you will need to look into more specialized video writing libraries such as FFMpeg or codecs as HuffYUV, CorePNG and LCL.
What this paragraph says is that the developers of OpenCV made a design choice that says you cannot write video files larger than 2Gb using OpenCV for the specific reason that it is a computer vision library not a video tool.
Unfortunately if you want to write videos larger than 2Gb you are going to need to learn to use FFMPEG or something similar (It isn't that hard and has good bindings to OpenCV)
I use Windows 7. All I want to do is create raw audio and stream it to a speaker. After that, I want to create classes that can generate sine progressions (basically, a tone that slowly gets more and more shrill). After that, I want to put my raw audio into audio codecs and containers like .WAV and .MP3 without going insane. How might I be able to achieve this in Python without using dependencies that don't come with a standard install?
I looked up a great deal of files, descriptions, and related questions from here and all over the internet. I read about PCM and ADPCM, as well as A/D Converters. Where I get lost is somewhere between the ratio of byte input --> Kbps output, and all that stuff.
Really, all I want is for somebody to please be able to point me in the right direction to learn the audio formats precisely, and how to use them in Python (but first I want to start with raw audio).
This questions really has 2 parts:
How do I generate audio signals
How do I play audio signals through the speakers.
I wrote a simple wrapper around the python std lib's wave module, called pydub, which you can look at (on github) as a point of reference for how to manipulate raw audio data.
I generally just export the audio data to a file and then play it using VLC player. IMHO there's no reason to write a bunch of code to playback audio unless you're making a synthesizer or a game or some other realtime app.
Anyway, I hope that helps you get started :)
I am trying to use the Opencv VideoWriter object with the mpeg-1 encoding to create videos, I am aiming at writing only two images on that video, using mpeg-1 encoding, I would like to know how much the first image that I wrote first helps in compressing the second image. In other words find the file size before writing the 2nd image and after. My questions are:
Is there any way to perform this process using Opencv?
Is there a way to avoid writing on disks and just have the information of the size of the compreesed video( after adding the second image)?
Is there any other good alternatives reach my goals?
I suggest you learn GStreamer framework which has Python bindings available.
http://gstreamer.freedesktop.org/modules/gst-python.html
It works best on Linux platforms, some OSX support is available.
GStreamer provides "sane", but very powerful and very complex, APIs for procedural video and audio generation.
See also:
GStreamer: status of Python bindings and encoding video with mixed audio
Alternative you can write out frames to raw image images files and parse them to a video using ffmpeg command. Might work on Microsoft Windows platforms too.