using ffmpeg with python, Input buffer exhausted before END element found - python

When I run this from command line everything is fine
ffmpeg -i input.mp4 -f mp3 -ab 320000 -vn output.mp3
But when I call the same from python
subprocess.call(['ffmpeg', '-i', 'input.mp4', '-f', 'mp3', '-ab', '320000', '-vn', 'output.mp3'])
After several seconds converting I'm getting this error
[aac # 0x7fb3d803e000] decode_band_types: Input buffer exhausted before
END element found
Error while decoding stream #0:1: Invalid data found when processing
input
[mov,mp4,m4a,3gp,3g2,mj2 # 0x7fb3d8000000] stream 1, offset 0x80011d:
partial file
input.mp4: Invalid data found when processing input
Any ideas?

You need to add -dn & -ignore_unknown and -sn option (if subtitles causing encoding failure).
-dn refers to no data encoding.
-sn refers to no subtitles encoding
-ignore_unknown refers to ignore the unknown streams(SCTE 35, 128 data)
Irrespective of the input streams, -dn -sn & -ignore_unknown options will work.
That will solve your problem.
There are another options if you want to preserve the data, subtitles streams.
-c:d copy refers to copy the data streams.
-c:s copy refers to copy the subtitle streams.
You can use -copy_unknown option to get the unknown streams into your output.
Your final code would look like below.
subprocess.call(['ffmpeg', '-i', 'input.mp4', '-f', 'mp3', '-ab', '320000', '-vn', '-sn', '-dn', '-ignore_unknown', 'output.mp3'])
NOTE: -copy_unknown option only works with ffmpeg 4.x version or above.

Note - this error is not a FATAL error as defined by ffmpeg, meaning you could catch the error in code and the conversion will continue and the mp4 will be playable. It will be missing the 'failed' information, but most likely this was invisible to the user.

Related

Using os.system() to convert audio files sample rate

I have started working on an NLP project, and at the start of this, I need to downsample the audio files. To do this I have found one script that can do it automatically, but though I can use it to downsample my audio I'm struggling to understand how it's working.
def convert_audio(audio_path, target_path, remove=False):
"""This function sets the audio `audio_path` to:
- 16000Hz Sampling rate
- one audio channel ( mono )
Params:
audio_path (str): the path of audio wav file you want to convert
target_path (str): target path to save your new converted wav file
remove (bool): whether to remove the old file after converting
Note that this function requires ffmpeg installed in your system."""
os.system(f"ffmpeg -i {audio_path} -ac 1 -ar 16000 {target_path}")
# os.system(f"ffmpeg -i {audio_path} -ac 1 {target_path}")
if remove:
os.remove(audio_path)
this is the code that's giving my trouble, I don't understand how the 4th line from the bottom works, I believe that is the line that resamples the audio files.
The repo this is inside of :
https://github.com/x4nth055/pythoncode-tutorials/
if anyone has information on how this is done I'd love to know, or if there are better ways to downsample audio files! Thanks
Have you ever used ffmpeg? the docs clearly show the options(maybe need audio expertise to understand)
-ac[:stream_specifier] channels (input/output,per-stream) Set the number of audio channels. For output streams it is set by default to
the number of input audio channels. For input streams this option only
makes sense for audio grabbing devices and raw demuxers and is mapped
to the corresponding demuxer options.
-ar[:stream_specifier] freq (input/output,per-stream) Set the audio sampling frequency. For output streams it is set by default to the
frequency of the corresponding input stream. For input streams this
option only makes sense for audio grabbing devices and raw demuxers
and is mapped to the corresponding demuxer options.
Explanations for os.system
Execute the command (a string) in a subshell...on Windows, the return
value is that returned by the system shell after running command. The
shell is given by the Windows environment variable COMSPEC: it is
usually cmd.exe, which returns the exit status of the command run; on
systems using a non-native shell, consult your shell documentation.
for better understanding, suggest print the command
cmd_str = f"ffmpeg -i {audio_path} -ac 1 -ar 16000 {target_path}"
print(cmd_str) # then you can copy paste to cmd/bash and run
os.system(cmd_str)

No such filter '"split': ffplay with ffmpeg in Python

I am trying to visualize the YUV histograms of video overlayed with the video using ffmpeg on Python. The code that I use is the following:
subprocess.call(['ffplay','video.mp4','-vf','"split=2[a][b],[b]histogram,format=yuva444p[hh],[a][hh]overlay"'])
But when I execute the code, this error shows up:
It is a bit strange because if run the same line on the command window, it works with no problem.
Remove the double quotes around the filter - subprocess.call automatically adds quotes around arguments with special characters like [, ], =.
The following command should work:
subprocess.call(['ffplay','video.mp4','-vf','split=2[a][b],[b]histogram,format=yuva444p[hh],[a][hh]overlay'])
For watching the actual command line, you may add -report argument, and check the log file.
subprocess.call(['ffplay','video.mp4','-vf','split=2[a][b],[b]histogram,format=yuva444p[hh],[a][hh]overlay', '-report'])
Applies:
ffplay video.mp4 -vf "split=2[a][b],[b]histogram,format=yuva444p[hh],[a][hh]overlay" -report.
The above command is in correct syntax.
subprocess.call(['ffplay','video.mp4','-vf','"split=2[a][b],[b]histogram,format=yuva444p[hh],[a][hh]overlay"', '-report']
Applies:
ffplay video.mp4 -vf "\"split=2[a][b],[b]histogram,format=yuva444p[hh],[a][hh]overlay\"" -report
As you can see, subprocess added extra "\ and \", and this is the cause for your error.

EOF in scipy.io.wavfile.read

I was trying to get data of a wav file using scipy.io.wavfile.read but it always returns this error message: ValueError: Unexpected end of file.
I went through all the related questions on this site (I guess). But none of them worked. I have also tried writing filename as r'Mozart 40 Allegro.wav'.
import scipy.io.wavfile
sample,data=scipy.io.wavfile.read('Mozart 40 Allegro.wav')
print(data)
Note: Others have mentioned that my wav file may be corrupt, so I downloaded a sample wav file. And this was the result. WavFileWarning: Chunk (non-data) not understood, skipping it.
WavFileWarning)
But is there any way to get the wav file I require which is not corrupt and doesn't give the second error message I mentioned?
Thank You
Thanks: Initially I used some online converter but they do a very bad job in keeping the file intact with the precise format, vlc can handle such errors but these can't. Always use sox to convert and other stuff and don't forget to include the required extra files (lame files) if you are working with mp3.
I have some similar problems with some files that aren't created with proper headers. To solved I first transformed the file from wav to wav with ffmeg. This creates the metadata for the wav file.
Then the steps to follow should be more or less:
ffmpeg -i "Mozart 40 Allegro.wav" -f wav -acodec pcm_s16le -ar 22050 -ac 1 "Mozart 40 Allegro_.wav"
And then the new created file should have the proper metadata. So now it should not raise the error when it is opened on python:
sample,data=scipy.io.wavfile.read('Mozart 40 Allegro_.wav')
Use underscore for spaces:
sample,data=scipy.io.wavfile.read('Mozart_40_Allegro.wav')
Try:
import soundfile as sf
audio = sf.read("file")

using find in subprocess.call() gives error while the command executes properly from command prompt

C:\Windows\System32> ffmpeg -i D:\devaraj\KPIX_test.ts -vf "blackframe" -an -f n
ull - 2>&1|find "Parsed" > D:\devaraj\info.txt
this works fine , writes the file info.txt
subprocess.call('ffmpeg' ,'-i', 'D:\devaraj\KPIX_test.ts' ,'-vf', '"blackframe"', 'D:\devaraj\KPIX_textfinal.mp3', '- 2>&1>','|','find', '"Parsed"', '>' ,'D:\devaraj\info.txt', 'shell=True')
gives an error buffer size must be integer
were as
subprocess.call('ffmpeg -i D:\devaraj\KPIX_test.ts -vf "blackframe" -an -f n
ull - 2>&1|find "Parsed" > D:\devaraj\info.txt', shell=True)
gives an error
'find' is not recognized as an internal or external command,
operable program or batch file.
any help would be appreciated from d bottom of heart
you should use native python methods to get filtered ffmpeg output:
ffmpeg -i D:\devaraj\KPIX_test.ts -vf "blackframe" -an -f null - 2>&1|find "Parsed"
To do this, you'd normally require check_output but this particular example is known to provide the required info but exit with a non-zero return code (using run from Python 3.5 would work, though)
So I'll use Popen instead. It becomes (as list, without all redirections and filters), then read all output from process standard output:
p = subprocess.Popen(["ffmpeg","-i",r"D:\devaraj\KPIX_test.ts",
"-vf","blackframe","-an","-f","null"],stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
output = p.stdout.read()
You don't need shell=True, and it merges error & output streams in the output variable.
Now output contains the output of ffmpeg command. Let's decode it (to get a string) and split the lines, check if the string is in the lines:
for line in output.decode().splitlines(): # python 2: output.splitlines()
if "Parsed" in line:
print(line.rstrip()) # or store it in a file, string, whatever
for a process outputting a lot more text, it would be better to iterate on p.stdout instead of reading the full contents (less memory hungry, allows real-time echo to the console)

Stream a Video output via ffmpeg Pipe into a Python Script for analisis. How to pipe into python?

I'm working on a script in conjunction with other libraries which requires an frame or image in an RGB24 format. For improved compatibility I have decided to allow for an external pipe to stream frames into this program. Changing the device or source every time with in the code can become tedious and using a parser to simply specify the source leads to syntax errors. Example:
ffmpeg -f dshow -i video="OEM Device" a.mpg
works exactly how you would think. However in an subprocess in python
pipe = sp.Popen('ffmpeg -f dshow -i video="OEM Device" a.mpg'.split(),...
Edit I have tried to manually split. 'video="OEM Device"' didn't work inside python either.
Leads to ' Invalid argument "OEM Separating OEM and Device as two different variables/arguments. I have tried the alternative name as well.
Which led me to believe
"
is the problem.
Which led me to piping the video stream into python via the terminal.
ffmpeg -i a.mpg -f image2pipe -vcode rawvideo -pix_fmt rgb24 - |python myscript.py
This is what I have in the Script.
import subprocess as sp
import numpy
import sys
import os
pipe = sp.Popen('ffmpeg -f rawvideo -pix_fmt rgb24 -an -vcodec rawvideo -i - -f image2pipe -pix_fmt rgb24 -an -vcodec rawvideo -'.split(), stdin=sys.stdin, stderr=sp.PIPE, stdout=sp.PIPE)
#Assumeing 720x576 resolution
raw_img = pipe.stdout.read(720*576*3)
image = numpy.fromstring(raw_img, dtype='uint8')
img_load = image.reshape(576, 720, 3)
I know the Above pipe is not needed and can probably be replaced by.(Which I have tried)
raw_img = sys.stdin.read(720*576*3)
Regardless of the two it ordinarily gives output, which results in
image.reshape(576,720,3)
to receive irregular dimensions and never the required 720x576 as is being specified. I have to admit this is the first time using pipes with python. As I understand stderr is Suppressed As I have specified image2pipe.
How can I let ffmpeg to either give python the required dimensions or give an subprocess the syntax ,which allows " in the given command without splitting the values or causing syntax errors?
Instead of writing a string and then .split()-ing it, just pass a proper array to start with:
.Popen(['ffmpeg', '-f', 'dshow', '-i', 'video="OEM Device"', 'a.mpg'], ...)
The command you are calling needs to see video="OEM Device" as a single element in its args array, so you need to pass it as a single element to the Popen args array.
Curtesy of #Grisha Levit: The Answer was to simply remove "
Instead of writing a string and then .split()-ing it, just pass a proper array to start with:
.Popen(['ffmpeg', '-f', 'dshow', '-i', 'video=OEM Device', 'a.mpg'], ...)
The command you are calling needs to see 'video=OEM Device' as a single element in its args array, so you need to pass it as a single element to the Popen args array.

Categories