Using os.system() to convert audio files sample rate - python

I have started working on an NLP project, and at the start of this, I need to downsample the audio files. To do this I have found one script that can do it automatically, but though I can use it to downsample my audio I'm struggling to understand how it's working.
def convert_audio(audio_path, target_path, remove=False):
"""This function sets the audio `audio_path` to:
- 16000Hz Sampling rate
- one audio channel ( mono )
Params:
audio_path (str): the path of audio wav file you want to convert
target_path (str): target path to save your new converted wav file
remove (bool): whether to remove the old file after converting
Note that this function requires ffmpeg installed in your system."""
os.system(f"ffmpeg -i {audio_path} -ac 1 -ar 16000 {target_path}")
# os.system(f"ffmpeg -i {audio_path} -ac 1 {target_path}")
if remove:
os.remove(audio_path)
this is the code that's giving my trouble, I don't understand how the 4th line from the bottom works, I believe that is the line that resamples the audio files.
The repo this is inside of :
https://github.com/x4nth055/pythoncode-tutorials/
if anyone has information on how this is done I'd love to know, or if there are better ways to downsample audio files! Thanks

Have you ever used ffmpeg? the docs clearly show the options(maybe need audio expertise to understand)
-ac[:stream_specifier] channels (input/output,per-stream) Set the number of audio channels. For output streams it is set by default to
the number of input audio channels. For input streams this option only
makes sense for audio grabbing devices and raw demuxers and is mapped
to the corresponding demuxer options.
-ar[:stream_specifier] freq (input/output,per-stream) Set the audio sampling frequency. For output streams it is set by default to the
frequency of the corresponding input stream. For input streams this
option only makes sense for audio grabbing devices and raw demuxers
and is mapped to the corresponding demuxer options.
Explanations for os.system
Execute the command (a string) in a subshell...on Windows, the return
value is that returned by the system shell after running command. The
shell is given by the Windows environment variable COMSPEC: it is
usually cmd.exe, which returns the exit status of the command run; on
systems using a non-native shell, consult your shell documentation.
for better understanding, suggest print the command
cmd_str = f"ffmpeg -i {audio_path} -ac 1 -ar 16000 {target_path}"
print(cmd_str) # then you can copy paste to cmd/bash and run
os.system(cmd_str)

Related

How do I integrate the FFMPEG commands with Python Script?

I am trying to extract the frames when the scene changes in an .mp4 video.
The package that I am using is FFMPEG.
FFMPEG predominantly works on the CLI and I am trying to integrate it with Python3.x
The command I am using in the CLI is:
ffmpeg -i {0} -vf "select=gt(scene\,0.5), scale=640:360" -vsync vfr frame%d.png
The output comes out just fine with the CLI execution.
But I want to use same command in a Python script, how do I do that and what should be the code?
Being an amateur in the field, currently grappling with this!
You could execute that command from Python via subprocess module, of course, but it would better to use library like https://github.com/kkroening/ffmpeg-python
I would recommend PyAV. it's a proper wrapper around ffmpeg's libraries.
the other mentioned packages use the "subprocess" approach, which is limited and inefficient. these libraries may be more convenient than plain ffmpeg APIs.
Thanks for the help!
This is the snippet of code I'm currently using and it gives the results as I require.
I have added a functionality for timestamp generation of the frames in addition to the frame formation using scene change detection
===========================================================================
> # FFMPEG Package call through script
> # need to change the location in the cmd post -vsync vfr to the location where the frames are to be stored
> # the location should be same as where the videos are located
============================================================================
inputf = []
for filename in os.listdir(path):
file= filename.split('.')[0] # Splits the file at the extension and stores it without .mp4 extension
input_file = path + filename
inputf.append(input_file) # Creates a list of all the files read
for x in range (0, len(inputf)):
cmd = f'ffmpeg -i {inputf[x]} -filter_complex "select=gt(scene\,0.2), scale=640:360, metadata=print:file=time_{file}.txt" -vsync vfr {path where the videos are located}\\{file}_frame%d.jpg'
os.system(cmd)
x=x+1
print("Done") # Takes time will loop over all the videos

.raw to .wav via pydub(AudioSegment) sounds noisy

I want to convert .raw audio file to .wav audio file. So, I use below code with pydub AudioSegment
final = AudioSegment.from_file('input.raw', format='raw', frame_rate=8000, channels=1, sample_width=1).export('result.wav', format='wav')
btw, its output file 'result.wav' sounds very noisy. Actually, I'm not sure 'input.raw' file has clear sound (because it is gotten from RTP packet of VoIP phone call).
So, my question is, does output(.wav) file have clear sound if input(.raw) file does not be crashed? I'm wondering what is the problem. crashed file? or not correct code?
I ran into a similar issue when I was attempting to convert PCMU RAW audio to WAV format and I reached to the author of pydub via this issue on GitHub and here was his response:
pydub assumes any file is a raw wave if the filename ends with raw.
And also doesn't have a way to inject the -ar 8000 into the conversion
command (to tell ffmpeg that the audio is at 8000 samples per second)
So the workaround is to open the file manually and explicitly tell pydub what the format of the file is like so:
# open the file ourselves so that pydub doesn't try to inspect the file name
with open('input.raw', 'rb') as raw_audio_f:
# explicitly tell pydub the format for your file
# use ffmpeg -i format | grep PCM to figure out what to string value to use
sound = AudioSegment.from_file(raw_audio_f, format="mulaw")
# override the default sample rate with the rate we know is correct
sound.frame_rate = 8000
# then export it
sound.export('result.wav')

EOF in scipy.io.wavfile.read

I was trying to get data of a wav file using scipy.io.wavfile.read but it always returns this error message: ValueError: Unexpected end of file.
I went through all the related questions on this site (I guess). But none of them worked. I have also tried writing filename as r'Mozart 40 Allegro.wav'.
import scipy.io.wavfile
sample,data=scipy.io.wavfile.read('Mozart 40 Allegro.wav')
print(data)
Note: Others have mentioned that my wav file may be corrupt, so I downloaded a sample wav file. And this was the result. WavFileWarning: Chunk (non-data) not understood, skipping it.
WavFileWarning)
But is there any way to get the wav file I require which is not corrupt and doesn't give the second error message I mentioned?
Thank You
Thanks: Initially I used some online converter but they do a very bad job in keeping the file intact with the precise format, vlc can handle such errors but these can't. Always use sox to convert and other stuff and don't forget to include the required extra files (lame files) if you are working with mp3.
I have some similar problems with some files that aren't created with proper headers. To solved I first transformed the file from wav to wav with ffmeg. This creates the metadata for the wav file.
Then the steps to follow should be more or less:
ffmpeg -i "Mozart 40 Allegro.wav" -f wav -acodec pcm_s16le -ar 22050 -ac 1 "Mozart 40 Allegro_.wav"
And then the new created file should have the proper metadata. So now it should not raise the error when it is opened on python:
sample,data=scipy.io.wavfile.read('Mozart 40 Allegro_.wav')
Use underscore for spaces:
sample,data=scipy.io.wavfile.read('Mozart_40_Allegro.wav')
Try:
import soundfile as sf
audio = sf.read("file")

using ffmpeg with python, Input buffer exhausted before END element found

When I run this from command line everything is fine
ffmpeg -i input.mp4 -f mp3 -ab 320000 -vn output.mp3
But when I call the same from python
subprocess.call(['ffmpeg', '-i', 'input.mp4', '-f', 'mp3', '-ab', '320000', '-vn', 'output.mp3'])
After several seconds converting I'm getting this error
[aac # 0x7fb3d803e000] decode_band_types: Input buffer exhausted before
END element found
Error while decoding stream #0:1: Invalid data found when processing
input
[mov,mp4,m4a,3gp,3g2,mj2 # 0x7fb3d8000000] stream 1, offset 0x80011d:
partial file
input.mp4: Invalid data found when processing input
Any ideas?
You need to add -dn & -ignore_unknown and -sn option (if subtitles causing encoding failure).
-dn refers to no data encoding.
-sn refers to no subtitles encoding
-ignore_unknown refers to ignore the unknown streams(SCTE 35, 128 data)
Irrespective of the input streams, -dn -sn & -ignore_unknown options will work.
That will solve your problem.
There are another options if you want to preserve the data, subtitles streams.
-c:d copy refers to copy the data streams.
-c:s copy refers to copy the subtitle streams.
You can use -copy_unknown option to get the unknown streams into your output.
Your final code would look like below.
subprocess.call(['ffmpeg', '-i', 'input.mp4', '-f', 'mp3', '-ab', '320000', '-vn', '-sn', '-dn', '-ignore_unknown', 'output.mp3'])
NOTE: -copy_unknown option only works with ffmpeg 4.x version or above.
Note - this error is not a FATAL error as defined by ffmpeg, meaning you could catch the error in code and the conversion will continue and the mp4 will be playable. It will be missing the 'failed' information, but most likely this was invisible to the user.

capture screenshot/frame of a video file

is there a way to capture a single frame of a video file in python?
it could also be done by command line. im using handbrakecli to convert the videos,
but i would need some screenshots of it too.
thank you
You should first check out PyFFmpeg.
PyFFmpeg is a wrapper around FFmpeg's
libavcodec, libavformat and libavutil
libraries whose main purpose is to
provide access to individual frames of
video files of various formats
(including MPEG and DIVX encoded
videos). It also provides access to
audio data.
It is also possible using ffmpeg, so call that using subprocess. A simple search will give you the command required to extract a frame from a video file. Just call that command using subprocess and that should do it.
>>> import subprocess
>>> import shlex # to split the command that follows
>>> command = 'ffmpeg -i sample.avi' # your command goes here
>>> subprocess.call(shlex.split(command))
The similar procedure applies to handbrakecli or whatever you might use. Just call the appropriate command.

Categories