I need to record a .3gp audio file coming from the Android front-end to be converted into .wav audio using the python Flask server back-end for further processing. Any suggested method or library to convert .3gp audio into .wav audio format?
audiofile = flask.request.files['file']
filename = werkzeug.utils.secure_filename(audiofile.filename)
audiofile.save('Audio/' + filename)
I'm using this code now which receives the audio file as .3gp. I need to convert this into .wav format
Update: You can also do it using ffmpeg
Method 1:
https://github.com/adaptlearning/adapt_authoring/wiki/Installing-FFmpeg#installing-ffmpeg-in-ubuntu
bash
ffmpeg -i path/to/3gp.3gp path/to/wav.wav
or
python (which runs bash command)
import os
os.system('ffmpeg -i path/to/3gp.3gp path/to/wav.wav')
Method 2:
Convert .3gp to .mp3 then .mp3 to .wav
Use https://pypi.org/project/ftransc/ to convert .3gp to .mp3. Currently there is no python API for that so either use
bash
ftransc -f mp3 filename.3gp give the destination - check for help
OR
python
os.system('ftransc -f mp3 filename.3gp')
Then use pydub https://github.com/jiaaro/pydub#installation to convert .mp3 to .wav
newAudio = AudioSegment.from_mp3('path/to/mp3')
newAudio.export('path/to/destination.wav', format="wav")
Related
I have started working on an NLP project, and at the start of this, I need to downsample the audio files. To do this I have found one script that can do it automatically, but though I can use it to downsample my audio I'm struggling to understand how it's working.
def convert_audio(audio_path, target_path, remove=False):
"""This function sets the audio `audio_path` to:
- 16000Hz Sampling rate
- one audio channel ( mono )
Params:
audio_path (str): the path of audio wav file you want to convert
target_path (str): target path to save your new converted wav file
remove (bool): whether to remove the old file after converting
Note that this function requires ffmpeg installed in your system."""
os.system(f"ffmpeg -i {audio_path} -ac 1 -ar 16000 {target_path}")
# os.system(f"ffmpeg -i {audio_path} -ac 1 {target_path}")
if remove:
os.remove(audio_path)
this is the code that's giving my trouble, I don't understand how the 4th line from the bottom works, I believe that is the line that resamples the audio files.
The repo this is inside of :
https://github.com/x4nth055/pythoncode-tutorials/
if anyone has information on how this is done I'd love to know, or if there are better ways to downsample audio files! Thanks
Have you ever used ffmpeg? the docs clearly show the options(maybe need audio expertise to understand)
-ac[:stream_specifier] channels (input/output,per-stream) Set the number of audio channels. For output streams it is set by default to
the number of input audio channels. For input streams this option only
makes sense for audio grabbing devices and raw demuxers and is mapped
to the corresponding demuxer options.
-ar[:stream_specifier] freq (input/output,per-stream) Set the audio sampling frequency. For output streams it is set by default to the
frequency of the corresponding input stream. For input streams this
option only makes sense for audio grabbing devices and raw demuxers
and is mapped to the corresponding demuxer options.
Explanations for os.system
Execute the command (a string) in a subshell...on Windows, the return
value is that returned by the system shell after running command. The
shell is given by the Windows environment variable COMSPEC: it is
usually cmd.exe, which returns the exit status of the command run; on
systems using a non-native shell, consult your shell documentation.
for better understanding, suggest print the command
cmd_str = f"ffmpeg -i {audio_path} -ac 1 -ar 16000 {target_path}"
print(cmd_str) # then you can copy paste to cmd/bash and run
os.system(cmd_str)
I am trying to extract the frames when the scene changes in an .mp4 video.
The package that I am using is FFMPEG.
FFMPEG predominantly works on the CLI and I am trying to integrate it with Python3.x
The command I am using in the CLI is:
ffmpeg -i {0} -vf "select=gt(scene\,0.5), scale=640:360" -vsync vfr frame%d.png
The output comes out just fine with the CLI execution.
But I want to use same command in a Python script, how do I do that and what should be the code?
Being an amateur in the field, currently grappling with this!
You could execute that command from Python via subprocess module, of course, but it would better to use library like https://github.com/kkroening/ffmpeg-python
I would recommend PyAV. it's a proper wrapper around ffmpeg's libraries.
the other mentioned packages use the "subprocess" approach, which is limited and inefficient. these libraries may be more convenient than plain ffmpeg APIs.
Thanks for the help!
This is the snippet of code I'm currently using and it gives the results as I require.
I have added a functionality for timestamp generation of the frames in addition to the frame formation using scene change detection
===========================================================================
> # FFMPEG Package call through script
> # need to change the location in the cmd post -vsync vfr to the location where the frames are to be stored
> # the location should be same as where the videos are located
============================================================================
inputf = []
for filename in os.listdir(path):
file= filename.split('.')[0] # Splits the file at the extension and stores it without .mp4 extension
input_file = path + filename
inputf.append(input_file) # Creates a list of all the files read
for x in range (0, len(inputf)):
cmd = f'ffmpeg -i {inputf[x]} -filter_complex "select=gt(scene\,0.2), scale=640:360, metadata=print:file=time_{file}.txt" -vsync vfr {path where the videos are located}\\{file}_frame%d.jpg'
os.system(cmd)
x=x+1
print("Done") # Takes time will loop over all the videos
I want to convert .raw audio file to .wav audio file. So, I use below code with pydub AudioSegment
final = AudioSegment.from_file('input.raw', format='raw', frame_rate=8000, channels=1, sample_width=1).export('result.wav', format='wav')
btw, its output file 'result.wav' sounds very noisy. Actually, I'm not sure 'input.raw' file has clear sound (because it is gotten from RTP packet of VoIP phone call).
So, my question is, does output(.wav) file have clear sound if input(.raw) file does not be crashed? I'm wondering what is the problem. crashed file? or not correct code?
I ran into a similar issue when I was attempting to convert PCMU RAW audio to WAV format and I reached to the author of pydub via this issue on GitHub and here was his response:
pydub assumes any file is a raw wave if the filename ends with raw.
And also doesn't have a way to inject the -ar 8000 into the conversion
command (to tell ffmpeg that the audio is at 8000 samples per second)
So the workaround is to open the file manually and explicitly tell pydub what the format of the file is like so:
# open the file ourselves so that pydub doesn't try to inspect the file name
with open('input.raw', 'rb') as raw_audio_f:
# explicitly tell pydub the format for your file
# use ffmpeg -i format | grep PCM to figure out what to string value to use
sound = AudioSegment.from_file(raw_audio_f, format="mulaw")
# override the default sample rate with the rate we know is correct
sound.frame_rate = 8000
# then export it
sound.export('result.wav')
I was trying to get data of a wav file using scipy.io.wavfile.read but it always returns this error message: ValueError: Unexpected end of file.
I went through all the related questions on this site (I guess). But none of them worked. I have also tried writing filename as r'Mozart 40 Allegro.wav'.
import scipy.io.wavfile
sample,data=scipy.io.wavfile.read('Mozart 40 Allegro.wav')
print(data)
Note: Others have mentioned that my wav file may be corrupt, so I downloaded a sample wav file. And this was the result. WavFileWarning: Chunk (non-data) not understood, skipping it.
WavFileWarning)
But is there any way to get the wav file I require which is not corrupt and doesn't give the second error message I mentioned?
Thank You
Thanks: Initially I used some online converter but they do a very bad job in keeping the file intact with the precise format, vlc can handle such errors but these can't. Always use sox to convert and other stuff and don't forget to include the required extra files (lame files) if you are working with mp3.
I have some similar problems with some files that aren't created with proper headers. To solved I first transformed the file from wav to wav with ffmeg. This creates the metadata for the wav file.
Then the steps to follow should be more or less:
ffmpeg -i "Mozart 40 Allegro.wav" -f wav -acodec pcm_s16le -ar 22050 -ac 1 "Mozart 40 Allegro_.wav"
And then the new created file should have the proper metadata. So now it should not raise the error when it is opened on python:
sample,data=scipy.io.wavfile.read('Mozart 40 Allegro_.wav')
Use underscore for spaces:
sample,data=scipy.io.wavfile.read('Mozart_40_Allegro.wav')
Try:
import soundfile as sf
audio = sf.read("file")
I have 100 uncompressed mov (Video files) and i want to convert all mov to sgi image sequences.
i have a list of all mov file path.
how to convert .mov (video) to .sgi (image sequence) using python and FFmpeg.
you can use ffmpeg to convert the video to sgi images using this ffmpeg command
ffmpeg -i inputVideo outputFrames_%04d.sgi
-replace inputVideo your input file path and name
-replace outputFrames with output file path and name
-replace '4' in _%04d with the number of digits you want for sequential image file naming.
now one way to process your files from python is to launch ffmpeg as a subprocess and providing the command you want executed by ffmpeg:
import subprocess as sp
cmd='ffmpeg -i inputVideo outputFrames_%04d.sgi'
sp.call(cmd,shell=True)
remember to use double \ in your file path in the cmd command string (at least for me on windows).
If you want to loop over 100 movie files, write a loop that concatenates the command string with the appropriate input and output file names.