why audioop causes sound loss for some files? - python

i am decoding a wav file and trying to convert it to pcm_s16le to i used audioop.adpcm2lin() function but it causes sound loss for some wave files how to fix it ?
import audioop
import struct
import wave
with open(f"game_music.wav","rb") as wb:
types = struct.unpack("4s",wb.read(4))[0]
print(types)
temp = wb.read(2)
temp = wb.read(2)
channels = struct.unpack("H",wb.read(2))[0]
print(channels)
temp = wb.read(2)
sample_rate = struct.unpack("H",wb.read(2))[0]
print(sample_rate)
num_samples = struct.unpack("I",wb.read(4))[0]
print(num_samples)
audio_data = audioop.adpcm2lin(wb.read(),2,None)[0]
with wave.open(f"game_musics.wav", "w") as wav:
wav.setnchannels(channels)
wav.setsampwidth(2)
wav.setframerate(sample_rate)
wav.writeframesraw(audio_data)
for example this file working fine .
https://drive.google.com/file/d/1XAlZZ6bkSRDsnsch6fL2u708xB7xscTa/view?usp=sharing
but this file corrupts .
https://drive.google.com/file/d/1IC8p1TIyEaIi5mwMTWvflCUIhaBaR1BR/view?usp=sharing
if anyone trying to fix then you can use this to test.
its orginal version of corrupted file.
https://drive.google.com/file/d/1OyuwxMSYxiyh1rMFDJ1_k0bRp_hx5CpX/viewusp=sharing
Thanks for helping

Related

How to compress WAV file in python?

I have converted MP3 files to WAV format but how can I compress WAV file to very small size less or same size that of MP3 size without changing the file format
from pydub import AudioSegment
import os
# files
src_folder = "D:/projects/data/mp3"
dst_folder = "D:/projects/data/wav"
#get all audio file
files = os.listdir(src_folder)
for name in files:
#name of the file
wav_name = name.replace(".mp3", "")
try:
# convert wav to mp3
sound = AudioSegment.from_mp3("{}/{}".format(src_folder, name))
sound.export("{}/{}.wav".format(dst_folder, wav_name), format="wav")
except Exception as e:
pass
s1.export("output.mp3", format='mp3', parameters=["-ac","2","-ar","8000"])
The line of code managed to reduce my audio size by half its previous size. Hope this is helpful to someone

Mozilla DeepSpeech: How to generate a SRT file from multiple segmented audio file?

I've been following this guide on generating an SRT subtitle file from video/audio files using Mozilla DeepSpeech.
I've been able to remove the silent portion of the audio .wav file into multiple segmented .wav files based on the guide using pyAudioAnalysis library.
Segmented audio files
However, I'm currently having difficulty understanding how to read through the multiple-segmented files and generate a subtitle .srt file using Mozilla DeepSpeech. I've attached an image of the segmented audio file above.
As for my current code, most are similar to the guide I'm following but it doesn't explain well enough regarding the functions.
SilenceRemoval Function
from pyAudioAnalysis import audioBasicIO as aIO
from pyAudioAnalysis import audioSegmentation as aS
def silenceRemoval(input_file, smoothing_window = 1.0, weight = 0.2):
print("Running silenceRemoval function\n")
[fs, x] = aIO.read_audio_file(input_file)
segmentLimits = aS.silence_removal(x, fs, 0.05, 0.05, smoothing_window, weight)
for i, s in enumerate(segmentLimits):
strOut = "{0:s}_{1:.3f}-{2:.3f}.wav".format(input_file[0:-4], s[0], s[1])
# wavfile.write(strOut, fs, x[int(fs * s[0]):int(fs * s[1])])
write_file("audio", strOut, ".wav", x[int(fs * s[0]):int(fs * s[1])], fs)
print("\nsilenceRemoval function completed")
Writing .wav file into multiple segments
import os
import scipy.io.wavfile as wavfile
def write_file(output_file_path, input_file_name, name_attribute, sig, fs):
"""
Read wave file as mono.
Args:
- output_file_path (str) : path to save resulting wave file to.
- input_file_name (str) : name of processed wave file,
- name_attribute (str) : attribute to add to output file name.
- sig (array) : signal/audio array.
- fs (int) : sampling rate.
Returns:
tuple of sampling rate and audio data.
"""
# set-up the output file name
fname = os.path.basename(input_file_name).split(".wav")[0] + name_attribute
fpath = os.path.join(output_file_path, fname)
wavfile.write(filename=fpath, rate=fs, data=sig)
print("Writing data to " + fpath + ".")
main() calling the functions
video_name = "Videos\MIB_Sample.mp4"
audio_name = video_name + ".wav"
# DeepSpeech Model and Scorer
ds = Model("deepspeech-0.9.3-models.pbmm")
scorer = ds.enableExternalScorer("deepspeech-0.9.3-models.scorer")
def main():
# Extract audio from input video file
extractAudio(video_name, audio_name)
print("Splitting on silent parts in audio file")
silenceRemoval(audio_name)
generateSRT(audio_name)
generateSRT() function
def generateSRT(audio_file_name):
command = ["deepspeech", "--model", ds,
"--scorer", scorer,
"--audio", audio_file_name]
try:
ret = sp.call(command, shell=True)
print("generating subtitles")
except Exception as e:
print("Error: ", str(e))
exit(1)
I'm currently trying to generate subtitle from the single extracted audio file but I'm facing this error
Error: expected str, bytes or os.PathLike object, not Model
Appreciate any help on how to loop through the folder that contains the segmented audio file to be read and generate into an SRT file using Mozilla DeepSpeech and output it to another folder. Thank you!
I'm going to address the specific error you are encountering here; the blog post that you linked to is a good guide for the end to end process of creating .srt files using DeepSpeech.
In your code here:
command = ["deepspeech", "--model", ds,
"--scorer", scorer,
"--audio", audio_file_name]
you are invoking the deepspeech binary from the command line, and passing the model as an argument, using the variable ds. If you invoke deepspeech from the command line, it expects a file path to where the model file (the .pbmm file) is.
This is why you are receiving the error:
Error: expected str, bytes or os.PathLike object, not Model
because the deepspeech binary is expecting a file path, not a model object. Try replacing ds with the file path to the model file, rather than making ds a Model.
For more information on how to call deepspeech from the command line, see this page in the documentation.

How to decode base64 to wav format using python?

I have audio recordings saved on a server as base64, webm format and want to decode them with python into a wav file. I tried both suggested ways from a simliar question found here: How to decode base64 String directly to binary audio format. But I'm facing different problems with both suggestions:
The version using file.write resulted in a wav file that I could play with the VLC player and which included the expected content. But I got an error message when I tried to read it with matlab or python saying "unknown format" or "missing riff".
fin = open(dirName + file, "r")
b64_str = fin.read()
fin.close()
# decode base64 string to original binary sound object
decodedData = base64.b64decode(b64_str)
webmfile = (outdir + file.split('.')[0] + ".webm")
wavfile = (outdir + file.split('.')[0] + ".wav")
with open(webmfile , 'wb') as wm:
wm.write(decodedData)
with open(webmfile, 'rb') as wm:
webmdata = pcm.read()
with open(wavfile, 'wb') as file:
file.write(webmdata)
The version using writeframes with setting the parameters result in a file I could read with matlab or python but this one does not contain the expected content and is way shorter than expected.
with wave.open(wavfile, 'wb') as wav:
wav.setparams((1, 2, 48000, 0, 'NONE', 'NONE'))
wav.writeframes(webmdata)
Any ideas on how to solve this problem? The file itself is fine. Converting it with an online converter worked.
In case someone has the same problem at some point, here is the solution which worked for me:
The following code creates a webm file from the base64 str:
import base64
decodedData = base64.b64decode(b64_str)
webmfile = (outdir + file.split('.')[0] + ".webm")
with open(webmfile, 'wb') as file:
file.write(decodedData)
And for the conversion I used ffmpy:
from ffmpy import FFmpeg
ff = FFmpeg(
executable = 'C:/Program Files/ffmpeg-2020/bin/ffmpeg.exe',
inputs={file:None},
outputs = {outfile:'-c:a pcm_f32le'})
ff.cmd
ff.run()
After those two steps, I was able to read the resulting wav file with matlab or any other program.

How to generate wav with G.711alaw from a mp3 file using pydub library?

I am trying to generate a wav file with G. 711 alaw companding from a mp3 file using Pydub library. The wav file is getting generated but it is not resampled to frequency 8 kHz. I have tried following code:
from_path = '/home/nikhil/Music/m1.mp3' #this is a mp3 file
to_path = '/home/nikhil/Music/m1.wav' #resulted file
from_format = 'mp3'
to_format = 'wav'
params = ["-acodec", "pcm_alaw", "-ar", "8000"]
AudioSegment.from_file(from_path, from_format).export(to_path, format=to_format, parameters=params)
Can someone help me?
I was looking over the code in the export method and I realized that ffmpeg is not used when the output format is "wav".
Since wav is used internally it just writes the in-memory version of the audio directly to disk (this was done to make ffmpeg an optional dependency, if you only need wav support you don't need to install it).
I have 2 ideas that may allow you to get around this issue:
Use a different format kwarg, like "pcm". I'm not sure if this will work, and I don't have ffmpeg on my current machine to test, but definitely worth a try.
from_path = '/home/nikhil/Music/m1.mp3' #this is a mp3 file
to_path = '/home/nikhil/Music/m1.wav' #resulted file
from_format = 'mp3'
to_format = 'pcm'
params = ["-acodec", "pcm_alaw", "-ar", "8000"]
AudioSegment.from_file(from_path, from_format).export(to_path, format=to_format, parameters=params)
Use pydub's internal mechanism for resampling to 8kHz: Again, I can't test this right at the moment...
from_path = '/home/nikhil/Music/m1.mp3' #this is a mp3 file
to_path = '/home/nikhil/Music/m1.wav' #resulted file
seg = AudioSegment.from_mp3(from_path)
seg = seg.set_frame_rate(8000)
seg.export(to_path, format="wav")

MP3 with Pyaudio

import pyaudio
import wave
chunk = 1024
wf = wave.open('yes.mp3', 'rb')
p = pyaudio.PyAudio()
stream = p.open(
format = p.get_format_from_width(wf.getsampwidth()),
channels = wf.getnchannels(),
rate = wf.getframerate(),
output = True)
data = wf.readframes(chunk)
while data != '':
stream.write(data)
data = wf.readframes(chunk)
stream.close()
p.terminate()
No matter how I put this, while trying multiple methods I seem to keep getting the following error in terminal:
raise Error, 'file does not start with RIFF id'
I would use pyglet but media and all other modules aren't detected even though I'm able to import pyglet.
Any help?
You're using wave to attempt to open a file that is not wav. Instead, you're attempting to open an mp3 file. The wave module can only open wav files, so you need to convert the mp3 to wav. Here's how you can use pyglet to play an mp3 file:
import pyglet
music = pyglet.resource.media('music.mp3')
music.play()
pyglet.app.run()
It would be much simpler than the method you're trying. What errors are you getting with pyglet?

Categories