I was trying to get data of a wav file using scipy.io.wavfile.read but it always returns this error message: ValueError: Unexpected end of file.
I went through all the related questions on this site (I guess). But none of them worked. I have also tried writing filename as r'Mozart 40 Allegro.wav'.
import scipy.io.wavfile
sample,data=scipy.io.wavfile.read('Mozart 40 Allegro.wav')
print(data)
Note: Others have mentioned that my wav file may be corrupt, so I downloaded a sample wav file. And this was the result. WavFileWarning: Chunk (non-data) not understood, skipping it.
WavFileWarning)
But is there any way to get the wav file I require which is not corrupt and doesn't give the second error message I mentioned?
Thank You
Thanks: Initially I used some online converter but they do a very bad job in keeping the file intact with the precise format, vlc can handle such errors but these can't. Always use sox to convert and other stuff and don't forget to include the required extra files (lame files) if you are working with mp3.
I have some similar problems with some files that aren't created with proper headers. To solved I first transformed the file from wav to wav with ffmeg. This creates the metadata for the wav file.
Then the steps to follow should be more or less:
ffmpeg -i "Mozart 40 Allegro.wav" -f wav -acodec pcm_s16le -ar 22050 -ac 1 "Mozart 40 Allegro_.wav"
And then the new created file should have the proper metadata. So now it should not raise the error when it is opened on python:
sample,data=scipy.io.wavfile.read('Mozart 40 Allegro_.wav')
Use underscore for spaces:
sample,data=scipy.io.wavfile.read('Mozart_40_Allegro.wav')
Try:
import soundfile as sf
audio = sf.read("file")
Related
I have started working on an NLP project, and at the start of this, I need to downsample the audio files. To do this I have found one script that can do it automatically, but though I can use it to downsample my audio I'm struggling to understand how it's working.
def convert_audio(audio_path, target_path, remove=False):
"""This function sets the audio `audio_path` to:
- 16000Hz Sampling rate
- one audio channel ( mono )
Params:
audio_path (str): the path of audio wav file you want to convert
target_path (str): target path to save your new converted wav file
remove (bool): whether to remove the old file after converting
Note that this function requires ffmpeg installed in your system."""
os.system(f"ffmpeg -i {audio_path} -ac 1 -ar 16000 {target_path}")
# os.system(f"ffmpeg -i {audio_path} -ac 1 {target_path}")
if remove:
os.remove(audio_path)
this is the code that's giving my trouble, I don't understand how the 4th line from the bottom works, I believe that is the line that resamples the audio files.
The repo this is inside of :
https://github.com/x4nth055/pythoncode-tutorials/
if anyone has information on how this is done I'd love to know, or if there are better ways to downsample audio files! Thanks
Have you ever used ffmpeg? the docs clearly show the options(maybe need audio expertise to understand)
-ac[:stream_specifier] channels (input/output,per-stream) Set the number of audio channels. For output streams it is set by default to
the number of input audio channels. For input streams this option only
makes sense for audio grabbing devices and raw demuxers and is mapped
to the corresponding demuxer options.
-ar[:stream_specifier] freq (input/output,per-stream) Set the audio sampling frequency. For output streams it is set by default to the
frequency of the corresponding input stream. For input streams this
option only makes sense for audio grabbing devices and raw demuxers
and is mapped to the corresponding demuxer options.
Explanations for os.system
Execute the command (a string) in a subshell...on Windows, the return
value is that returned by the system shell after running command. The
shell is given by the Windows environment variable COMSPEC: it is
usually cmd.exe, which returns the exit status of the command run; on
systems using a non-native shell, consult your shell documentation.
for better understanding, suggest print the command
cmd_str = f"ffmpeg -i {audio_path} -ac 1 -ar 16000 {target_path}"
print(cmd_str) # then you can copy paste to cmd/bash and run
os.system(cmd_str)
I want to convert .raw audio file to .wav audio file. So, I use below code with pydub AudioSegment
final = AudioSegment.from_file('input.raw', format='raw', frame_rate=8000, channels=1, sample_width=1).export('result.wav', format='wav')
btw, its output file 'result.wav' sounds very noisy. Actually, I'm not sure 'input.raw' file has clear sound (because it is gotten from RTP packet of VoIP phone call).
So, my question is, does output(.wav) file have clear sound if input(.raw) file does not be crashed? I'm wondering what is the problem. crashed file? or not correct code?
I ran into a similar issue when I was attempting to convert PCMU RAW audio to WAV format and I reached to the author of pydub via this issue on GitHub and here was his response:
pydub assumes any file is a raw wave if the filename ends with raw.
And also doesn't have a way to inject the -ar 8000 into the conversion
command (to tell ffmpeg that the audio is at 8000 samples per second)
So the workaround is to open the file manually and explicitly tell pydub what the format of the file is like so:
# open the file ourselves so that pydub doesn't try to inspect the file name
with open('input.raw', 'rb') as raw_audio_f:
# explicitly tell pydub the format for your file
# use ffmpeg -i format | grep PCM to figure out what to string value to use
sound = AudioSegment.from_file(raw_audio_f, format="mulaw")
# override the default sample rate with the rate we know is correct
sound.frame_rate = 8000
# then export it
sound.export('result.wav')
I am using google API for speech to text.
below is my python code:
from google.cloud import speech_v1p1beta1 as speech
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="C:\\Users\\chetan.patil\\Speech Recognition-db71b5de7c80.json" #Specified key
client=speech.SpeechClient()
speech_file="Chetan_Recording_20Secflac.flac" #import file
with open(speech_file,'rb') as audio_file:
content=audio_file.read()
audio=speech.types.RecognitionAudio(content=content)
config=speech.types.RecognitionConfig(encoding=speech.enums.RecognitionConfig.AudioEncoding.LINEAR16,
language_code='en_US',enable_speaker_diarization=True,audio_channel_count=1,
sample_rate_hertz=44100)
response = client.recognize(config, audio)
When i run the last code of line. It gives error as "400 Specify FLAC encoding to match file header"
Even i tried with .wav file then its giving error as "400 Must use single channel (mono) audio, but WAV header indicates 2 channels"
Can anyone please help me on this?
Removing the entire encoding configuration also seems to work. I mean dropping the encoding=speech.enums.RecognitionConfig.AudioEncoding.LINEAR16 from the config settings since this can be inferred from the headers of the audio file.
When i run the last code of line. It gives error as "400 Specify FLAC encoding to match file header"
You need speech.enums.RecognitionConfig.AudioEncoding.FLAC to process FLAC files
Even i tried with .wav file then its giving error as "400 Must use
single channel (mono) audio, but WAV header indicates 2 channels"
The wav file should be mono indeed, looks like you tried a stereo file.
When I run this from command line everything is fine
ffmpeg -i input.mp4 -f mp3 -ab 320000 -vn output.mp3
But when I call the same from python
subprocess.call(['ffmpeg', '-i', 'input.mp4', '-f', 'mp3', '-ab', '320000', '-vn', 'output.mp3'])
After several seconds converting I'm getting this error
[aac # 0x7fb3d803e000] decode_band_types: Input buffer exhausted before
END element found
Error while decoding stream #0:1: Invalid data found when processing
input
[mov,mp4,m4a,3gp,3g2,mj2 # 0x7fb3d8000000] stream 1, offset 0x80011d:
partial file
input.mp4: Invalid data found when processing input
Any ideas?
You need to add -dn & -ignore_unknown and -sn option (if subtitles causing encoding failure).
-dn refers to no data encoding.
-sn refers to no subtitles encoding
-ignore_unknown refers to ignore the unknown streams(SCTE 35, 128 data)
Irrespective of the input streams, -dn -sn & -ignore_unknown options will work.
That will solve your problem.
There are another options if you want to preserve the data, subtitles streams.
-c:d copy refers to copy the data streams.
-c:s copy refers to copy the subtitle streams.
You can use -copy_unknown option to get the unknown streams into your output.
Your final code would look like below.
subprocess.call(['ffmpeg', '-i', 'input.mp4', '-f', 'mp3', '-ab', '320000', '-vn', '-sn', '-dn', '-ignore_unknown', 'output.mp3'])
NOTE: -copy_unknown option only works with ffmpeg 4.x version or above.
Note - this error is not a FATAL error as defined by ffmpeg, meaning you could catch the error in code and the conversion will continue and the mp4 will be playable. It will be missing the 'failed' information, but most likely this was invisible to the user.
So here's the problem. I have sample.gz file which is roughly 60KB in size. I want to decompress the first 2000 bytes of this file. I am running into CRC check failed error, I guess because the gzip CRC field appears at the end of file, and it requires the entire gzipped file to decompress. Is there a way to get around this? I don't care about the CRC check. Even if I fail to decompress because of bad CRC, that is OK. Is there a way to get around this and unzip partial .gz files?
The code I have so far is
import gzip
import time
import StringIO
file = open('sample.gz', 'rb')
mybuf = MyBuffer(file)
mybuf = StringIO.StringIO(file.read(2000))
f = gzip.GzipFile(fileobj=mybuf)
data = f.read()
print data
The error encountered is
File "gunzip.py", line 27, in ?
data = f.read()
File "/usr/local/lib/python2.4/gzip.py", line 218, in read
self._read(readsize)
File "/usr/local/lib/python2.4/gzip.py", line 273, in _read
self._read_eof()
File "/usr/local/lib/python2.4/gzip.py", line 309, in _read_eof
raise IOError, "CRC check failed"
IOError: CRC check failed
Also is there any way to use zlib module to do this and ignore the gzip headers?
The issue with the gzip module is not that it can't decompress the partial file, the error occurs only at the end when it tries to verify the checksum of the decompressed content. (The original checksum is stored at the end of the compressed file so the verification will never, ever work with a partial file.)
The key is to trick gzip into skipping the verification. The answer by caesar0301 does this by modifying the gzip source code, but it's not necessary to go that far, simple monkey patching will do. I wrote this context manager to temporarily replace gzip.GzipFile._read_eof while I decompress the partial file:
import contextlib
#contextlib.contextmanager
def patch_gzip_for_partial():
"""
Context manager that replaces gzip.GzipFile._read_eof with a no-op.
This is useful when decompressing partial files, something that won't
work if GzipFile does it's checksum comparison.
"""
_read_eof = gzip.GzipFile._read_eof
gzip.GzipFile._read_eof = lambda *args, **kwargs: None
yield
gzip.GzipFile._read_eof = _read_eof
An example usage:
from cStringIO import StringIO
with patch_gzip_for_partial():
decompressed = gzip.GzipFile(StringIO(compressed)).read()
I seems that you need to look into Python zlib library instead
The GZIP format relies on zlib, but introduces a file-level compression concept along with CRC checking, and this appears to be what you do not want/need at the moment.
See for example these code snippets from Dough Hellman
Edit: the code on Doubh Hellman's site only show how to compress or decompress with zlib. As indicated above, GZIP is "zlib with an envelope", and you'll need to decode the envellope before getting to the zlib-compressed data per se. Here's more info to go about it, it's really not that complicated:
see RFC 1952 for details about the GZIP format
This format starts with a 10 bytes header, followed by optional, non compressed elements such as the file name or a comment, followed by the zlib-compressed data, itself followed by a CRC-32 (precisely an "Adler32" CRC).
By using Python's struct module, parsing the header should be relatively simple
The zlib sequence (or its first few thousand bytes, since that is what you want to do) can then be decompressed with python's zlib module, as shown in the examples above
Possible problems to handle: if there are more than one file in the GZip archive, and if the second file starts within the block of a few thousand bytes we wish to decompress.
Sorry to provide neither an simple procedure nor a ready-to-go snippet, however decoding the file with the indication above should be relatively quick and simple.
I can't see any possible reason why you would want to decompress the first 2000 compressed bytes. Depending on the data, this may uncompress to any number of output bytes.
Surely you want to uncompress the file, and stop when you have uncompressed as much of the file as you need, something like:
f = gzip.GzipFile(fileobj=open('postcode-code.tar.gz', 'rb'))
data = f.read(4000)
print data
AFAIK, this won't cause the whole file to be read. It will only read as much as is necessary to get the first 4000 bytes.
I also encounter this problem when I use my python script to read compressed files generated by gzip tool under Linux and the original files were lost.
By reading the implementation of gzip.py of Python, I found that gzip.GzipFile had similar methods of File class and exploited python zip module to process data de/compressing. At the same time, the _read_eof() method is also present to check the CRC of each file.
But in some situations, like processing Stream or .gz file without correct CRC (my problem), an IOError("CRC check failed") will be raised by _read_eof(). Therefore, I try to modify the gzip module to disable the CRC check and finally this problem disappeared.
def _read_eof(self):
pass
https://github.com/caesar0301/PcapEx/blob/master/live-scripts/gzip_mod.py
I know it's a brute-force solution, but it save much time to rewrite yourself some low level methods using the zip module, like of reading data chuck by chuck from the zipped files and extract the data line by line, most of which has been present in the gzip module.
Jamin