How to took middle time range from wav . Python - python

I need to extract exaclly 8 seconds from the middle of the audio data from the wav. file with lenght 0:27 sec.
--All what I did already, it took the middle 9 sec by divided wav. file on 3 parts and took the middle one, but it's 9s I need 8s.
And how find a number of bits in that numpy array?
import scipy.io.wavfile
import pyaudio
import numpy as np
(samplRate,data)=scipy.io.wavfile.read('Track48.wav')
print
CHANNELS=2
p= pyaudio.PyAudio()
#
nine_sec=len(data)/3
eight_sec=2*len(data)/3
stream = p.open(format=pyaudio.paInt16,
channels=CHANNELS,
rate=44100,
output=True
)
cuted_data=data[nine_sec:eight_sec]
newdata = cuted_data.astype(np.int16).tostring()
stream.write(newdata)
print(cuted_data)
Thank you for your help.

You can use pydub to slice middle 8 seconds very easily.
Details on pydub are here
And you can install as pip install pydub
I had a wav file of 348 sec duration whose middle 8 seconds are sliced.
>>> song.duration_seconds
348.05551020408166
You can also use different file formats such as wav, mp3, m4a, ogg etc. for import (convert to data-segments) and export.
Source Code
from pydub import AudioSegment
from pydub.playback import play
song = AudioSegment.from_wav("music.wav")
#slice middle eight seconds of audio
midpoint = song.duration_seconds // 2
left_four_seconds = (midpoint - 4) * 1000 #pydub workds in milliseconds
right_four_seconds = (midpoint + 4) * 1000 #pydub workds in milliseconds
eight_sec_slice = song[left_four_seconds:right_four_seconds ]
#Play slice
play(eight_sec_slice )
#or save to file
eight_sec_slice.export("eight_sec_slice.wav", format="wav")
As you can see length of middle 8 seconds slice is exactly as desired.
>>> eight_sec_slice.duration_seconds
8.0

I know this question is old but someone may like the solution i want to offer which uses numpy only. No need of pydub.
import scipy.io.wavfile as wavfile
fs, data - wavefile.read("Track48.wav")
# number of samples N
N = data.shape[0]
# Convert seconds to samples
eight_secs_in_samples = float(fs)*8 # time = number_of_samples / rate
midpoint_sample = N//2 # Midpoint of sample
# substract 4 seconds from midpoint
left_side = midpoint_sample-(eight_secs_in_samples//2)
# Add 4 seconds from midpoint up
right_side = midpoint_sample + (eight_secs_in_samples//2)
# The midpoint of samples is therefore:
mid8secs = array_data[int(left_side):int(right_side)] # this range contains the required samples
# Save the file
wavfile.write("eightSecSlice.wav",fs,mid8secs)

Related

How can i pad wav file to specific length?

I am using wave files for making deep learning model
they are in different length , so i want to pad all of them
to 16 sec length using python
If I understood correctly, the question wants to fix all lengths to a given length. Therefore, the solution will be slightly different:
from pydub import AudioSegment
pad_ms = 1000 # Add here the fix length you want (in milliseconds)
audio = AudioSegment.from_wav('you-wav-file.wav')
assert pad_ms > len(audio), "Audio was longer that 1 second. Path: " + str(full_path)
silence = AudioSegment.silent(duration=pad_ms-len(audio)+1)
padded = audio + silence # Adding silence after the audio
padded.export('padded-file.wav', format='wav')
This answer differs from this one in the sense that this one creates all audios from the same length where the other adds the same size of silence at the end.
Using pydub:
from pydub import AudioSegment
pad_ms = 1000 # milliseconds of silence needed
silence = AudioSegment.silent(duration=pad_ms)
audio = AudioSegment.from_wav('you-wav-file.wav')
padded = audio + silence # Adding silence after the audio
padded.export('padded-file.wav', format='wav')
AudioSegment objects are immutable
You can use Librosa. The Librosa.util.fix_length function adds silent patch to audio file by appending zeros to the end the numpy array containing the audio data:
from librosa import load
from librosa.util import fix_length
file_path = 'dir/audio.wav'
sf = 44100 # sampling frequency of wav file
required_audio_size = 5 # audio of size 2 second needs to be padded to 5 seconds
audio, sf = load(file_path, sr=sf, mono=True) # mono=True converts stereo audio to mono
padded_audio = fix_length(audio, size=5*sf) # array size is required_audio_size*sampling frequency
print('Array length before padding', np.shape(audio))
print('Audio length before padding in seconds', (np.shape(audio)[0]/fs))
print('Array length after padding', np.shape(padded_audio))
print('Audio length after padding in seconds', (np.shape(padded_audio)[0]/fs))
Output:
Array length before padding (88200,)
Audio length before padding in seconds 2.0
Array length after padding (220500,)
Audio length after padding in seconds 5.0
Although after looking through a number of similar questions, it seems like pydub.AudioSegment is the go to solution.

How to write pyaudio output into audio file?

I currently have the following code, which produces a sine wave of varying frequencies using the pyaudio module:
import pyaudio
import numpy as np
p = pyaudio.PyAudio()
volume = 0.5
fs = 44100
duration = 1
f = 440
samples = (np.sin(2 * np.pi * np.arange(fs * duration) * f /
fs)).astype(np.float32).tobytes()
stream = p.open(format = pyaudio.paFloat32,
channels = 1,
rate = fs,
output = True)
stream.write(samples)
However, instead of playing the sound, is there any way to make it so that the sound is written into an audio file?
Add this code at the top of your code.
from scipy.io.wavfile import write
Also, add this code at the bottom of your code.
This worked for me.
scaled = numpy.int16(s/numpy.max(numpy.abs(s)) * 32767)
write('test.wav', 44100, scaled)
Using scipy.io.wavfile.write as suggested by #h lee produced the desired results:
import numpy
from scipy.io.wavfile import write
volume = 1
sample_rate = 44100
duration = 10
frequency = 1000
samples = (numpy.sin(2 * numpy.pi * numpy.arange(sample_rate * duration)
* frequency / sample_rate)).astype(numpy.float32)
write('test.wav', sample_rate, samples)
Another example can be found on the documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.wavfile.write.html
Handle your audio input as a numpy array like I did here in the second answer, but instead of just processing the frames and sending the data back to PyAudio, save each frame in a new output_array. Then when the processing is done you can use that output_array to write it to .wav or .mp3 file.
If you do that, however, the sound would still play. If you don't want to play the sound you have two options, either using the blocking mode or, if you wanna stick with the non blocking mode and the callbacks, do the following:
Erase output=True so that it is False as default.
Add a input=True parameter.
In your callback do not return ret_data, instead return None.
Keep count of the frames you've processed so that when you're done you return paComplete as second value of the returned tuple.

Why does tone change when I change the block size?

import pyaudio
import numpy as np
RATE=44100
block = 64
pa = pyaudio.PyAudio()
stream = pa.open(format=pyaudio.paFloat32,
channels=1,
rate=RATE,
output=True)
while True:
x = np.arange(block,dtype=np.float32)
output = np.cos(2*np.pi*2000*x/44100)
output = output.tobytes()
stream.write(output)
I want to play a cosine wave with 2000Hz frequency and 64 block size. Why does tone change when I change the block size? It should be fixed tone with certain frequency whatever the block size is, shouldn't it?
Thank you for your reply.
I'm not sure what you are trying to achieve with your calculation. For a 2kHz-sound, you need 2000 sin-waves every second or every 44100 samples/ 1 sin-wave every ~22 samples or 0.5ms. The best way to find such formulas is grabbing pen and paper and find out what you actually want (how to actually combine frequency, sampling-rate and desired blocklength). One possible way is here but try to understand the math behind (untested):
import pyaudio
import numpy as np
RATE=44100
FREQUENCY = 2000
pa = pyaudio.PyAudio()
stream = pa.open(format=pyaudio.paFloat32,
channels=1,
rate=RATE,
output=True)
sample_len = 4000.0
wave_len = float(RATE) / FREQUENCY # ~22 samples per wave
# x goes from 0 to 1 for approx index 0..wave_len-1, 1..2 for wave_len..2wave_len-1, ...
x = np.arange(sample_len,dtype=np.float32)/wave_len
# 0..1 -> 0..1..0..-1..0; 1..2 -> 0..1..0..-1..0
# yes, I prefer sin over cos
output = np.sin(2*np.pi*x)
output = output.tobytes()
# no need to recreate the pattern every cycle
while True:
stream.write(output)

Play a part of a .wav file in python

Is it possible to play a certain part of a .wav file in Python?
I'd like to have a function play(file, start, length) that plays the audiofile file from start seconds and stops playing after length seconds. Is this possible, and if so, what library do I need?
this is possible and can be easy in python.
Pyaudio is a nice library and you can use to play your audio!
First do you need decode the audio file (wav, mp3, etc) this step convert audio data in numbers(short int or float32).
Do you need convert the seconds in equivalent position point to cut the signal in the position of interest, to do this multiply your frame rate by what seconds do you want !
Here one simple example for wav files:
import pyaudio
import sys
import numpy as np
import wave
import struct
File='ederwander.wav'
start = 12
length=7
chunk = 1024
spf = wave.open(File, 'rb')
signal = spf.readframes(-1)
signal = np.fromstring(signal, 'Int16')
p = pyaudio.PyAudio()
stream = p.open(format =
p.get_format_from_width(spf.getsampwidth()),
channels = spf.getnchannels(),
rate = spf.getframerate(),
output = True)
pos=spf.getframerate()*length
signal =signal[start*spf.getframerate():(start*spf.getframerate()) + pos]
sig=signal[1:chunk]
inc = 0;
data=0;
#play
while data != '':
data = struct.pack("%dh"%(len(sig)), *list(sig))
stream.write(data)
inc=inc+chunk
sig=signal[inc:inc+chunk]
stream.close()
p.terminate()
I know that this is a rather old question, but I just needed the exact same thing and for me ederwander's example seems a little bit too complicated.
Here is my shorter (and commented) solution:
import pyaudio
import wave
# set desired values
start = 7
length = 3
# open wave file
wave_file = wave.open('myWaveFile.wav', 'rb')
# initialize audio
py_audio = pyaudio.PyAudio()
stream = py_audio.open(format=py_audio.get_format_from_width(wave_file.getsampwidth()),
channels=wave_file.getnchannels(),
rate=wave_file.getframerate(),
output=True)
# skip unwanted frames
n_frames = int(start * wave_file.getframerate())
wave_file.setpos(n_frames)
# write desired frames to audio buffer
n_frames = int(length * wave_file.getframerate())
frames = wave_file.readframes(n_frames)
stream.write(frames)
# close and terminate everything properly
stream.close()
py_audio.terminate()
wave_file.close()

Generating sine wave sound in Python

I need to generate a sine wave sound in Python, and I need to be able to control frequency, duration, and relative volume. By 'generate' I mean that I want it to play though the speakers immediately, not save to a file.
What is the easiest way to do this?
Version with numpy:
import time
import numpy as np
import pyaudio
p = pyaudio.PyAudio()
volume = 0.5 # range [0.0, 1.0]
fs = 44100 # sampling rate, Hz, must be integer
duration = 5.0 # in seconds, may be float
f = 440.0 # sine frequency, Hz, may be float
# generate samples, note conversion to float32 array
samples = (np.sin(2 * np.pi * np.arange(fs * duration) * f / fs)).astype(np.float32)
# per #yahweh comment explicitly convert to bytes sequence
output_bytes = (volume * samples).tobytes()
# for paFloat32 sample values must be in range [-1.0, 1.0]
stream = p.open(format=pyaudio.paFloat32,
channels=1,
rate=fs,
output=True)
# play. May repeat with different volume values (if done interactively)
start_time = time.time()
stream.write(output_bytes)
print("Played sound for {:.2f} seconds".format(time.time() - start_time))
stream.stop_stream()
stream.close()
p.terminate()
Version without numpy:
import array
import math
import time
import pyaudio
p = pyaudio.PyAudio()
volume = 0.5 # range [0.0, 1.0]
fs = 44100 # sampling rate, Hz, must be integer
duration = 5.0 # in seconds, may be float
f = 440.0 # sine frequency, Hz, may be float
# generate samples, note conversion to float32 array
num_samples = int(fs * duration)
samples = [volume * math.sin(2 * math.pi * k * f / fs) for k in range(0, num_samples)]
# per #yahweh comment explicitly convert to bytes sequence
output_bytes = array.array('f', samples).tobytes()
# for paFloat32 sample values must be in range [-1.0, 1.0]
stream = p.open(format=pyaudio.paFloat32,
channels=1,
rate=fs,
output=True)
# play. May repeat with different volume values (if done interactively)
start_time = time.time()
stream.write(output_bytes)
print("Played sound for {:.2f} seconds".format(time.time() - start_time))
stream.stop_stream()
stream.close()
p.terminate()
ivan-onys gave an excellent answer, but there is a little addition to it:
this script will produce 4 times shorter sound than expected because Pyaudio write method needs string data of float32, but when you pass numpy array to this method, it converts whole array as entity to a string, therefore you have to convert data in numpy array to the byte sequence yourself like this:
samples = (np.sin(2*np.pi*np.arange(fs*duration)*f/fs)).astype(np.float32).tobytes()
and you have to change this line as well:
stream.write(samples)
One of the more consistent andeasy to install ways to deal with sound in Python is the Pygame multimedia libraries.
I'd recomend using it - there is the pygame.sndarray submodule that allows you to manipulate numbers in a data vector that become a high-level sound object that can be playerd in the pygame.mixer module.
The documentation in the pygame.org site should be enough for using the sndarray module.
Today for Python 3.5+ the best way is to install the packages recommended by the developer.
http://people.csail.mit.edu/hubert/pyaudio/
For Debian do
sudo apt-get install python3-all-dev portaudio19-dev
before trying to install pyaudio
The script from ivan_onys produces a signal that is four times shorter than intended. If a TypeError is returned when volume is a float, try adding .tobytes() to the following line instead.
stream.write((volume*samples).tobytes())
#mm_ float32 = 32 bits, and 8 bits = 1 byte, so float32 = 4 bytes. When samples are passed to stream.write as float32, byte count (duration) is divided by 4. Writing samples back .tobytes() corrects for quartering the sample count when writing to float32.
I the bregman lab toolbox you have a set of functions that does exactly what you want. This python module is a little bit buggy but you can adapt this code to get your own functions

Categories