I have a csv file of data, recorded on a data acquisition center with frequency of 500Hz, and I am trying to convert it to wav format. I have trie to Python and simply feed the numbers (as 16bit integers to the wave package), and it didn't work. How should I construct a wav file from simply a stream of numbers?
I've tried the following code, which includes normalization, and I set the dtype to be float32 so that it would use 32-bit floating-point format according to the documentation here, it just is not generating any sounds.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import numpy as np
import scipy.io.wavfile
from numpy import *
csv_array = np.loadtxt('trimmed.csv', delimiter=',', dtype=float32)
min = np.amin(csv_array)
max = np.amax(csv_array)
med = (max + min) / 2
def f(x):
return (x - med) * (1 - (-1)) / (max - min)
f = np.vectorize(f)
wav_array = f(csv_array)
scipy.io.wavfile.write('output.wav', 500, csv_array)
The problem is with your sampling rate. Try re-sampling the data to something like 44100 Hz (see code below). I do not know what effects re-sampling will have on your data.
import numpy as np
from scipy.io import wavfile
from scipy.signal import resample
data = np.random.uniform(-1, 1, 500)
data_resampled = resample(data, 44100)
wavfile.write('output.wav', 44100, data_resampled)
Try playing around with the rate argument in sipy.io.wavfile.write. As the rate lowers, the frequency of the sound lowers.
Related
I want to create "heart rate monitor" effect from a 2D array in numpy and want the tone to reflect the values in the array.
You can use the write function from scipy.io.wavfile to create a wav file which you can then play however you wish. Note that the array must be integers, so if you have floats, you might want to scale them appropriately:
import numpy as np
from scipy.io.wavfile import write
rate = 44100
data = np.random.uniform(-1, 1, rate) # 1 second worth of random samples between -1 and 1
scaled = np.int16(data / np.max(np.abs(data)) * 32767)
write('test.wav', rate, scaled)
If you want Python to actually play audio, then this page provides an overview of some of the packages/modules.
For the people coming here in 2016 scikits.audiolab doesn't really seem to work anymore. I was able to get a solution using sounddevice.
import numpy as np
import sounddevice as sd
fs = 44100
data = np.random.uniform(-1, 1, fs)
sd.play(data, fs)
in Jupyter the best option is:
from IPython.display import Audio
wave_audio = numpy.sin(numpy.linspace(0, 3000, 20000))
Audio(wave_audio, rate=20000)
In addition, you could try scikits.audiolab. It features file IO and the ability to 'play' arrays. Arrays don't have to be integers. To mimick dbaupp's example:
import numpy as np
import scikits.audiolab
data = np.random.uniform(-1,1,44100)
# write array to file:
scikits.audiolab.wavwrite(data, 'test.wav', fs=44100, enc='pcm16')
# play the array:
scikits.audiolab.play(data, fs=44100)
I had some problems using scikit.audiolabs, so I looked for some other options for this task. I came up with sounddevice, which seems a lot more up-to-date. I have not checked if it works with Python 3.
A simple way to perform what you want is this:
import numpy as np
import sounddevice as sd
sd.default.samplerate = 44100
time = 2.0
frequency = 440
# Generate time of samples between 0 and two seconds
samples = np.arange(44100 * time) / 44100.0
# Recall that a sinusoidal wave of frequency f has formula w(t) = A*sin(2*pi*f*t)
wave = 10000 * np.sin(2 * np.pi * frequency * samples)
# Convert it to wav format (16 bits)
wav_wave = np.array(wave, dtype=np.int16)
sd.play(wav_wave, blocking=True)
PyGame has the module pygame.sndarray which can play numpy data as audio. The other answers are probably better, as PyGame can be difficult to get up and running. Then again, scipy and numpy come with their own difficulties, so maybe it isn't a large step to add PyGame into the mix.
http://www.pygame.org/docs/ref/sndarray.html
Another modern and convenient solution is to use pysoundfile, which can read and write a wide range of audio file formats:
import numpy as np
import soundfile as sf
data = np.random.uniform(-1, 1, 44100)
sf.write('new_file.wav', data, 44100)
Not sure of the particulars of how you would produce the audio from the array, but I have found mpg321 to be a great command-line audio player, and could potentially work for you.
I use it as my player of choice for Anki, which is written in python and has libraries that could be a great starting place for interfacing your code/arrays with audio.
Check out:
anki.sound.py
customPlayer.py
I currently have the following code, which produces a sine wave of varying frequencies using the pyaudio module:
import pyaudio
import numpy as np
p = pyaudio.PyAudio()
volume = 0.5
fs = 44100
duration = 1
f = 440
samples = (np.sin(2 * np.pi * np.arange(fs * duration) * f /
fs)).astype(np.float32).tobytes()
stream = p.open(format = pyaudio.paFloat32,
channels = 1,
rate = fs,
output = True)
stream.write(samples)
However, instead of playing the sound, is there any way to make it so that the sound is written into an audio file?
Add this code at the top of your code.
from scipy.io.wavfile import write
Also, add this code at the bottom of your code.
This worked for me.
scaled = numpy.int16(s/numpy.max(numpy.abs(s)) * 32767)
write('test.wav', 44100, scaled)
Using scipy.io.wavfile.write as suggested by #h lee produced the desired results:
import numpy
from scipy.io.wavfile import write
volume = 1
sample_rate = 44100
duration = 10
frequency = 1000
samples = (numpy.sin(2 * numpy.pi * numpy.arange(sample_rate * duration)
* frequency / sample_rate)).astype(numpy.float32)
write('test.wav', sample_rate, samples)
Another example can be found on the documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.wavfile.write.html
Handle your audio input as a numpy array like I did here in the second answer, but instead of just processing the frames and sending the data back to PyAudio, save each frame in a new output_array. Then when the processing is done you can use that output_array to write it to .wav or .mp3 file.
If you do that, however, the sound would still play. If you don't want to play the sound you have two options, either using the blocking mode or, if you wanna stick with the non blocking mode and the callbacks, do the following:
Erase output=True so that it is False as default.
Add a input=True parameter.
In your callback do not return ret_data, instead return None.
Keep count of the frames you've processed so that when you're done you return paComplete as second value of the returned tuple.
I generate a sound wave on frequency 440hz as expected in pyaudio, but even though I am using the same sample array to save a wav file, it does not save the same sound and I canĀ“t figure out why
Here is the code:
import wave
import numpy as np
import pyaudio
p = pyaudio.PyAudio()
volume = 0.5 # range [0.0, 1.0]
fs = 44100 # sampling rate, Hz, must be integer
duration = 2.0 # in seconds, may be float
f = 440.0 # sine frequency, Hz, may be float
channels = 1
# open stream (2)
stream = p.open(format=pyaudio.paFloat32,
channels=channels,
rate=fs,
output=True)
def get_value(i):
return np.sin(f * np.pi * float(i) / float(fs))
samples = np.array([get_value(a) for a in range(0, fs)]).astype(np.float32)
for i in range(0, int(duration)):
stream.write(samples, fs)
wf = wave.open("test.wav", 'wb')
wf.setnchannels(channels)
wf.setsampwidth(3)
wf.setframerate(fs)
wf.setnframes(int(fs * duration))
wf.writeframes(samples)
wf.close()
# stop stream (4)
stream.stop_stream()
stream.close()
# close PyAudio (5)
p.terminate()
https://gist.github.com/badjano/c727b20429295e2695afdbc601f2334b
I think the main problem is that you are using the float32 data type which is not supported by the wave module.
You can use int16 or int32 or you can use 24-bit integers with some manual conversion.
Since you are using wf.setsampwidth(3), I assume you want to use 24-bit data?
I've written a little tutorial about the wave module (including how to handle 24-bit data) and an overview about different modules for handling sound files.
You may also be interested in my tutorial about creating a simple signal.
Since you are already using NumPy, I recommend using a library that supports NumPy arrays out-of-the-box and does all the conversions for you.
My personal preference would be to use the soundfile module, but I'm quite biased.
For playback, I would also recommend using a library that supports NumPy. Here my suggestion is the sounddevice module, but I'm very biased here as well.
If you want to follow my suggestions, your code might become something like that (including the handling of volume and fixing a missing factor of 2 in the sinus' argument):
from __future__ import division
import numpy as np
import sounddevice as sd
import soundfile as sf
volume = 0.5 # range [0.0, 1.0]
fs = 44100 # sampling rate, Hz
duration = 2.0 # in seconds
f = 440.0 # sine frequency, Hz
t = np.arange(int(duration * fs)) / fs
samples = volume * np.sin(2 * np.pi * f * t)
sf.write('myfile.wav', samples, fs, subtype='PCM_24')
sd.play(samples, fs)
sd.wait()
UPDATE:
If you want to keep using PyAudio, that's fine.
But you'll have to manually convert the floating point array (with values from -1.0 to 1.0) to integers in the appropriate range, depending on the data type you want to use.
The first link I mentioned above contains the file utility.py which has a function float2pcm() to do just that.
Here's an abbreviated version of that function:
def float2pcm(sig, dtype='int16'):
i = np.iinfo(dtype)
abs_max = 2 ** (i.bits - 1)
offset = i.min + abs_max
return (sig * abs_max + offset).clip(i.min, i.max).astype(dtype)
I am trying to write a Python script that can demodulate an FSK modulated audio file and return the data encoded in the audio. The data being transmitted is GPS NMEA strings which are embedded as the audio channel in video files. Basically, text is encoded with FSK modulation, and I am trying to retrieve the text using Python. The device I am using to encode the data can also decode it, so I have been able to generate the correct output, but I need to be able to do it using software.
I have done some background reading to introduce myself to signal processing and FSK, and I have looked at example scripts (e.g. this one and minimodem).
I managed to write a Python script that runs successfully, although the output is incorrect. The correct output derived from the encoding/decoding device has 8,280 raw binary (0 and 1) characters, the Python output has 1,344,786. I think I am missing a symbol synchronizer, but I'm not sure how this works.
My question now is: how can I add symbol synchronization to the script and/or symbol timing? Are there better examples or explanations of how to do FSK demodulation in Python? I would appreciate any feedback or direction. Thank you.
Here's my script so far:
from scipy.io.wavfile import read
import numpy as np
import wave
import matplotlib.pyplot as plt
import scipy.signal as signal
from scipy.signal import blackman, butter
from scipy.fftpack import fft, rfft, rfftfreq, irfft
import scipy.signal.signaltools as sigtool
import binascii
# Read in data; 'wav' allows getting paramters, 'wav1' is actual signal data
wavfile = 'Sample4_160224_mono.wav'
wavfile1 = open(wavfile, 'r')
wav = wave.open(wavfile, 'r')
wav_1 = read(wavfile1)
params = wav.getparams()
N = params[3] #Sample size
wav1 = read(wavfile1)
wav2 = wav1[1][0:N]
duration = float(params[3] / params[2])
n_samples = len(wav2)
Fs = params[2]
nyq = 0.5 * Fs #Nyquist rate
Fbit = (params[2]*params[0]*16)/100
print "Fbit", Fbit
# Windowing function
w = blackman(n_samples)
print "W is", w
# FFT
wfft = rfft(wav2 * w)
wfft_norm = wfft/N
wfft_norm = abs(wfft_norm[range(N/2)])
# Working with frequencies...
freqs = rfftfreq(len(wfft_norm))
index = np.argmax(np.abs(wfft)) #Returns the index of the maximum absolute value of the windowed FFT
freq = freqs[index] #Returns the frequency from the above index
freq_range = [freq - 0.01, freq + 0.01]
freq_in_Hz = abs(freq * params[2]) #Converts the Hz
freq_range_Hz = [abs(freq_range[0] * params[2]), abs(freq_range[1] * params[2])]
# Differentiator
diff = np.diff(wav2)
# Envelope detector
env = np.abs(sigtool.hilbert(diff))
print "ENV", len(env)
# Low-pass filter
h = signal.firwin(numtaps = 10, cutoff = freq_range[1], nyq = nyq)
filt = signal.lfilter(h, 1, env)
# Signal's mean
mean = np.mean(filt)
#Do some crazy stuff to get binary **maybe wrong**
rx_data = []
sampled_signal = env[Fs/Fbit/2:params[3]+1:]
for bit in sampled_signal:
if bit > mean:
rx_data.append(int(1))
else:
rx_data.append(int(0))
# Save raw binary output
rx_data1 = ''.join(map(str, (rx_data)))
outfile1 = open('FSK_wav6_output_binary.txt', 'w')
outfile1.write(rx_data1)
outfile1.close()
Seems that you use multiple channles and the sound you need is embedded in one of them.
So far I have found few problems in your scripts:
Nyquist rate is not a half rate of your sound. It is the rate which could sample the original sound wave, and should be at least 2 times bigger than the sound sampling rate. Hence,
nyq = 0.5 * Fs
is wrong.
If you take advantage of the noiseless sound to demodulate, then the Differentiator can be omitted.
For the low-pass filter:
h = signal.firwin(numtaps = 10, cutoff = freq_range[1], nyq = nyq)
the cutoff frequency is your data sample rate, please read this.
filt is the final signal which can extract the specific data you desire.
How to choose points in sampled_signal to recreate the original signal actually depends on the ratio between the original signal rate and the sampling rate. Just like the first link you provided, assuming the data were written in 11025 Hz and the sampling or recording rate is 44100 Hz, then the code you gave:
sampled_signal = env[Fs/Fbit/2:params[3]+1:]
should be:
sampled_signal = filt[Fs/Fbit*2:params[3]:Fs/Fbit*4]
where Fs/Fbit*2 is the beginning, params[3] is the ending, Fs/Fbit*4 is the step length.
The correct output derived from the encoding/decoding device has 8,280 raw binary (0 and 1) characters, the Python output has 1,344,786.
It is normal, because of different sample rates, you can add some special characters acting like a start-sign and end-sign in you text, and try to find them, then you might find the data with correct lenght you need.
I want to create "heart rate monitor" effect from a 2D array in numpy and want the tone to reflect the values in the array.
You can use the write function from scipy.io.wavfile to create a wav file which you can then play however you wish. Note that the array must be integers, so if you have floats, you might want to scale them appropriately:
import numpy as np
from scipy.io.wavfile import write
rate = 44100
data = np.random.uniform(-1, 1, rate) # 1 second worth of random samples between -1 and 1
scaled = np.int16(data / np.max(np.abs(data)) * 32767)
write('test.wav', rate, scaled)
If you want Python to actually play audio, then this page provides an overview of some of the packages/modules.
For the people coming here in 2016 scikits.audiolab doesn't really seem to work anymore. I was able to get a solution using sounddevice.
import numpy as np
import sounddevice as sd
fs = 44100
data = np.random.uniform(-1, 1, fs)
sd.play(data, fs)
in Jupyter the best option is:
from IPython.display import Audio
wave_audio = numpy.sin(numpy.linspace(0, 3000, 20000))
Audio(wave_audio, rate=20000)
In addition, you could try scikits.audiolab. It features file IO and the ability to 'play' arrays. Arrays don't have to be integers. To mimick dbaupp's example:
import numpy as np
import scikits.audiolab
data = np.random.uniform(-1,1,44100)
# write array to file:
scikits.audiolab.wavwrite(data, 'test.wav', fs=44100, enc='pcm16')
# play the array:
scikits.audiolab.play(data, fs=44100)
I had some problems using scikit.audiolabs, so I looked for some other options for this task. I came up with sounddevice, which seems a lot more up-to-date. I have not checked if it works with Python 3.
A simple way to perform what you want is this:
import numpy as np
import sounddevice as sd
sd.default.samplerate = 44100
time = 2.0
frequency = 440
# Generate time of samples between 0 and two seconds
samples = np.arange(44100 * time) / 44100.0
# Recall that a sinusoidal wave of frequency f has formula w(t) = A*sin(2*pi*f*t)
wave = 10000 * np.sin(2 * np.pi * frequency * samples)
# Convert it to wav format (16 bits)
wav_wave = np.array(wave, dtype=np.int16)
sd.play(wav_wave, blocking=True)
PyGame has the module pygame.sndarray which can play numpy data as audio. The other answers are probably better, as PyGame can be difficult to get up and running. Then again, scipy and numpy come with their own difficulties, so maybe it isn't a large step to add PyGame into the mix.
http://www.pygame.org/docs/ref/sndarray.html
Another modern and convenient solution is to use pysoundfile, which can read and write a wide range of audio file formats:
import numpy as np
import soundfile as sf
data = np.random.uniform(-1, 1, 44100)
sf.write('new_file.wav', data, 44100)
Not sure of the particulars of how you would produce the audio from the array, but I have found mpg321 to be a great command-line audio player, and could potentially work for you.
I use it as my player of choice for Anki, which is written in python and has libraries that could be a great starting place for interfacing your code/arrays with audio.
Check out:
anki.sound.py
customPlayer.py