Im working on EEG signal processing method for recognition of P300 ERP.
At the moment, Im training my classifier with a single vector of data that I get by averaging across preprocessed data from chosen subset of original 64 channels. Im using the values from EEG directly, not a frequency features from fft. The method actually got quite a solid performance of around 75% accurate classification.
I would like to improve it by using ICA to clean the EEG data a bit. I read through a lot of tutorials and papers and I am still kinda confused.
Im implementing my method in python so I chose to use sklearn's FastICA.
from sklearn.decomposition import FastICA
self.ica = FastICA(n_components=64,max_iter=300)
icaSignal = self.ica.fit_transform(self.signal)
From 25256 samples x 64 channels matrix I get matrix of original sources, that is also 25256x64. The problem is, that im not quite sure how to use the output.
Averaging those components and training a classifier same way as with signal reduces performance to less than 30%. So this is not probably the way.
Another way that I read about, is rejecting some of components at this point - the ones that represent eye blinks, muscle activity etc. Doing that based on their frequency and some other heuristics. - I also not quite confident about how to do that exactly.
After I reject some of the components, what is the next step? Should I try to average the ones that left and feed the classifier with them, or should i try to reconstruct the EEG signal without them now - if so, how to do that in python? I wasnt able to find any information about that reconstruction step. It is probably much easier to do in matlab so nobody bothered to write about it :(
Any suggestions? :)
Thank you very much!
I haven't used Python for ICA, but in turns of the steps, it shouldn't matter whether it's Matlab or Python.
You are completely right that it's hard to reject ICA components. There is no widely-accepted objective measurement. There are certain patterns for eye blinks (high voltage in frontal channels), muscle artifacts (wide spectrum coverage because it's EMG, at peripheral channels). If you don't know where to get started, I recommend reading the help of a Matlab plugin called EEGLAB. This UCSD group has some nice materials to help you start.
https://eeglab.org/
To answer your question on the ICA reconstruction: after rejecting some ICA components, you should reconstruct the original EEG without them.
Related
I am trying to replicate this article but its corresponding github repo is written quite badly. In the article, an NN is trained on manually corrupted audio signals. Unfortunately, the researchers did not add the audio files nor a clean code that show how they have corrupted their audio files. In the paper they write:
..for the noisy test set, the 100 utterances were corrupted with four
unseen noise types (engine, white, street, and baby cry), at six SNR
levels (-6 dB, 0 dB, 6 dB, 12 dB, 18 dB, and 24 dB); for the enhanced
set, the utterances in the noisy set were enhanced by the enhancement
model above.
Now to the question - is there a python (R/MATLAB libraries are fine as well) that takes as an input the signal, the type of desired noise and the SNR and returns a corrupted signal? If not, where do I get an engine or a crying baby noise types?
Thanks!
So, if someone is getting into the same problem, here is what I did. First, I looked for databases that include real-life noises. Most of them costs money and offer limited variety of environments (see the AURORA-2 corpus, the CHiME background noise data, or the NOISEX-92 database). Finally I found the DEMAND dataset that includes multi-channel noises from 16 different environments (office, car, road, etc.) and is available freely.
Now, before merging noise and signal, one has to verify they share the same sampling rate (actually, it is not such a severe problem as I understand from this discussion, but it is better to be on the safe side). If you are using python, you can use the librosa.resample module to standardize the two. After that, you can add the two signals. When adding the noise, you may want to control the magnitude of each of inputs (signal and noise). You can use the signal to noise ratio formula that is given below in order to find $a$, the multiplayer by which you have to multiply your noise in order to get the desired signal to noise ratio (SNR).
Where the desired SNR is given, and the two RMS are calculated from your data.
In school we have to listen to intervals and chords and determine their name. I'm really into neuronal network. Thats why I want to create a neuronal network with Python which listen to the audio and give me the name as an output. I've learned once that for music I need a LSTM. Should I need for this purpose also a LSTM and how/where should I start? Can anybody teach me how to achieve my goal?
first of all you need to exactly define the task you like to solve: Do you like to classify a whole piece of music/track or do you like to classify segments of the piece/track? This will influence which architecture you need to use to solve your task. I will briefly present an approach for each of those tasks.
Classifying a track: Recordings of music are time series, for each of your recordings you need to have a label. Your first intuition of using LSTMs (or RNNs in general) is a good one. Just use your recording transformed into a vector as input-sequence for your LSTM-network and let it put out probabilities for each class. As already indicated by a comment, working in frequency-space can be beneficial. However just using the Fourier-Transformation of the whole track will most likely lose important information since the temporal frequency information is lost. Rather use Short-time Fourier-Transormation (STFT) or Mel-frequency cepstrum coefficients (MFCC, here is a python-library to calculate them: libROSA). Very oversimplified, those methods will transform your time series into some kind of 'image', an two-dimensional frequncy-spectrum, and for image classification task Convolutional Neural Networks (CNNs) are the way to go.
Classifying segments: If you like to classify segments of your track you need to have a labels for each time-frame in your song. Lets say your song is 3 minutes long and you have a sampling frequency of 60 Hz, your vector representation of the song will have 3*60*60 = 10800 time-frames, thus for each of the entries you need to provide a class label (chord or whatever). Again you can use LSTMs, use your vector as input-sequence and let your network produce an output sequence of the same length of your song and compare it to the class labels. You also could use the previously mentioned STFT- or MFC-coefficients as inputs and take advantage of the frequency information, now you will have a spectrum for each time-frame as input.
I hope these broad ideas will bring you one step closer to solve your task. For implementation details I like to point you to the keras documentation and to countless tutorials on the internet.
Disclaimer:
My knowledge of music theory is rather limited, so please take my answer with a grain of salt and feel free to correct me or ask for clarification. Have fun
I'm very new to signal processing. I have two sound signal data right now. Each of the data is collected at a sample rate of 10 KHz, 2 seconds. I have imported this data into python. Both sound_1 and sound_2 is a numpy array right now. The length of each sound data is 20000 of course.
Sound_1 contains a water flow sound(which I'm interested) and environmental noise(I'm not interested), while sound_2 only contains environment noise(I'm not interested).
I'm looking for an algorithm(or package) which can help me determine the frequency range of this water flow sound. I think if I can find out the frequency range, I can use an inverse Fourier transform to filter the environment noise.
However, my ultimate purpose is to extract the water flow sound from sound_1 data and eliminate environmental noise. It would be great if there are other approaches.
I'm currently looking at this post: Python frequency detection
But I don't understand how they can find out the frequency by only one sound signal. I think we need to compare 2 signal data at least(one contains the sound I am interested, the other doesn't), so we can find out the difference.
Since sound_1 contains both water flow and environmental noise, there's no straightforward way of extracting the water flow. The Fourier transform will get you all frequencies in the signal, irrespective of the source.
The way to approach is get frequencies of environmental noise from sound_2 and then remove them from sound_1. After that is done, you can extract the frequencies from already denoised sound_1.
One of popular approaches to such noise reduction is with spectral gating. Essentially, you first determine how the noise sounds like and then remove smoothed spectrum from your signal. Smoothing is crucial, as sound is a wave, a continuous entity. If you simply chop out discrete frequencies from the wave, you will get very poor results (audio will sound unnatural and robotic). The amount of smoothing you apply will determine how much noise is reduced (mind it's never truly removed - you will always get some residue).
To the concrete solution.
As you're new to the subject, I'd recommend first how noise reduction works in a software that will do the work for you. Audacity is an excellent choice. I linked the manual for noise reduction, but there are plenty of tutorials out there.
After you know what you want to get, you can either implement spectral gating yourself or use existing package. Audacity has an excellent implementation in C++, but it may prove difficult to a newbie to port. I'd recommend going first with noisereduce package. It's based on Audacity implementation. If you use it, you will be done in a few lines.
Here's a snippet:
import noisereduce as nr
# load data
rate, data = wavfile.read("sound_1.wav")
# select section of data that is noise
noisy_part = wavfile.read("sound_2.wav")
# perform noise reduction
reduced_noise = nr.reduce_noise(audio_clip=data, noise_clip=noisy_part, verbose=True)
Now simply run FFT on the reduced_noise to discover the frequencies of water flow.
Here's how I am using noisereduce. In this part I am determining the frequency statistics.
I have a time series and generate its spectrogram in Python with matplotlib.pyplot.specgram.
After I make some analysis and changes I need to convert the spectrogram back into time series.
Is there any function in matplotlib or in other library that I can use directly? Or if not, could you please elaborate on which direction I should work on?
Your warm help is appreciated.
Matplotlib is a library for plotting data. Generally if you're trying to do any computation you'd use a library suited for that.
numpy is a very popular library for doing numerical computation in Python. It just so happens they have a fairly extensive set of fft and ifft methods.
I would check them out here and see if they can solve your problem.
One thing commonly done (for example in the source separation community) is to use the phase data of the original signal (before transformation where applied to it) - the result is much better than null or random phase, and not so far from algorithms aiming at reconstructing the phase information from scratch.
A classic reconstruction algorithm is Griffin&Lim's, described in the paper "Signal estimation from modified short-time Fourier transform". This is an iterative algorithm, each iteration requires a full STFT / inverse STFT, which makes it quite costly.
This problem is indeed an active area of research, a search for STFT + reconstruction + magnitude will yield plenty of papers aiming at improving on Griffin&Lim in terms of signal quality and/or computational efficiency.
You can find detailed dicussion hereThread on DSP Stack Exchange
Anybody able to supply links, advice, or other forms of help to the following?
Objective - use python to classify 10-second audio samples so that I afterwards can speak into a microphone and have python pick out and play snippets (faded together) of closest matches from db.
My objective is not to have the closest match and I don't care what the source of the audio samples is. So the result is probably of no use other than speaking in noise (fun).
I would like the python app to be able to find a specific match of FFT for example within the 10 second samples in the db. I guess the real-time sampling of the microphone will have a 100 millisecond buffersample.
Any ideas? FFT? What db? Other?
In order to do this, you need three things:
Segmentation (decide how to make your audio samples)
Feature Extraction (decide what audio feature (e.g. FFT) you care about)
Distance Metric (decide what the "closest" sample is)
Segmentation: you currently describe using 10-second samples. I think you might have better results with shorter segments (closer to 100-1000ms) in order to get something that fits the changes in the voice better.
Feature Extraction: you mention using FFT. The zero crossing rate is surprisingly ok considering how simple it is. If you want to get more fancy, using MFCCs or spectral centroid is probably the way to go.
Distance Metric: most people use the euclidean distance, but there are also fancier ones like the manhattan distance, cosine distance, and earth-movers distance.
For a database, if you have a small enough set of samples, you might try just loading everything up into a kdtree so that you can do fast distance calculations, and just hold it in memory.
Good luck! It sounds like a fun project.
Try searching for algorithms on "music fingerprinting".
You could try some typical short-term feature extraction (e.g. energy, zero crossing rate, MFCCs, spectral features, chroma, etc) and then model your segment through a vector of feature statistics. Then you could use a simple distance-based classifier (e.g. kNN) in order to retrieve the "closest" training samples from a manually laballed set, given an unknown "query".
Check out my lib on several Python Audio Analysis functionalities: pyAudioAnalysis