I'm currently working on some code to transmit messages/files/and other data over lasers using audio transformation. My current code uses the hexlify function from the binascii module in python to convert the data to binary, and then emits a tone for a 1 and a different tone for a 0. This in theory works, albeit not the fastest way to encode/decode, but in testing there proves to be a few errors.
the tones generated are not spot on, ie: emitting 150Hz can turn out to be 145-155Hz on the receiving end, this isn't a huge issue as I can just set the boundaries on the receiving end lower or higher.
the real problem is that if I emit a tone, and it is played, the computer on the receiving end may read it multiple times or not read it at all based on the rate it samples the incoming audio. I have tried to play the tones at the same speed it samples, but that is very iffy.
In all, I have had a couple of successful runs using short messages, but this is very unreliable and inaccurate due to the above mentioned issues.
I have looked into this further and a solution to this looks like it could involve BPSK or Binary Phase Shift Keying, although I'm not sure how to implement this. Any suggestions or code samples would be appreciated!
My code for the project can be found here but the main files I'm working on are for binary decoding and encoding which is here and here. I'm not an expert in python so please pardon me if anything I've said is wrong, my code isn't the best, or If i've overlooked something basic.
Thanks! :-)
Take a look at GNU Radio!
http://gnuradio.org/redmine/projects/gnuradio/wiki
GNU Radio is a project to do, in software, as much possible of radio signal transmission or reception. Because radio already uses phase shift keying, the GNU Radio guys have already solved the problem for you, and GNU Radio is already a Python project! And the complicated DSP stuff is written in C++ for speed, but wrapped for use in Python.
Here is a page discussing a project using Differential Binary Phase Shift Keying (DBPSK)/ Differential Quadrature Phase Shift Keying (DQPSK) to transmit binary data (in the example, a JPEG image). Python source code is available for download.
http://www.wu.ece.ufl.edu/projects/softwareRadio/
I see that your project is under the MIT license. GNU Radio is under GPL3, which may be a problem for you. You need to figure out if you can use GNU Radio without needing to make your project into a derived work, thus forcing you to change your license. It should be possible to make a standalone "sending daemon" and a standalone "receiving daemon", both of whose source code would be GPL3, and then have your MIT code connect to them over a socket or something.
By the way, one of my searches found this very clear explanation of how BPSK works:
http://cnx.org/content/m10280/latest/
Good luck!
In response to the first issue regarding the frequency:
Looking at your decoder, I see that your sample rate is 44100 and your chunk size is 2048. If am reading this right, that means your FFT size is 2048. That would put your FFT bin size at ~21hz. Have you tried to zero pad your FFT? Zero-padding the FFT won't change the frequency but will give you better resolution. I do see you are using a quadratic interpolation to improve your frequency estimate. I haven't used that technique, so I'm not familiar with the improvement you get from that. Maybe a balance between zero-padding and doing a quadratic interpolation will get you better frequency accuracy.
Also, depending on the hardware doing the transmission and receiving, the frequency error might be a result of different clocks driving the A/D - One or both of the clocks are not at exactly 44100Hz. Something like that might affect the frequency you see on your FFT output.
Related
Let's say I have a piece of music, and I want to find patterns that reproduces themselves so that I can cut out certain areas without it being audible.
For example :
What would be the best approach in python?
I thought about generating a waveform and then slicing it into images to find two similar ones but I don't know where to start and if it's a good idea.
you can split signal into buffers and compare it with fft. If the result of the fft differs from previous one by certain specified value, you can divide the part... but this really depends on what kind of music you are doing this alorithm for - for example, for house music it could be problematic to distinguish parts with fft, so you could for example acquire tempo of the track via waveform of the percussions and measure rms value.. if the rms value is changed, you have next part. The most fun and valid solution for this problem woudl be to actually use neural network, where waveform is your input and output list of the timestamps of certain parts and there you go
To complete Mateusz answer, here is a post about Fourier transform to générate new features.
Other tools exists to split an audio file into patterns or parts using pyAudioAnalysis. An explanation is given here
Im trying to solve this for a couple of weeks now but it seems like Im not able to wrap my head around this. The task is pretty simple: Im getting a signal in voltage from a microfone and in the end I want to know how loud in dB(A) it is out there.
There are so many problems I dont even know where to start. Lets begin with my idea.
Im converting the voltsignal into a signal in pascal [Pa].
Using a FFT on that signal so I know which frequencies im dealing with.
Then somehow I should implement the A-Weighting on that, but since im handling my values in [Pa] I cant just multiply or add my A-Weighning.
Going with an iFFT and getting back to my timesignal.
Going from Pa to dB.
Calculate RMS and Im done. (Hopefully)
The main problem is the A-Weighting. I realy dont get the idea how I can implement it on a live signal? And since the FFT leads to complex values Im also a little confused by that.
Maybe you get the idea/problem/workflow and help me to at least getting a little bit closer to the goal.
A little disclaimer, I am 100% new to the world of acoustics so please make sure to explain it like you would explain it a little child :D and Im programming with python.
Thanks in advance for your time!
To give you a short answer. This task can be done in only a few steps, utilizing the waveform_analysis package and Parseval's theorem.
The most simple implementation I can come up with is:
Time domain A-weighting filtering the signal - Using this library -
import waveform_analysis
weighted_signal = waveform_analysis.A_weight(signal, fs)
Take the RMS of the signal (utilizing that the power of the time domain equals the power of the frequency domain - Parseval's theorem). -
import numpy as np
rms_value = np.sqrt(np.mean(np.abs(weighted_signal)**2))
Convert this amplitude to dB -
result = 20 * np.log10(rms_value)
This gives you the results in dB(A)FS, if you run these three snippets together.
To get the dB(A)Pa value, you need to know what 0 dBPa corresponds to in dBFS. This is usually done by having a calibrated source such as https://www.grasacoustics.com/products/calibration-equipment/product/756-42ag
One flaw of this implementation is not pre-windowing the time signal. This is, on the other hand, not an issue for sufficently long signals.
Hello Christian Fichtelberg and welcome to StackOveflow. I believe your question could be answered more easily in DSP StackExchange but I will try to provide some quick and dirty answer.
In order to avoid taking the signal to the frequency domain, do the multiplication there (I urge you to the fact that convolution in the time domain - where your signal "resides" - is equivalent to multiplication in the frequency domain. If unfamiliar with this please have a quick look in Wikipedia's convolution page) you can implement the A-Weighting filter in the time-domain and perform some kind of convolution there.
I won't go into the details of the possible pros and cons of each method (time-domain convolution vs frequency domain multiplication). You could have a search on DSP SE or look into some textbook on DSP (such as Oppenheim's Digital Signal Processing, or an equivalent book by Proakis and Manolakis).
According to IEC 61672-1:2013 the digital filter should be "translated" from the analogue filter (a good way to do so is to use the bilinear transform). The proposed filter is a quite "simple" IIR (Infinite Impulse Response) filter.
I will skip the implementation here as it has been provided by others. Please find a MATLAB implementation, a Python implementation (most probably what you are seeking for your application), a quite "high-level" answer on DSP SE with some links and information on designing filters for arbitrary sample rates on DSP SE.
Finally, I would like to mention that if you manage to create a ("smooth enough") polynomial approximation to the curve of the A-Weighting filter you could possibly perform a frequency domain multiplication of the frequency response and the polynomial to change the magnitude only of the spectrum and then perform an iFFT to go back to time domain. This should most probably provide an approximation to the A-Weighted signal. Please note here that this is NOT the correct way to do filtering so treat it with caution (if you decide to try it at all) and only as a quick solution to perform some checks.
I'm currently working on a tkinter python school project where the sole purpose is to generate images from audio files, I'm going to pick audio properties and use them as values to generate unique abstract images from it, however I don't know which properties I can analyze to extract the values from. So I was looking for some guidance on which properties (audio frequency, amplitude... etc.) I can extract values from to use to generate the images with Python.
The question is very broad in it's current form.
(Bare in mind audio is not my area of expertise so do keep an eye out for the opinion of people working in audio/audiovisual/generative fields.)
You can go about it either way: figure out what kind of image(s) you'd like to create from audio and from there figure out which audio features to use. The other way around is also valid: pick an audio feature you'd like to explore, then think of how you'd best or most interestingly represent that visually.
There's a distintion between image and images.
For a single image, the simplest thing I can think of is drawing a grid of squares where a visual property of the square (e.g. square size, fill colour intensity, etc.) is mapped to the amplitude at that time. The single image would visualise a whole track's amplitude pattern. Even with such a simple example there are many choices you can make (how often you sample, how you layout the grid (cartesian, polar), how each amplitude sample is visualised (could different shapes, sizes, colours, etc.).
(Similar concept to CinemaRedux, simpler for audio only)
You can look into the field of data visualisation for inspiration.
Information is Beautiful is great place to start.
If you want to generate images that seems to go into the audiovisual territory (e.g. abstract animation, audio reactive motion graphics, etc.).
Your question originally had the tag Processing tag, which I removed, however you could be using Processing's Python Mode.
In ferms of audio visualisisation one good example I can think is Robert Hogin's work, see Magnetosphere and the Audio-generated landscape prototype. He is using frequency analysis (FFT) with a bit of smoothing/data massaging to amplify the elements useful for visualisation and dampen some of the noise:
(There are a few handy audio libraries such as Minim and beads, however I assume you're intresting in using raw Python, not Jython (which is what the official Processing Python mode uses). He is an answer on FFT analysis for visualisation (even though it's in Processing Java, the principles can be applied in Python)
Personally I've only used pyaudio so far for basic audio tasks. I would assume you could use it for amplitude analysis, but for other more complex tasks you might something extra.
Doing a quick search librosa pops up.
If what you want to achieve isn't clear, try prototyping first and start with the simplest audio analysis and visual elements you can think of (e.g. amplitude mapped to boxes over time). Constraints can be great for creativity and the minimal approach could translate into a cleaner, minimal visuals.
You can then look into FFT, MFCC, onset/ beat detection, etc.
Another tool that could be useful for prototyping is Sonic Visualiser.
You can open a track and use some of the built-in feature extractors.
(You can even get away with exporting XML or CSV data from Sonic Visualser which you can load/parse in Python and use to render image(s))
It uses a plugin system (similar to VST plugins in DAWs like Abbleton Live, Apple Logic, etc.) called Vamp plugins. You can then use the VampPy Python wrapper if you need the data at runtime.
(You might also want to draw inspiration from other languages used of audiovisual artworks like PureData + Gems , MaxMSP + Jitter, VVVV, etc.)
Time domain: Zero-crossing rate, Root mean square energy ,etc . Frequency Domain: Spectral bandwith,flux,rollof,flatness,MFCC etc. Also ,tempo, You can use librosa for Python , link : https://librosa.org/doc/latest/index.html for extraction from a .wav file , which implements Fast Fourier Transfrom and framing. And then you can apply some statistics such mean,standard deviation to the vector of the above characteristics across the whole audio file.
Providing an additional avenue for exploration: you have some tools to explore this qualitatively (as opposed to quantitatively using metrics derived from the audio signal as suggested in the great answers above)
As you mention the objective is to generate unique abstract images from sound - I would suggest an interesting angle may be to apply some Machine Learning techniques and derive some mood classification predictions from the source audio.
For instance you could use the Tensorflow models in essentia to predict the mood of the track and associate images you select with the mood scores generated. I would suggest going well beyond this and using the tkinter image creation tools to create your mappings to mood. Use pen and paper to develop your mapping strategy - are certain moods more angular or circular? What colour mappings will you select, and why? You have a great deal of freedom to create these mappings - so start simple as complexity builds naturally.
Using some some simple mood predictions may be more useful for you as someone who has more experience with the qualitative experience with sound rather than the quantitative experience as an audio engineer. I think this may be worth making central to the report you write and documenting your mapping decisions and design process for the report if this is a requirement of the task.
Before I begin I have to tell you that I have zero knowledge about DSP in python.
I want to deconvolute two sound signals using python so that I can extract the room impulse response, the input signal being a sinesweep and the output a record of it.
I wrote a piece of code but it didn't work, I've been trying for too long and really without results.
Can someone please help me with a code that calculate the FFT of the input and output then calculate h the iFFT of their fraction and plot it.
Deconvolution is an ill-posed tough problem in presence of noise and spatially-variant blurring. I assume you have a non spatially variant problem, as far as you are using FFTs, so you can use restoration module from skimage python package (instead of programming the algorithm at low level with FFTs).
Here you can study a code example with one of the implemented methods in restoration module.
I recommend you to read O'Leary et al. book if you want to learn more. All authors of this book have more advanced books about this great topic.
Im working on EEG signal processing method for recognition of P300 ERP.
At the moment, Im training my classifier with a single vector of data that I get by averaging across preprocessed data from chosen subset of original 64 channels. Im using the values from EEG directly, not a frequency features from fft. The method actually got quite a solid performance of around 75% accurate classification.
I would like to improve it by using ICA to clean the EEG data a bit. I read through a lot of tutorials and papers and I am still kinda confused.
Im implementing my method in python so I chose to use sklearn's FastICA.
from sklearn.decomposition import FastICA
self.ica = FastICA(n_components=64,max_iter=300)
icaSignal = self.ica.fit_transform(self.signal)
From 25256 samples x 64 channels matrix I get matrix of original sources, that is also 25256x64. The problem is, that im not quite sure how to use the output.
Averaging those components and training a classifier same way as with signal reduces performance to less than 30%. So this is not probably the way.
Another way that I read about, is rejecting some of components at this point - the ones that represent eye blinks, muscle activity etc. Doing that based on their frequency and some other heuristics. - I also not quite confident about how to do that exactly.
After I reject some of the components, what is the next step? Should I try to average the ones that left and feed the classifier with them, or should i try to reconstruct the EEG signal without them now - if so, how to do that in python? I wasnt able to find any information about that reconstruction step. It is probably much easier to do in matlab so nobody bothered to write about it :(
Any suggestions? :)
Thank you very much!
I haven't used Python for ICA, but in turns of the steps, it shouldn't matter whether it's Matlab or Python.
You are completely right that it's hard to reject ICA components. There is no widely-accepted objective measurement. There are certain patterns for eye blinks (high voltage in frontal channels), muscle artifacts (wide spectrum coverage because it's EMG, at peripheral channels). If you don't know where to get started, I recommend reading the help of a Matlab plugin called EEGLAB. This UCSD group has some nice materials to help you start.
https://eeglab.org/
To answer your question on the ICA reconstruction: after rejecting some ICA components, you should reconstruct the original EEG without them.