I'd like to produce sounds that would resemble audio from real instruments. The problem is that I have very little clue how to get that.
What I know this far from real instruments is that sounds they output are rarely clean. But how to produce such unclean sounds?
This far I've gotten to do this, it produces quite plain sound from which I'm not sure it's even using the alsa correctly.
import numpy
from numpy.fft import fft, ifft
from numpy.random import random_sample
from alsaaudio import PCM, PCM_NONBLOCK, PCM_FORMAT_FLOAT_LE
pcm = PCM()#mode=PCM_NONBLOCK)
pcm.setrate(44100)
pcm.setformat(PCM_FORMAT_FLOAT_LE)
pcm.setchannels(1)
pcm.setperiodsize(4096)
def sine_wave(x, freq=100):
sample = numpy.arange(x*4096, (x+1)*4096, dtype=numpy.float32)
sample *= numpy.pi * 2 / 44100
sample *= freq
return numpy.sin(sample)
for x in xrange(1000):
sample = sine_wave(x, 100)
pcm.write(sample.tostring())
Sound synthesis is a complex topic which requires many years of study to master.
It is also not an entirely solved problem, although relatively recent developments (such as physical modelling synthesis) have made progress in imitating real-world instruments.
There are a number of options open to you. If you are sure that you want to explore synthesis further, then I suggest you start by learning about FM synthesis. It is relatively easy to learn and implement in software, at least in basic forms, and produces a wide range of interesting sounds. Also, check out the book "The Computer Music Tutorial" by Curtis Roads. It's a bible for all things computer music, and although it's a few years old it is the book of choice for learning the fundamentals.
If you want a quicker way to produce life-like sound, consider using sampling techniques: that is, record the instruments you want to reproduce (or use a pre-existing sample bank), and just play back the samples. It's a much more straightforward (and often more effective) approach.
Cheery, if you want to generate (from scratch) something that really sounds "organic", i.e. like a physical object, you're probably best off to learn a bit about how these sounds are generated. For a solid introduction, you could have a look at a book such as Fletcher and Rossings The Physics of Musical Instruments. There's lots of stuff on the web too, you might want to have a look at a the primer James Clark has here
Having at least a skim over this sort of stuff will give you an idea of what you are up against. Modeling physical instruments accurately is very difficult!
If what you want to do is have something that sounds physical, rather something that sounds like instrument X, your job is a bit easier. You can build up frequencies quite easily and stack them together, add a little noise, and you'll get something that at least doesn't sound anything like a pure tone.
Reading a bit about Fourier analysis in general will help, as will Frequency Modulation (FM) techniques.
Have fun!
I agree that this is very non-trivial and there's no set "right way", but you should consider starting with a (or making your own) MIDI SoundFont.
As other people said, not a trivial topic at all. There are challenges both at the programming side of things (especially if you care about low-latency) and the synthesis part. A goldmine for sound synthesis is the page by Julius O. Smith. There is a lot of techniques for synthesis http://ccrma-www.stanford.edu/~jos/.
Related
I am having a trouble picking the right data structure as/library . I lack in experience in the area of image processing / pattern recognition . The aim is to building a simple prototype to learn recognizing particular shapes from construction plans. I would be great full for any indication about the data structure as I know It will be hard to switch it later on during the project and thus I am not entirely sure which one to pick.
The problem is , I plan to use a kind of neural network / algorithm later on so the performance of processing of the data structure may happen to be my bottle neck.
I was thinking about NumPy / SciPy / PIL / MatPlotLib
I will be extremely grateful for expertise of anyone who has tackled similar problem
If you're planning on using something like PyTorch later on (which would tie in to the sort of neural-network functionality you're pursuing), I'd become familiar with how NumPy operates, as it bridges pretty well into Torch data structures. If it helps, a lot of SciPy and Matplotlib functions work beautifully with Numpy structures right out of the box.
It's hard to tell exactly what you're looking for in these non-neural data structures; where are you worried about performance concerns and bottlenecks?
I'd recommend starting out with some PyTorch (or other deep learning framework) tutorials regarding image recognition and classification; it will get you closer to where you want to end up, and you'll be better able to make decisions about what your eventual program structure needs will be.
Before I begin I have to tell you that I have zero knowledge about DSP in python.
I want to deconvolute two sound signals using python so that I can extract the room impulse response, the input signal being a sinesweep and the output a record of it.
I wrote a piece of code but it didn't work, I've been trying for too long and really without results.
Can someone please help me with a code that calculate the FFT of the input and output then calculate h the iFFT of their fraction and plot it.
Deconvolution is an ill-posed tough problem in presence of noise and spatially-variant blurring. I assume you have a non spatially variant problem, as far as you are using FFTs, so you can use restoration module from skimage python package (instead of programming the algorithm at low level with FFTs).
Here you can study a code example with one of the implemented methods in restoration module.
I recommend you to read O'Leary et al. book if you want to learn more. All authors of this book have more advanced books about this great topic.
I'm currently working on some code to transmit messages/files/and other data over lasers using audio transformation. My current code uses the hexlify function from the binascii module in python to convert the data to binary, and then emits a tone for a 1 and a different tone for a 0. This in theory works, albeit not the fastest way to encode/decode, but in testing there proves to be a few errors.
the tones generated are not spot on, ie: emitting 150Hz can turn out to be 145-155Hz on the receiving end, this isn't a huge issue as I can just set the boundaries on the receiving end lower or higher.
the real problem is that if I emit a tone, and it is played, the computer on the receiving end may read it multiple times or not read it at all based on the rate it samples the incoming audio. I have tried to play the tones at the same speed it samples, but that is very iffy.
In all, I have had a couple of successful runs using short messages, but this is very unreliable and inaccurate due to the above mentioned issues.
I have looked into this further and a solution to this looks like it could involve BPSK or Binary Phase Shift Keying, although I'm not sure how to implement this. Any suggestions or code samples would be appreciated!
My code for the project can be found here but the main files I'm working on are for binary decoding and encoding which is here and here. I'm not an expert in python so please pardon me if anything I've said is wrong, my code isn't the best, or If i've overlooked something basic.
Thanks! :-)
Take a look at GNU Radio!
http://gnuradio.org/redmine/projects/gnuradio/wiki
GNU Radio is a project to do, in software, as much possible of radio signal transmission or reception. Because radio already uses phase shift keying, the GNU Radio guys have already solved the problem for you, and GNU Radio is already a Python project! And the complicated DSP stuff is written in C++ for speed, but wrapped for use in Python.
Here is a page discussing a project using Differential Binary Phase Shift Keying (DBPSK)/ Differential Quadrature Phase Shift Keying (DQPSK) to transmit binary data (in the example, a JPEG image). Python source code is available for download.
http://www.wu.ece.ufl.edu/projects/softwareRadio/
I see that your project is under the MIT license. GNU Radio is under GPL3, which may be a problem for you. You need to figure out if you can use GNU Radio without needing to make your project into a derived work, thus forcing you to change your license. It should be possible to make a standalone "sending daemon" and a standalone "receiving daemon", both of whose source code would be GPL3, and then have your MIT code connect to them over a socket or something.
By the way, one of my searches found this very clear explanation of how BPSK works:
http://cnx.org/content/m10280/latest/
Good luck!
In response to the first issue regarding the frequency:
Looking at your decoder, I see that your sample rate is 44100 and your chunk size is 2048. If am reading this right, that means your FFT size is 2048. That would put your FFT bin size at ~21hz. Have you tried to zero pad your FFT? Zero-padding the FFT won't change the frequency but will give you better resolution. I do see you are using a quadratic interpolation to improve your frequency estimate. I haven't used that technique, so I'm not familiar with the improvement you get from that. Maybe a balance between zero-padding and doing a quadratic interpolation will get you better frequency accuracy.
Also, depending on the hardware doing the transmission and receiving, the frequency error might be a result of different clocks driving the A/D - One or both of the clocks are not at exactly 44100Hz. Something like that might affect the frequency you see on your FFT output.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Learning efficient algorithms
I recently came across an problem that was solved by applying the correct algorithm: Calculating plugin dependencies
While I was eventually able to understand the logic of the prescribed algorithm, it was not an easy task for me. The only reason I was able to come up with code that worked was because of the logic example on the wikipedia page.
Being entirely self taught, without any CS or math background, I'd like to at least get some practical foundation to being able to apply algorithms to solve problems.
That said, are there any great books / resources (something akin to 'algorithms for dummies') that doesn't expect you have completed college Algebra 9 or Calculus 5 that can teach the basics? I don't expect to ever be a wizard, just expand my problem solving tool-set a little bit.
Doing an amazon search turns up a bunch of books, but I'm hoping you guys can point me to the truly useful resources.
The only language I have any real experience with is Python (a tiny bit of C) so whatever I find needs to be language agnostic or centred around Python/C.
"Art of Computer Programming" by Donald Knuth is a Very Useful Book.
A great book is "Introduction to Algorithms" by Cormen, Leiserson, Rivest and Stein.
Probably not the easiest one but it is very good indeed.
I found useful for myself the following sources:
"Analysis of Algorithms : An Active Learning Approach" by Jeffrey J. McConnell;
"Python Algorithms: Mastering Basic Algorithms in the Python Language"(Expert's Voice in Open Source) by Magnus Lie Hetland. - this books seems to me to be a very like a previous book but from python developer point of view;
http://en.wikipedia.org/wiki/Structure_and_Interpretation_of_Computer_Programs
Steve Skiena's Algorithm Design Manual is very good. It doesn't assume very much background knowledge, and covers several important topics in algorithms.
Personally I found Algorithms and Complexity to be super helpful. I'm also without CS degree or anything.
I am looking for references (tutorials, books, academic literature) concerning structuring unstructured text in a manner similar to the google calendar quick add button.
I understand this may come under the NLP category, but I am interested only in the process of going from something like "Levi jeans size 32 A0b293"
to: Brand: Levi, Size: 32, Category: Jeans, code: A0b293
I imagine it would be some combination of lexical parsing and machine learning techniques.
I am rather language agnostic but if pushed would prefer python, Matlab or C++ references
Thanks
You need to provide more information about the source of the text (the web? user input?), the domain (is it just clothes?), the potential formatting and vocabulary...
Assuming worst case scenario you need to start learning NLP. A very good free book is the documentation of NLTK: http://www.nltk.org/book . It is also a very good introduction to Python and the SW is free (for various usages). Be warned: NLP is hard. It doesn't always work. It is not fun at times. The state of the art is no where near where you imagine it is.
Assuming a better scenario (your text is semi-structured) - a good free tool is pyparsing. There is a book, plenty of examples and the resulting code is extremely attractive.
I hope this helps...
Possibly look at "Collective Intelligence" by Toby Segaran. I seem to remember that addressing the basics of this in one chapter.
After some researching I have found that this problem is commonly referred to as Information Extraction and have amassed a few papers and stored them in a Mendeley Collection
http://www.mendeley.com/research-papers/collections/3237331/Information-Extraction/
Also as Tai Weiss noted NLTK for python is a good starting point and this chapter of the book, looks specifically at information extraction
If you are only working for cases like the example you cited, you are better off using some manual rule-based that is 100% predictable and covers 90% of the cases it might encounter production..
You could enumerable lists of all possible brands and categories and detect which is which in an input string cos there's usually very little intersection in these two lists..
The other two could easily be detected and extracted using regular expressions. (1-3 digit numbers are always sizes, etc)
Your problem domain doesn't seem big enough to warrant a more heavy duty approach such as statistical learning.