Problems with Echo Nest Earworm analyzing small mp3s - python

Alright everyone, this one is super niche:
I am attempting to use the earworm.py code to analyze the timbre and pitch features of very short mp3s/tracks (1 second minimum); however, the code is returning no features and an empty graph.
The issue seems to stem from the function get_central(analysis, member='segments'). With short tracks, '"member = getattr(analysis, member)" returns empty.
Why is this? Is there a quick fix I could use like changing "member='segments'" to something that is more fine-grained?
Is there a way to extract timbre and pitch features from such short tracks using EchoNest?

Related

Find patterns in a music

Let's say I have a piece of music, and I want to find patterns that reproduces themselves so that I can cut out certain areas without it being audible.
For example :
What would be the best approach in python?
I thought about generating a waveform and then slicing it into images to find two similar ones but I don't know where to start and if it's a good idea.
you can split signal into buffers and compare it with fft. If the result of the fft differs from previous one by certain specified value, you can divide the part... but this really depends on what kind of music you are doing this alorithm for - for example, for house music it could be problematic to distinguish parts with fft, so you could for example acquire tempo of the track via waveform of the percussions and measure rms value.. if the rms value is changed, you have next part. The most fun and valid solution for this problem woudl be to actually use neural network, where waveform is your input and output list of the timestamps of certain parts and there you go
To complete Mateusz answer, here is a post about Fourier transform to générate new features.
Other tools exists to split an audio file into patterns or parts using pyAudioAnalysis. An explanation is given here

Methods for Point of View Analysis using Python

Can anyone please point me to some techniques to do a Point of View Analysis on novel text?
I'm basically looking for methods to determine how many words were written from different characters Points of View in any novel preferably using Python
Something like this: Statistical Analysis of WoT
Maybe this could help you.
You first start by storing every word, then you count every apparition.
If you also need to plot them in a graph, you should try to do an histogram with matplotlib like here.
Hope it helps !

Python: Create Nomograms from Data (using PyNomo)

I am working on Python 2.7. I want to create nomograms based on the data of various variables in order to predict one variable. I am looking into and have installed PyNomo package.
However, the from the documentation here and here and the examples, it seems that nomograms can only be made when you have equation(s) relating these variables, and not from the data. For example, examples here show how to use equations to create nomograms. What I want, is to create a nomogram from the data and use that to predict things. How do I do that? In other words, how do I make the nomograph take data as input and not the function as input? Is it even possible?
Any input would be helpful. If PyNomo cannot do it, please suggest some other package (in any language). For example, I am trying function nomogram from package rms in R, but not having luck with figuring out how to properly use it. I have asked a separate question for that here.
The term "nomogram" has become somewhat confused of late as it now refers to two entirely different things.
A classic nomogram performs a full calculation - you mark two scales, draw a straight line across the marks and read your answer from a third scale. This is the type of nomogram that pynomo produces, and as you correctly say, you need a formula. As mentioned above, producing nomograms like this is definitely a two-step process.
The other use of the term (very popular, recently) is to refer to regression nomograms. These are graphical depictions of regression models (usually logistic regression models). For these, a group of parallel predictor variables are depicted with a common scale on the bottom; for each predictor you read the 'score' from the scale and add these up. These types of nomograms have become very popular in the last few years, and thats what the RMS package will draft. I haven't used this but my understanding is that it works directly from the data.
Hope this is of some use! :-)

How to extract Information?

Objective: I am trying to do a project on Natural Language Processing (NLP), where I want to extract information and represent it in graphical form.
Description:
I am considering news article as input to my project.
Removing unwanted data in the input & making it in Clean Format.
Performing NLP & Extracting Information/Knowledge
Representing Information/Knowledge in Graphical Format.
Is it Possible?
If want to use nltk, you can start here. Its has some explanation about tokenizing, Part Of Speech Tagging, Parsing and more.
Check this page for an example of named entity detection using nltk.
The Graphical representation can be performed using igraph or matplotlib.
Also, scikit-learn has a great text feature extraction methods, in case you want to run some more sophisticated models.
The first step is to try and do this job yourself by hand with a pencil. Try it on not just one but a collection of news stories. You really do have to do this and not just think about it. Draw the graphics just as you'd want the computer to.
What this does is forces you to create rules about how information is transformed to graphics. This is NOT always possible, so doing it by hand is a good test. If you can't do it then you can't program a computer to do it.
Assuming you have found a paper and pencil method. What I like to do is work BACKWARDS. Your method starts with the text. No. Start with the numbers you need to draw the graphic. Then you think about where are these numbers in the stories and what words do I have to look at to get these numbers. Your job is now more like a hunting trip, you know the data is there, but how to find it.
Sorry for the lack of details but I don't know your exact problem but this works in every case. First learn to do the job yourself on paper then work backwards from the output to the input.
If you try to design this software in the forward direction you get stuck soon because you can't possibly know what to do with your text because you don't know what you need, it's like pushing a rope it don't work. Go to the other end and pull the rope. Do the graphic work FIRST then pull the needed data from the news stories.

Machine Learning in Python - Get the best possible feature-combination for a label

My Question is as follows:
I know a little bit about ML in Python (using NLTK), and it works ok so far. I can get predictions given certain features. But I want to know, is there a way, to display the best features to achieve a label? I mean the direct opposite of what I've been doing so far (put in all circumstances, and get a label for that)
I try to make my question clear via an example:
Let's say I have a database with Soccer games.
The Labels are e.g. 'Win', 'Loss', 'Draw'.
The Features are e.g. 'Windspeed', 'Rain or not', 'Daytime', 'Fouls committed' etc.
Now I want to know: Under which circumstances will a Team achieve a Win, Loss or Draw? Basically I want to get back something like this:
Best conditions for Win: Windspeed=0, No Rain, Afternoon, Fouls=0 etc
Best conditions for Loss: ...
Is there a way to achieve this?
My paint skills aren't the best!
All I know is theory, so well you'll have to look for the code..
If you have only 1 case(The best for "x" situations) the diagram becomes something like (It won't be 2-D, but something like this):
Green (Win), Orange(Draw), Red(Lose)
Now if you want to predict whether the team wins, loses or draws, you have (at least) 2 models to classify:
Linear Regression, the separator is the Perpendicular bisector of the line joining the 2 points:
K-nearest-neighbours: it is done just by calculating the distance from all the points, and classifying the point as the same as the closest..
So, for example, if you have a new data, and have to classify it, here's how:
We have a new point, with certain attributes..
We classify it by seeing/calculating which side of the line the point comes in (or seeing how far it is from our benchmark situations...
Note: You will have to give some weightage to each factor, for more accuracy..
You could compute the representativeness of each feature to separate the classes via feature weighting. The most common method for feature selection (and therefore feature weighting) in Text Classification is chi^2. This measure will tell you which features are better. Based on this information you can analyse the specific values that are best for every case. I hope this helps.
Regards,
Not sure if you have to do this in python, but if not, I would suggest Weka. If you're unfamiliar with it, here's a link to a set of tutorials: https://www.youtube.com/watch?v=gd5HwYYOz2U
Basically, you'd just need to write a program to extract your features and labels and then output a .arff file. Once you've generated a .arff file, you can feed this to Weka and run myriad different classifiers on it to figure out what model best fits your data. If necessary, you can then program this model to operate on your data. Weka has plenty of ways to analyze your results and to graphically display said results. It's truly amazing.

Categories