Are there any Deep Learning literature/references where they performed clustering in structured data?
I know it can be done using Kmeans, GMM etc. But is there any chance that cluster analysis to be done using Deep Neural Nets and the like? Thanks.
Neural networks can be used in a clustering pipeline. For example, you can use Self-organizing maps (SOMs) for dimensionality reduction and k-means for clustering. Also, auto-encoders directly pop to my mind. But then, again, it is rather compression / dimensionality reduction than clustering. The real clustering is done by something else.
The problem with clustering is the missing optimization goal. The problem is not well-defined.
Deep learning refers to the depth of the neural nets in and the huge number of parameters applied to learn how to recognize features related to a certain object, and neural nets in essence need a loss function to learn, and the loss should be in the form of an equation that can by applying calculus give an estimate of how much each parameter we need to correct to get better result (Basically forward propogation to predict and backward propogation to update parameters), and such a loss function as of now does not exist, so we don't use neural nets for clustering. And if no neural nets, no deep learning.
If any part of that seems confusing comment below.
To read more about clustering algorithm have a look at this https://towardsdatascience.com/the-5-clustering-algorithms-data-scientists-need-to-know-a36d136ef68
Again you will find no neural nets. :)
Yes.
If you do a little bit of literature research yourself you will find that people have repeatedly published clustering with deep neural networks.
Except that it doesn't seem to work anywhere but on MNIST data...
Two other Potential Methods:
KMeans + Autoencoder (a simple deep learning architecture with
kmeans and reducing the dimensionality of the data using autoencoders).
Deep Embedded Clustering algorithm (advanced deep learning)
Related
Let's assume we're dealing with continuous features and responses. We fit a linear regression model (let's say first order) and after CV we get a somehow good r^2 (let's say r^2=0.8).
Why do we go for other ML algorithms? I've read some research papers and they were trying different ML algorithms and taking the simple linear model as a base model for comparison. In these papers, the linear model outperformed other algorithms, what I have difficulty understanding is why do we go for other ML algorithms then? Why can't we just be satisfied with the linear model especially in the specific case where other algorithms perform poorly?
The other question is what do they gain from presenting the other algorithms in their research papers if these algorithms performed poorly?
The Best model for solving predictive problems such as continuous output is the regression model especially if you do it using a Neural network (polynomial or linear) with hyperparameter tuning based on the problem.
Using other ML algorithms such as Decision Tree or SVM or any other model where their main goal is classification but on the paper, they say it can do regression also in fact, they can't predict any new values.
but in the field of research people always try to find a better way to predict values other than regression, like in the classification world we start with Logistic regression -> decision tree and now we have SVM and ensembling models and DeepLearning.
I think the answer is because you never know.
especially in the specific case where other algorithms perform poorly?
You know they performed poorly because someone tried dose models. It's always worthy trying various models.
Is it possible to use "reinforcement learning" or a feedback loop on a supervised model?
I have worked on a machine learning problem using a supervised learning model, more precisely a linear regression model, but I would like to improve the results by creating a feedback loop on the outputs of the prediction, i.e, tell the algorithm if it made mistakes on some examples.
As I know, this is basically how reinforcement learning works: the model learns from positive and negative feedbacks.
I found out that we can implement supervised learning and reinforcement learning algorithms using PyBrain, but I couldn't find a way to relate between both.
Most (or maybe all) iterative supervised learning methods already use a feedback loop on the outputs of the prediction. If fact, this feedback is very informative since it provides information with the exact amount of error in each sample. Think for example in stochastic gradient descent, where you compute the error of each sample to update the model parameters.
In reinforcement learning the feedback signal (i.e., reward) is much more limited than in supervised learning. Therefore, in the typical setup of adjusting some model parameters, if you have a set of input-output (i.e., a training data set), probably it has no sense to apply reinforcement learning.
If you are thinking on a more specific case/problem, you should be more specific in your question.
Reinforcement Learning has been used to tune hyper-parameters and/or select optimal Supervised Learning Models. There's also a paper on it: "Learning to optimize with Reinforcement Learning".
Reading Pablo's answer you may want to read up on "back propagation". It may be what you are looking for.
R's package 'forecast' has a function nnetar, which uses feed-forward neural networks with a single hidden layer to predict in time series.
Now I am using Python to do the similar analysis. I want to use neural network which does not need to be as complex as deep learning. Maybe 2 layers and a couple of nodes are good enough for my case.
So, does Python have a model of simple neural networks which can be used in time series lik nnetar? If not, how to deal with this problem?
Any NN model that uses 1 or more hidden layers is a multi-layer perceptron model, and for that case it is trivial to make it extendable to N layers. So any library that you pick will support it. My guess for you not picking a complex library like pytorch/Tensorflow is its size.
Tensorflow does have TF-Lite which can work for smaller IOT devices.
Sklearn does have MLPRegressor that can train NNs if that is more to your liking.
You can always write your model. There are plenty of examples for this that use numpy and are plenty fast for cpu computation.( Single Hidden layer NN I am guessing will be more memory bound than computation bound)
Use another ML algorithm. Single Hidden layer NNs will not perform nearly as well as other other simpler algorithms.
If there are other reasons for not using a standard library like tensorflow/pytorch then you should mention them.
I do apologies in advance if something similar has been posted but from the research I've done I can't find anything specific.
I'm currently looking at http://scikit-learn.org and the content here looks great but I'm confused what type I should be using for my problem.
I want to able to have 2 labels.
**Suspicious**
1hbn34uqrup7a13t
qmr30zoyswr21cdxolg
1qmqnbetqx
**Not-Suspicious**
cheesemix
reg526
animato12
What type of machine learning algorithm could I feed the data in above as to teach it what I'd class as suspicious through supervised learning?
I'm leaning towards classification but there are so many models to choose from my slightly lost.
The first step in such machine learning problems is to think about the "features". You can't use e.g. a linear classifier directly on these strings. Thus, you have to extract some meaningful features that describe the string. In computer vision, these features are often edges, corner points, SIFT features. You basically have to options:
Design features yourself.
Learn the features.
1) This is the "classical" machine learning approach: you manually design a list of representative features, which you can extract from your input data. In your case, you could start with e.g.
length of the string
number of different characters
number of special characters
something about the sorting?
...
That will give you a vector of numbers for each string. Now, you can use any of the classifiers from scikit-learn to classify the data. You can start choosing your algorithm with the help of this flowchart. You should start with a simple model, e.g. a linear model (e.g. linear SVM). If performance is not sufficient, use a more complex model (e.g. SVM with kernels), or rethink your choice of features.
2) This is the "modern" approach, which is gaining more and more popularity. Designing the features is a crucial step in 1) and it requires good knowledge of your data. Now, by using a deep neural network, you can feed your raw data (the string) into the network, and let the network learn such "features" itself. This, however, requires a large amount of labeled training data, and a lot of processing power (GPUs).
LSTM networks are todays state-of-the-art in natural language processing and similar tasks. LSTMs would be well suited to your tasks, as the input can be of variable length.
tl;dr: Either design features yourself and use a classifier of your choice, or dive into deep neural networks and let a network learn both the features and the classification.
I would like to ask if anyone has an idea or example of how to do support vector regression in python with high dimensional output( more than one) using a python binding of libsvm? I checked the examples and they are all assuming the output to be one dimensional.
libsvm might not be the best tool for this task.
The problem you describe is called multivariate regression, and usually for regression problems, SVM's are not necessarily the best choice.
You could try something like group lasso (http://www.di.ens.fr/~fbach/grouplasso/index.htm - matlab) or sparse group lasso (http://spams-devel.gforge.inria.fr/ - seems to have a python interface), which solve the multivariate regression problem with different types of regularization.
Support Vector Machines as a mathematical framework is formulated in terms of a single prediction variable. Hence most libraries implementing them will reflect this as using one single target variable in their API.
What you could do is train a single SVM model for each target dimension in your data.
on the plus side, you can train them in // on a cluster as each model is independent of one another
on the minus side, sub-models will share nothing and won't benefit from what they individually discovered in the structure of the input data and potentially need a lot of memory to store the model as they will have no shared intermediate representation
Variant of SVMs can probably be devised in a multi-task learning setting to learn some common kernel-based intermediate representation suitable for reuse to predict multi-dimensional targets however this is not implemented in libsvm AFAIK. Google for multi task learning SVM if you want to learn more.
Alternatively, multi-layer perceptrons (a kind of feed forward neural networks) can naturally deal with multi-dimensional outcomes and hence should be better at sharing intermediate representations of the data reused across targets, especially if they are deep enough with the first layers pre-trained in an unsupervised manner using an autoencoder objective function.
You might want to have a look at http://deeplearning.net/tutorial/ for a nice introduction to various neural network architectures and practical tools and examples to implement them efficiently.