I’m relatively new to the topic of machine learning, so naturally I have a couple of issues that I hope you can help me with or lead me in the right direction. I had a project before, during which we collected data of people walking normally and also with a stone in their shoe. We measured Acceleration and also with a gyroscope sensor. Based on this data I build a neural network that can classify the signals into normal or impaired walking. So two possible outputs.
Now my idea is this: I want to, using the same data, build a network that can predict the weights of the participants (it was also recorded).
Based on this my three questions:
- What kind of network structure is most suitable for such a task? (Dense, CNN, LSTM,…)
- Before the network basically had two options to answer from (normal or impaired walking) but now I have a continuous range of answers… How can this be approached?
- How can I make sure the network initializes with a sensible prediction?
I hope all the questions make sense. Any help will be much appreciated!
You can use the NNa architecture you prefer:
If you work with sequences use 1d convolutionals or RNNs.
As you are dealing with a regression problem you have to have a single neuron as output without activation function.
Take a.look here to learn to solve a regression problem with RNNs
Related
I has been trying to understand this machine learning problem for many days now and it really confuses me, I need some help.
I am trying to train a neural network whose input is an image, and which generates another image as output (it is not a very large image, it is 8x8 pixels). And I have an arbitrary fancy_algorithm() "black box" function that receives the input and prediction of the network (the two images) and outputs a float number that tells how good the output of the network was (calculates a loss). My problem is that I want to train THIS neural network but using the loss generated by the black box algorithm. This problem is confusing me, I researched a lot and I didn't find much about it, it seems like reinforcement learning, but at the same time I'm not sure because it’s not like an agent, but it has some kind of reinforcement at the same time.
In case you need more details to help me just ask. Thanks in advance!
Alright question solved. This is a reinforcement learning problem. I can’t use a gradient based optimization on my black-box loss function which don’t has any gradient. More details here: https://www.reddit.com/r/tensorflow/comments/gekotd/can_i_use_an_arbitrary_algorithm_as_a_loss/
I am making a project in which I have to predict a plane trajectory.
I have 2 types of trajectory, the first one is the planned, and the second one is the real one that I recovered after the end of the flight.
The two trajectories are (x,y) points on a map and I want to predict the real one with the planned one.
What kind of model do you use? I heard about multivariate regression or recurrent neural network but I am not sure about both, I think multivariate is not appropriate and rnn include time as parameter and I would not want to use it first.
Do you have any ideas?
Thank you
You could try either training single-target multiple regression models, and predict the x and y variables independently. The other way to go about is to use multi-target regression-based methods. The most commonly used method using Predictive Clustering trees. You can read about various methods from https://towardsdatascience.com/regression-models-with-multiple-target-variables-8baa75aacd to start with. I hope it is somewhat helpful. :)
I'm implementing my first neural network for images classification.
I would like to know if i should start to find best hyperparameters first and then try to modify my neural network architecture (e.g number of layer, dropout...) or architecture then hyperameters?
First you should decide for an architecture and then play around with the hyperparameters. To compare different hyperparameters it is important to have the same base (architecture).
Of course you can also play around with the architecture (layers, nodes,...).But I think here it is easier to search for an architecture online, because often the same or a similar problem yet have been solved or described in a tutorial/blog.
The dropout is also a (training-)hyperparameter and not part of the architecture!
The answer is as always : it depends
What are you trying to achieve?
If you're hoping to make the worlds best image classifier by trial and error then you might want to ask yourself if you think you have more compute available than the people who have already done this. For a really good classifier there are several ones that come with tensorflow/keras and can be easily implemented. If you're goofing around and learning the coding then I'd recommend different architectures because that's going to teach you more functions. If you have a dataset you don't think existing solutions will be good at analysing and genuinely need the best network to solve classify them then unfortunately it still depends...
How to decide:
Firstly decide on the rough order of magnitude for your overall parameter count (the literal number of parameters your model has). For a given number of parameters, architecture is likely to produce the biggest difference in results between representative hyperparameter choices (don't choke your network down to a single neuron in the middle and expect it to be representative of that architecture).
Its important to compare the rough performance per parameter so you're not giving an edge to the networks with greater overfitting capacity. You don't need to use all your training data or even train to completion, mostly you'll find the better networks learn faster and finish better (mostly). In the past I've done grid searches with multiple trials at each point using significantly reduced data then optimised the architecture with the most potential by considering the gradients of the grid search. Fun fact: with sufficient time you can use gradient descent methods on hyperparameters to find local minima. You might well find that there are many similarly top performing models, all of which should you can tune until a clear winner emerges.
I do apologies in advance if something similar has been posted but from the research I've done I can't find anything specific.
I'm currently looking at http://scikit-learn.org and the content here looks great but I'm confused what type I should be using for my problem.
I want to able to have 2 labels.
**Suspicious**
1hbn34uqrup7a13t
qmr30zoyswr21cdxolg
1qmqnbetqx
**Not-Suspicious**
cheesemix
reg526
animato12
What type of machine learning algorithm could I feed the data in above as to teach it what I'd class as suspicious through supervised learning?
I'm leaning towards classification but there are so many models to choose from my slightly lost.
The first step in such machine learning problems is to think about the "features". You can't use e.g. a linear classifier directly on these strings. Thus, you have to extract some meaningful features that describe the string. In computer vision, these features are often edges, corner points, SIFT features. You basically have to options:
Design features yourself.
Learn the features.
1) This is the "classical" machine learning approach: you manually design a list of representative features, which you can extract from your input data. In your case, you could start with e.g.
length of the string
number of different characters
number of special characters
something about the sorting?
...
That will give you a vector of numbers for each string. Now, you can use any of the classifiers from scikit-learn to classify the data. You can start choosing your algorithm with the help of this flowchart. You should start with a simple model, e.g. a linear model (e.g. linear SVM). If performance is not sufficient, use a more complex model (e.g. SVM with kernels), or rethink your choice of features.
2) This is the "modern" approach, which is gaining more and more popularity. Designing the features is a crucial step in 1) and it requires good knowledge of your data. Now, by using a deep neural network, you can feed your raw data (the string) into the network, and let the network learn such "features" itself. This, however, requires a large amount of labeled training data, and a lot of processing power (GPUs).
LSTM networks are todays state-of-the-art in natural language processing and similar tasks. LSTMs would be well suited to your tasks, as the input can be of variable length.
tl;dr: Either design features yourself and use a classifier of your choice, or dive into deep neural networks and let a network learn both the features and the classification.
I have a audio data set and each of them has different length. There are some events in these audios, that I want to train and test but these events are placed randomly, plus the lengths are different, it is really hard to build a machine learning system with using that dataset. I thought fixing a default size of length and build a multilayer NN however, the length's of events are also different. Then I thought about using CNN, like it is used to recognise patterns or multiple humans on an image. The problem for that one is I am really struggling when I try to understand the audio file.
So, my questions, Is there anyone who can give me some tips about building a machine learning system that classifies different types of defined events with training itself on a dataset that has these events randomly(1 data contains more than 1 events and they are different from each other.) and each of them has different lenghts?
I will be so appreciated if anyone helps.
First, you need to annotate your events in the sound streams, i.e. specify bounds and labels for them.
Then, convert your sounds into sequences of feature vectors using signal framing. Typical choices are MFCCs or log-mel filtebank features (the latter corresponds to a spectrogram of a sound). Having done this, you will convert your sounds into sequences of fixed-size feature vectors that can be fed into a classifier. See this. for better explanation.
Since typical sounds have a longer duration than an analysis frame, you probably need to stack several contiguous feature vectors using sliding window and use these stacked frames as input to your NN.
Now you have a) input data and b) annotations for each window of analysis. So, you can try to train a DNN or a CNN or a RNN to predict a sound class for each window. This task is known as spotting. I suggest you to read Sainath, T. N., & Parada, C. (2015). Convolutional Neural Networks for Small-footprint Keyword Spotting. In Proceedings INTERSPEECH (pp. 1478–1482) and to follow its references for more details.
You can use a recurrent neural network (RNN).
https://www.tensorflow.org/versions/r0.12/tutorials/recurrent/index.html
The input data is a sequence and you can put a label in every sample of the time series.
For example a LSTM (a kind of RNN) is available in libraries like tensorflow.