Multivariate time-series regression with neural networks - python

I have a dataset, where X is three dimensional matrix with 760 (different id/individuals) x 300000 (electrophysiological time series data) x 15 (number of different channels). I have a continuous, numerical y which is unique to each of the individuals (N=760). I have to predict y from X (obviously...).
I'd like to use deep neural networks for this purpose, but I kind of lost in choosing the right model.
RNN/LSTM could be good, but it's just for forecasting the time series itself and not regression. I'm not sure whether a convolutional neural net could detect changes in time (reshaping X into a 760x4500000 matrix).
Could you suggest some valid approaches for this?

You can opt for either CNNs or LSTMs.
CNNs can actually be used for time series analysis and may provide better results than expected.
I provide here below an example which illustrates how convolution can be applied on time series.
The alternative is to use LSTMs/GRUs. I do not add here a description since, as compared to 1D CNNs, they are much better described on the internet.
Two things to note here are:
That both(CNNs and LSTMs) can be used for classification purposes.
Using neural networks would be an optimal choice due to having multi-variate regression.

Related

TF2 - Splitting Input Data & Using Different Pre-Trained Weights of a Layer

I have individually trained the same neural network architecture on a large number of different datasets (order of 100s) to learn a unique non-linear function for each i.e have basically learned a set of weights that describes the function for each dataset.
Now, I want to use these sets of weights as a pre-trained layer in another optimization problem. I know how to load in a single saved model and employ that as a layer. However, what I will be doing is a group-wise optimization across the 100s of different datasets, where I have a pre-trained weights for each (from above).
So the setup is a batch of x datasets, each with n data points in d dimensions i.e. input data is of the shape [X, N, D]. There are a series of layers which act on all this data, then when it gets to the "pre-trained" layer, I wish to use different pre-trained weights i.e. For [0,:,:] uses the weights learned from dataset 0 from above, [1,:,:] with weights learned from dataset 1 etc etc etc.
I then need to combine the output of all this together, as the loss function for this groupwise optimization is based on the variance across all datasets. So I don't believe I can trivially evaulate one set, calculate loss, change weights, rinse and repeat and sum up at the end.
I doubt it is feasible to have some massive duplicate branches going, where I have x copies of the pre-trained NN layers as the pre-trained NN architecture is already quite complex.
Is it is possible to use a split layer, then a for loop type approach, in which I change the weights, then pass the correct portion of data through? Then merge all the outputs? Or is there a better way of tackling this?
Any help much appreciated.

How to build a model to predict a graph (not a image) in time series?

There is an adjacent matrix dataset that is based on time series. I would like to know if it is possible to build a neural network model to predict tn time point's matrix by using the previous time-series data. In my opinion, traditional models such as CNN may not fit for the sparse matrix graph.
Maybe you should give a look at Graph Neural Networks (specialy Spatial-Temporal Graph Networks). They use temporal information about graphs and its adjacency matrix to predict future nodes states, such values in the next-step.
You can read this survey paper as a start point and follow its cited works therefore.

Mixed linear model with Probabilistic Layers Regression

I'm interested in fitting a linear mixed model using the variational inference capabilities of tensorflow probabilities and keras. However, I cannot find a straight-forward answer on how to implement such analysis. Using the regression example in TF probabilities (see Case 3 here), I am able to grasp how to fit these models if we have only random variables in the model (the example is regression using a single feature). Following the radon example here, we have two features: floor (fixed) and county (random). My understanding is the latter should only be passed to the denseVariational layers, while the former can be passed to a regular dense layer. So I guess I would have to jointly train two networks, one for fixed and one for the random features and some how merge their outputs.
So my questions are:
(1) If these are fit jointly can the same loss function be applied to both? I often see mean square error used, while in VI negative log likelihood is used (I think this equivalent to maximizing evidence of lower bound).
(2) Does the input need to be split before-hand and fed as input to two networks?

Neural network architecture recomendation

Its my first time working with neural networks and I have been given the task of predicting some values of a dataset and I could make good use of some help on deciding which is the smartest architecture for the task. I'm working with Keras using Tensorflow as backend.
I'm not going into details but basically I have performed lots of CFD simulations on similar but slightly different geometries to obtain a stress value on the surface of the geometries. All the geometries have the same conectivity and number of nodes and I have the stress value for each of those nodes.
Simply put, I have an input matrix of [2500,3,300] where 2500 is the amount of nodes in each geometry, 3 represents the x,y,z coordinates in space of each node on the mesh and 300 is the total number of geometries I have. For the stress I have an output matrix of [2500,300] where 2500 is the value of stress for each node and 300 once again corresponds to the number of instances. I would like to train some kind of neural network so I can predict the stress values given the geometry.
I have been basing my architecture on the following paper but I can't not make use of the part in which the convolutional networks are employed. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5805990/
The simplest approach I can think of is a fully connected network but I struggle to figure out the layer architecture to relate the 3D matrix of the geometry to the 2D of the output stress matrix with my scarce knowledge of the subject.
Any suggestiong is more than welcomed. Thanks for your help!!!
Since I have been working with stress values prediction using DL, I would like to recommend you to work with CNN models which you have filters due to intelligent learning capacity even correlation between parameters your suppose to following. Nevertheless RNN and its promised version like LSTM & GRU have good performance on sufficient data if you have. Unfortunately I can't address you to paper because that this issue is still under study!
Another point I can turn out is reshape of your data is kind of important when you feed to NN models especially when you are dealing with time-series data.

Neural network regression with multi-value (probabilistic) functions

I'm a bit of a beginner in the art of machine learning. Here is a rather conceptual question I've been wondering:
Suppose I have a function X->Y, say y=x^2, then, generating enough data of X->Y, I can train a neural network to perform regression on the function, and get x^2 with any input x. This is basically also what the Universal Approximation Theorem suggests.
Now, my question is, what if I want the inverse relation, Y->X? In this case, Y is a multi-valued function of X, for instance for X>0, x=+-sqrt(y). I can swap X and Y as input/output data to train the network alright, but for any given y, there should be a random 1/2 - 1/2 chance that x=sqrt(y) and x=-sqrt(y). But of course, if one trains it with min-squared-error, the network wouldn't know this is a multi-value function, and would just follow SGD on the loss function and get x=0, the average value, for any given y.
Therefore, I wonder if there is any way a neural network can model a multi-valued function? For instance, my guess would be
(1) the neural network can output a collection of, say, the top 2 possible values for X and train it with cross-entropy. The problem is, if X is a vector or even a matrix (like a bit-map image) instead of a number, we don't know how many solutions Y=X has (which could very well be an infinite number, i.e. a continuous range), so a "list" of possible values and probabilities won't work - ideally the neural network should output values randomly and continuously distributed across possible X solutions.
(2) perhaps does this fall into the realm of probabilistic neural networks (PNN)? Does PNN model functions that support a given probabilistic distribution (continuous or discrete) of vectors as its output? If so, is it possible to implement PNN with popular frameworks like Tensorflow+Keras?
(Also, note that this is different from a "multivariate" function, which is the case where X,Y could be multi-component vectors, which is still something a traditional network can easily train on. The actual problem in question here is where the output could be a probabilistic distribution of vectors, which is something that a simple feed-forward network doesn't capture, since it doesn't have the inherent randomness.)
Thank you for your kind help!
Image of forward function Y=X^2 (can be easily modeled by network with regression)
Image of inverse function X=+-sqrt(Y) (the network cannot capture the two-value function and outputs the average value X=0 for any Y)
Try to read the following paper:
https://onlinelibrary.wiley.com/doi/abs/10.1002/ecjc.1028
Mifflin's algorithm (or its more general version SLQP-GS) mentioned in this paper is available here and corresponding paper with description is here.

Categories