I have installed the tensorflow and follow the tutorial here
https://www.tensorflow.org/versions/0.6.0/tutorials/mnist/tf/index.html#tensorflow-mechanics-101
and build it successfully, I can get the evaluation result for the same size dataset, like 1000X784 for training set, and 1000X784 for testing set.
but what if i want to test one data, 1X784, and find out what's the output, using the algorithm trained above.
I am now to tensorflow, and new to Machine Learning, I hope that I have described my self.
It's not clear to me which part you're having trouble with, but I think what you're asking is how to use batch size 1000 for training, but only predict on a single input. I assume you already know how to predict on batches of size 1000.
If the first dimension of your model's input placeholder, which is usually the batch size, is set to be None, the size is inferred when you provide an input. So, if you change the 1000 to be None, you should then be able to pass an input of size 1 by 784 to make predictions.
The solution that you found to feed a 1*784 is a great solution to just get a quick feed back , however in bigger networks which they need a lot of time (around hours) for training your solution is not feasible.
Tensorflow they have a new feature it's name is Tensorflow serving which you give it a train model then you interact with your model as a client.
Here is their website for more information : https://github.com/tensorflow/serving
Related
I'm new to machine learning, and I've been given a task where I'm asked to extract features from a data set with continuous data using representation learning (for example a stacked autoencoder).
Then I'm to combine these extracted features with the original features of the dataset and then use a feature selection technique to determine my final set of features that goes into my prediction model.
Could anyone point me to some resources or demos or sample code of how I could get started on this? I'm very confused on where to begin on this and would love some advice!
Okay, say you have an input of (1000 instances and 30 features). What I would do based on what you told us is:
Train an autoencoder, a neural network that compresses the input and then decompresses it, which has as a target your original input. The compressed representation lies in the latent space and encapsulates information about the input which is not directly accessible by humans. Now you may find such networks in tensorflow or pytorch. Tensorflow is easier and more straightforward so it could be better for you. I will provide this link (https://keras.io/examples/generative/vae/) for a variational autoencoder that may do the job for you. This has Conv2D layers so it performs really well for image data, but you can play around with the architecture. I cannot tell u more because you did not provide more info for your dataset. However, the important thing is the following:
After your autoencoder is trained properly and you need to make sure of it, (it adequately reconstructs the input) then you need to extract the aforementioned latent inputs (you will find more in the link). Now, that will be let's say 16 numbers but you can play with it. These 16 numbers were built to preserve info regarding your input. You said you wanted to combine these numbers with your input so might as well do that and end up with 46 input features. Now the feature selection part has to do with selecting the input features that are more useful for your model. That is not very interesting, you may find more information (https://towardsdatascience.com/feature-selection-techniques-in-machine-learning-with-python-f24e7da3f36e) and one way to select features is by training many models with different feature subsets. Remember, techniques such as PCA are for feature extraction not selection. I cannot provide any demo that does the whole thing but there are sources that can help. Remember, your autoencoder is supposed to return 16 numbers for each training example. Your autoencoder is trained only on your train data, with your train data as targets.
I've checked through stack overflow as best I can and the Tensorflow API's section on Estimator.evaluate() but haven't been able to find anything addressing this question.
I'm a student working on a research project with Tensorflow, I've been tracking accuracy with evaluate() and storing that value that's returned in a text file. My advising professor (who works with ML/NNs but not specifically python and Tensorflow) wants to know if that accuracy value is specific to the batch of data it saw in the moment, or if it's the overall accuracy of that network from inception to that moment in time.
Can someone please clarify whether 'accuracy' is a measure of the accuracy for that given batch of data at the moment of evaluation OR is it a measure of all batches/data that it has seen up to and including that moment?
If it is NOT a measure of all batches, is there any way to find that from the network or do I need to be manually calculating it?
On how I've been building/training my network (in case it matters): I build the model at a slightly lower level than Keras (as in, I define the architecture in a method using tf.layers), I also have never explicitly run the network with tf.session() (I've only run into trouble when I've tried and past networks have functioned fine without it).
Estimator.evaluate() calls input_fn in each step which returns one batch of data as can be seen in the document https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator#evaluate. So what your input_fn returns is important here i.e if test data is small, it can return entire data OR it can return data in batches if test data is large.
If your test dataset is small enough to fit in memory (RAM), input_fn returns all test data at once, you can pass it once and get result
e.g.
result = classifier.evaluate(test_inpf)
Now if your test data is large not to fit in the memory, to get accuracy over entire test dataset, you can get accuracy on each batch (because input_fn will be returning batches now) and take running average over all batches in your dataset.
e.g. If your test dataset has 100 examples. Batch size is 10.
For every batch of size 10, you evaluate accuracy. You get 10 accuracy values for dataset. Then average of these is accuracy of model over entire dataset.
This is also a helpful tutorial on TensorFlow website
https://www.tensorflow.org/tutorials/estimators/linear
How can I use Long Short-term Memory (LSTM) to predict a future value x(t+1) (out of sample prediction) based on a historical dataset. I read and tried many web tutorials for forecasting and prediction using lstm, but still far away from the point. What's the exact procedure to do this prediction? Is it just as simple as shifting the target array (n)steps where n is the number of future predicts and do the prediction operation? or there's another techniques?
please help or leave a suggestion.
Can you provide the framework you are using? tensorflow? pytorch? which web tutorials specifically?
Assuming you are going tensorflow, you can copy and paste code from one of these, test that it works on the provided dataset, then modify the input encoding functions to fit your dataset, then run on your dataset.
https://github.com/llSourcell/How-to-Predict-Stock-Prices-Easily-Demo (best)
https://github.com/sebastianheinz/stockprediction
https://github.com/talolard/MarketVectors/blob/master/preparedata.ipynb (you will have to replace fc layers with lstm, and fiddle with inputs)
In general procedure is something like (assuming tensorflow):
Download Dataset
Create a function to load batches of data
Create a function to encode batch of data (normalization, other transforms)
Create LSTM layer to recieve series of inputs.
Create output layer (usually fully connected) to take last lstm state and predict output of your desired size.
Create a tf session to wire everything together, and hit run.
Some questions to ask conceptually about which network use:
How many inputs to how many outputs - see this excellent http://cs231n.stanford.edu/slides/2016/winter1516_lecture10.pdf by Karpathy
How far back do you consider the stock prices eg {t-100... t} or {t-10 ...t} which may dictate size of hidden layers.
What other information do you think is relevant to the model? does stock A influence stock B? in which case you may have 2 lstms outputing a state to your fully connected layer...
I came to the point where I deployed my trained model done with Keras and based on Tensorflow. I have Tensorflow-Serving running and when I feed the input with the test data I get exactly the output as expected. Wonderful!
However, in the real world (deployed scenario) I need to pass a new data set to the model that the model has never seen before. And in the training/testing setup I did categorization and one-hot encoding. So I need to transform the submitted data-set first. This I might be able to do.
I also did normalization (Standardscaler from sklearn) and now I have no clue what is best practice to do here. In order to do normalization I would need to run through the training data again plus the one submitted data-set.
I believe this can be solved in an elegant way. Any ideas?
I'm using Keras to build a convolutional neural net to perform regression from microscopic images to 2D label data (for counting). I'm looking into training the network on smaller patches of the microscopic data (where the patches are the size of the receptive field). The problem is, the fit() method requires validation data to be of the same size as the input. Instead, I'm hoping to be able to validate on entire images (not patches) so that I can validate on my entire validation set and compare the results to other methods I've used so far.
One solution I found was to alternate between fit() and evaluate() each epoch. However, I was hoping to be able to observe these results using Tensorboard. Since evaluate() doesn't take in callbacks, this solution isn't ideal. Does anybody have a good way validating on full-resolution images while training on patches?
You could use fit generator instead of fit and provide a different generator for validation set. As long as the rest of your network is agnostic to the image size, (e.g, fully convolutional layers), you should be fine.
You need to make sure that your network input is of shape (None,None,3), which means your network accepts an input color image of arbitrary size.