Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed yesterday.
Improve this question
I am facing accuracy issue with my code. I have used CNN as model and LDA for feature reduction with NSL-KDD dataset for my MPhil research but accuracy is 94% I don't know what's the matter, I want to achieve 98%.
These are the layers I have used
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I have a question regarding the model.fit method and overfitting from the scikit learn library in Pandas
Does the generic sklearn method model.fit(x---, y--) returns the score after applying the model to the specified training data?
Also, it is overfitting when performance on the test data degrades as more training data is used to learn the model?
model.fit(X, y) doesn't explicitly give you the score, if you assign a variable to it, it stores all the artifacts, training parameters. You can get the score by using model.score(X, y).
Overfitting in simple words is increasing the variance in your model by which your model fails to generalize. There are ways to reduce overfitting like feature engineering, normalization, regularization, ensemble methods etc.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I created image data set including 62992 images with 128x128px resolution that contains characters, numbers and symbols with four kinds of font styles. how do I train this dataset and create pre-trained model for my OCR, can you please help me?
If you want to train a custom OCR, I would advise studying the Tensorflow Attention OCR implementation which can be found here.
I've used this implementation in various projects and it gave satisfactory results in converting image to text.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I am studying machine learning course online and I am curious what I am going to do with the machine learning I made next. How can I save the test and predicted table and how can I use the trained machine learning model to my newest data? I'm using python.
Saving ml model using Pickle
# save the model to disk
filename = 'model.sav'
pickle.dump(model, open(filename, 'wb'))
# load the model from disk
loaded_model = pickle.load(open(filename, 'rb'))
result = loaded_model.predict(new_point)
Here new_point is 2D numpy array [[]] example 2 points with 3 features
[[x11,x12,x13],
[x21,x22,x23]]
For more information visit https://machinelearningmastery.com/save-load-machine-learning-models-python-scikit-learn/
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
How to determine what's the optimal number of iterations in learning a neural network?
One way of doing it is to split your training data into a train and validation set. During training, the error on the training set should decrease steadily. The error on the validation set will decrease and at some point start to increase again. At this point the net starts to overfit to the training data. What that means is that the model adapts to the random variations in the data rather than learning the true regularities. You should retain the model with overall lowest validation error. This is called Early Stopping.
Alternatively, you can use Dropout. With a high enough Dropout probability, you can essentially train for as long as you want, and overfitting will not be a significant issue.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have a python Machine Learning script which learns from Firewall log data. I am using Random Forest Classifier in it. If I run it as a service is it possible to train the model, again and again, each day whenever last day's Firewall logs are received.
These are the estimators in sklearn that support what you want to do and unfortunately for you Random Forest isn't one of them, so you will have to refit every time you add data.
If you insist on sticking with Random Forest, one option would be to reduce the number of features (based on the classifier you currently have) to increase the speed of refitting your classifier.