Is there any built-in MSMOTE - python

I am trying to deal with data imbalance within a small dataset. Just found an article talking about SMOTE and MSMOTE here
It seems that MSMOTE can overcome the shortages of SMOTE, so I really want to try it. MSMOTE paper is published in 2009, however I could not find any library related to MSMOTE in R or python.
Do you know whether there is any built-in MSMOTE I could try? I'm fine with whatever programming language...

You can use "imbalanced-learn" package in Python.
This is the link

This is an old question, but for future reference.
Here is a library with multiple variants to SMOTE in Python.
In particular, includes MSMOTE: https://smote-variants.readthedocs.io/en/latest/oversamplers.html?highlight=msmote#msmote
oversampler= smote_variants.MSMOTE()
X_samp, y_samp= oversampler.sample(X, y)

Related

Reweighting (rake) in Python

I'm looking for a python library for replace the rake function from "Survey", an R library (https://www.rdocumentation.org/packages/survey/versions/4.0/topics/rake)
I have found and try Quantipy, but the weights quality is poor compared to the weights generate with R on the same dataset.
I have found PandaSurvey, but seems to not working correctly (and the documentation is very poor)
I am surprised not to find much on google on this subject. However, it is an essential function if you are working with polls. Python being a datascience language, it's surprising. But maybe I missed it.
Thank you very much!

Whats the name of this matrix or table?

i hope you all will be doing fine.
I am having a conceptual problem,I dont know the name of this table and neither i know how can i extract it using scikit-learn.Even, if i knew the correct terminology for this table that would have helped a lot or if someone can tell me, which scikit function to use then it will be awesome.
i have googled it a lot e.g using terms like aggregated table, classification reports but couldn't find this type of table.
thanks for your time!
happy coding!
You can use eli5 package in python.
ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions.
For the specific case, you can use eli5.show_weights() function for your classifier. Notice that it works for classifiers in sklearn and sklearn-crfsuit also.
Sorry for the late reply, but I discovered after searching and discussion with my peers. This is custom matrix used for algorithm comparison on the basis of feature extraction techniques. Thanks #OmG for taking your time to answer this question.

Is there python library that can learn the parameters of a given dynamic bayesian network?

As the title said, what I need is a python library not Matlab's BNT.
BNT is quite strong, but most of the time, I use python to clean data, and recently I found that use two different language to do one thing usually make the problem much more complex. So I want a python library that can fit the parameters of DBNs.
Thank you very much.

how can I make recommendation model using python's scikit-learn

I'm learning statistical learning these days using python's pandas and scikit-learn library and they're fantastic tools for me.
I could have learned the way of classification, regression and also clustering with them of course.
But, I cannot find the way how can I start with them when I would like to make a recommendation model. For example, if I have a customer's purchase dataset, which contains date, product name, product maker, price, order device etc...
What is the problem type of recommendation? classification, regression, or anything else?
In fact, I could find out there are very famous algorithms like collaborative filtering when someone has to solve this problem.
If so, can I use those algorithms using scikit-learn? or should I have to learn another M.L libraries?
Regards
Scikit-learn does not offer any recommendation system tools. You can give a look at mahout which is giving really easy to start proposition or spark.
However recommendation is a problem in itself in machine learning word. It can be regression if you are trying to predict the rate that a user would give to a movie for instance or classification if you want to know if a user will like the movie or not (binary choice).
The important thing is that recommendation is using tools and algorithms dedicated to this problem like item-based or content-based recommendation. These concepts are actually quite simple to understand and implementing yourself a little recommendation engine might be the best.
I advice you the book mahout in action which is a great introduction to recommendation concept
How about Crab https://github.com/python-recsys/crab, which is a a Python framework for building recommender engines integrated with the world of scientific Python packages (numpy, scipy, matplotlib).
I have not used this framework but just found it. And it seems there is only version 0.1 and Crab hasn't been updated for years. So I doubt whether it is well documented. Whatever, if you decide to try Crab, please give us a feedback after that:)

Documentation for libsvm in python

Is there any good documentation for libsvm in python with a few non-trivial examples, that explain what each of the flags mean, and how data can the trained and tested from end to end?
(There is no official documentation for libsvm. The 'official documentation' provided for libsvm is just a paper on how SVM's works and does not contain any usage instructions for the module. Hence, please link any useful python documentation / example code for libsvm here)
If you have already downloaded libSVM you will find some "usefull" documentation inside two files:
./libsvm-3.xx/README file in the top directory which covers the C/C++ API and also documentation about the binary executables svm-predict, svm-scale and svm-train
./libsvm-3.xx/python/README which deals with the Python interfaces (svm and svmutil), which I think is what you are looking for. However the example is quite naive although is a good beginning.
Let me suggest you that if you want to work with libSVM in Python, the scikit-learn package implements SVM using libSVM underneath, it much more easy, better documented and let's you control the same parameters of libSVM.
I think you might be approaching this the wrong way. You seem to be expecting to use LIBSVM as if it was ls: just do man ls to get the parameters and view the results. SVMs are more complicated than that.
The authors of LIBSVM publish a document (not a scientific paper!) called: A Practical Guide to Support Vector Classification. You need to read and understand all that the authors explain there. The appendix to that guide gives multiple examples on many datasets and how to train and how to search for parameters (all things that are very important).
There is a README file in the python directory of the LIBSVM distribution. If you understand python and you read the practical guide you should be able to use it. If not you should probably start from the command line examples to learn SVM or start with somthing easier(not SVMs!) to learn python. After reading and understanding that you should be able to read use all the examples from the appendix and invoke them from python.
Once you've tried this you should be up and running in no time. If not, this is a great place to ask specific questions about problems you run into.

Categories