I'm looking for a python library for replace the rake function from "Survey", an R library (https://www.rdocumentation.org/packages/survey/versions/4.0/topics/rake)
I have found and try Quantipy, but the weights quality is poor compared to the weights generate with R on the same dataset.
I have found PandaSurvey, but seems to not working correctly (and the documentation is very poor)
I am surprised not to find much on google on this subject. However, it is an essential function if you are working with polls. Python being a datascience language, it's surprising. But maybe I missed it.
Thank you very much!
Related
i hope you all will be doing fine.
I am having a conceptual problem,I dont know the name of this table and neither i know how can i extract it using scikit-learn.Even, if i knew the correct terminology for this table that would have helped a lot or if someone can tell me, which scikit function to use then it will be awesome.
i have googled it a lot e.g using terms like aggregated table, classification reports but couldn't find this type of table.
thanks for your time!
happy coding!
You can use eli5 package in python.
ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions.
For the specific case, you can use eli5.show_weights() function for your classifier. Notice that it works for classifiers in sklearn and sklearn-crfsuit also.
Sorry for the late reply, but I discovered after searching and discussion with my peers. This is custom matrix used for algorithm comparison on the basis of feature extraction techniques. Thanks #OmG for taking your time to answer this question.
I am trying to deal with data imbalance within a small dataset. Just found an article talking about SMOTE and MSMOTE here
It seems that MSMOTE can overcome the shortages of SMOTE, so I really want to try it. MSMOTE paper is published in 2009, however I could not find any library related to MSMOTE in R or python.
Do you know whether there is any built-in MSMOTE I could try? I'm fine with whatever programming language...
You can use "imbalanced-learn" package in Python.
This is the link
This is an old question, but for future reference.
Here is a library with multiple variants to SMOTE in Python.
In particular, includes MSMOTE: https://smote-variants.readthedocs.io/en/latest/oversamplers.html?highlight=msmote#msmote
oversampler= smote_variants.MSMOTE()
X_samp, y_samp= oversampler.sample(X, y)
As the title said, what I need is a python library not Matlab's BNT.
BNT is quite strong, but most of the time, I use python to clean data, and recently I found that use two different language to do one thing usually make the problem much more complex. So I want a python library that can fit the parameters of DBNs.
Thank you very much.
Is there any good documentation for libsvm in python with a few non-trivial examples, that explain what each of the flags mean, and how data can the trained and tested from end to end?
(There is no official documentation for libsvm. The 'official documentation' provided for libsvm is just a paper on how SVM's works and does not contain any usage instructions for the module. Hence, please link any useful python documentation / example code for libsvm here)
If you have already downloaded libSVM you will find some "usefull" documentation inside two files:
./libsvm-3.xx/README file in the top directory which covers the C/C++ API and also documentation about the binary executables svm-predict, svm-scale and svm-train
./libsvm-3.xx/python/README which deals with the Python interfaces (svm and svmutil), which I think is what you are looking for. However the example is quite naive although is a good beginning.
Let me suggest you that if you want to work with libSVM in Python, the scikit-learn package implements SVM using libSVM underneath, it much more easy, better documented and let's you control the same parameters of libSVM.
I think you might be approaching this the wrong way. You seem to be expecting to use LIBSVM as if it was ls: just do man ls to get the parameters and view the results. SVMs are more complicated than that.
The authors of LIBSVM publish a document (not a scientific paper!) called: A Practical Guide to Support Vector Classification. You need to read and understand all that the authors explain there. The appendix to that guide gives multiple examples on many datasets and how to train and how to search for parameters (all things that are very important).
There is a README file in the python directory of the LIBSVM distribution. If you understand python and you read the practical guide you should be able to use it. If not you should probably start from the command line examples to learn SVM or start with somthing easier(not SVMs!) to learn python. After reading and understanding that you should be able to read use all the examples from the appendix and invoke them from python.
Once you've tried this you should be up and running in no time. If not, this is a great place to ask specific questions about problems you run into.
I am using Latent Dirichlet Allocation with a corpus of news data from six different sources. I am interested in topic evolution, emergence, and want to compare how the sources are alike and different from each other over time. I know that there are a number of modified LDA algorithms such as the Author-Topic model, Topics Over Time, and so on.
My issue is that very few of these alternate model specifications are implemented in any standard format. A few are available in Java, but most exist as conference papers only. What is the best way to go about implementing some of these algorithms on my own? I am fairly proficient in R and jags, and can stumble around in Python when given long enough. I am willing to write the code, but I don't really know where to start and I don't know C or Java. Can I build a model in JAGS or Python just having the formulas from the manuscript? If so, can someone point me at an example of doing this? Thanks.
My friend's response is below, pardon the language please.
First I wrote up a Python implementation of the collapsed Gibbs sampler seen here (http://www.pnas.org/content/101/suppl.1/5228.full.pdf+html) and fleshed out here (http://cxwangyi.files.wordpress.com/2012/01/llt.pdf). This was slow as balls.
Then I used a Python wrapping of a C implementation of this paper (http://books.nips.cc/papers/files/nips19/NIPS2006_0511.pdf). Which is fast as f*ck, but the results are not as great as one would see with NMF.
But NMF implementations I've seen, with scitkits, and even with the scipy sparse-compatible recently released NIMFA library, they all blow the f*ck up on any sizable corpus. My new white whale is a sliced, distributed implementation of the thing. This'll be non-trivial.
In Python, do you know of PyMC? It's flexible in specifying both the model and the fitting algorithm.
Also, when starting with R and JAGS, there is this tutorial on "Using JAGS in R with the rjags Package" together with a collection of examples.