I'm a College student (Economics) and I want to program some monetary models using Neural Networks. I want those models to be able to predict future values of some variables using economic data, but I really don't know how to "model" the program itself. Is there any good Python module for that? I mean, a module for NN and a module for economic analysis?
P.S.: I'm using Python 3.x, but I can switch to 2.7.x if needed...
There is also PyBrain. I didn't work with any library yet, but had some time to investigate the documentation. It looks like PyBrains has the simplest interface, compared to the available feature set.
EDIT
I have now (Dec 2010) some practical experience with PyBrain and like it very much.
I've played with ffnet a little. PS - It was a pain to install.
"Feed-forward neural network for python"
http://pypi.python.org/pypi/ffnet/0.6
For large neural networks, you might want to consider GPU-accelerated libraries.
Our own library CUV comes to mind, as well as for example theano. CUV has python bindings, theano actually generates C++/CUDA code.
Google yields at least four different Python neural network implementations; in particular, bpnn.py looks good just for its simplicity.
Or were you looking for an explanation of neural networks?
pyfann, is fast and well documented: http://leenissen.dk/fann/wp/
Related
I'm actually working on a machine learning project and I need to make some "previsions".
I have this datas (solar panel and kind of weather)
I need to make a prevision of the energy efficiency of the following days.
I search a little bit on this, there is some informations about neuronal network like Keras. I install it but I just don't know how to make it works in my situation. I am beginer in machine learning, I learn a lot about this but it's a lot of theory and little practice so I'm really lost.
If there is someone who can just says me how can I do or give me something like a search trail !
Thanks a lot for the support !
To use keras/tensorflow or other libraries you need to know how to code in python at least and should have the understanding of neural networks. To begin with, you can have a look at knime (https://www.knime.org/) this provides similar functionalities but you need not do any coding. This might help you in understanding what's happening when you apply any kind of algorithm. Once you have a fair idea you might want to try to use keras/tensorflow.
As the title said, what I need is a python library not Matlab's BNT.
BNT is quite strong, but most of the time, I use python to clean data, and recently I found that use two different language to do one thing usually make the problem much more complex. So I want a python library that can fit the parameters of DBNs.
Thank you very much.
I'm learning statistical learning these days using python's pandas and scikit-learn library and they're fantastic tools for me.
I could have learned the way of classification, regression and also clustering with them of course.
But, I cannot find the way how can I start with them when I would like to make a recommendation model. For example, if I have a customer's purchase dataset, which contains date, product name, product maker, price, order device etc...
What is the problem type of recommendation? classification, regression, or anything else?
In fact, I could find out there are very famous algorithms like collaborative filtering when someone has to solve this problem.
If so, can I use those algorithms using scikit-learn? or should I have to learn another M.L libraries?
Regards
Scikit-learn does not offer any recommendation system tools. You can give a look at mahout which is giving really easy to start proposition or spark.
However recommendation is a problem in itself in machine learning word. It can be regression if you are trying to predict the rate that a user would give to a movie for instance or classification if you want to know if a user will like the movie or not (binary choice).
The important thing is that recommendation is using tools and algorithms dedicated to this problem like item-based or content-based recommendation. These concepts are actually quite simple to understand and implementing yourself a little recommendation engine might be the best.
I advice you the book mahout in action which is a great introduction to recommendation concept
How about Crab https://github.com/python-recsys/crab, which is a a Python framework for building recommender engines integrated with the world of scientific Python packages (numpy, scipy, matplotlib).
I have not used this framework but just found it. And it seems there is only version 0.1 and Crab hasn't been updated for years. So I doubt whether it is well documented. Whatever, if you decide to try Crab, please give us a feedback after that:)
sorry if this all seem nooby and unclear, but I'm currently learning Netlogo to model agent-based collective behavior and would love to hear some advice on alternative software choices. My main thing is that I'd very much like to take advantage of PyCuda since, from what I understand, it enables parallel computation. However, does that mean I still have to write the numerical script in some other environment and implement the visuals in yet another one???
If so, my questions are:
What numerical package should I use? PyEvolve, DEAP, or something else? It appears that PyEvolve is no longer being developed and DEAP is just a wrapper on the outdated(?) EAP.
Graphic-wise, I find mayavi2 and vtk promising. The problem is, none of the numerical package seems to bind to these readily. Is there no better alternative than to save the numerical output to datafile and feed them into, say, mayavi2?
Another option is to generate the data via Netlogo and feed them into a graphing package from (2). Is there any disadvantage to doing this?
Thank you so much for shedding light on this confusion.
You almost certainly do not want to use CUDA unless you are running into a significant performance problem. In general CUDA is best used for solving floating point linear algebra problems. If you are looking for a framework built around parallel computations, I'd look towards OpenCL which can take advantage of GPUs if needed..
In terms of visualization, I'd strongly suggest targeting a a specific data interchange format and then letting some other program do that rendering for you. The only reason I'd use something like VTK is if for some reason you need more control over the visualization process or you are looking for a real time solution.
Probably the best choice for visualization would be to use an intermediate format and do it in another program. But for performance, i'd rather configure a JVM for a cluster and run NetLogo on it. I've not tried it yet but i'm thinking seriously to try NetLogo on a Beowulf style cluster.
BTW, there is an ABM platform called Repast that is said to have Python interface if you're planning to implement your code in Python.
I'm planning of implementing a document ranker which uses neural networks. How can one rate a document by taking in to consideration the ratings of similar articles?. Any good python libraries for doing this?. Can anyone recommend a good book for AI, with python code.
EDIT
I'm planning to make a recommendation engine which would make recommendations from similar users as well as using the data clustered using tags. User would be given chance to vote for articles. There will be about hundred thousand articles. Documents would be clustered based on their tags. Given a keyword articles would be fetched based on their tags and passed through a neural network for ranking.
The problem you are trying to solve is called "collaborative filtering".
Neural Networks
One state-of-the-art neural network method is Deep Belief Networks and Restricted Boltzman Machines. For a fast python implementation for a GPU (CUDA) see here. Another option is PyBrain.
Academic papers on your specific problem:
This is probably the state-of-the-art of neural networks and collaborative filtering (of movies):
Salakhutdinov, R., Mnih, A. Hinton, G, Restricted Boltzman
Machines for Collaborative Filtering, To appear in
Proceedings of the 24th International Conference on
Machine Learning 2007.
PDF
A Hopfield network implemented in Python:
Huang, Z. and Chen, H. and Zeng, D. Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering.
ACM Transactions on Information Systems (TOIS), 22, 1,116--142, 2004, ACM. PDF
A thesis on collaborative filtering with Restricted Boltzman Machines (they say Python is not practical for the job):
G. Louppe. Collaborative filtering: Scalable
approaches using restricted Boltzmann machines.
Master's thesis, Universite de Liege, 2010.
PDF
Neural networks are not currently the state-of-the-art in collaborative filtering. And they are not the simplest, wide-spread solutions. Regarding your comment about the reason for using NNs being having too little data, neural networks don't have an inherent advantage/disadvantage in that case. Therefore, you might want to consider simpler Machine Learning approaches.
Other Machine Learning Techniques
The best methods today mix k-Nearest Neighbors and Matrix Factorization.
If you are locked on Python, take a look at pysuggest (a Python wrapper for the SUGGEST recommendation engine) and PyRSVD (primarily aimed at applications in collaborative filtering, in particular the Netflix competition).
If you are open to try other open source technologies look at: Open Source collaborative filtering frameworks and http://www.infoanarchy.org/en/Collaborative_Filtering.
Packages
If you're not committed to neural networks, I've had good luck with SVM, and k-means clustering might also be helpful. Both of these are provided by Milk. It also does Stepwise Discriminant Analysis for feature selection, which will definitely be useful to you if you're trying to find similar documents by topic.
God help you if you choose this route, but the ROOT framework has a powerful machine learning package called TMVA that provides a large number of classification methods, including SVM, NN, and Boosted Decision Trees (also possibly a good option). I haven't used it, but pyROOT provides python bindings to ROOT functionality. To be fair, when I first used ROOT I had no C++ knowledge and was in over my head conceptually too, so this might actually be amazing for you. ROOT has a HUGE number of data processing tools.
(NB: I've also written a fairly accurate document language identifier using chi-squared feature selection and cosine matching. Obviously your problem is harder, but consider that you might not need very hefty tools for it.)
Storage vs Processing
You mention in your question that:
...articles would be fetched based on their tags and passed through a neural network for ranking.
Just as another NB, one thing you should know about machine learning is that processes like training and evaluating tend to take a while. You should probably consider ranking all documents for each tag only once (assuming you know all the tags) and storing the results. For machine learning generally, it's much better to use more storage than more processing.
Now to your specific case. You don't say how many tags you have, so let's assume you have 1000, for roundness. If you store the results of your ranking for each doc on each tag, that gives you 100 million floats to store. That's a lot of data, and calculating them all will take a while, but retrieving them is very fast. If instead you recalculate the ranking for each document on demand, you have to do 1000 passes of it, one for each tag. Depending on the kind of operations you're doing and the size of your docs, that could take a few seconds to a few minutes. If the process is simple enough that you can wait for your code to do several of these evaluations on demand without getting bored, then go for it, but you should time this process before making any design decisions / writing code you won't want to use.
Good luck!
If I understand correctly, your task is something related to Collaborative filtering. There are many possible approaches to this problem; I suggest you follow the wikipedia page to have an overview of the main approaches you can choose.
For your project work I can suggest looking at Python based intro to Neural Networks with a simple BackProp NN implementation and a classification example. This is not "the" solution, but perhaps you can build your system out of that example without the need for a bigger framework.
You might want to check out PyBrain.
The FANN library also looks promising.
I am not really sure if a neural networks are the best way to solve this. I think Euclidean Distance Score or Pearson Correlation Score combined with item or user based filtering would be a good start.
An excellent book on the topic is: Programming Collective Intelligence from Toby Segaran