As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I am a decent c/c++ programmer, but don't know much about web dev. I am interested in twitter/social data mining. So which is a better tool - RoR or Django? I am on level zero in both ruby and python. But python's syntax seemed easier to understand/learn. But the main Qs is that which tool has better mining related APIs?
Thanks!!
They both have all what you need. But Python does better here I think. Python has a very interesting library for text mining called NLTK, and Numpy/Scipy for analytical computations which allow you to achieve almost c comparable performances. On the other hand for pure data mining I'd suggest python+Pandas (Pandas is really well written and fast and there is no ruby equivalent as far as I know) or python + some R code called thru rpy. If in your data mining code you need to compute some symbolic math you can decide to use Sympy (slower because it's written in python but very complete) or Theano (way faster but with less features; it can even make your code run on the GPU thru CUDA)
If you are merely collecting data from twitter, you don't need a MVC frame work like Django or RoR. Actually you can use C++ libraries to collect data from Twitter, store them in database, build the indexing and so on, and then use C or C++ to perform data mining task against your data. Or you can performance the analysis on the go.
If you want to build your own web interface to present your work, or the likes, Django and RoR are both very good and easy to pick up framework.
This is not a real question, please read the faq
Related
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I have huge tables of data that I need to manipulate (sort, calculate new quantities, select specific rows according to some conditions and so on...). So far I have been using a spreadsheet software to do the job but this is really time consuming and I am trying to find a more efficient way to do the job.
I use python but I could not figure out how to use it for such things. I am wondering if anybody can suggest something to use. SQL?!
This is a very general question, but there are multiple things that you can do to possibly make your life easier.
1.CSV These are very useful if you are storing data that is ordered in columns, and if you are looking for easy to read text files.
2.Sqlite3 Sqlite3 is a database system that does not require a server to use (it uses a file instead), and is interacted with just like any other database system. However, for very large scale projects that are handling massive amounts of data, it is not recommended.
3.MySql MySql is a database system that requires a server to interact with, but can be tweaked for very large scale projects, as well as small scale projects.
There are many other different types of systems though, so I suggest you search around and find that perfect fit. However, if you want to mess around with Sqlite3 or CSV, both Sqlite3 and CSV modules are supplied in the standard library with python 2.7 and 3.x I believe.
You will probably appreciate the sqlite3 module in Python standard library:
http://docs.python.org/library/sqlite3.html
You get a SQL database that's stored in a file on disk, with no need to configure a separate database server. It's not appropriate for multiple clients accessing at once, but for a single-threaded analysis application like yours, it's a good fit.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
The requirement is to develop a HTML based facebook app. It would not be content based like a newspaper site,
but will mostly have user generated data which would be aggregated and presented from database + memcache.
The app would contain 4-5 pages at most, with different purposes.
We decided to write the app in Python instead of PHP , and tried to evaluate django.
However, we found django is not as flexible as how CodeIgniter in PHP is i.e. putting less restrictions and rules, and allowing you to do what you want to do.
PHP CodeIgnitor is minimalistic MVC framework, which we would have chosen if we were to develop in PHP.
Can you please suggest a flexible and minimalistic python based web framework? I have heard of pylons,cheeryPy,web.py , but I am completely unaware of their usage and structure.
Pyramid and Flask are both good options. Personally I think where pyramid shines is in it's flexibility in routing requests to view functions. You can do route based which is similar to how django does it though it's not full on regex matching and if you are willing to use resources/traversal you can do some really crazy things with access control lists.
You may not need that stuff and you are free to not use it. But it does scale up nicely to a super complex application. And it runs on python 3 where I don't think flask does yet, but it will eventually.
For my experience, I will recomend you Django:
Developed by a fast-moving online-news operation, Django was designed to handle two challenges: the intensive deadlines of a newsroom and the stringent requirements of the experienced Web developers who wrote it. It lets you build high-performing, elegant Web applications quickly.
It is really easy to learn and you will be able to develop those features after going through the official walkthrough
Check (Flask) It's a very clever micro-framework with a quiet active community.
You will not regret it ;)
For the fastest development you may dive into Django. But Django is probably not the fastest solution. Flask is lighter. Also you can try Pyramid.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
So i'm here and i pretty much got a full chain of languages for prette much any purpose:
PHP
HTML
Javascript
Java
CSS
SQL
And now i'm questioning what i should do next as an alternative or addition.
I know there isn't a single best language but i hope i can at least get some expertise on what to expect from people using the languages.
My main focus is in web development so mainly there are some technologies that are growing big really fast lately:
node.js (yes basically javascript but i think theres a lot to learn for js to be used on the server side)
ruby
clojure
And there are some that have been around for quite some years now:
perl
python
But those are only the ones i've seen so far. What are your recommendations from these languages/technologies? Why? What are the benefits or have i missed the ultimate star among them all?
I was in a similar "what language next" connundrum and picked Ruby. I read tons of Ruby vs Python articles, and finally decided to go through a simple app in each. I used Ruby on Rails and Python's Django Framework. I really liked the Rails MVC pattern usage. It helps me stick to better writing practices.
Also, I found a good IDE to use (RubyMine), which when you're using a tool all day, I find it helpful when getting to know a new Framework.
There is no ultimate star each language has it's own pros and cons.
Most of the frameworks for the languages are almost the same and you can do pretty much all that you need with what you know. It all depends on your needs and current project.
I am a web developer that uses PHP and I still haven't had any problems except for maybe making a true Singleton like you can in Java for example (because of the lifespan of the script).
Python is cool I like it because it has many libraries and useful tools and the syntax is convenient.
I think that a good idea for you now would be to take on a MVC framework (Cohana, CodeIgniter, Yii etc...) and start using it and learn it because for more complex web applications experience with frameworks is good.
I think you should learn NoSql types of DB's and the design of NoSql Db's, as this is the way of the future for high traffic in depth web applications...
I would suggest doing some research in the design and implementation of:
Apache Hadoop
Cassandra
MongoDB
couchDB
BigTable
and perhaps even check out the wiki:
http://en.wikipedia.org/wiki/NoSQL
This is the "cloud" tech utilized by Facebook, Twitter, Google, etc. And it is pretty impressive but requires quite a deviation of approach as opposed to traditional databases (RDBMS)...
This is what was the natural progression for me, when I was hired at my current job to take my dev "skills" to the cloud :) (btw if you are good at nosql implementations, and call it "cloud" solutions you can make a lot more money, its an emerging market for mainstream consumers)
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I want to ask you what programming language I should use to develop a horizontally scalable database. I don't care too much about performance.
Currently, I only know PHP and Python, but I wonder if Python is good for scalability.
Or is this even possible in Python?
The reasons I don't use an existing system is, I need deep insight into the system, and there is no database out there that can store indexes the way I want. (It's a mix of non relational, sparse free multidimensional, and graph design)
EDIT:
I already have most of the core code written in Python and investigated ways to improve adding data for that type of database design, what limits the use of other databases even more.
EDIT 2:
Forgot to note, the database tables are several hundred gigabytes.
The deveopment of a scalable database is language independent, i cannot say much about PHP, but i can tell you good things about Python, it's easy to read, easy to learn, etc. In my opinion it makes the code much cleaner than other languges.
Betweent PHP & Python, definitely Python. Where I work, the entire system is written in Python and it scales quite well.
p.s.: Do take a look at Mongo Db though.
You're looking for MongoDB.
Mongodb has some excellent python drivers. It is a joy to work with.
Since this is clearly a request for "opinion", I thought I'd offer my $.02
We looked at MongoDB 12-months ago, and started to really like it...but for one issue. MongoDB limits the largest database to amount of physical RAM installed on the MongoDB server. For our tests, this meant we were limited to 4 GB databases. This didn't fit our needs, so we walked away (too bad really, because Mongo looked great).
We moved back to home turf, and went with PostgreSQL for our project. It is an exceptional system, with lots to like.
But we've kept an eye on the NoSQL crowd ever since, and it looks like Riak is doing some really interesting work.
(fyi -- it's also possible the MongoDB project has resolved the DB size issue -- we haven't kept up with that project).
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I am a newb coder in a startup and I am implementing search of documents in a directory in a web host.
I am comparing Lucene/Solr, Whoosh, Sphinx and Xapian. Whoosh is natively python. But I want your opinions on it too. Which of these have
mature and easy to use and install interfaces with python? (Whoosh is a no-brainer)
no chance for crashes, bottlenecks and other failures
best documented interface (Im not reading PHP docs because python docs were sparse)
easiest to get up and running (only one has a quick-start tutorial)
Speaking for Apache Solr, Python has several Solr clients, which I've collected based on feedback from our customers at Websolr:
Haystack is very popular, and designed for seamless integration within Django apps. If you're developing a Django app, Haystack is for you.
Sunburnt looks to be more generic than Haystack, and is also very well documented. If you're doing plain ol' Python, Sunburnt is worth a look.
Other Python Solr clients that I've found, which seem a bit lower level...
solrpy
pysolr (I know, right?)
Insol
Some more details about how your app is built (in particular, is it a Django app?) would help narrow things down from here. Good luck finding the best fit for your app!
Use Whoosh if you don't need the speed, extra features of the alternatives. It's great, has a nice API, good documentation. My second choice would probably be Xapian, which is fast and has a fairly decent API. They are all fairly mature products. If you don't know what you really need, I'd just go with Whoosh for now.
If you want quick python integration, try indextank. You can be up and running in 2 minutes, and it's free.
For the other alternatives, I'd go with Solr (provided you want to host the search servers yourself, or signup for websolr )
Disclaimer: I work at indextank.