RegEx performance in Objective-C vs Python - python

This question might seem vague, sorry. Does anybody have experience writing RegEx with Objective-C and Python? I am wondering about the performance of one vs the other? Which is faster in terms of 1. runtime speed, and 2. memory consumption? I have a Mac OS application that is always running in the background, and I'd like my app to index some text files that are being saved, and then save the result... I could write a regex method in my app in Obj-C, or I could potentially write a separate app using Perl or Python (just a beginner in Python).
(Thanks, I got some good info from some of you already. Boo to those who downvoted; I am here to learn, and I might have some stupid questions time to time - part of the deal.)

If you’re looking for raw speed, neither of those two would be a very good choice. For execution speed, you’d choose Perl. For how quickly you could code it up, either Python or Perl alike would easily beat the time to write it in Objective C, just as both would easily beat a Java solution. High-level languages that take less time to code up are always a win if all you’re measuring is time-to-solution compared with solutions that take many more lines of code.
As far as actual run-time performance goes, Perl’s regexes are written in very tightly coded C, and are known to be the fastest and most flexible regexes available. The regex optimizer does a lot of very clever things to the compiled regex program, such as applying an Aho–Corasick start-point optimization for finding the start of an alternation trie, running in O(1) time. Nobody else does that. Heck, I don’t think anybody else but Perl even bothers to optimize alternations into tries, which is the thing that takes you from O(n) to O(1), because the compiler spent more time doing something smart so that the interpreter runs much faster. Perl regexes also offer substantial improvements in debugging and profiling. They’re also more flexible than Python’s, but the debugging alone is enough to tip the balance.
The only exception on performance matters is with certain pathological patterns that degenerate when run under any recursive backtracker, whether Perl’s, Java’s, or Python’s. Those can be addressed by using the highly recommended RE2 library, written by Russ Cox, as a replacement plugin. I know it’s available as a transparent replacement regex engine for Perl, and I’m pretty sure I remember seeing that it was also available for Python, too.
On the other hand, if you really want to use Python but just want a more expressive and robust regex library, particularly one that is well-behaved on Unicode, then you want to use Matthew Barnett’s regex module, available for both Python2 and Python3. Besides conforming to tr18’s level-1 compliance requirements (that’s the standards doc on Unicode regexes), it also has all kinds of other clever features, some of which are completely sui generis. If you’re a regex connoisseur, it’s very much worth checking out.

In my Mac OS application I will be doing some text processing, and I was wondering if doing that in Python would be faster.
It will be faster in terms of development time, almost certainly. For nearly all software projects, development time dominates runtime as a measure of success.
If you mean runtime, then you're almost certainly doing premature optimization, unless you've shown that slow code will cause unbearable/noticeable user-interface slowdown.
Premature optimization is the root of all evil.
-- Donald Knuth

Related

Does the Python 3 interpreter have a JIT feature?

I found that when I ask something more to Python, python doesn't use my machine resource at 100% and it's not really fast, it's fast if compared to many other interpreted languages, but when compared to compiled languages i think that the difference is really remarkable.
Is it possible to speedup things with a Just In Time (JIT) compiler in Python 3?
Usually a JIT compiler is the only thing that can improve performances in interpreted languages, so i'm referring to this one, if other solutions are available i would love to accept new answers.
First off, Python 3(.x) is a language, for which there can be any number of implementations. Okay, to this day no implementation except CPython actually implements those versions of the language. But that will change (PyPy is catching up).
To answer the question you meant to ask: CPython, 3.x or otherwise, does not, never did, and likely never will, contain a JIT compiler. Some other Python implementations (PyPy natively, Jython and IronPython by re-using JIT compilers for the virtual machines they build on) do have a JIT compiler. And there is no reason their JIT compilers would stop working when they add Python 3 support.
But while I'm here, also let me address a misconception:
Usually a JIT compiler is the only thing that can improve performances in interpreted languages
This is not correct. A JIT compiler, in its most basic form, merely removes interpreter overhead, which accounts for some of the slow down you see, but not for the majority. A good JIT compiler also performs a host of optimizations which remove the overhead needed to implement numerous Python features in general (by detecting special cases which permit a more efficient implementation), prominent examples being dynamic typing, polymorphism, and various introspective features.
Just implementing a compiler does not help with that. You need very clever optimizations, most of which are only valid in very specific circumstances and for a limited time window. JIT compilers have it easy here, because they can generate specialized code at run time (it's their whole point), can analyze the program easier (and more accurately) by observing it as it runs, and can undo optimizations when they become invalid. They can also interact with interpreters, unlike ahead of time compilers, and often do it because it's a sensible design decision. I guess this is why they are linked to interpreters in people's minds, although they can and do exist independently.
There are also other approaches to make Python implementation faster, apart from optimizing the interpreter's code itself - for example, the HotPy (2) project. But those are currently in research or experimentation stage, and are yet to show their effectiveness (and maturity) w.r.t. real code.
And of course, a specific program's performance depends on the program itself much more than the language implementation. The language implementation only sets an upper bound for how fast you can make a sequence of operations. Generally, you can improve the program's performance much better simply by avoiding unnecessary work, i.e. by optimizing the program. This is true regardless of whether you run the program through an interpreter, a JIT compiler, or an ahead-of-time compiler. If you want something to be fast, don't go out of your way to get at a faster language implementation. There are applications which are infeasible with the overhead of interpretation and dynamicness, but they aren't as common as you'd think (and often, solved by calling into machine code-compiled code selectively).
The only Python implementation that has a JIT is PyPy. Byt - PyPy is both a Python 2 implementation and a Python 3 implementation.
The Numba project should work on Python 3. Although it is not exactly what you asked, you may want to give it a try:
https://github.com/numba/numba/blob/master/docs/source/doc/userguide.rst.
It does not support all Python syntax at this time.
You can try the pypy py3 branch, which is more or less python compatible, but the official CPython implementation has no JIT.
This will best be answered by some of the remarkable Python developer folks on this site.
Still I want to comment: When discussing speed of interpreted languages, I just love to point to a project hosted at this location: Computer Language Benchmarks Game
It's a site dedicated to running benchmarks. There are specified tasks to do. Anybody can submit a solution in his/her preferred language and then the tests compare the runtime of each solution. Solutions can be peer reviewed, are often further improved by others, and results are checked against the spec. In the long run this is the most fair benchmarking system to compare different languages.
As you can see from indicative summaries like this one, compiled languages are quite fast compared to interpreted languages. However, the difference is probably not so much in the exact type of compilation, it's the fact that Python (and the others in the graph slower than python) are fully dynamic. Objects can be modified on the fly. Types can be modified on the fly. So some type checking has to be deferred to runtime, instead of compile time.
So while you can argue about compiler benefits, you have to take into account that there are different features in different languages. And those features may come at an intrinsic price.
Finally, when talking about speed: Most often it's not the language and the perceived slowness of a language that's causing the issue, it's a bad algorithm. I never had to switch languages because one was too slow: When there's a speed issue in my code, I fix the algorithm. However, if there are time-consuming, computational intensive loops in your code it is usually worth the while to recompile those. A prominent example are libraries coded in C used by scripting languages (Perl XS libs, or e.g. numpy/scipy for Python, lapack/blas are examples of libs available with bindings for many scripting languages)
If you mean JIT as in Just in time compiler to a Bytecode representation then it has such a feature(since 2.2). If you mean JIT to machine code, then no. Yet the compilation to byte code provides a lot of performance improvement. If you want it to compile to machine code, then Pypy is the implementation you're looking for.
Note: pypy doesn't work with Python 3.x
If you are looking for speed improvements in a block of code, then you may want to have a look to rpythonic, that compiles down to C using pypy. It uses a decorator that converts it in a JIT for Python.

PyPy and CPython: are big performance increases planned?

While I know projects promising large speed gains can result in let downs, I don't see much in the way of a roadmap for speeding up CPython and/or PyPy.
Is there something planned that promises a huge boost in speed for the core interpreter (e.g. --with-computed-gotos) in either of them? How about their standard libraries (e.g. Decimal in C, IO in C)?
I know HotPy(2) has an outline of a plan for speeding CPython up, but it sounds like an one-man project without much traction in core CPython.
PyPy has some information about where performance isn't great, but I can find no big goals for speedup in the docs.
So, are there known targets that could bring big performance improvement for Python implementations?
I'll answer the part about PyPy. I can't speak for CPython, but I think there are performance improvements that are being worked on (don't quote me on this though).
There is no project plan, since it's really not working that way. All the major parts (like "JIT" or "Garbage Collection") has been essentially done, however that completely does not mean everything is fast. There are definitely things that are slow and we generally improve on a case by case basis - submit a bug report if you think something is too slow. I have quite a few performance improvements on my plate that would definitely help twisted, but I have no idea about others.
Big things that are being worked on that might be worth mentioning:
Improved frames, that should help recursion and function calls that are not inlined (for example that contain loops)
Better string implementations for various kinds of usages, like concatenation, slicing etc.
Faster tracing
More compact tuples and objects, storing unwrapped results
Can I promise when how or how much it'll speed up things? Absolutely not, but on average we manage to have 10-30% speed improvements release-to-release, which is usually every 4 months or so, so I guess some stuff will get faster, but without you giving me a crystal ball or a time machine, I won't tell you for sure.
Cheers,
fijal
Your comments belie a lot of confusion...
PyPy and Python have currently very different performance capabilities.
Pypy is currently more than 5x faster than CPython on average.
HotPy has nothing to do with CPython. It's a one-man project and it's a whole new VM (not yet released, so I can't say anything about it's performance).
At the moment, there's a lot of activity in the PyPy project and they are improving it day by day.
There's a numpy port in a very advanced stage of development, they are improving ctypes, Cython compatibility, and soon there will be a complete Python3 implementation.
I believe PyPy is currently on pair with the V8 JavaScript engine and similar projects in terms of performance.
If speed and Python is what you want, pay attention to this project.
The answer is that PyPy is the plan to speed up CPython. PyPy aims to be an extremely conformant python interpreter which is highly optimized. The project has collected together all of the benchmarks they could find, and runs all of them for each build of pypy, to ensure against performance regressions. Check it out: http://speed.pypy.org/
I believe that by the time that the performance of cpython won't cut it anymore (for web dev work), pypy will be completely ready for prime-time. Raymond Hettinger (a core python dev) has called PyPy "python with the optimizations turned on".

Next step after PHP: Perl or Python? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
It might seem it has been asked numerous times, but in fact it hasn't. I did my research, and now I'm eager to hear others' opinions.
I have experience with PHP 5, both with functional and object oriented programming methods. I created a few feature-minimalistic websites.
Professionals may agree about PHP not being a programming language that encourages good development habits. (I believe it's not the task of the tool, but this doesn't matter.) Furthermore, its performance is also controversial and often said to be poor compared to competitors.
In the 42nd podcast at Stack Overflow blog a developer from Poland asked what language he should learn in order to improve his skills. Jeff and Joel suggested that every one of them would help, altough there are specific ones that are better in some ways.
Despite they made some great points, it didn't help me that much.
From a beginner point of view, there are not one may not see (correction suggested by S. Lott) many differences between Perl & Python. I would like You to emphasize their strenghts and weaknesses and name a few unique services.
Of course, this wouldn't be fair as I could also check both of them. So here's my wishlist and requirements to help You help me.
First of all, I'd like to follow OOP structures and use it fundamentally. I partly planned a multiuser CMS using MySQL and XML, so the greater the implementations are, the better. Due to its foreseen nature, string manipulation will be used intensively.
If there aren't great differences, comparisons should probably mention syntax and other tiny details that don't matter in the first place.
So, here's my question: which one should I try first -- Perl || Python?
Conclusion
Both Perl and Python have their own fans, which is great. I'd like to say I'm grateful for all participation -- there is no trace of any flame war.
I accepted the most valued answer, although there are many great mini-articles below. As suggested more often, I will go with Python first. Then I'll try Perl later on. Let me see which one fits my mind better.
During the development of my special CMS, I'm going to ask more regarding programming doubts -- because developers now can count on each other! Thank you.
Edit: There were some people suggesting to choose Ruby or Java instead. Java has actually disappointed me. Maybe it has great features, maybe it hasn't. I wouldn't enjoy using it.
In addition, I was told to use Ruby. So far, most of the developers I communicate with have quite bad opinion about Ruby. I'll see it myself, but that's the last element on my priority list.
Perl is a very nice language and CPAN has a ton of mature modules that will save you a lot of time. Furthermore, Perl is really moving forwards nowadays with a lot of interesting projects (unlike what uninformed fanboys like to spread around). Even a Perl 6 implementation is by now releasing working Perl 6.
I you want to do OO, I would recommend Moose.
Honestly, the "majority" of my programming has been in Perl and PHP and I recently decided to do my latest project in Python, and I must admit it is very nice to program with. I was hesitant of the whole no curly braces thing as that's what I've always done, but it is really very clean. At the end of the day, though, you can make good web applications with all 3, but if you are dead-set on dropping PHP to try something new I would recommend Python and the Django framework.
I'd go with Perl. Not everyone will agree with me here, but it's a great language well suited to system administration work, and it'll expose you to some more functional programming constructs. It's a great language for learning how to use the smallest amount of code for a given task, as well.
For the usage scenario you mentioned though, I think PHP may be your best bet still. Python does have some great web frameworks, however, so if you just want to try out a new language for developing web applications, Python might be your bet.
I have no experience with Python. I vouch strongly to learn Perl, not out of attrition, but because there is a TON to learn in the platform. The key concepts of Perl are: Do What I Mean (DWIM) and There's More Than One Way To Do It (TMTOWTDI). This means, hypothetically there's often no wrong way to approach a problem if the problem is adequately solved.
Start with learning the base language of Perl, then extend yourself to learning the key Perl modules, like IO::File, DBI, HTML::Template, XML::LibXML, etc. etc. search.cpan.org will be your resource. perlmonks.org will be your guide. Just about everything useful to do will likely have a module published.
Keep in mind that Perl is a dynamic and loosely structured language. Perl is not the platform to enforce draconian OOP standards, but for good reason. You'll find the language extremely flexible.
Where is Perl used? System Admins use it heavily, as already mentioned. You can still do excellent web apps either by simple CGI or MVC framework.
I haven't worked with Python much, but I can tell why I didn't like about Perl when I used it.
OO support feels tacked on. OO in perl is very different from OO support in the other languages I've used (which include things like PHP, Java, and C#)
TMTOWTDI (There's More Than One Way To Do It). Good idea in theory, terrible idea in practice as it reduces code readability.
Perl uses a lot of magic symbols.
Perl doesn't support named function arguments, meaning that you need to dig into the #_ array to get the arguments passed to a function (or rather, a sub as perl doesn't have the function keyword). This means you'll see a lot of things like the example below (moved 'cause SO doesn't like code in numbered lists)
Having said all that, I'd look into Python. Unless you want to go with something heavier-weight like C++ or C#/Java.
Oh, before I forgot: I wanted to put an example for 4 above, but SO doesn't like putting code in numbered lists:
sub mySub {
#extremely common to see in Perl, as built-ins operators operate on the $_ scalar or #_ array implicitly
my $arg1 = shift;
my $arg2 = shift;
}
I recently made the step from Perl over to Python, after a couple of Perl-only years. Soon thereafter I discovered I had started to read through all kinds of Python-code just as it were any other easy to read text — something I've never done with Perl. Having to delve into third-party Perl code has always been kind of a nightmare for me, so this came as a very very nice surprise!
For me, this means Python-code is much easier to maintain, which on the long run makes Python much more attractive than Perl.
Python is clean and elegant, and the fact that LOTS of C APIs have been wrapped, gives you powerful hooks to much. I also like the "Zen of Python".
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to
do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than right now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
As a Perl programmer, I would normally say Perl. But coming from PHP, I think Perl is too similar and you won't actually get that much out of it. (Not because there isn't a lot to learn, but you are likely to program in Perl using the same style as you program in PHP.)
I'd suggest something completely different: Haskell (suggested by Joel), Lisp, Lua, JavaScript or C. Any one of these would make you a better programmer by opening up new ways of looking at the world.
But there's no reason to stop learning PHP in the meantime.
For a good look at the dark side of these languages, I heartily recommend: What are five things you hate about your favorite language?
I suggest going through a beginner tutorial of each and decide for yourself which fits you better. You'll find you can do what you need to do in either:
Python Tutorial (Python Classes)
Perl Tutorial (Perl Classes)
(Couldn't find a single 'official' perl tutorial, feel free to suggest one)
In my experience python provides a cleaner, more straight-forward experience.
My issues with perl:
'use strict;', Taint, Warnings? - Ideally these shouldn't be needed.
Passing variables: #; vs. $, vs shift
Scoping my, local, ours? (The local defintion seems to particularly point out some confusion with perl, "You really probably want to be using my instead, because local isn't what most people think of as "local".".)
In general with my perl skills I still find my self referencing documentation for built-in features. Where as in python I find this less so. (I've worked in both roughly the same amount of time, but my general programming expereince has grown with time. In other words, I'd probably be a better perl programmer now)
If your a unix command line guru though, perl may come more naturally to you. Or, if your using it mainly as a replacement or extension to command line admin tasks, it may suit your needs fine. In my opinion perl is "faster on the draw" at the command line than python is.
Why isn't there Ruby on your list? Maybe you should give it a try.
"I'd like to follow OOP structure..." advocates for Python or, even more so if you're open, Ruby. On the other hand, in terms of existing libraries, the order is probably Perl > Python >> Ruby. In terms of your career, Perl on your resume is unlikely to make you stand out, while Python and Ruby may catch the eye of a hiring manager.
As a PHP programmer, you are probably going to see all 3 as somewhat "burdensome" to get a Web page up. All have good solutions for Web frameworks, but none is quite as focussed on rendering a Web page as is PHP.
I think that Python is quite likely to be a better choice for you than Perl. It has many good resources, a large community (although not as large as Perl, probably), "stands out" a little on a resume, and has a good reputation.
If those 2 are your only choices, I would choose Python.
Otherwise you should learn javascript.
No I mean really learn it...
If you won't be doing web development with this language, either of them would do. If you are, you may find that doing web development in perl is a bit more complicated, since all of the frameworks require more knowledge of the language.
You can do nice things in both, but my opinion is that perl allows more rapid development. Also, perl's regexes rock!
Every dynamic language is from same family. It does not matter Which is the tool you work with it matter how you do..
PHP VS PYTHON OT PERL OR RUBY? Stop it
As many comments mentioned python is cleaner well sometime whose curly brackets are use full to. You just have to practice.

Scripting language choice for initial performance [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I have a small lightweight application that is used as part of a larger solution. Currently it is written in C but I am looking to rewrite it using a cross-platform scripting language. The solution needs to run on Windows, Linux, Solaris, AIX and HP-UX.
The existing C application works fine but I want to have a single script I can maintain for all platforms. At the same time, I do not want to lose a lot of performance but am willing to lose some.
Startup cost of the script is very important. This script can be called anywhere from every minute to many times per second. As a consequence, keeping it's memory and startup time low are important.
So basically I'm looking for the best scripting languages that is:
Cross platform.
Capable of XML parsing and HTTP Posts.
Low memory and low startup time.
Possible choices include but are not limited to: bash/ksh + curl, Perl, Python and Ruby. What would you recommend for this type of a scenario?
Lua is a scripting language that meets your criteria. It's certainly the fastest and lowest memory scripting language available.
Because of your requirement for fast startup time and a calling frequency greater than 1Hz I'd recommend either staying with C and figuring out how to make it portable (not always as easy as a few ifdefs) or exploring the possibility of turning it into a service daemon that is always running. Of course this depends on how
Python can have lower startup times if you compile the module and run the .pyc file, but it is still generally considered slow. Perl, in my experience, in the fastest of the scripting languages so you might have good luck with a perl daemon.
You could also look at cross platform frameworks like gtk, wxWidgets and Qt. While they are targeted at GUIs they do have low level cross platform data types and network libraries that could make the job of using a fast C based application easier.
"called anywhere from every minute to many times per second. As a consequence, keeping it's memory and startup time low are important."
This doesn't sound like a script to me at all.
This sounds like a server handling requests that arrive from every minute to several times a second.
If it's a server, handling requests, start-up time doesn't mean as much as responsiveness. In which case, Python might work out well, and still keep performance up.
Rather than restarting, you're just processing another request. You get to keep as much state as you need to optimize performance.
When written properly, C should be platform independant and would only need a recompile for those different platforms. You might have to jump through some #ifdef hoops for the headers (not all systems use the same headers), but most normal (non-win32 API) calls are very portable.
For web access (which I presume you need as you mention bash+curl), you could take a look at libcurl, it's available for all the platforms you mentioned, and shouldn't be that hard to work with.
With execution time and memory cost in mind, I doubt you could go any faster than properly written C with any scripting language as you would lose at least some time on interpreting the script...
I concur with Lua: it is super-portable, it has XML libraries, either native or by binding C libraries like Expat, it has a good socket library (LuaSocket) plus, for complex stuff, some cURL bindings, and is well known for being very lightweight (often embedded in low memory devices), very fast (one of the fastest scripting languages), and powerful. And very easy to code!
It is coded in pure Ansi C, and lot of people claim it has one of the best C biding API (calling C routines from Lua, calling Lua code from C...).
If Low memory and low startup time are truly important you might want to consider doing the work to keep the C code cross platform, however I have found this is rarely necessary.
Personally I would use Ruby or Python for this type of job, they both make it very easy to make clear understandable code that others can maintain (or you can maintain after not looking at it for 6 months). If you have the control to do so I would also suggest getting the latest version of the interpreter, as both Ruby and Python have made notable improvements around performance recently.
It is a bit of a personal thing. Programming Ruby makes me happy, C code does not (nor bash scripting for anything non-trivial).
As others have suggested, daemonizing your script might be a good idea; that would reduce the startup time to virtually zero. Either have a small C wrapper that connects to your daemon and transmits the request back and forth, or have the daemon handle requests directly.
It's not clear if this is intended to handle HTTP requests; if so, Perl has a good HTTP server module, bindings to several different C-based XML parsers, and blazing fast string support. (If you don't want to daemonize, it has a good, full-featured CGI module; if you have full control over the server it's running on, you could also use mod_perl to implement your script as an Apache handler.) Ruby's strings are a little slower, but there are some really good backgrounding tools available for it. I'm not as familiar with Python, I'm afraid, so I can't really make any recommendations about it.
In general, though, I don't think you're as startup-time-constrained as you think you are. If the script is really being called several times a second, any decent interpreter on any decent operating system will be cached in memory, as will the source code of your script and its modules. Result: the startup times won't be as bad as you might think.
Dagny:~ brent$ time perl -MCGI -e0
real 0m0.610s
user 0m0.036s
sys 0m0.022s
Dagny:~ brent$ time perl -MCGI -e0
real 0m0.026s
user 0m0.020s
sys 0m0.006s
(The parameters to the Perl interpreter load the rather large CGI module and then execute the line of code '0;'.)
Python is good. I would also check out The Computer Languages Benchmarks Game website:
http://shootout.alioth.debian.org/
It might be worth spending a bit of time understanding the benchmarks (including numbers for startup times and memory usage). Lots of languages are compared such as Perl, Python, Lua and Ruby. You can also compare these languages against benchmarks in C.
I agree with others in that you should probably try to make this a more portable C app instead of porting it over to something else since any scripting language is going to introduce significant overhead from a startup perspective, have a much larger memory footprint, and will probably be much slower.
In my experience, Python is the most efficient of the three, followed by Perl and then Ruby with the difference between Perl and Ruby being particularly large in certain areas. If you really want to try porting this to a scripting language, I would put together a prototype in the language you are most comfortable with and see if it comes close to your requirements. If you don't have a preference, start with Python as it is easy to learn and use and if it is too slow with Python, Perl and Ruby probably won't be able to do any better.
Remember that if you choose Python, you can also extend it in C if the performance isn't great. Heck, you could probably even use some of the code you have right now. Just recompile it and wrap it using pyrex.
You can also do this fairly easily in Ruby, and in Perl (albeit with some more difficulty). Don't ask me about ways to do this though.
Can you instead have it be a long-running process and answer http or rpc requests?
This would satisfy the latency requirements in almost any scenario, but I don't know if that would break your memory footprint constraints.
At first sight, it's sounds like over engineering, as a rule of thumb I suggest fixing only when things are broken.
You have an already working application. Apparently you want to want to call the feature provided from few more several sources. It looks like the description of a service to me (maybe easier to maintain).
Finally you also mentioned that this is part of a larger solution, then you may want to reuse the language, facilities of the larger solutions. From the description you gave (xml+http) it seems quite an usual application that can be written in any generalist language (maybe a web container in java?).
Some libraries can help you to make your code portable:
Boost,
Qt
more details may trigger more ideas :)
Port your app to Ruby. If your app is too slow, profile it and rewrite the those parts in C.

Is Python good for big software projects (not web based)? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
Right now I'm developing mostly in C/C++, but I wrote some small utilities in Python to automatize some tasks and I really love it as language (especially the productivity).
Except for the performances (a problem that could be sometimes solved thanks to the ease of interfacing Python with C modules), do you think it is proper for production use in the development of stand-alone complex applications (think for example to a word processor or a graphic tool)?
What IDE would you suggest? The IDLE provided with Python is not enough even for small projects in my opinion.
We've used IronPython to build our flagship spreadsheet application (40kloc production code - and it's Python, which IMO means loc per feature is low) at Resolver Systems, so I'd definitely say it's ready for production use of complex apps.
There are two ways in which this might not be a useful answer to you :-)
We're using IronPython, not the more usual CPython. This gives us the huge advantage of being able to use .NET class libraries. I may be setting myself up for flaming here, but I would say that I've never really seen a CPython application that looked "professional" - so having access to the WinForms widget set was a huge win for us. IronPython also gives us the advantage of being able to easily drop into C# if we need a performance boost. (Though to be honest we have never needed to do that. All of our performance problems to date have been because we chose dumb algorithms rather than because the language was slow.) Using C# from IP is much easier than writing a C Extension for CPython.
We're an Extreme Programming shop, so we write tests before we write code. I would not write production code in a dynamic language without writing the tests first; the lack of a compile step needs to be covered by something, and as other people have pointed out, refactoring without it can be tough. (Greg Hewgill's answer suggests he's had the same problem. On the other hand, I don't think I would write - or especially refactor - production code in any language these days without writing the tests first - but YMMV.)
Re: the IDE - we've been pretty much fine with each person using their favourite text editor; if you prefer something a bit more heavyweight then WingIDE is pretty well-regarded.
You'll find mostly two answers to that – the religous one (Yes! Of course! It's the best language ever!) and the other religious one (you gotta be kidding me! Python? No... it's not mature enough). I will maybe skip the last religion (Python?! Use Ruby!). The truth, as always, is far from obvious.
Pros: it's easy, readable, batteries included, has lots of good libraries for pretty much everything. It's expressive and dynamic typing makes it more concise in many cases.
Cons: as a dynamic language, has way worse IDE support (proper syntax completion requires static typing, whether explicit in Java or inferred in SML), its object system is far from perfect (interfaces, anyone?) and it is easy to end up with messy code that has methods returning either int or boolean or object or some sort under unknown circumstances.
My take – I love Python for scripting, automation, tiny webapps and other simple well defined tasks. In my opinion it is by far the best dynamic language on the planet. That said, I would never use it any dynamically typed language to develop an application of substantial size.
Say – it would be fine to use it for Stack Overflow, which has three developers and I guess no more than 30k lines of code. For bigger things – first your development would be super fast, and then once team and codebase grow things are slowing down more than they would with Java or C#. You need to offset lack of compilation time checks by writing more unittests, refactorings get harder cause you never know what your refacoring broke until you run all tests or even the whole big app, etc.
Now – decide on how big your team is going to be and how big the app is supposed to be once it is done. If you have 5 or less people and the target size is roughly Stack Overflow, go ahead, write in Python. You will finish in no time and be happy with good codebase. But if you want to write second Google or Yahoo, you will be much better with C# or Java.
Side-note on C/C++ you have mentioned: if you are not writing performance critical software (say massive parallel raytracer that will run for three months rendering a film) or a very mission critical system (say Mars lander that will fly three years straight and has only one chance to land right or you lose $400mln) do not use it. For web apps, most desktop apps, most apps in general it is not a good choice. You will die debugging pointers and memory allocation in complex business logic.
In my opinion python is more than ready for developing complex applications. I see pythons strength more on the server side than writing graphical clients. But have a look at http://www.resolversystems.com/. They develop a whole spreadsheet in python using the .net ironpython port.
If you are familiar with eclipse have a look at pydev which provides auto-completion and debugging support for python with all the other eclipse goodies like svn support. The guy developing it has just been bought by aptana, so this will be solid choice for the future.
#Marcin
Cons: as a dynamic language, has way
worse IDE support (proper syntax
completion requires static typing,
whether explicit in Java or inferred
in SML),
You are right, that static analysis may not provide full syntax completion for dynamic languages, but I thing pydev gets the job done very well. Further more I have a different development style when programming python. I have always an ipython session open and with one F5 I do not only get the perfect completion from ipython, but object introspection and manipulation as well.
But if you want to write second Google
or Yahoo, you will be much better with
C# or Java.
Google just rewrote jaiku to work on top of App Engine, all in python. And as far as I know they use a lot of python inside google too.
I really like python, it's usually my language of choice these days for small (non-gui) stuff that I do on my own.
However, for some larger Python projects I've tackled, I'm finding that it's not quite the same as programming in say, C++. I was working on a language parser, and needed to represent an AST in Python. This is certainly within the scope of what Python can do, but I had a bit of trouble with some refactoring. I was changing the representation of my AST and changing methods and classes around a lot, and I found I missed the strong typing that would be available to me in a C++ solution. Python's duck typing was almost too flexible and I found myself adding a lot of assert code to try to check my types as the program ran. And then I couldn't really be sure that everything was properly typed unless I had 100% code coverage testing (which I didn't at the time).
Actually, that's another thing that I miss sometimes. It's possible to write syntactically correct code in Python that simply won't run. The compiler is incapable of telling you about it until it actually executes the code, so in infrequently-used code paths such as error handlers you can easily have unseen bugs lurking around. Even code that's as simple as printing an error message with a % format string can fail at runtime because of mismatched types.
I haven't used Python for any GUI stuff so I can't comment on that aspect.
Python is considered (among Python programmers :) to be a great language for rapid prototyping. There's not a lot of extraneous syntax getting in the way of your thought processes, so most of the work you do tends to go into the code. (There's far less idioms required to be involved in writing good Python code than in writing good C++.)
Given this, most Python (CPython) programmers ascribe to the "premature optimization is the root of all evil" philosophy. By writing high-level (and significantly slower) Python code, one can optimize the bottlenecks out using C/C++ bindings when your application is nearing completion. At this point it becomes more clear what your processor-intensive algorithms are through proper profiling. This way, you write most of the code in a very readable and maintainable manner while allowing for speedups down the road. You'll see several Python library modules written in C for this very reason.
Most graphics libraries in Python (i.e. wxPython) are just Python wrappers around C++ libraries anyway, so you're pretty much writing to a C++ backend.
To address your IDE question, SPE (Stani's Python Editor) is a good IDE that I've used and Eclipse with PyDev gets the job done as well. Both are OSS, so they're free to try!
[Edit] #Marcin: Have you had experience writing > 30k LOC in Python? It's also funny that you should mention Google's scalability concerns, since they're Python's biggest supporters! Also a small organization called NASA also uses Python frequently ;) see "One coder and 17,000 Lines of Code Later".
Nothing to add to the other answers, besides that if you choose python you must use something like pylint which nobody mentioned so far.
One way to judge what python is used for is to look at what products use python at the moment. This wikipedia page has a long list including various web frameworks, content management systems, version control systems, desktop apps and IDEs.
As it says here - "Some of the largest projects that use Python are the Zope application server, YouTube, and the original BitTorrent client. Large organizations that make use of Python include Google, Yahoo!, CERN and NASA. ITA uses Python for some of its components."
So in short, yes, it is "proper for production use in the development of stand-alone complex applications". So are many other languages, with various pros and cons. Which is the best language for your particular use case is too subjective to answer, so I won't try, but often the answer will be "the one your developers know best".
Refactoring is inevitable on larger codebases and the lack of static typing makes this much harder in python than in statically typed languages.
And as far as I know they use a lot of python inside google too.
Well i'd hope so, the maker of python still works at google if i'm not mistaken?
As for the use of Python, i think it's a great language for stand-alone apps. It's heavily used in a lot of Linux programs, and there are a few nice widget sets out there to aid in the development of GUI's.
Python is a delight to use. I use it routinely and also write a lot of code for work in C#. There are two drawbacks to writing UI code in Python. one is that there is not a single ui framework that is accepted by the majority of the community. when you write in c# the .NET runtime and class libraries are all meant to work together. With Python every UI library has at's own semantics which are often at odds with the pythonic mindset in which you are trying to write your program. I am not blaming the library writers. I've tried several libraries (wxwidgets, PythonWin[Wrapper around MFC], Tkinter), When doing so I often felt that I was writing code in a language other than Python (despite the fact that it was python) because the libraries aren't exactly pythonic they are a port from another language be it c, c++, tk.
So for me I will write UI code in .NET (for me C#) because of the IDE & the consistency of the libraries. But when I can I will write business logic in python because it is more clear and more fun.
I know I'm probably stating the obvious, but don't forget that the quality of the development team and their familiarity with the technology will have a major impact on your ability to deliver.
If you have a strong team, then it's probably not an issue if they're familiar. But if you have people who are more 9 to 5'rs who aren't familiar with the technology, they will need more support and you'd need to make a call if the productivity gains are worth whatever the cost of that support is.
I had only one python experience, my trash-cli project.
I know that probably some or all problems depends of my inexperience with python.
I found frustrating these things:
the difficult of finding a good IDE for free
the limited support to automatic refactoring
Moreover:
the need of introduce two level of grouping packages and modules confuses me.
it seems to me that there is not a widely adopted code naming convention
it seems to me that there are some standard library APIs docs that are incomplete
the fact that some standard libraries are not fully object oriented annoys me
Although some python coders tell me that they does not have these problems, or they say these are not problems.
Try Django or Pylons, write a simple app with both of them and then decide which one suits you best. There are others (like Turbogears or Werkzeug) but those are the most used.

Categories