Are there programmatic tools for Perl to Python conversion? [closed] - python

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
In my new job more people are using Python than Perl, and I have a very useful API that I wrote myself and I'd like to make available to my co-workers in Python.
I thought that a compiler that compiled Perl code into Python code would be really useful for such a task. Before trying to write something that parsed Perl (or at least, the subset of Perl that I've used in defining my API), I came across bridgekeeper from a consultancy.
It's almost certainly not worth the money for me to engage a consultancy to translate this API, but that's a really interesting tool.
Does anyone know of a compiler that will parse (or try to parse!) Perl5 code and compile it into Python? If there isn't such a thing, how should I start writing a simple compiler that parses my object-oriented Perl code and turns it into Python? Is there an ANTLR or YACC grammar that I can use as a starting point?
Edit: I found perl.y, which might be a starting point if I were to roll my own compiler.

James,
I recommend you to just rewrite the module in Python, for several reasons:
Parsing Perl is DARN HARD. Unless this is an important and desirable exercise for you, you'll find yourself spending much more time on the translation than on useful work.
By rewriting it, you'll have a great chance to practice Python. Learning is best done by doing, and having a task you really need done is a great boon.
Finally, Python and Perl have quite different philosophies. To get a more Pythonic API, it's best to just rewrite it in Python.

I think you should rewrite your code. The quality of the results of a parsing effort depends on your Perl coding style.
I think the quote below sums up the theoretical side very well.
From Wikipedia:Perl in Wikipedia
Perl has a Turing-complete grammar because parsing can be affected by run-time code executed during the compile phase.[25] Therefore, Perl cannot be parsed by a straight Lex/Yacc lexer/parser combination. Instead, the interpreter implements its own lexer, which coordinates with a modified GNU bison parser to resolve ambiguities in the language.
It is often said that "Only perl can parse Perl," meaning that only the Perl interpreter (perl) can parse the Perl language (Perl), but even this is not, in general, true. Because the Perl interpreter can simulate a Turing machine during its compile phase, it would need to decide the Halting Problem in order to complete parsing in every case. It's a long-standing result that the Halting Problem is undecidable, and therefore not even Perl can always parse Perl. Perl makes the unusual choice of giving the user access to its full programming power in its own compile phase. The cost in terms of theoretical purity is high, but practical inconvenience seems to be rare.
Other programs that undertake to parse Perl, such as source-code analyzers and auto-indenters, have to contend not only with ambiguous syntactic constructs but also with the undecidability of Perl parsing in the general case. Adam Kennedy's PPI project focused on parsing Perl code as a document (retaining its integrity as a document), instead of parsing Perl as executable code (which not even Perl itself can always do). It was Kennedy who first conjectured that, "parsing Perl suffers from the 'Halting Problem'."[26], and this was later proved.[27]

Starting in 5.10, you can compile perl with the experimental Misc Attribute Decoration enabled and set the PERL_XMLDUMP environment variable to a filename to get an XML dump of the parse tree (including comments - very helpful for language translators). Though as the doc says, this is a work in progress.

I never tried it and it seems unmaintained, but maybe PyPerl is an option?
How big is this API? If it really this useful then why don't you rewrite it in python. Writing an automatic converter will probably take longer then rewriting the API.
And even if you manage to automatically rewrite it, the resulting code probably won't be very pythonic anyway.
Be sure to check out the answers by weismat and eliben

As much as it might be fun to convert it to or rewrite it in python, I wouldn't make either of those my first choice. Then you'd be stuck with a forked code base. Any modifications you make will have to be duplicated.
Write some sort of wrapper for your API that you can access from outside of Perl. One possibility is a RESTful interface. Another, if you don't want to deal with networking issues, is to create a set of command line tools that access the API (possibly passing information as JSON). Then you can write an easy python library which accesses the wrapper API using httplib2 or subprocess (depending on how you've implemented the wrapper).
You'll still have to update the Python API whenever the interface changes, but now it's only for interface changes.

You could try writing a parser with PPI, dump it to some intermediary form and write Python mecanically from there. Hard, but doable. Useful? Er....
Or you could port your code to Perl 6, wait to Pynie to be ready enough to allow direct call from Python to Perl6 within the same runtime! It's not that far away after all. Too bad Ponie's dead though.

https://perthon.sourceforge.net could probably work? While it is still in alpha, I see a lot of potential.

Related

What's the best tool to parse log files? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I use grep to parse through my trading apps logs, but it's limited in the sense that I need to visually trawl through the output to see what happened etc.
I'm wondering if Perl is a better option? Any good resources to learn log and string parsing with Perl?
I'd also believe that Python would be good for this. Perl vs Python vs 'grep on linux'?
In the end, it really depends on how much semantics you want to identify, whether your logs fit common patterns, and what you want to do with the parsed data.
If you can use regular expressions to find what you need, you have tons of options. Perl is a popular language and has very convenient native RE facilities. I personally feel a lot more comfortable with Python and find that the little added hassle for doing REs is not significant.
If you want to do something smarter than RE matching, or want to have a lot of logic, you may be more comfortable with Python or even with Java/C++/etc. For instance, it is easy to read line-by-line in Python and then apply various predicate functions and reactions to matches, which is great if you have a ruleset you would like to apply.
All scripting languages are good candidates: Perl, Python, Ruby, PHP, and AWK are all fine for this. Using any one of these languages are better than peering at the logs starting from a (small) size.
Wearing Ruby Slippers to Work is an example of doing this in Ruby, written in Why's inimitable style. Here's a basic example in Perl. I suggest you choose one of these languages and start cracking.
A big advantage Perl has over Python is that when parsing text is the ability to use regular expressions directly as part of the language syntax. For example:
if ($line =~ m/^Regex/) {
... code goes here
}
Perl also assigns capture groups directly to $1, $2, etc, making it very simple to work with. Depending on the format and structure of the logfiles you're trying to parse, this could prove to be quite useful (or, if it can be parsed as a fixed width file or using simpler techniques, not very useful at all).
It's all just syntactic sugar, really, and other languages also allow you use regular expressions and capture groups (indeed, the linked article shows how to do it in Python). You just have to write a bit more code and pass around objects to do it.
There's a Perl program called Log_Analysis that does a lot of analysis and preprocessing for you.
Learning a programming language will let you take you log analysis abilities to another level.
Any dynamic or "scripting" language like Perl, Ruby or Python will do the job. What you should use really depends on external factors. Among the things you should consider:
does work already use a suitable
langauge?
do you know anyone who can
mentor you in a suitable language?
try each language a little and see which language fits you better.
Personally, for the above task I would use Perl. YMMV.
Several reasons to like Perl:
Powerful one-liners - if you need to do a real quick, one-off job, Perl offers some really great short-cuts. See perlrun -n for one example
Multi-paradigm language - Perl has support for imperative, functional and object-oriented programming methodologies.
Sigils - those leading punctuation characters on variables like $foo or #bar. They are a bit like hungarian notation without being so annoying.
Moose - an incredible new OOP system that provides powerful new OO techniques for code composition and reuse.
Strictures - the use strict pragma catches many errors that other dynamic languages gloss over at compile time. I miss it terribly when I use Python or PHP.
Self-discipline - Perl gives you the freedom to write and do what you want, when you want. This means that you have to learn to write clean code or you will hurt. Fortunately, there are tools to help a beginner. Perl::Critic does lint-like analysis of code for best practices.
I find this list invaluable when dealing with any job that requires one to parse with python.
I wouldn't use perl for parsing large/complex logs - just for the readability (the speed on perl lacks for me (big jobs) - but that's probably my perl code (I must improve)).
However if grep suits your needs perfectly for now - there really is no reason to get bogged down in writing a full blown parser. Simplest solution is usually the best, and grep is a fine tool.
Another possible interpretation of your question is "Are there any tools that make log monitoring easier?", and to answer that I would suggest you have a look at Splunk or maybe Log4view.
on linux, you can use just the shell(bash,ksh etc) to parse log files if they are not too big in size. The other tools to go for are usually grep and awk. However, for more programming power, awk is usually used. If you have big files to parse, try awk.
Of course, Perl or Python or practically any other languages with file reading and string manipulation capabilities can be used as well.
try Nagios Log Monitoring
The reason this tool is the best for your purpose is this:
It requires no installation of foreign packages. Which means, there's no need to install any perl dependencies or any silly packages that may make you nervous.
There is little to no learning curve. You don't need to learn any programming languages to use it. All you need to do is know exactly what you want to do with the logs you have in mind, and read the pdf that comes with the tool.
If the log you want to parse is in a syslog format, you can use a command like this:
./NagiosLogMonitor 10.20.40.50:5444 logrobot autofig /opt/jboss/server.log 60m 'INFO' '.' 1 2 -show
Even if your log is not in a recognized format, it can still be monitored efficiently with the following command:
./NagiosLogMonitor 10.20.40.50:5444 logrobot autonda /opt/jboss/server.log 60m 'INFO' '.' 1 2 jbosslogs -ndshow
To parse a log for specific strings, replace the 'INFO' string with the patterns you want to watch for in the log. If you want to search for multiple patterns, specify them like this 'INFO|ERROR|fatal'.
If efficiency and simplicity (and safe installs) are important to you, this Nagios tool is the way to go.

Next step after PHP: Perl or Python? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
It might seem it has been asked numerous times, but in fact it hasn't. I did my research, and now I'm eager to hear others' opinions.
I have experience with PHP 5, both with functional and object oriented programming methods. I created a few feature-minimalistic websites.
Professionals may agree about PHP not being a programming language that encourages good development habits. (I believe it's not the task of the tool, but this doesn't matter.) Furthermore, its performance is also controversial and often said to be poor compared to competitors.
In the 42nd podcast at Stack Overflow blog a developer from Poland asked what language he should learn in order to improve his skills. Jeff and Joel suggested that every one of them would help, altough there are specific ones that are better in some ways.
Despite they made some great points, it didn't help me that much.
From a beginner point of view, there are not one may not see (correction suggested by S. Lott) many differences between Perl & Python. I would like You to emphasize their strenghts and weaknesses and name a few unique services.
Of course, this wouldn't be fair as I could also check both of them. So here's my wishlist and requirements to help You help me.
First of all, I'd like to follow OOP structures and use it fundamentally. I partly planned a multiuser CMS using MySQL and XML, so the greater the implementations are, the better. Due to its foreseen nature, string manipulation will be used intensively.
If there aren't great differences, comparisons should probably mention syntax and other tiny details that don't matter in the first place.
So, here's my question: which one should I try first -- Perl || Python?
Conclusion
Both Perl and Python have their own fans, which is great. I'd like to say I'm grateful for all participation -- there is no trace of any flame war.
I accepted the most valued answer, although there are many great mini-articles below. As suggested more often, I will go with Python first. Then I'll try Perl later on. Let me see which one fits my mind better.
During the development of my special CMS, I'm going to ask more regarding programming doubts -- because developers now can count on each other! Thank you.
Edit: There were some people suggesting to choose Ruby or Java instead. Java has actually disappointed me. Maybe it has great features, maybe it hasn't. I wouldn't enjoy using it.
In addition, I was told to use Ruby. So far, most of the developers I communicate with have quite bad opinion about Ruby. I'll see it myself, but that's the last element on my priority list.
Perl is a very nice language and CPAN has a ton of mature modules that will save you a lot of time. Furthermore, Perl is really moving forwards nowadays with a lot of interesting projects (unlike what uninformed fanboys like to spread around). Even a Perl 6 implementation is by now releasing working Perl 6.
I you want to do OO, I would recommend Moose.
Honestly, the "majority" of my programming has been in Perl and PHP and I recently decided to do my latest project in Python, and I must admit it is very nice to program with. I was hesitant of the whole no curly braces thing as that's what I've always done, but it is really very clean. At the end of the day, though, you can make good web applications with all 3, but if you are dead-set on dropping PHP to try something new I would recommend Python and the Django framework.
I'd go with Perl. Not everyone will agree with me here, but it's a great language well suited to system administration work, and it'll expose you to some more functional programming constructs. It's a great language for learning how to use the smallest amount of code for a given task, as well.
For the usage scenario you mentioned though, I think PHP may be your best bet still. Python does have some great web frameworks, however, so if you just want to try out a new language for developing web applications, Python might be your bet.
I have no experience with Python. I vouch strongly to learn Perl, not out of attrition, but because there is a TON to learn in the platform. The key concepts of Perl are: Do What I Mean (DWIM) and There's More Than One Way To Do It (TMTOWTDI). This means, hypothetically there's often no wrong way to approach a problem if the problem is adequately solved.
Start with learning the base language of Perl, then extend yourself to learning the key Perl modules, like IO::File, DBI, HTML::Template, XML::LibXML, etc. etc. search.cpan.org will be your resource. perlmonks.org will be your guide. Just about everything useful to do will likely have a module published.
Keep in mind that Perl is a dynamic and loosely structured language. Perl is not the platform to enforce draconian OOP standards, but for good reason. You'll find the language extremely flexible.
Where is Perl used? System Admins use it heavily, as already mentioned. You can still do excellent web apps either by simple CGI or MVC framework.
I haven't worked with Python much, but I can tell why I didn't like about Perl when I used it.
OO support feels tacked on. OO in perl is very different from OO support in the other languages I've used (which include things like PHP, Java, and C#)
TMTOWTDI (There's More Than One Way To Do It). Good idea in theory, terrible idea in practice as it reduces code readability.
Perl uses a lot of magic symbols.
Perl doesn't support named function arguments, meaning that you need to dig into the #_ array to get the arguments passed to a function (or rather, a sub as perl doesn't have the function keyword). This means you'll see a lot of things like the example below (moved 'cause SO doesn't like code in numbered lists)
Having said all that, I'd look into Python. Unless you want to go with something heavier-weight like C++ or C#/Java.
Oh, before I forgot: I wanted to put an example for 4 above, but SO doesn't like putting code in numbered lists:
sub mySub {
#extremely common to see in Perl, as built-ins operators operate on the $_ scalar or #_ array implicitly
my $arg1 = shift;
my $arg2 = shift;
}
I recently made the step from Perl over to Python, after a couple of Perl-only years. Soon thereafter I discovered I had started to read through all kinds of Python-code just as it were any other easy to read text — something I've never done with Perl. Having to delve into third-party Perl code has always been kind of a nightmare for me, so this came as a very very nice surprise!
For me, this means Python-code is much easier to maintain, which on the long run makes Python much more attractive than Perl.
Python is clean and elegant, and the fact that LOTS of C APIs have been wrapped, gives you powerful hooks to much. I also like the "Zen of Python".
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to
do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than right now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
As a Perl programmer, I would normally say Perl. But coming from PHP, I think Perl is too similar and you won't actually get that much out of it. (Not because there isn't a lot to learn, but you are likely to program in Perl using the same style as you program in PHP.)
I'd suggest something completely different: Haskell (suggested by Joel), Lisp, Lua, JavaScript or C. Any one of these would make you a better programmer by opening up new ways of looking at the world.
But there's no reason to stop learning PHP in the meantime.
For a good look at the dark side of these languages, I heartily recommend: What are five things you hate about your favorite language?
I suggest going through a beginner tutorial of each and decide for yourself which fits you better. You'll find you can do what you need to do in either:
Python Tutorial (Python Classes)
Perl Tutorial (Perl Classes)
(Couldn't find a single 'official' perl tutorial, feel free to suggest one)
In my experience python provides a cleaner, more straight-forward experience.
My issues with perl:
'use strict;', Taint, Warnings? - Ideally these shouldn't be needed.
Passing variables: #; vs. $, vs shift
Scoping my, local, ours? (The local defintion seems to particularly point out some confusion with perl, "You really probably want to be using my instead, because local isn't what most people think of as "local".".)
In general with my perl skills I still find my self referencing documentation for built-in features. Where as in python I find this less so. (I've worked in both roughly the same amount of time, but my general programming expereince has grown with time. In other words, I'd probably be a better perl programmer now)
If your a unix command line guru though, perl may come more naturally to you. Or, if your using it mainly as a replacement or extension to command line admin tasks, it may suit your needs fine. In my opinion perl is "faster on the draw" at the command line than python is.
Why isn't there Ruby on your list? Maybe you should give it a try.
"I'd like to follow OOP structure..." advocates for Python or, even more so if you're open, Ruby. On the other hand, in terms of existing libraries, the order is probably Perl > Python >> Ruby. In terms of your career, Perl on your resume is unlikely to make you stand out, while Python and Ruby may catch the eye of a hiring manager.
As a PHP programmer, you are probably going to see all 3 as somewhat "burdensome" to get a Web page up. All have good solutions for Web frameworks, but none is quite as focussed on rendering a Web page as is PHP.
I think that Python is quite likely to be a better choice for you than Perl. It has many good resources, a large community (although not as large as Perl, probably), "stands out" a little on a resume, and has a good reputation.
If those 2 are your only choices, I would choose Python.
Otherwise you should learn javascript.
No I mean really learn it...
If you won't be doing web development with this language, either of them would do. If you are, you may find that doing web development in perl is a bit more complicated, since all of the frameworks require more knowledge of the language.
You can do nice things in both, but my opinion is that perl allows more rapid development. Also, perl's regexes rock!
Every dynamic language is from same family. It does not matter Which is the tool you work with it matter how you do..
PHP VS PYTHON OT PERL OR RUBY? Stop it
As many comments mentioned python is cleaner well sometime whose curly brackets are use full to. You just have to practice.

Is there a library that will detect the source code language of a block of code? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
Writing a python script and it needs to find out what language a block of code is written in. I could easily write this myself, but I'd like to know if a solution already exists.
Pygments is insufficient and unreliable.
Pygments can guess too. Here is an example from the documentation:
>>> from pygments.lexers import guess_lexer, guess_lexer_for_filename
>>> guess_lexer('#!/usr/bin/python\nprint "Hello World!"')
<pygments.lexers.PythonLexer>
>>> guess_lexer_for_filename('test.py', 'print "Hello World!"')
<pygments.lexers.PythonLexer>
I guess you should try what this very site uses: google-code-prettify (from this question)
[EDIT]J.F. Sebastian pointed me to Pygments (see this answer)
This can be a little difficult to do reliably. For example, what language is the following:
print("blah");
The most reliable way (aside from having the user select the correct language, of course) is to check if the first line is starts with #! ("hashbang") - whatever is after this is the intepreter for the scripting language.
That will work reliably for a lot of scripting languages (including python, shell scripting, perl, ruby etc etc..), but not for compiled languages..
You could look for unique syntax stylings, or specific keywords and weight each one towards a specific language. For example $#somevar is probably Perl. somevar.each do |another| ..... end is probably ruby.. but this would end up being a lot of work, and will not always work (especially with short code blocks)
The other obvious way is to use the file-extension. If it's *.pl it's probably Perl code..
What are you trying to achieve? If you want to syntax highlight, look at what google-code-prettify does - basically a reasonably intelligent, generic syntax highlighter..
In the above above ambiguous example, print is probably a statement or function name, "blah" is probably a string. If you highlight those two differently, you've successfully highlighted a lot of different languages, without having to detect what one it actually is.. but that may not always work, depending on the task..
Ohcount has been developed for this exactly:
http://labs.ohloh.net/ohcount
They are using it at www.ohloh.net to count the contribution of people in languages.
The bad news is that it is coded in ruby, but I am sure that you can integrate it one way or the other in python.
Since you asked this question, GitHub have released the code they use to detect programming languages, Linguist. In my experience, GitHub is very accurate.
Language detection
Linguist defines the list of all languages known to GitHub in a yaml file. In order for a file to be highlighted, a language and lexer must be defined there.
Most languages are detected by their file extension. This is the fastest and most common situation.
For disambiguating between files with common extensions, we use a bayesian classifier. For an example, this helps us tell the difference between .h files which could be either C, C++, or Obj-C.
Ruby gem: http://rubygems.org/gems/github-linguist
If you can't use Ruby for whatever reason, the logic is simple enough to port https://github.com/github/linguist/blob/master/lib/linguist/language.rb
Vim uses a bunch of interesting tests and regular expressions to look for certain file formats. You can look at the vim instruction file at vim/vim71/filetype.vim, or here online.
what language a block of code is written in
What are your alternatives, among what languages? There is no way to determine this universally. But if you narrow your focus there is probably a tool somewhere
You can check highlight.js which automatically highlights the code block, they say they are using some kind of heuristic methods to accomplish this http://softwaremaniacs.org/soft/highlight/en/
As other have said Pygments will be your best bet.

Scripting language choice for initial performance [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I have a small lightweight application that is used as part of a larger solution. Currently it is written in C but I am looking to rewrite it using a cross-platform scripting language. The solution needs to run on Windows, Linux, Solaris, AIX and HP-UX.
The existing C application works fine but I want to have a single script I can maintain for all platforms. At the same time, I do not want to lose a lot of performance but am willing to lose some.
Startup cost of the script is very important. This script can be called anywhere from every minute to many times per second. As a consequence, keeping it's memory and startup time low are important.
So basically I'm looking for the best scripting languages that is:
Cross platform.
Capable of XML parsing and HTTP Posts.
Low memory and low startup time.
Possible choices include but are not limited to: bash/ksh + curl, Perl, Python and Ruby. What would you recommend for this type of a scenario?
Lua is a scripting language that meets your criteria. It's certainly the fastest and lowest memory scripting language available.
Because of your requirement for fast startup time and a calling frequency greater than 1Hz I'd recommend either staying with C and figuring out how to make it portable (not always as easy as a few ifdefs) or exploring the possibility of turning it into a service daemon that is always running. Of course this depends on how
Python can have lower startup times if you compile the module and run the .pyc file, but it is still generally considered slow. Perl, in my experience, in the fastest of the scripting languages so you might have good luck with a perl daemon.
You could also look at cross platform frameworks like gtk, wxWidgets and Qt. While they are targeted at GUIs they do have low level cross platform data types and network libraries that could make the job of using a fast C based application easier.
"called anywhere from every minute to many times per second. As a consequence, keeping it's memory and startup time low are important."
This doesn't sound like a script to me at all.
This sounds like a server handling requests that arrive from every minute to several times a second.
If it's a server, handling requests, start-up time doesn't mean as much as responsiveness. In which case, Python might work out well, and still keep performance up.
Rather than restarting, you're just processing another request. You get to keep as much state as you need to optimize performance.
When written properly, C should be platform independant and would only need a recompile for those different platforms. You might have to jump through some #ifdef hoops for the headers (not all systems use the same headers), but most normal (non-win32 API) calls are very portable.
For web access (which I presume you need as you mention bash+curl), you could take a look at libcurl, it's available for all the platforms you mentioned, and shouldn't be that hard to work with.
With execution time and memory cost in mind, I doubt you could go any faster than properly written C with any scripting language as you would lose at least some time on interpreting the script...
I concur with Lua: it is super-portable, it has XML libraries, either native or by binding C libraries like Expat, it has a good socket library (LuaSocket) plus, for complex stuff, some cURL bindings, and is well known for being very lightweight (often embedded in low memory devices), very fast (one of the fastest scripting languages), and powerful. And very easy to code!
It is coded in pure Ansi C, and lot of people claim it has one of the best C biding API (calling C routines from Lua, calling Lua code from C...).
If Low memory and low startup time are truly important you might want to consider doing the work to keep the C code cross platform, however I have found this is rarely necessary.
Personally I would use Ruby or Python for this type of job, they both make it very easy to make clear understandable code that others can maintain (or you can maintain after not looking at it for 6 months). If you have the control to do so I would also suggest getting the latest version of the interpreter, as both Ruby and Python have made notable improvements around performance recently.
It is a bit of a personal thing. Programming Ruby makes me happy, C code does not (nor bash scripting for anything non-trivial).
As others have suggested, daemonizing your script might be a good idea; that would reduce the startup time to virtually zero. Either have a small C wrapper that connects to your daemon and transmits the request back and forth, or have the daemon handle requests directly.
It's not clear if this is intended to handle HTTP requests; if so, Perl has a good HTTP server module, bindings to several different C-based XML parsers, and blazing fast string support. (If you don't want to daemonize, it has a good, full-featured CGI module; if you have full control over the server it's running on, you could also use mod_perl to implement your script as an Apache handler.) Ruby's strings are a little slower, but there are some really good backgrounding tools available for it. I'm not as familiar with Python, I'm afraid, so I can't really make any recommendations about it.
In general, though, I don't think you're as startup-time-constrained as you think you are. If the script is really being called several times a second, any decent interpreter on any decent operating system will be cached in memory, as will the source code of your script and its modules. Result: the startup times won't be as bad as you might think.
Dagny:~ brent$ time perl -MCGI -e0
real 0m0.610s
user 0m0.036s
sys 0m0.022s
Dagny:~ brent$ time perl -MCGI -e0
real 0m0.026s
user 0m0.020s
sys 0m0.006s
(The parameters to the Perl interpreter load the rather large CGI module and then execute the line of code '0;'.)
Python is good. I would also check out The Computer Languages Benchmarks Game website:
http://shootout.alioth.debian.org/
It might be worth spending a bit of time understanding the benchmarks (including numbers for startup times and memory usage). Lots of languages are compared such as Perl, Python, Lua and Ruby. You can also compare these languages against benchmarks in C.
I agree with others in that you should probably try to make this a more portable C app instead of porting it over to something else since any scripting language is going to introduce significant overhead from a startup perspective, have a much larger memory footprint, and will probably be much slower.
In my experience, Python is the most efficient of the three, followed by Perl and then Ruby with the difference between Perl and Ruby being particularly large in certain areas. If you really want to try porting this to a scripting language, I would put together a prototype in the language you are most comfortable with and see if it comes close to your requirements. If you don't have a preference, start with Python as it is easy to learn and use and if it is too slow with Python, Perl and Ruby probably won't be able to do any better.
Remember that if you choose Python, you can also extend it in C if the performance isn't great. Heck, you could probably even use some of the code you have right now. Just recompile it and wrap it using pyrex.
You can also do this fairly easily in Ruby, and in Perl (albeit with some more difficulty). Don't ask me about ways to do this though.
Can you instead have it be a long-running process and answer http or rpc requests?
This would satisfy the latency requirements in almost any scenario, but I don't know if that would break your memory footprint constraints.
At first sight, it's sounds like over engineering, as a rule of thumb I suggest fixing only when things are broken.
You have an already working application. Apparently you want to want to call the feature provided from few more several sources. It looks like the description of a service to me (maybe easier to maintain).
Finally you also mentioned that this is part of a larger solution, then you may want to reuse the language, facilities of the larger solutions. From the description you gave (xml+http) it seems quite an usual application that can be written in any generalist language (maybe a web container in java?).
Some libraries can help you to make your code portable:
Boost,
Qt
more details may trigger more ideas :)
Port your app to Ruby. If your app is too slow, profile it and rewrite the those parts in C.

Is Python good for big software projects (not web based)? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
Right now I'm developing mostly in C/C++, but I wrote some small utilities in Python to automatize some tasks and I really love it as language (especially the productivity).
Except for the performances (a problem that could be sometimes solved thanks to the ease of interfacing Python with C modules), do you think it is proper for production use in the development of stand-alone complex applications (think for example to a word processor or a graphic tool)?
What IDE would you suggest? The IDLE provided with Python is not enough even for small projects in my opinion.
We've used IronPython to build our flagship spreadsheet application (40kloc production code - and it's Python, which IMO means loc per feature is low) at Resolver Systems, so I'd definitely say it's ready for production use of complex apps.
There are two ways in which this might not be a useful answer to you :-)
We're using IronPython, not the more usual CPython. This gives us the huge advantage of being able to use .NET class libraries. I may be setting myself up for flaming here, but I would say that I've never really seen a CPython application that looked "professional" - so having access to the WinForms widget set was a huge win for us. IronPython also gives us the advantage of being able to easily drop into C# if we need a performance boost. (Though to be honest we have never needed to do that. All of our performance problems to date have been because we chose dumb algorithms rather than because the language was slow.) Using C# from IP is much easier than writing a C Extension for CPython.
We're an Extreme Programming shop, so we write tests before we write code. I would not write production code in a dynamic language without writing the tests first; the lack of a compile step needs to be covered by something, and as other people have pointed out, refactoring without it can be tough. (Greg Hewgill's answer suggests he's had the same problem. On the other hand, I don't think I would write - or especially refactor - production code in any language these days without writing the tests first - but YMMV.)
Re: the IDE - we've been pretty much fine with each person using their favourite text editor; if you prefer something a bit more heavyweight then WingIDE is pretty well-regarded.
You'll find mostly two answers to that – the religous one (Yes! Of course! It's the best language ever!) and the other religious one (you gotta be kidding me! Python? No... it's not mature enough). I will maybe skip the last religion (Python?! Use Ruby!). The truth, as always, is far from obvious.
Pros: it's easy, readable, batteries included, has lots of good libraries for pretty much everything. It's expressive and dynamic typing makes it more concise in many cases.
Cons: as a dynamic language, has way worse IDE support (proper syntax completion requires static typing, whether explicit in Java or inferred in SML), its object system is far from perfect (interfaces, anyone?) and it is easy to end up with messy code that has methods returning either int or boolean or object or some sort under unknown circumstances.
My take – I love Python for scripting, automation, tiny webapps and other simple well defined tasks. In my opinion it is by far the best dynamic language on the planet. That said, I would never use it any dynamically typed language to develop an application of substantial size.
Say – it would be fine to use it for Stack Overflow, which has three developers and I guess no more than 30k lines of code. For bigger things – first your development would be super fast, and then once team and codebase grow things are slowing down more than they would with Java or C#. You need to offset lack of compilation time checks by writing more unittests, refactorings get harder cause you never know what your refacoring broke until you run all tests or even the whole big app, etc.
Now – decide on how big your team is going to be and how big the app is supposed to be once it is done. If you have 5 or less people and the target size is roughly Stack Overflow, go ahead, write in Python. You will finish in no time and be happy with good codebase. But if you want to write second Google or Yahoo, you will be much better with C# or Java.
Side-note on C/C++ you have mentioned: if you are not writing performance critical software (say massive parallel raytracer that will run for three months rendering a film) or a very mission critical system (say Mars lander that will fly three years straight and has only one chance to land right or you lose $400mln) do not use it. For web apps, most desktop apps, most apps in general it is not a good choice. You will die debugging pointers and memory allocation in complex business logic.
In my opinion python is more than ready for developing complex applications. I see pythons strength more on the server side than writing graphical clients. But have a look at http://www.resolversystems.com/. They develop a whole spreadsheet in python using the .net ironpython port.
If you are familiar with eclipse have a look at pydev which provides auto-completion and debugging support for python with all the other eclipse goodies like svn support. The guy developing it has just been bought by aptana, so this will be solid choice for the future.
#Marcin
Cons: as a dynamic language, has way
worse IDE support (proper syntax
completion requires static typing,
whether explicit in Java or inferred
in SML),
You are right, that static analysis may not provide full syntax completion for dynamic languages, but I thing pydev gets the job done very well. Further more I have a different development style when programming python. I have always an ipython session open and with one F5 I do not only get the perfect completion from ipython, but object introspection and manipulation as well.
But if you want to write second Google
or Yahoo, you will be much better with
C# or Java.
Google just rewrote jaiku to work on top of App Engine, all in python. And as far as I know they use a lot of python inside google too.
I really like python, it's usually my language of choice these days for small (non-gui) stuff that I do on my own.
However, for some larger Python projects I've tackled, I'm finding that it's not quite the same as programming in say, C++. I was working on a language parser, and needed to represent an AST in Python. This is certainly within the scope of what Python can do, but I had a bit of trouble with some refactoring. I was changing the representation of my AST and changing methods and classes around a lot, and I found I missed the strong typing that would be available to me in a C++ solution. Python's duck typing was almost too flexible and I found myself adding a lot of assert code to try to check my types as the program ran. And then I couldn't really be sure that everything was properly typed unless I had 100% code coverage testing (which I didn't at the time).
Actually, that's another thing that I miss sometimes. It's possible to write syntactically correct code in Python that simply won't run. The compiler is incapable of telling you about it until it actually executes the code, so in infrequently-used code paths such as error handlers you can easily have unseen bugs lurking around. Even code that's as simple as printing an error message with a % format string can fail at runtime because of mismatched types.
I haven't used Python for any GUI stuff so I can't comment on that aspect.
Python is considered (among Python programmers :) to be a great language for rapid prototyping. There's not a lot of extraneous syntax getting in the way of your thought processes, so most of the work you do tends to go into the code. (There's far less idioms required to be involved in writing good Python code than in writing good C++.)
Given this, most Python (CPython) programmers ascribe to the "premature optimization is the root of all evil" philosophy. By writing high-level (and significantly slower) Python code, one can optimize the bottlenecks out using C/C++ bindings when your application is nearing completion. At this point it becomes more clear what your processor-intensive algorithms are through proper profiling. This way, you write most of the code in a very readable and maintainable manner while allowing for speedups down the road. You'll see several Python library modules written in C for this very reason.
Most graphics libraries in Python (i.e. wxPython) are just Python wrappers around C++ libraries anyway, so you're pretty much writing to a C++ backend.
To address your IDE question, SPE (Stani's Python Editor) is a good IDE that I've used and Eclipse with PyDev gets the job done as well. Both are OSS, so they're free to try!
[Edit] #Marcin: Have you had experience writing > 30k LOC in Python? It's also funny that you should mention Google's scalability concerns, since they're Python's biggest supporters! Also a small organization called NASA also uses Python frequently ;) see "One coder and 17,000 Lines of Code Later".
Nothing to add to the other answers, besides that if you choose python you must use something like pylint which nobody mentioned so far.
One way to judge what python is used for is to look at what products use python at the moment. This wikipedia page has a long list including various web frameworks, content management systems, version control systems, desktop apps and IDEs.
As it says here - "Some of the largest projects that use Python are the Zope application server, YouTube, and the original BitTorrent client. Large organizations that make use of Python include Google, Yahoo!, CERN and NASA. ITA uses Python for some of its components."
So in short, yes, it is "proper for production use in the development of stand-alone complex applications". So are many other languages, with various pros and cons. Which is the best language for your particular use case is too subjective to answer, so I won't try, but often the answer will be "the one your developers know best".
Refactoring is inevitable on larger codebases and the lack of static typing makes this much harder in python than in statically typed languages.
And as far as I know they use a lot of python inside google too.
Well i'd hope so, the maker of python still works at google if i'm not mistaken?
As for the use of Python, i think it's a great language for stand-alone apps. It's heavily used in a lot of Linux programs, and there are a few nice widget sets out there to aid in the development of GUI's.
Python is a delight to use. I use it routinely and also write a lot of code for work in C#. There are two drawbacks to writing UI code in Python. one is that there is not a single ui framework that is accepted by the majority of the community. when you write in c# the .NET runtime and class libraries are all meant to work together. With Python every UI library has at's own semantics which are often at odds with the pythonic mindset in which you are trying to write your program. I am not blaming the library writers. I've tried several libraries (wxwidgets, PythonWin[Wrapper around MFC], Tkinter), When doing so I often felt that I was writing code in a language other than Python (despite the fact that it was python) because the libraries aren't exactly pythonic they are a port from another language be it c, c++, tk.
So for me I will write UI code in .NET (for me C#) because of the IDE & the consistency of the libraries. But when I can I will write business logic in python because it is more clear and more fun.
I know I'm probably stating the obvious, but don't forget that the quality of the development team and their familiarity with the technology will have a major impact on your ability to deliver.
If you have a strong team, then it's probably not an issue if they're familiar. But if you have people who are more 9 to 5'rs who aren't familiar with the technology, they will need more support and you'd need to make a call if the productivity gains are worth whatever the cost of that support is.
I had only one python experience, my trash-cli project.
I know that probably some or all problems depends of my inexperience with python.
I found frustrating these things:
the difficult of finding a good IDE for free
the limited support to automatic refactoring
Moreover:
the need of introduce two level of grouping packages and modules confuses me.
it seems to me that there is not a widely adopted code naming convention
it seems to me that there are some standard library APIs docs that are incomplete
the fact that some standard libraries are not fully object oriented annoys me
Although some python coders tell me that they does not have these problems, or they say these are not problems.
Try Django or Pylons, write a simple app with both of them and then decide which one suits you best. There are others (like Turbogears or Werkzeug) but those are the most used.

Categories