The origin of using # as a comment in Python? - python

So I just had like this mental explosion dude! I was looking at my Python source code and was reading some comments and then I looked a the comments again. When I came across this:
#!/usr/bin/env python
# A regular comment
Which made me wonder, was # chosen as the symbol to start a comment because it would allow the python program to be invoked in a shell, like so:
./test.py
and then be ignored once the python interpreter was running?

Yes.
Using # to start a comment is a convention followed by every major interpreted language designed to work on POSIX systems (i.e. not Windows).
It also dovetails nicely with the fact that the sequence "#!" at the beginning of a file is recognized by the OS to mean "run the command on this line" when you try to run the script file itself.
But mostly, it's the commonly accepted convention. If python didn't use # to start a comment, it would confuse a lot of people.
EDIT
Using "#" as a comment marker apparently pre-dates the "#!" hash-bang notation. "#!" was introduced by Dennis Ritchie betwen Unix 7 and 8, while languages that support # as a comment marker existed earlier. For example, the Bourne shell was already the default when Version 7 Unix was introduced.
Therefore, the convention of using "#" as a comment marker probably influenced the choice of "#!" as the command line marker, and not the other way around.

Using # for comments was happening before Python came around. The shebang (#!/usr/bin/env python) convention is almost as old as UNIX itself. The two are intertwined for many interpreted (aka shell) languages.
Might as well study up on the history of the shebang!

"All the rest of the line after character X" is clearly the handiest way to do comments if you have an X available (C++ had to use two characters, //, for the purpose, to offer an alternative to C's clunky PL/I-inspired '/' ... '/' "brackets").
Almost all printable Ascii characters can be used for other purposes in Python -- if the choice is between # and ?, with the first already being familiar from its use in sh, bash, tcl, perl, awk, ... -- it's not a very hard choice, is it? The handiness of hashbang is just a vig.

Very good, you are correct.
This is known as the shebang or hash-bang. I suppose a scripting language could use any comment character it liked and allow #! on the first line as a comment, but it does seem easier to just make # be the comment character...

Related

When is semicolon use in Python considered "good" or "acceptable"?

Python is a "whitespace delimited" language. However, the use of semicolons are allowed. For example, the following works, but it is frowned upon:
print("Hello!");
print("This is valid");
I've been using Python for several years now, and the only time I have ever used semicolons is in generating one-time command-line scripts with Python:
python -c "import inspect, mymodule; print(inspect.getfile(mymodule))"
Or adding code in comments on Stack Overflow (i.e., "you should try import os; print os.path.join(a,b)")
I also noticed in this answer to a similar question that the semicolon can also be used to make one line if blocks, as in
if x < y < z: print(x); print(y); print(z)
which is convenient for the two usage examples I gave (command-line scripts and comments).
The above examples are for communicating code in paragraph form or making short snippets, but not something I would expect in a production codebase.
Here is my question: in Python, is there ever a reason to use the semicolon in a production code? I imagine that they were added to the language solely for the reasons I have cited, but it’s always possible that Guido had a grander scheme in mind.
No opinions please; I'm looking either for examples from existing code where the semicolon was useful, or some kind of statement from the python docs or from Guido about the use of the semicolon.
PEP 8 is the official style guide and says:
Compound statements (multiple statements on the same line) are generally discouraged.
(See also the examples just after this in the PEP.)
While I don't agree with everything PEP 8 says, if you're looking for an authoritative source, that's it. You should use multi-statement lines only as a last resort. (python -c is a good example of such a last resort, because you have no way to use actual linebreaks in that case.)
I use semicolons in code all of the time. Our code folds across lines quite frequently, and a semicolon is a definitive assertion that a statement is ending.
output, errors, status = generate_output_with_errors_and_status(
first_monstrous_functional_argument(argument_one_to_argument
, argument_two_to_argument)
, second_argument);
See? They're quite helpful to readability.

IPython magic commands and dashes

I recently switched my default shell to IPython, rather than bash, by creating an IPython profile with automagic, autocall and other such features turned on. To make executables visible to the IPython environment, I've included %rehashx to run automatically in my config files. The trouble with this is that commands with dashes in their names, such as xdg-open, are not properly translated into magic commands, and thus require using the shell-escape syntax to run. Is there a way to automagic commands with dashes, so that I can more closely emulate bash-like calling of such commands?
You will have to live with this.
If identifiers are handled across language boundaries (in this case bash/Python) you will have problems if the languages' rules for identifiers allow different things (in this case the - is allowed in bash but not in Python). One way to solve this is name mangling. Sometimes this is done, e. g. by replacing offending characters with allowed characters (e. g. xdg-open by xdg_open); to avoid name clashes (e. g. if there already is an xdg_open besides the xdg-open) the replacement often is escaped in some way, e.g. by the hex value of the character (e. g. - by _2d, _ by _5f etc.). You will probably know this from URL lines containing stuff like %20 and the like. This all becomes either unreadable very quickly, or the rules for the name mangling are very complicated (there's a trade-off).

Python - When Is It Ok to Use os.system() to issue common Linux commands

Spinning off from another thread, when is it appropriate to use os.system() to issue commands like rm -rf, cd, make, xterm, ls ?
Considering there are analog versions of the above commands (except make and xterm), I'm assuming it's safer to use these built-in python commands instead of using os.system()
Any thoughts? I'd love to hear them.
Rule of thumb: if there's a built-in Python function to achieve this functionality use this function. Why? It makes your code portable across different systems, more secure and probably faster as there will be no need to spawn an additional process.
One of the problems with system() is that it implies knowledge of the shell's syntax and language for parsing and executing your command line. This creates potential for a bug where you didn't validate input properly, and the shell might interpet something like variable substitution or determining where an argument begins or ends in a way you don't expect. Also, another OS's shell might have divergent syntax from your own, including very subtle divergence that you won't notice right away. For reasons like these I prefer to use execve() instead of system() -- you can pass argv tokens directly and not have to worry about something in the middle (mis-)parsing your input.
Another problem with system() (this also applies to using execve()) is that when you code that, you are saying, "look for this program, and pass it these args". This makes a couple of assumptions which may lead to bugs. First is that the program exists and can be found in $PATH. Maybe on some system it won't. Second, maybe on some system, or even a future version of your own OS, it will support a different set of options. In this sense, I would avoid doing this unless you are absolutely certain the system you will run on will have the program. (Like maybe you put the callee program on the system to begin with, or the way you invoke it is mandated by something like POSIX.)
Lastly... There's also a performance hit associated with looking for the right program, creating a new process, loading the program, etc. If you are doing something simple like a mv, it's much more efficient to use the system call directly.
These are just a few of the reasons to avoid system(). Surely there are more.
Darin's answer is a good start.
Beyond that, it's a matter of how portable you plan to be. If your program is only ever going to run on a reasonably "standard" and "modern" Linux then there's no reason for you to re-invent the wheel; if you tried to re-write make or xterm they'd be sending the men in the white coats for you. If it works and you don't have platform concerns, knock yourself out and simply use Python as glue!
If compatibility across unknown systems was a big deal you could try looking for libraries to do what you need done in a platform independent way. Or you need to look into a way to call on-board utilities with different names, paths and mechanisms depending on which kind of system you're on.
The only time that os.system might be appropriate is for a quick-and-dirty solution for a non-production script or some kind of testing. Otherwise, it is best to use built-in functions.
Your question seems to have two parts. You mention calling commands like "xterm", "rm -rf", and "cd".
Side Note: you cannot call 'cd' in a sub-shell. I bet that was a trick question ...
As far as other command-level things you might want to do, like "rm -rf SOMETHING", there is already a python equivalent. This answers the first part of your question. But I suspect you are really asking about the second part.
The second part of your question can be rephrased as "should I use system() or something like the subprocess module?".
I have a simple answer for you: just say NO to using "system()", except for prototyping.
It's fine for verifying that something works, or for that "quick and dirty" script, but there are just too many problems with os.system():
It forks a shell for you -- fine if you need one
It expands wild cards for you -- fine unless you don't have any
It handles redirect -- fine if you want that
It dumps output to stderr/stdout and reads from stdin by default
It tries to understand quoting, but it doesn't do very well (try 'Cmd" > "Ofile')
Related to #5, it doesn't always grok argument boundaries (i.e. arguments with spaces in them might get screwed up)
Just say no to "system()"!
I would suggest that you only use use os.system for things that there are not already equivalents for within the os module. Why make your life harder?
The os.system call is starting to be 'frowned upon' in python. The 'new' replacement would be subprocess.call or subprocess.Popen in the subprocess module. Check the docs for subprocess
The other nice thing about subprocess is you can read the stdout and stderr into variables, and process that without having to redirect to other file(s).
Like others have said above, there are modules for most things. Unless you're trying to glue together many other commands, I'd stick with the things included in the library. If you're copying files, use shutil, working with archives you've got modules like tarfile/zipfile and so on.
Good luck.

Switching from python-mode.el to python.el

I recently tried switching from using python-mode.el to python.el for editing python files in emacs, found the experience a little alien and unproductive, and scurried back. I've been using python-mode.el for something like ten years, so perhaps I'm a little set in my ways. I'd be interested in hearing from anyone who's carefully evaluated the two modes, in particular of the pros and cons they perceive of each and how their work generally interacts with the features specific to python.el.
The two major issues for me with python.el were
Each buffer visiting a python file gets its own inferior interactive python shell. I am used to doing development in one interactive shell and sharing data between python files. (Might seem like bad practice from a software-engineering perspective, but I'm usually working with huge datasets which take a while to load into memory.)
The skeleton-mode support in python.el, which seemed absolutely gratuitous (python's syntax makes such automation unnecessary) and badly designed (for instance, it has no knowledge of "for" loop generator expressions or "<expr 1> if <cond> else <expr 2>" expressions, so you have to go back and remove the colons it helpfully inserts after insisting that you enter the expression clauses in the minibuffer.) I couldn't figure out how to turn it off. There was a python.el variable which claimed to control this, but it didn't seem to work. It could be that the version of python.el I was using was broken (it came from the debian emacs-snapshot package) so if anyone knows of an up-to-date version of it, I'd like to hear about it. (I had the same problem with the version in CVS emacs as of approximately two weeks ago.)
For what it's worth, I do not see the behavior you are seeing in issue #1, "Each buffer visiting a python file gets its own inferior interactive python shell."
This is what I did using python.el from Emacs 22.2.
C-x C-f foo.py
[insert: print "foo"]
C-x C-f bar.py
[insert: print "bar"]
C-c C-z [*Python* buffer appears]
C-x o
C-c C-l RET ["bar" is printed in *Python*]
C-x b foo.py RET
C-c C-l RET ["foo" is printed in the same *Python* buffer]
Therefore the two files are sharing the same inferior python shell. Perhaps there is some unforeseen interaction between your personal customizations of python-mode and the default behaviors of python.el. Have you tried using python.el without your .emacs customizations and checking if it behaves the same way?
The major feature addition of python.el over python-mode is the symbol completion function python-complete-symbol. You can add something like this
(define-key inferior-python-mode-map "\C-c\t" 'python-complete-symbol)
Then typing
>>> import os
>>> os.f[C-c TAB]
you'll get a *Completions* buffer containing
Click <mouse-2> on a completion to select it.
In this buffer, type RET to select the completion near point.
Possible completions are:
os.fchdir os.fdatasync
os.fdopen os.fork
os.forkpty os.fpathconf
os.fstat os.fstatvfs
os.fsync os.ftruncate
It'll work in .py file buffers too.
I can't reproduce this behavior on Emacs v23.1, this must have been changed since then.
Forget about any mode's skeleton support and use the hyper-advanced and extensible yasnippet instead, it's really worth a try!
Note nearly everything said here is obsolete meanwhile as things changed.
python-mode.el commands are prefixed "py-" basically, you should be able to use commands from both, nonewithstanding which one was loaded first.
python-mode.el does not unload python.el; beside of python-mode-map, which is re-defined.
The diff is in the menu displayed and keysetting however, the last one loaded will determine.
python-mode.el is written by the Python community. python.el is written by the emacs community. I've used python-mode.el for as long as I can remember and python.el doesn't even come close to the standards of python-mode.el. I trust the Python community better than the Emacs community to come up with a decent mode file. Just stick with python-mode.el, is there really a reason not to?
python-mode.el has no support triple-quoted strings, so if your program contains long docstrings, all the syntax coloring (and associated syntaxic features) tends to break down.
my .02
Debian has deleted the python-mode package, alas, so I felt compelled to try python.el. I loaded it and ran "describe-bindings". It appeared to be designed for elisp coders who think c-X ; is the intuitive binding for commenting a line of Python code. (Wow.) Also, I found no way at all to comment a region of code, or, rather, no binding with the strings "region" and "comment" in it.
Good old python-mode can still be cloned via git clone https://gitlab.com/python-mode-devs/python-mode.git. It was last edited a week ago at this writing, so it's safe to assume it's not abandoned.

Is there a way around coding in Python without the tab, indent & whitespace criteria?

I want to start using Python for small projects but the fact that a misplaced tab or indent can throw a compile error is really getting on my nerves. Is there some type of setting to turn this off?
I'm currently using NotePad++. Is there maybe an IDE that would take care of the tabs and indenting?
The answer is no.
At least, not until something like the following is implemented:
from __future__ import braces
No. Indentation-as-grammar is an integral part of the Python language, for better and worse.
Emacs! Seriously, its use of "tab is a command, not a character", is absolutely perfect for python development.
All of the whitespace issues I had when I was starting Python were the result mixing tabs and spaces. Once I configured everything to just use one or the other, I stopped having problems.
In my case I configured UltraEdit & vim to use spaces in place of tabs.
It's possible to write a pre-processor which takes randomly-indented code with pseudo-python keywords like "endif" and "endwhile" and properly indents things. I had to do this when using python as an "ASP-like" language, because the whole notion of "indentation" gets a bit fuzzy in such an environment.
Of course, even with such a thing you really ought to indent sanely, at which point the conveter becomes superfluous.
I find it hard to understand when people flag this as a problem with Python. I took to it immediately and actually find it's one of my favourite 'features' of the language :)
In other languages I have two jobs:
1. Fix the braces so the computer can parse my code
2. Fix the indentation so I can parse my code.
So in Python I have half as much to worry about ;-)
(nb the only time I ever have problem with indendation is when Python code is in a blog and a forum that messes with the white-space but this is happening less and less as the apps get smarter)
I'm currently using NotePad++. Is
there maybe an IDE that would take
care of the tabs and indenting?
I liked pydev extensions of eclipse for that.
I do not believe so, as Python is a whitespace-delimited language. Perhaps a text editor or IDE with auto-indentation would be of help. What are you currently using?
No, there isn't. Indentation is syntax for Python. You can:
Use tabnanny.py to check your code
Use a syntax-aware editor that highlights such mistakes (vi does that, emacs I bet it does, and then, most IDEs do too)
(far-fetched) write a preprocessor of your own to convert braces (or whatever block delimiters you love) into indentation
You should disable tab characters in your editor when you're working with Python (always, actually, IMHO, but especially when you're working with Python). Look for an option like "Use spaces for tabs": any decent editor should have one.
I agree with justin and others -- pick a good editor and use spaces rather than tabs for indentation and the whitespace thing becomes a non-issue. I only recently started using Python, and while I thought the whitespace issue would be a real annoyance it turns out to not be the case. For the record I'm using emacs though I'm sure there are other editors out there that do an equally fine job.
If you're really dead-set against it, you can always pass your scripts through a pre-processor but that's a bad idea on many levels. If you're going to learn a language, embrace the features of that language rather than try to work around them. Otherwise, what's the point of learning a new language?
Not really. There are a few ways to modify whitespace rules for a given line of code, but you will still need indent levels to determine scope.
You can terminate statements with ; and then begin a new statement on the same line. (Which people often do when golfing.)
If you want to break up a single line into multiple lines you can finish a line with the \ character which means the current line effectively continues from the first non-whitespace character of the next line. This visually appears violate the usual whitespace rules but is legal.
My advice: don't use tabs if you are having tab/space confusion. Use spaces, and choose either 2 or 3 spaces as your indent level.
A good editor will make it so you don't have to worry about this. (python-mode for emacs, for example, you can just use the tab key and it will keep you honest).
Tabs and spaces confusion can be fixed by setting your editor to use spaces instead of tabs.
To make whitespace completely intuitive, you can use a stronger code editor or an IDE (though you don't need a full-blown IDE if all you need is proper automatic code indenting).
A list of editors can be found in the Python wiki, though that one is a bit too exhausting:
- http://wiki.python.org/moin/PythonEditors
There's already a question in here which tries to slim that down a bit:
https://stackoverflow.com/questions/60784/poll-which-python-ideeditor-is-the-best
Maybe you should add a more specific question on that: "Which Python editor or IDE do you prefer on Windows - and why?"
Getting your indentation to work correctly is going to be important in any language you use.
Even though it won't affect the execution of the program in most other languages, incorrect indentation can be very confusing for anyone trying to read your program, so you need to invest the time in figuring out how to configure your editor to align things correctly.
Python is pretty liberal in how it lets you indent. You can pick between tabs and spaces (but you really should use spaces) and can pick how many spaces. The only thing it requires is that you are consistent which ultimately is important no matter what language you use.
I was a bit reluctant to learn Python because of tabbing. However, I almost didn't notice it when I used Vim.
If you don't want to use an IDE/text editor with automatic indenting, you can use the pindent.py script that comes in the Tools\Scripts directory. It's a preprocessor that can convert code like:
def foobar(a, b):
if a == b:
a = a+1
elif a < b:
b = b-1
if b > a: a = a-1
end if
else:
print 'oops!'
end if
end def foobar
into:
def foobar(a, b):
if a == b:
a = a+1
elif a < b:
b = b-1
if b > a: a = a-1
# end if
else:
print 'oops!'
# end if
# end def foobar
Which is valid python.
Nope, there's no way around it, and it's by design:
>>> from __future__ import braces
File "<stdin>", line 1
SyntaxError: not a chance
Most Python programmers simply don't use tabs, but use spaces to indent instead, that way there's no editor-to-editor inconsistency.
I'm surprised no one has mentioned IDLE as a good default python editor. Nice syntax colors, handles indents, has intellisense, easy to adjust fonts, and it comes with the default download of python. Heck, I write mostly IronPython, but it's so nice & easy to edit in IDLE and run ipy from a command prompt.
Oh, and what is the big deal about whitespace? Most easy to read C or C# is well indented, too, python just enforces a really simple formatting rule.
Many Python IDEs and generally-capable text/source editors can handle the whitespace for you.
However, it is best to just "let go" and enjoy the whitespace rules of Python. With some practice, they won't get into your way at all, and you will find they have many merits, the most important of which are:
Because of the forced whitespace, Python code is simpler to understand. You will find that as you read code written by others, it is easier to grok than code in, say, Perl or PHP.
Whitespace saves you quite a few keystrokes of control characters like { and }, which litter code written in C-like languages. Less {s and }s means, among other things, less RSI and wrist pain. This is not a matter to take lightly.
In Python, indentation is a semantic element as well as providing visual grouping for readability.
Both space and tab can indicate indentation. This is unfortunate, because:
The interpretation(s) of a tab varies
among editors and IDEs and is often
configurable (and often configured).
OTOH, some editors are not
configurable but apply their own
rules for indentation.
Different sequences of
spaces and tabs may be visually
indistinguishable.
Cut and pastes can alter whitespace.
So, unless you know that a given piece of code will only be modified by yourself with a single tool and an unvarying config, you must avoid tabs for indentation (configure your IDE) and make sure that you are warned if they are introduced (search for tabs in leading whitespace).
And you can still expect to be bitten now and then, as long as arbitrary semantics are applied to control characters.
Check the options of your editor or find an editor/IDE that allows you to convert TABs to spaces. I usually set the options of my editor to substitute the TAB character with 4 spaces, and I never run into any problems.
Yes, there is a way. I hate these "no way" answers, there is no way until you discover one.
And in that case, whatever it is worth, there is one.
I read once about a guy who designed a way to code so that a simple script could re-indent the code properly. I didn't managed to find any links today, though, but I swear I read it.
The main tricks are to always use return at the end of a function, always use pass at the end of an if or at the end of a class definition, and always use continue at the end of a while. Of course, any other no-effect instruction would fit the purpose.
Then, a simple awk script can take your code and detect the end of block by reading pass/continue/return instructions, and the start of code with if/def/while/... instructions.
Of course, because you'll develop your indenting script, you'll see that you don't have to use continue after a return inside the if, because the return will trigger the indent-back mechanism. The same applies for other situations. Just get use to it.
If you are diligent, you'll be able to cut/paste and add/remove if and correct the indentations automagically. And incidentally, pasting code from the web will require you to understand a bit of it so that you can adapt it to that "non-classical" setting.

Categories