Is there a package naming convention for Python like Java's com.company.actualpackage? Most of the time I see simple, potentially colliding package names like "web".
If there is no such convention, is there a reason for it? What do you think of using the Java naming convention in the Python world?
Python has two "mantras" that cover this topic:
Explicit is better than implicit.
and
Namespaces are one honking great idea -- let's do more of those!
There is a convention for naming of and importing of modules that can be found in The Python Style Guide (PEP 8).
The biggest reason that there is no such convention to consistently prefix your modules names in a Java style, is because over time you end up with a lot of repetition in your code that doesn't really need to be there.
One of the problems with Java is it forces you to repeat yourself, constantly. There's a lot of boilerplate that goes into Java code that just isn't necessary in Python. (Getters/setters being a prime example of that.)
Namespaces aren't so much of a problem in Python because you are able to give modules an alias upon import. Such as:
import com.company.actualpackage as shortername
So you're not only able to create or manipulate the namespace within your programs, but are able to create your own keystroke-saving aliases as well.
The Java's conventions also has its own drawbacks. Not every opensource package has a stable website behind it. What should a maintainer do if his website changes? Also, using this scheme package names become long and hard to remember. Finally, the name of the package should represent the purpose of the package, not its owner
An update for anyone else who comes looking for this:
As of 2012, PEP 423 addresses this. PEP 8 touches on the topic briefly, but only to say: all lowercase or underscores.
The gist of it: pick memorable, meaningful names that aren't already used on PyPI.
There is no Java-like naming convention for Python packages. You can of course adopt one for any package you develop yourself, but you might have to invasively edit any package you may adopt from third parties, and the "culturally alien" naming convention will probably sap the changes of your own packages to be widely adopted outside of your organization.
Technically, there would be nothing wrong with Java's convention in Python (it would just make some from statements a tad longer, no big deal), but in practice the cultural aspects make it pretty much unfeasible.
The reason there's normally no package hierarchy is because Python packages aren't easily extended that way. Packages are actual directories, and though you can make packages look in multiple directories for sub-modules (by adding directories to the __path__ list of the package) it's not convenient, and easily done wrong. As for why Python packages aren't easily extended that way, well, that's a design choice. Guido didn't like deep hierarchies (and still doesn't) and doesn't think they're necessary.
The convention is to pick a toplevel package name that's obvious but unique to your project -- for example, the name of the project itself. You can structure everything inside it however you want (because you are in control of it.) Splitting the package into separate bits with separate owners is a little more work, but with a few guidelines it's possible. It's rarely needed.
There's nothing stopping you using that convention if you want to, but it's not at all standard in the Python world and you'd probably get funny looks. It's not much fun to take care of admin on packages when they're deeply nested in com.
It may sound sloppy to someone coming from Java, but in reality it doesn't really seem to have caused any big difficulties, even with packages as poorly-named as web.py.
The place where you often do get namespace conflicts in practice is relative imports: where code in package.module1 tries to import module2 and there's both a package.module2 and a module2 in the standard library (which there commonly is as the stdlib is large and growing). Luckily, ambiguous relative imports are going away.
I've been using python for years and 99.9% of the collisions I have seen comer from new developers trying to name a file "xml.py". I can see some advantages to the Java scheme, but most developers are smart enough to pick reasonable package names, so it really is't that big of a problem.
Related
Are PEP-249 "rules" that Python database modules (eg psycopg) should follow?
Is PEP-249 not any Python code, class or something that can be "touchable", or something that is already (natively) inside Python itself?
Is PEP-249 a basis for the database modules that will be used by the Python language?
Taken from PEP-249 introduction:
This API has been defined to encourage similarity between the Python modules that are used to access databases. By doing this, we hope to achieve a consistency leading to more easily understood modules, code that is generally more portable across databases, and a broader reach of database connectivity from Python.
PEPs, i.e. Python Enhancement Proposals, are just that: proposals (see PEP-1). In this case, it's a proposal of a standard, "canonical" API design for database libraries. It isn't necessarily a "law of the land", but rather a reference for people who build such APIs.
The community has come together, decided that it might be a good idea to come up with something common, and put it up as a PEP. The PEP got accepted, and now it will be seen as a kind of a standard.
Whether or not this standard is followed is in the hands of the maintainers of each particular library. It's all a matter of agreement between developers.
Context
I write my own library for data analysis purpose. It includes classes to import data from the server, procedures to clean, analyze and display results. It also includes functions to compare results.
Concerns
I put all these in a single module and import it when I do a new project.
I put any newly-developed classes/functions in the same file.
I have concerns that my module becomes longer and harder to browse and explain.
Questions
I started Python six months ago and want to know common practices:
How do you group your function/classes and put them into separated files?
By Purpose? By project? By class/function?
Or you are not doing it at all?
In general how many lines of code in a single module?
What's the way to track the dependency among your own libraries?
Feel free to suggest any thoughts.
I believe the best way to answer this question is to look at what the leaders in this field are doing. There is a very healthy eco-system of modules available on pypi whose authors have wrestled with this question. Take a look at some of the modules you use frequently and thus are already installed on your system. Or better yet, many of those modules have their development versions hosted on GitHub (The pypi page usually has a pointer). Go there and look around.
The PEP 8 style guide (Python) says methodnames should be lowercase and that sometimes method_names may have embedded underscores. Do experienced Python programmers often come across CamelCase (with leading capital) methods?
In my experience, a module's naming conventions depends on the following factors:
If it's a newly written, Python-only module, or if the authors especially care for "Pythonic code" (quite common!), the naming usually sticks to the recommendations in the Python Style Guide.
Beginners to Python often stick to their favorite style, especially if they have a Java/C#/... background.
Some companies prefer a consistent naming style across languages.
Some Python modules adapt an API from other languages (especially Java in the threading and xml.dom modules), including naming style.
Python bindings for C/C++ libraries usually keep the original naming (e.g. camelCase for PyQt or PascalCase for VTK)
I think the last point is the most common way that you will find PascalCase in Python code.
I was notorious for camel-casing in Python, having just prior come from Java. No more of that, though. I much prefer the "method_name" style of capitalization.
One Python library that I know of, that for sure has camel-cased method names, is the xml.dom package, and its subpackages (like xml.dom.minidom).
I found myself using camelCase for method names quite a bit (I use underscored_names for attributes). The main reason I got this habit is probably that I've been using Qt.
Probably it would have been simpler if Python forced a naming convention and four space indenting ... style is substance.
I personally prefer not to use camelCase in Python (or at all -- my other daily use language is C), but I've seen it done. I had to do it once because we were using a mixed codebase (some Python and a lot of previously-written Matlab code) and the Matlab code used camelCase, though I'm not sure why since Matlab seems to frown on camel case as well.
By my experience, yes, some libraries use camelCase, however is not the most common one.
I personally encourage to follow the python standard (as defined in the PEP008), the reason is very simple: Every person with a different programming language background may tend to "impose" its own code style and that is pretty dangerous, imagine a java fan and a php fan writting python code together... could be fun.
However, if there is already a proposed standard, why not follow it? i mean in java everyone uses camelCase, why? well simply because it is the convention.
Finally you have cool tools as pylint that are configured by default to check for the PEP008 syntax.
A few libraries have this convention. Many programmers come from a background that uses PascalCase for method names (Pascal Case is what you are actually referring to, it is the special case of leading capital camelCase that you mention).
It's incorrect by the python style guide, so I recommend against writing code this way.
scenario: a modular app that loads .py modules on the fly as it works. programmer (me) wishes to edit the code of a module and then re-load it into the program without halting execution.
can this be done?
i have tried running import a second time on an updated module.py, but the changes are not picked up
While reload does reload a module, as the other answer mentions, you need quite a few precautions to make it work smoothly -- and for some things you might believe would work easily, you're in for quite a shock in terms of amount of work actually needed.
If you ever use the form from module import afunction, then you've almost ensured reload won't work: you must exclusively import modules, never functions, classes, etc, from inside modules, if you want to have any hope of reload doing something useful all all (otherwise you'd have to somehow chases all the bits and pieces imported here and there from the module, and rebind each and every one of them -- eep;-). Note that I prefer following this rule anyway, whether I plan to do any reloading or not, but, with reload, it's crucial.
The difficult problem is: if you have, alive anywhere, instances of classes that existed in the previous version of the module, reload per se will do absolutely nothing to upgrade those instances. That problem is a truly hard one; one of the longest, hardest recipes in the Python Cookbook (2nd edition) is all about how to code your modules to support such "reload that actually upgrades existing instances". This only matters if you program in OOP style, of course, but... any Python program complex enough to need "reload this plugin" functionality is very likely to have lots of OOP in it, so it's hardly a minor issue.
The docs for reload are pretty complete and do mention this issue, but give no hint how to solve it. This recipe by Michael Hudson, from the Python Cookbook online, is better, but it's only the start of what we evolved into the printed (2nd edition) -- recipe 20.15, the online version of that is here (incomplete unless you sign up for a free time-limited preview of O'Reilly's commercial online books service).
reload() will do what you need. You just have to be careful about what code executes in your module upon re-compile; it is usually easiest if your module just contains definitions.
use the builtin command:
reload(module)
How do I unload (reload) a Python module?
You may get some use from the livecoding module.
Alex's answer and the others cover the general case.
However, since these modules you're working on are your own and you'll edit only them, it's possible to implement some kind of a local reloading mechanism rather than rely on the builtin reload. You can architect your module loader as something that loads up the files, compiles it and evals them into a specific namespace and tells the calling code that it's ready. You can take care of reloading by making sure that there's a decent init function which reinitialises the module properly. I assume you'll be editing only these.
It's a bit of work but might be interesting and worth your time.
Out of curiosity, are there many compilers out there which target .pyc files?
After a bit of Googling, the only two I can find are:
unholy: why_'s Ruby-to-pyc compiler
Python: The PSF's Python to pyc compiler
So… Are there any more?
(as a side note, I got thinking about this because I want to write a Scheme-to-pyc compiler)
(as a second side note, I'm not under any illusion that a Scheme-to-pyc compiler would be useful, but it would give me an incredible excuse to learn some internals of both Scheme and Python)
"I want to write a Scheme-to-pyc compiler".
My brain hurts! Why would you want to do that? Python byte code is an intermediate language specifically designed to meet the needs of the Python language and designed to run on Python virtual machines that, again, have been tailored to the needs of Python. Some of the most important areas of Python development these days are moving Python to other "virtual machines", such as Jython (JVM), IronPython (.NET), PyPy and the Unladen Swallow project (moving CPython to an LLVM-based representation). Trying to squeeze the syntax and semantics of another, very different language (Scheme) into the intermediate representation of another high-level language seems to be attacking the problem (whatever the problem is) at the wrong level. So, in general, it doesn't seem like there would be many .pyc compilers out there and there's a good reason for that.
I wrote a compiler several years ago which accepted a lisp-like language called "Noodle" and produced Python bytecode. While it never became particularly useful, it was a tremendously good learning experience both for understanding Common Lisp better (I copied several of its features) and for understanding Python better.
I can think of two particular cases when it might be useful to target Python bytecode directly, instead of producing Python and passing it on to a Python compiler:
Full closures: in Python before 3.0 (before the nonlocal keyword), you can't modify the value of a closed-over variable without resorting to bytecode hackery. You can mutate values instead, so it's common practice to have a closure referencing a list, for example, and changing the first element in it from the inner scope. That can get real annoying. The restriction is part of the syntax, though, not the Python VM. My language had explicit variable declaration, so it successfully provided "normal" closures with modifiable closed-over values.
Getting at a traceback object without referencing any builtins. Real niche case, for sure, but I used it to break an early version of the "safelite" jail. See my posting about it.
So yeah, it's probably way more work than it's worth, but I enjoyed it, and you might too.
I suggest you focus on CPython.
http://www.network-theory.co.uk/docs/pytut/CompiledPythonfiles.html
Rather than a Scheme to .pyc translator, I suggest you write a Scheme to Python translator, and then let CPython handle the conversion to .pyc. (There is precedent for doing it this way; the first C++ compiler was Cfront which translated C++ into C, and then let the system C compiler do the rest.)
From what I know of Scheme, it wouldn't be that difficult to translate Scheme to Python.
One warning: the Python virtual machine is probably not as fast for Scheme as Scheme itself. For example, Python doesn't automatically turn tail recursion into iteration; and Python has a relatively shallow stack, so you would actually need to turn tail recursion to iteration for your translator.
As a bonus, once Unladen Swallow speeds up Python, your Scheme-to-Python translator would benefit, and at that point might even become practical!
If this seems like a fun project to you, I say go for it. Not every project has to be immediately practical.
P.S. If you want a project that is somewhat more practical, you might want to write an AWK to Python translator. That way, people with legacy AWK scripts could easily make the leap forward to Python!
Just for your interest, I have written a toy compiler from a simple LISP to Python. Practically, this is a LISP to pyc compiler.
Have a look: sinC - The tiniest LISP compiler
Probably a bit late at the party but if you're still interested the clojure-py project (https://github.com/halgari/clojure-py) is now able to compile a significant subset of clojure to python bytecode -- but some help is always welcome.
Targeting bytecode is not that hard in itself, except for one thing: it is not stable across platforms (e.g. MAKE_FUNCTION pops 2 elements from the stack in Python 3 but only 1 in Python 2), and these differences are not clearly documented in a single spot (afaict) -- so you probably have some abstraction layer needed.