Here is the situation: the company that I'm working in right now gave me the freedom to work with either java or python to develop my applications. The company has mainly experience in java.
I have decided to go with python, so they where very happy to ask me to give maintenance to all the python projects/scripts related to the database maintenance that they have.
Its not that bad to handle all that stuff and its kind of fun to see how much free time I have compared to java programmers. There is just one but, the projects layout is a mess.
There are many scripts that simply lay in virtual machines all over the company. Some of them have complex functionality that is spread across a few modules(4 at maximum.)
While thinking about it about it, I realized that I don't know how to address that, so here are 3 questions.
Where do I put standalone scripts? We use git as our versioning system.
How do structure the project's layout in a way that the user do not need to dig deep into the folders to run the programs(in java I created a jar or a jar and a shell script to handle some bootstrap operations.)
What is a standard way to create modules that allow easy reusability(mycompany.myapp.mymodule?)
Where do I put standalone scripts?
You organize them "functionally" -- based on what they do and why people use them.
The language (Python vs. Java) is irrelevant.
You have to think of scripts as small applications focused on some need and create appropriate directory structures for that application.
We use /opt/thisapp and /opt/thatapp. If you want a shared mount-point, you might use a different path.
How do structure the project's layout in a way that the user do not need to dig deep into the folders to run the programs
You organize them "functionally" -- based on what they do and why people use them. At the top level of a /opt/thisapp directory, you might have an __init__.py (because it's a package) and perhaps a main.py script which starts the real work.
In Python 2.7 and Python 3, you have the runpy module. With this you would name your
top-level main script __main__.py
http://docs.python.org/library/runpy.html#module-runpy
What is a standard way to create modules that allow easy reusability(mycompany.myapp.mymodule?)
Read about packages. http://docs.python.org/tutorial/modules.html#packages
A package is a way of creating a module hierarchy: if you make a file called __init__.py in a directory, Python will treat that directory as a package and allow you to import its contents using dotted imports:
spam \
__init__.py
ham.py
eggs.py
import spam.ham
The modules inside a package can reference each other -- see the docs.
If these are all DB maintenance scripts, I would make a package called DB or something, and place them all in it. You can have subpackages for the more complicated ones. So if you had a script for, I don't know, cleaning up the transaction logs, you could put it in ourDB.clean and do
import ourDB.clean
ourDB.clean.transaction_logs( )
Related
I am running ROS Indigo. I have what should be a simple problem: I have a utility class in my package that I want to be callable from our scripts. It only needs to be called within our own package; I don't need it to be available to other ROS packages.
I defined a class named HandControl in a file HandControl.py. All my attempts to import it, or use it without importing, fail. Where in the catkin workspace do I put it -- the root of the package, or in scripts? Do I need __init.py__ anywhere (I have tried several places)?
It is a good practice to follow the standards of Python and ROS here. Scripts are typically placed in /script directory and they should not be imported into other python scripts. Reusable python code is an indication of a python module. Python modules should be placed in /src/package_name and there you should create __init__.py as well. This module will be available everywhere in your catkin workspace. There is a good chance this structure will help you in the future to structure things, even though you may not seem to need it at the moment. Project typically grow and following guidelines helps to maintain good code. For more specific details checkout this python doc.
Erica,
please see this school project, which was written in Python and run on ROS Indigo. If you'd look in the /scripts folder, you can see an example of a custom class that is being called from other scripts. If you'd look into the launch files in /launch you can see an example of configuring the ROS nodes - maybe that is your problem.
I'm working toward adopting Python as part of my team's development tool suite. With the other languages/tools we use, we develop many reusable functions and classes that are specific to the work we do. This standardizes the way we do things and saves a lot of wheel re-inventing.
I can't seem to find any examples of how this is usually handled with Python. Right now I have a development folder on a local drive, with multiple project folders below that, and an additional "common" folder containing packages and modules with re-usable classes and functions. These "common" modules are imported by modules within multiple projects.
Development/
Common/
Package_a/
Package_b/
Project1/
Package1_1/
Package1_2/
Project2/
Package2_1/
Package2_2/
In trying to learn how to distribute a Python application, it seems that there is an assumption that all referenced packages are below the top-level project folder, not collateral to it. The thought also occurred to me that perhaps the correct approach is to develop common/framework modules in a separate project, and once tested, deploy those to each developer's environment by installing to the site-packages folder. However, that also raises questions re distribution.
Can anyone shed light on this, or point me to a resource that discusses this issue?
If you have common code that you want to share across multiple projects, it may be worth thinking about storing this code in a physically separate project, which is then imported as a dependency into your other projects. This is easily achieved if you host your common code project in github or bitbucket, where you can use pip to install it in any other project. This approach not only helps you to easily share common code across multiple projects, but it also helps protect you from inadvertently creating bad dependencies (i.e. those directed from your common code to your non common code).
The link below provides a good introduction to using pip and virtualenv to manage dependencies, definitely worth a read if you and your team are fairly new to working with python as this is a very common toolchain used for just this kind of problem:
http://dabapps.com/blog/introduction-to-pip-and-virtualenv-python/
And the link below shows you how to pull in dependencies from github using pip:
How to use Python Pip install software, to pull packages from Github?
The must-read-first on this kind of stuff is here:
What is the best project structure for a Python application?
in case you haven't seen it (and follow the link in the second answer).
The key is that each major package be importable as if "." was the top level directory, which means that it will also work correctly when installed in a site-packages. What this implies is that major packages should all be flat within the top directory, as in:
myproject-0.1/
myproject/
framework/
packageA/
sub_package_in_A/
module.py
packageB/
...
Then both you (within your other packages) and your users can import as:
import myproject
import packageA.sub_package_in_A.module
etc
Which means you should think hard about #MattAnderson's comment, but if you want it to appear as a separately-distributable package, it needs to be in the top directory.
Note this doesn't stop you (or your users) from doing an:
import packageA.sub_package_in_A as sub_package_in_A
but it does stop you from allowing:
import sub_package_in_A
directly.
...it seems that there is an assumption that all referenced packages
are below the top-level project folder, not collateral to it.
That's mainly because the current working directory is the first entry in sys.path by default, which makes it very convenient to import modules and packages below that directory.
If you remove it, you can't even import stuff from the current working directory...
$ touch foo.py
$ python
>>> import sys
>>> del sys.path[0]
>>> import foo
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named foo
The thought also occurred to me that perhaps the correct approach is
to develop common/framework modules in a separate project, and once
tested, deploy those to each developer's environment by installing to
the site-packages folder.
It's not really a major issue for development. If you're using version control, and all developers check out the source tree in the same structure, you can easily employ relative path hacks to ensure the code works correctly without having to mess around with environment variables or symbolic links.
However, that also raises questions re distribution.
This is where things can get a bit more complicated, but only if you're planning to release libraries independently of the projects which use them, and/or having multiple project installers share the same libraries. It that's the case, take a look at distutils.
If not, you can simply employ the same relative path hacks used in development to ensure you project works "out of the box".
I think that this is the best reference for creating a distributable python package:
link removed as it leads to a hacked site.
also, don't feel that you need to nest everything under a single directory. You can do things like
platform/
core/
coremodule
api/
apimodule
and then do things like from platform.core import coremodule, etc.
I started a new Python project and I want to have a good structure from the beginning. I'm reading some convention Python guides but I don't find any info about how the main script must be named. Is there any rules for this? Is there any other kind of convention for folders or text files inside the project (like readme files)?
By the way, I'm programming a client-server app so there is no way for this to become a package (at least in the way a think a package is).
If you want to package your application to allow a ZIP file containing it or its directory to be passed as an argument to the python interpreter to run the application, name your main script __main__.py. If you don't care about being able to do this (and most python applications do not), name it whatever you want.
No such rule exists for python main script which starts your application. There are coding guidelines (PEP8) which you can follow to keep your code clean though.
You can check existing python applications which are easily available. May be open source/free software projects e.g yum (on rpm based distros) command, lots of python apps (you can checkout them from publicly available source code management systems e.g git repo) etc. You can check basic principles they follow. But there are no constraints as such.
Is there a standard file in python which lists all the modules comprising the project, and other metadata?
Is this simply the 'package'? Or, do different IDEs use their own specific files?
There really isn't a single file in any package that consistently lists every module the entire package imports. Some people make entries to the __init__.py and some don't. Usually most python supported IDE's will make available to you whatever is on your pythonpath. Eclipse pydev, for instance, will add the specific project to the pythonpath of that project space.
If your project is on the pythonpath, then it should resolve.
Application builders like py2app/py2exe will scan the entire project and create an import graph to discover every module needed for that project
There's no real equivalent in Python by itself. Python packages are the way to encapsulate a set of modules and include metadata, but it isn't exactly equivalent to the notion of a "project."
Otherwise, are some projects which use project files in order to give you some of the features which IDEs provide. In particular, you should check out the rope library and the PyCharm IDE for some systems which implement a project file.
We have a growing library of apps depending on a set of common util modules. We'd like to:
share the same utils codebase between all projects
allow utils to be extended (and fixed!) by developers working on any project
have this be reasonably simple to use for devs (i.e. not a big disruption to workflow)
cross-platform (no diffs for devs on Macs/Win/Linux)
We currently do this "manually", with the utils versioned as part of each app. This has its benefits, but is also quite painful to repeatedly fix bugs across a growing number of codebases.
On the plus side, it's very simple to deal with in terms of workflow - util module is part of each app, so on that side there is zero overhead.
We also considered (fleetingly) using filesystem links or some such (not portable between OS's)
I understand the implications about release testing and breakage, etc. These are less of a problem than the mismatched utils are at the moment.
You can take advantage of Python paths (the paths searched when looking for module to import).
Thus you can create different directory for utils and include it within different repository than the project that use these utils. Then include path to this repository in PYTHONPATH.
This way if you write import mymodule, it will eventually find mymodule in the directory containing utils. So, basically, it will work similarly as it works for standard Python modules.
This way you will have one repository for utils (or separate for each util, if you wish), and separate repositories for other projects, regardless of the version control system you use.
What versioning system are you under? If you are under git, take a look to submodules. The idea in this case is that you would be able to keep a unique, separate repository with the utils, that would be polled into the various project automatically.
I have no direct experience with mercurial, but I believe subrepositories are the equivalent feature.
If you are under SVN... wait... I hope not! :)