Handling util functions in python - python

In our current c-project, we use python scripts for support and testing purposes such as unit testing, integration testing, benchmarking and communication.
Current folder structure (most files not shown):
.
|-- workingcopy1
|-- |-- config
|-- |-- |-- __init__.py
|-- |-- |-- parameters_one.py
|-- |-- |-- parameters_two.py
|-- |-- |-- parameters_three.py
|-- |-- src
|-- |-- |-- main_application_c_code.c
|-- |-- tests
|-- |-- |-- tools
|-- |-- |-- |-- display_communication_activity.py
|-- |-- |-- run_all_unit_tests.py
|-- |-- tools
|-- |-- |-- script1.py
|-- |-- |-- script2.py
|-- |-- |-- script3.py
|-- |-- utils
|-- |-- |-- python_utils
|-- |-- |-- |-- __init__.py
|-- |-- |-- |-- communication_utils.py
|-- |-- |-- |-- conversion_utils.py
|-- |-- |-- |-- constants.py
|-- |-- |-- |-- time_utils.py
|-- workingcopy2
...
|-- workingcopy3
...
Some python files are intented to be executed as script files ($ python script1.py) and some are intended to be included as modules in other python files.
What we would like to achive is a structure that enables us to have parameter and utility functions that can be used by:
Test code
Other utility codes
Smaller python application used for monitoring of our system. I.e. custom benchmarking tools
It should also be possible to have several workingcopies checked out
Up until this date, all scripts have following lines at top:
import os, sys
current_path = os.path.abspath(os.path.dirname(__file__))
sys.path.append(os.path.join(current_path, '..', 'utils')) # Variates depending on script location in file tree
sys.path.append(os.path.join(current_path, '..', 'config')) # Variates depending on script location in file tree
import python_utils.constants as constants
import config.parameters_one as parameters
With about 20+ script files this has become hard to maintain. Is there any better way to achive this?

You should convert your folders into Python modules by adding an empty __init__.py file into it.
Also, you can add the Python shebang, so they are executable without explicitly calling the Python command from shell (Should I put #! (shebang) in Python scripts, and what form should it take?).
Once your folders are modules, you have to add only the main source path and you will be able to import the children modules in an easier manner.
Furthermore, you should use a virtual environment (virtualenv) that can handle the paths for you (http://docs.python-guide.org/en/latest/dev/virtualenvs/) (and maybe virtualenvwrapper that allows you extra functionality)
I wanted to add a couple of additional strategies you could use here:
One of the cool things about python is that everything is an object so you could import and pass your script modules as a variable to a function that run's them, initialising the appropriate path. Also, the function could "discover" the scripts by looking into the folder and walking through it.
Again, all this can be easily handled from pre-post activate virtualenvwrapper hooks.

Related

Can't import local module in Django

I have a directory structure like this:
|-- top-level-directory/
|-- django-app/
|-- some-app/
|-- models.py
|-- ...
|-- some-other-app/
|-- ...
|-- local-module/
|-- __init__.py
|-- ...
|-- other-directory-with-scripts/
|-- file-where-import-works-fine.py
In some-app/models.py, I have an import local-module statement. However, I get an error saying ModuleNotFoundError: No module named 'local-module'.
I can import the module in other scripts outside the django directory without any problems. Any idea why this could be happening or how I can fix it?

PyGTK Builder add_from_file() different path approach with PyCharm

Well, in my app/view.py there's the a function call to GTK.Builder().add_from_file("AppView.glade"), however I observed it has been interpreted differently depending of how I execute the code.
project
|-- app
| |-- __init__.py
| |-- model.py
| |-- view.py
| |-- ...
|
|-- test
| |-- __init__.py
| |-- model.py
| |-- ...
|
|-- ui
|-- AppView.glade
When I regularly execute (python3 -m app) or test (python3 -m unittest test) the project, the add_from_file function expect the relative path from project root, that is ui/AppView.glade.
However, when I run or debug from PyCharm project's test or execution it expect a relative path from the file is calling the function, for this case `../ui/AppView.
Could I change this PyCharm behavior?

Why do some Python packages have repetitive directory names?

The question of what the directory structure of a Python project has been asked a number of times on Stack Overflow (e.g. here, here and here)
And many answers are given. But one thing that doesn't seem to be clear in any of those answers is why some projects have repetitive directories. For example, in this article which is often cited, the suggested layout is:
<root>/
|-- Twisted/
| |-- __init__.py
| |-- README
| |-- setup.py
| |-- twisted/
| | |-- __init__.py
| | |-- main.py
| | |-- test/
| | | |-- __init__.py
| | | |-- test_main.py
| | | |-- test_other.py
| | |-- bin/
| | | |-- myprogram
In this example, /Twisted/twisted/main.py is the main file
But then on the other hand you have advice like this:
Many developers are structuring their repositories poorly due to the new bundled application templates.
<root>/
|-- samplesite/
| |-- manage.py
| |-- samplesite/
| | |-- settings.py
| | |-- wsgi.py
| | |-- sampleapp/
| | |-- models.py
Dont do this.
Repetitive paths are confusing for both your tools and your developers. Unnecessary nesting doesnt help anybody. Let's do it properly:
<root>/
|-- manage.py
|-- samplesite/
| |-- settings.py
| |-- wsgi.py
| |-- sampleapp/
| |-- models.py
My question is not necessarily "which way is better?", since there may be pros or cons to each way.
Instead, my question is, if I go with the more simplified second style, what will I lose? Is there a good reason to have a /<root>/Twisted/twisted/main.py directory structure rather than just /<root>/twisted/main.py ? Does it make it easier somehow to share my application or make the import process smoother? Something else?
I believe the most common layout of python projects is something like this:
project/
|-- setup.py
|-- bin/
|-- docs/ ...
|-- examples/ ...
|-- package/
|-- __init__.py
|-- module1.py
|-- module2.py
|-- subpackage/ ...
|-- tests/ ...
Where the project is the name of the project and the package is the name of the top level import, for example scikits-learn and sklearn. The package has everything that python should be able to import, and you import using the package name. For example from package import thing or from package.module1 import thing. The project has the package and any supporting things like docs, examples and installation scripts. Notice that there is typically no __init__.py in project because project is not python importable. It is common for the project and package to have the same name, but not required.
Those two documents are closer than you think. Both Interesting Things, Largely Python and Twisted Related (your first example) and the django-admin startproject docs assume you are outside of the project repository while Structuring Your Project (your second example) assumes you are inside the repository. To quote, "Well, they go to their bare and fresh repository and run the following...".
The django docs state that if you run
django-admin.py start-project samplesite
both the project directory and project package will be named and the project directory will be created in the current
working directory
The command creates the project directory for you, so you certainly shouldn't be inside of an already-created project directory when you run it. The docs go on to say
django-admin startproject myproject /Users/jezdez/Code/myproject_repo
If the optional destination is provided, Django will use that existing
directory as the project directory
Now, suppose you were already in /Users/jezdez/Code/myproject_repo. Then you would do
django-admin startproject myproject .
to create the project package in the current directory. Voila, you've got the second author's example! The author was really just telling you to avoid the first form if you are creating your repo before running the command.
So, lets redraw your directory structure. In the first example, <root> is the directory where you hold your dev repos. Twisted is the directory with your repo. (As an aside, that directory shouldn't have an __init__.py because its not a package directory). In the final example, <root> is the repo directory itself. Supposing I named that directory DjangoExample, then the structure would be
<root>
|-- Twisted/
| |-- __init__.py
| |-- README
| |-- setup.py
| |-- twisted/
| | |-- __init__.py
| | |-- main.py
| | |-- test/
| | | |-- __init__.py
| | | |-- test_main.py
| | | |-- test_other.py
| | |-- bin/
| | | |-- myprogram
|
|-- DjangoExample/
| |-- manage.py
| |-- samplesite/
| | |-- settings.py
| | |-- wsgi.py
| | |-- sampleapp/
| | |-- models.py
As for other differences, the django app has to follow the django framework rules whereas twised follows the more generic python package rules.

Inconsistent behaviour of bdist vs sdist when distributing a Python package

I have a big project with the following structure. utilities is a collections of small modules that are reused in various places by the different components of the big_project, project1, 2, etc.
big_project/
|-- __init__.py
|-- utilities/
|-- mod1.py
|-- mod2.py
|-- project1/
|-- setup.py
|-- __init__.py
|-- src/
|-- __init__.py
|-- mod1.py
|-- mod2.py
|-- examples/
|-- __init__.py
|-- mod.py
|-- project2/
|-- ...
|-- project3/
|-- ...
I want to distribute project1, including utilities (because I don't want to distribute utilities separately). The distributed package would have the following structures:
project1/
|-- utilities/
|-- src/
|-- examples/
and project1/setup.py looks like this:
setup(
name = 'project1',
packages = ['project1.utilities', 'project1.src', 'project1.examples'],
package_dir = {'project1.utilities': '../utilities/',
'project1.src': 'src',
'project1.examples': 'examples'}
)
The problem: python setup.py bdist produces a distribution with the right structure, but python setup.py sdist doesn't:
bdist: content of project1-0.1.linux-x86_64.tar.gz:
/./usr/local/lib/python2.7/site-packages/
|-- project1/
|-- utilities
|-- src
|-- examples
sdist: content of project1-0.1.tar.gz:
project1/
|-- src/
|-- examples/
So sdist left out the utilities module, whereas bdist included it at the correct location. Why?
If anyone wants to look at the real project: https://testpypi.python.org/pypi/microscopy where both the bsdist and sdist archives are available.
Both setuptools and distutils produce the same result. Because the project is pure Python, I'd rather use sdist...
One way that seems to work is to use bdist_wheel, which despite its name produces a platform-agnostic source distribution when the content is pure Python. And wheels are suppose to be the new standard.
setup.py also needs to be told about the root package project1, otherwise project1.__init__.py is missing:
setup(
name = 'project1',
packages = ['project1'
'project1.utilities',
'project1.src',
'project1.examples'],
package_dir = {'project1': '.',
'project1.utilities': '../utilities/',
'project1.src': 'src',
'project1.examples': 'examples'}
)
and then
python2.7 setup.py bdist_wheel
I suggest to update your MANIFEST.in file to include utilities folder
e.g. recursive-include ../utilities *

Setting up Python path during development

I work on a couple of different programs and packages in Python. They are each developed in their own Git repository, but frequently need to import modules defined in other packages. For instance, during development, the directory structure looks something like:
src/
|-- project-a/
| |-- client.py
| |-- server.py
| |-- package-a
| |-- __init__.py
| |-- module.py
|-- project-b/
| |-- package-b
| | |-- __init__.py
| | |-- other_module.py
| |-- package-c
| |-- __init__.py
| |-- third_module.py
|-- project-c/
|-- server1.py
|-- server2.py
|-- package-d/
|-- package-e/
|-- package-f/
When they are all installed, they work fine; they are all installed such that each package is in your Python path, and you can import from them as you need.
However, in development, I want the development version of each of these to be in my Python path, not the installed version. When making changes, I don't want to have to install each package that I'm changing to test it, I want the changes to take effect immediately. That means my Python path needs to include the directories project-a, project-b, etc.
Our current solution is just to have an environment.bash in the top level, which you source in your shell and it sets PYTHONPATH. That works OK, but I frequently forget to do so; since this is a client server application, with communications between servers, I need to have at least four windows open to different VMs to run this, and it happens pretty often that I forget to source environment.bash in at least one of those, leading me to try debugging strange behavior until I realize I'm importing the wrong things.
Another solution would be to set sys.path from within the top level client.py or server.py. This would work fine for launching them directly, but I would also need the path set up for running tools like Pylint or Sphinx, which that solution wouldn't cover. I'd also need a way to distinguish between running from source (when I want the path to include . and ../project-b) and running the installed version (which should use the standard path without modification).
Another choice would be to have a Makefile which sets up PYTHONPATH appropriately for various targets like make run-server, make lint, make doc, and so on. That's OK for those targets, which don't require any options, but would be inconvenient for running the client, which takes arguments. make run-client ARGS='foo bar' is a fairly cumbersome way to invoke it.
Is there any common way of setting up the Python path during development so that both my executables and tools like Pylint and Sphinx can pick it up appropriately, without interfering with how it will behave when installed?
A straightforward solution would be to simply symlink in the directories for each module in a separate folder, and run things from there. That way Python sees them all being in the same location, even though the actual sources are in different repositories.
src/
|-- project-a/
| |-- client.py
| |-- server.py
| |-- package-a
| |-- __init__.py
| |-- module.py
|-- project-b/
|-- package-b
| |-- __init__.py
| |-- other_module.py
|-- package-c
|-- __init__.py
|-- third_module.py
run/
|-- client.py --> ../src/project-a/client.py
|-- server.py --> ../src/project-a/server.py
|-- package-a/ --> ../src/project-a/package-a/
|-- package-b/ --> ../src/project-b/package-b/
|-- package-c/ --> ../src/project-b/package-c/

Categories