How to share python classes up a directory tree? - python

I have an example file structure provided below.
/.git
/README.md
/project
/Operation A
generateinsights.py
insights.py
/Operation B
generatetargets.py
targets.py
generateinsights.py is run; it references insights.py to get the definition of an insight object. Next, generatetargets.py is run; it refrences targets.py to get the definition of a target object. The issue that I have, is generatetargets.py also needs to understand what an insight object is. How can I set up my imports so that insights.py and targets.py can be referenced by anything in the project directory? It seems like I should use _ init _.py for this, but I can't get it to work properly.

Firstly, you have to rename Operation A and Operation B so that they are composed of only letters, numbers and underscores, for example Operation_A - this is needed to be able to use these in an import statement without raising a SyntaxError.
Then, put an __init__.py file into the project, Operation_A and Operation_B folders. You can leave it empty, but you can also for example define additional attributes for your module.
Finally, you need to make Python find your modules - for this, either:
set your PYTHONPATH environment variable so that it includes the folder containing project or
put the package folder somewhere into Python's default import directories, for example in ยด/usr/lib/python3/site-packages` (requires root permissions)
After that you can import both targets.py and insights.py from any place like this:
from project.Operation_A import insights
from project.Operation_B import targets

Related

Is it possible to make __init__ files describe parent package dynamically?

So I have a tree of modules and packages, it's two levels deep.
My goal, at top level, in main.py, is to refer to a sub-module using this syntax
import pkg1
pkg1.bebrs.fn_3.run()
the dot, dot, dot notation is very important to me.
inside the __init__.py inside the bebrs folder, my init states
from pkg1.bebrs import fn_3
it explicitly states the name of parent and current package.
Is there any way to type the pkg1 and bebrs in a more generalized way inside from pkg1.bebrs import fn_3? I wish I could just put generic __init__.py files and be able to import nested stuff with dot notation instead of having a very explicit nested import inside each nested __init__.py file.

__init__.py and proper initialization

I created a "project" with Spyder and wrote a few classes that I stored into separate files (I would call the files "module" but I am not sure it is the proper way to call it).
I read that the __init__.py file is meant to avoid unwanted overwriting of names.
Is it a proper way to program in Python if I use __init__.py to initialize things for all my "modules": define constants (eg. debug level), "calculated" values (eg. system encoding), imports, start a global debugging function (tuned for my needs), etc. ?
In other words: What is the best way to do the initialization?
Leave __init__.py empty and initialize my values in another "module" which will be called a/ at the beginning of each "module" or b/ by each one of the __init__() of the classes?
If I use __init__.py and import it in every "module" for initialization, Spyder warns because I don't use it explicitly: I am actually doing something wrong or should I ignore the warning?
Should I put the definition of constants in a Config.ini file and get it through configparser.ConfigParser()? or just define it directly in the initialization module (__init__.py or another) ?
Thanks in advance for your insights.
Pierre.

Insert Python Packages from Separate Directory into a different Namespace

Suppose in my Python path, I had the namespace foo. I have modules in a separate directory (not in the python path) called bar: x.py, y.py, z.py. So the layout might look something like this:
|--/python/path/site-packages/foo/
|----__init.py__
|--...
|--/some/other/directory/bar/
|----__init__.py
|----x.py
|----y.py
|----z.py
So, given that foo is already in my path, I can easily do import foo. However, is there any sort of black magic I can add to that foo/__init__.py so that in my Python shell, I can start doing something like from foo import x or from foo.x import my_function? Ideally looking for a solution that works on both Python 2.7 and Python 3.6, but that isn't strict.
EDIT: I wanted to add that bar/ could also have sub-folders or sub-packages, in the ideal scenario.
Forgot that I had asked this question here, but, in case anyone else ends up here, this is what I ended up doing.
# /python/path/site-packages/foo/__init__.py
__path__.append("/some/other/directory/bar/")
The __path__ for a particular namespace tells Python which directories that namespace should look at.

Recursively populating __all__ in __init__.py

I'm using the following code to populate __all__ in my module's __init__.py and I was wandering if there was a more efficient way. Any ideas?
import fnmatch
import os
__all__ = []
for root, dirnames, filenames in os.walk(os.path.dirname(__file__)):
root = root[os.path.dirname(__file__).__len__():]
for filename in fnmatch.filter(filenames, "*.py"):
__all__.append(os.path.join(root, filename[:-3]))
You probably shouldn't be doing this: The default behaviour of import is quite flexible. If you don't want a module (or any other variable) to be automatically exported, give it a name that starts with _ and python won't export it. That's the standard python way, and reinventing the wheel is considered unpythonic. Also, don't forget that other things besides modules may need exporting; once you set __all__, you'll need to find and export them as well.
Still, you ask how to best generate a list of your exportable modules. Since you can't export what's not present, I'd just check what modules of your own are known to your main module:
basedir = os.path.dirname(__file__)
for m in sys.modules:
if m in locals() and not m.startswith('_'): # Only export regular names
mod = locals()[m]
if '__file__' in mod.__dict__ and mod.__file__.startswith(basedir):
print m
sys.modules includes the names of every module that python has loaded, including many that have not been exported to your main module-- so we check if they're in locals().
This is faster than scanning your filesystem, and more robust than assuming that every .py file in your directory tree will somehow end up as a top-level submodule. Naturally you should run this code near the end of your __init__.py, when everything has been loaded.
I work with a few complex packages that have sub-packages and sub-modules. I like to control this on a module by module basis. I use a simple package called auto-all which makes it easy (full disclosure - I am the author).
https://pypi.org/project/auto-all/
Here's an example:
from auto_all import start_all, end_all
# Define some internal stuff
start_all(globals())
# Define some external stuff
end_all(globals())
The reason I use this approach is mainly because of imports. As mentioned by alexis, you can implicitly make things private by prefixing object names with an underscore, however this can get messy or just impractical for imported objects. Consider the following code:
from pyspark.sql.session import SparkSession
If this appears in your module then you will be implicitly making SparkSession available to be accessed from outside the module. The alternative is to prefix all imported items with underscores, for example:
from pyspark.sql.session import SparkSession as _SparkSession
This also isn't ideal, so manually managing __all__ is the only way (I'm aware of) to manage what you make externally available.
You can easily do this by explicitly setting the contents of the __all__ variable (which is the pythonic way), but this can become tedious when managing a large number of objects, and can also lead to issues if a developer adds a new object and doesn't expose it by adding to the __all__ variable. This type of thing can slip through code reviews. Using simple helper functions to manage the variable contents makes this much easier.

Python includes, module scope issue

I'm working on my first significant Python project and I'm having trouble with scope issues and executing code in included files. Previously my experience is with PHP.
What I would like to do is have one single file that sets up a number of configuration variables, which would then be used throughout the code. Also, I want to make certain functions and classes available globally. For example, the main file would include a single other file, and that file would load a bunch of commonly used functions (each in its own file) and a configuration file. Within those loaded files, I also want to be able to access the functions and configuration variables. What I don't want to do, is to have to put the entire routine at the beginning of each (included) file to include all of the rest. Also, these included files are in various sub-directories, which is making it much harder to import them (especially if I have to re-import in every single file).
Anyway I'm looking for general advice on the best way to structure the code to achieve what I want.
Thanks!
In python, it is a common practice to have a bunch of modules that implement various functions and then have one single module that is the point-of-access to all the functions. This is basically the facade pattern.
An example: say you're writing a package foo, which includes the bar, baz, and moo modules.
~/project/foo
~/project/foo/__init__.py
~/project/foo/bar.py
~/project/foo/baz.py
~/project/foo/moo.py
~/project/foo/config.py
What you would usually do is write __init__.py like this:
from foo.bar import func1, func2
from foo.baz import func3, constant1
from foo.moo import func1 as moofunc1
from foo.config import *
Now, when you want to use the functions you just do
import foo
foo.func1()
print foo.constant1
# assuming config defines a config1 variable
print foo.config1
If you wanted, you could arrange your code so that you only need to write
import foo
At the top of every module, and then access everything through foo (which you should probably name "globals" or something to that effect). If you don't like namespaces, you could even do
from foo import *
and have everything as global, but this is really not recommended. Remember: namespaces are one honking great idea!
This is a two-step process:
In your module globals.py import the items from wherever.
In all of your other modules, do "from globals import *"
This brings all of those names into the current module's namespace.
Now, having told you how to do this, let me suggest that you don't. First of all, you are loading up the local namespace with a bunch of "magically defined" entities. This violates precept 2 of the Zen of Python, "Explicit is better than implicit." Instead of "from foo import *", try using "import foo" and then saying "foo.some_value". If you want to use the shorter names, use "from foo import mumble, snort". Either of these methods directly exposes the actual use of the module foo.py. Using the globals.py method is just a little too magic. The primary exception to this is in an __init__.py where you are hiding some internal aspects of a package.
Globals are also semi-evil in that it can be very difficult to figure out who is modifying (or corrupting) them. If you have well-defined routines for getting/setting globals, then debugging them can be much simpler.
I know that PHP has this "everything is one, big, happy namespace" concept, but it's really just an artifact of poor language design.
As far as I know program-wide global variables/functions/classes/etc. does not exist in Python, everything is "confined" in some module (namespace). So if you want some functions or classes to be used in many parts of your code one solution is creating some modules like: "globFunCl" (defining/importing from elsewhere everything you want to be "global") and "config" (containing configuration variables) and importing those everywhere you need them. If you don't like idea of using nested namespaces you can use:
from globFunCl import *
This way you'll "hide" namespaces (making names look like "globals").
I'm not sure what you mean by not wanting to "put the entire routine at the beginning of each (included) file to include all of the rest", I'm afraid you can't really escape from this. Check out the Python Packages though, they should make it easier for you.
This depends a bit on how you want to package things up. You can either think in terms of files or modules. The latter is "more pythonic", and enables you to decide exactly which items (and they can be anything with a name: classes, functions, variables, etc.) you want to make visible.
The basic rule is that for any file or module you import, anything directly in its namespace can be accessed. So if myfile.py contains definitions def myfun(...): and class myclass(...) as well as myvar = ... then you can access them from another file by
import myfile
y = myfile.myfun(...)
x = myfile.myvar
or
from myfile import myfun, myvar, myclass
Crucially, anything at the top level of myfile is accessible, including imports. So if myfile contains from foo import bar, then myfile.bar is also available.

Categories