I have defined several classes in a single python file. My wish is to create a library with these. I would ideally like to import the library in such a way that I can use the classes without a prefix (like mylibrary.myclass() as opposed to just myclass() ), if that's what you can call them, I am not entirely sure as I am a beginner.
What is the proper way to achieve this, or the otherwise best result? Define all classes in __init __? Define them all in a single file as I currently have like AllMyClasses.py? Or should I have a separate file for every class in the library directory like FirstClass.py, SecondClass.py etc.
I realize this is a question that should be easy enough to google, but since I am still quite new to python and programming in general I haven't quite figured out what the correct keywords are for a problem in this context(such as my uncertainty about "prefix")
More information can be found in the tutorial on modules (single files) or packages (when in a directory with an __init__.py file) on the python site.
The suggested way (according to the style guide) is to spell out each class import specifically.
from my_module import MyClass1, MyClass2
object1 = MyClass1()
object2 = MyClass2()
While you can also shorten the module name:
import my_module as mo
object = mo.MyClass1()
Using from my_module import * is recommended to be avoided as it can be confusing (even if it is the recommended way for some things, like tkinter)
If it's for your personal use, you can just put all your classes Class1, Class2, ... in a myFile.py and to use them call import myFile (without the .py extension)
import myFile
myVar1 = myFile.Class1()
myVar2 = myFile.Class2()
from within another script. If you want to be able to use the classes without the file name prefix, import the file like this:
from myFile import *
Note that the file you want to import should be in a directory where Python can find it (the same where the script is running or a directory in PYTHONPATH).
The _init_ is needed if you want to create a Python module for distribution. Here are the instructions: Distributing Python Modules
EDIT after checking the Python's style guide PEP 8 on imports:
Wildcard imports (from import) should be avoided, as they make it unclear which names are present in the namespace, confusing both readers and many automated tools
So in this example you should have used
from myFile import Class1, Class2
Related
I have a file, myfile.py, which imports Class1 from file.py and file.py contains imports to different classes in file2.py, file3.py, file4.py.
In my myfile.py, can I access these classes or do I need to again import file2.py, file3.py, etc.?
Does Python automatically add all the imports included in the file I imported, and can I use them automatically?
Best practice is to import every module that defines identifiers you need, and use those identifiers as qualified by the module's name; I recommend using from only when what you're importing is a module from within a package. The question has often been discussed on SO.
Importing a module, say moda, from many modules (say modb, modc, modd, ...) that need one or more of the identifiers moda defines, does not slow you down: moda's bytecode is loaded (and possibly build from its sources, if needed) only once, the first time moda is imported anywhere, then all other imports of the module use a fast path involving a cache (a dict mapping module names to module objects that is accessible as sys.modules in case of need... if you first import sys, of course!-).
Python doesn't automatically introduce anything into the namespace of myfile.py, but you can access everything that is in the namespaces of all the other modules.
That is to say, if in file1.py you did from file2 import SomeClass and in myfile.py you did import file1, then you can access it within myfile as file1.SomeClass. If in file1.py you did import file2 and in myfile.py you did import file1, then you can access the class from within myfile as file1.file2.SomeClass. (These aren't generally the best ways to do it, especially not the second example.)
This is easily tested.
In the myfile module, you can either do from file import ClassFromFile2 or from file2 import ClassFromFile2 to access ClassFromFile2, assuming that the class is also imported in file.
This technique is often used to simplify the API a bit. For example, a db.py module might import various things from the modules mysqldb, sqlalchemy and some other helpers. Than, everything can be accessed via the db module.
If you are using wildcard import, yes, wildcard import actually is the way of creating new aliases in your current namespace for contents of the imported module. If not, you need to use the namespace of the module you have imported as usual.
I'm using the following code to populate __all__ in my module's __init__.py and I was wandering if there was a more efficient way. Any ideas?
import fnmatch
import os
__all__ = []
for root, dirnames, filenames in os.walk(os.path.dirname(__file__)):
root = root[os.path.dirname(__file__).__len__():]
for filename in fnmatch.filter(filenames, "*.py"):
__all__.append(os.path.join(root, filename[:-3]))
You probably shouldn't be doing this: The default behaviour of import is quite flexible. If you don't want a module (or any other variable) to be automatically exported, give it a name that starts with _ and python won't export it. That's the standard python way, and reinventing the wheel is considered unpythonic. Also, don't forget that other things besides modules may need exporting; once you set __all__, you'll need to find and export them as well.
Still, you ask how to best generate a list of your exportable modules. Since you can't export what's not present, I'd just check what modules of your own are known to your main module:
basedir = os.path.dirname(__file__)
for m in sys.modules:
if m in locals() and not m.startswith('_'): # Only export regular names
mod = locals()[m]
if '__file__' in mod.__dict__ and mod.__file__.startswith(basedir):
print m
sys.modules includes the names of every module that python has loaded, including many that have not been exported to your main module-- so we check if they're in locals().
This is faster than scanning your filesystem, and more robust than assuming that every .py file in your directory tree will somehow end up as a top-level submodule. Naturally you should run this code near the end of your __init__.py, when everything has been loaded.
I work with a few complex packages that have sub-packages and sub-modules. I like to control this on a module by module basis. I use a simple package called auto-all which makes it easy (full disclosure - I am the author).
https://pypi.org/project/auto-all/
Here's an example:
from auto_all import start_all, end_all
# Define some internal stuff
start_all(globals())
# Define some external stuff
end_all(globals())
The reason I use this approach is mainly because of imports. As mentioned by alexis, you can implicitly make things private by prefixing object names with an underscore, however this can get messy or just impractical for imported objects. Consider the following code:
from pyspark.sql.session import SparkSession
If this appears in your module then you will be implicitly making SparkSession available to be accessed from outside the module. The alternative is to prefix all imported items with underscores, for example:
from pyspark.sql.session import SparkSession as _SparkSession
This also isn't ideal, so manually managing __all__ is the only way (I'm aware of) to manage what you make externally available.
You can easily do this by explicitly setting the contents of the __all__ variable (which is the pythonic way), but this can become tedious when managing a large number of objects, and can also lead to issues if a developer adds a new object and doesn't expose it by adding to the __all__ variable. This type of thing can slip through code reviews. Using simple helper functions to manage the variable contents makes this much easier.
I am building an application in Python and I have my whole package. While I really like the fact that you have to explicitly state every import you need, I was wondering if there is a way to add some function or class to the global scope implicitly.
In my example a want a Factory class that should be available in all files. Classes like dict, str and so on are all available and I thought maybe it is possible to add my own class to the global scope in the same way in my __init__.py.
Is this possible?
For interactive mode, add all your import definitions to a file (say all_my_imports.py) like below:
from abc import xyz
from my_stuff import *
And, point the environment variable PYTHONSTARTUP to it.
From a script, simply import the file that contains the above definitions:
from all_my_imports import *
Remember, it is not good to depend on this functionality, it's (almost) always better to explicitly import all your modules.
I like the Java convention of having one public class per file, even if there are sometimes good reasons to put more than one public class into a single file. In my case I have alternative implementations of the same interface. But if I would place them into separate files, I'd have redundant names in the import statements (or misleading module names):
import someConverter.SomeConverter
whereas someConverter would be the file (and module) name and SomeConverter the class name. This looks pretty inelegant to me. To put all alternative classes into one file would lead to a more meaningful import statement:
import converters.SomeConverter
But I fear that the files become pretty large, if I put all related classes into a single module file. What is the Python best practise here? Is one class per file unusual?
A lot of it is personal preference. Using python modules, you do have the option to keep each class in a separate file and still allow for import converters.SomeConverter (or from converters import SomeConverter)
Your file structure could look something like this:
* converters
- __init__.py
- baseconverter.py
- someconverter.py
- otherconverter.py
and then in your __init__.py file:
from baseconverter import BaseConverter
from otherconverter import OtherConverter
Zach's solution breaks on Python 3. Here is a fixed solution.
A lot of it is personal preference. Using python modules, you do have the option to keep each class in a separate file and still allow for import converters.SomeConverter (or from converters import SomeConverter)
Your file structure could look something like this:
* converters
- __init__.py
- baseconverter.py
- someconverter.py
- otherconverter.py
and then in your __init__.py file:
from converters.baseconverter import BaseConverter
from converters.otherconverter import OtherConverter
The above solutions are good, but the problem with importing modules in __init__.py is that this will cause all the modules to be loaded twice(inefficient). Try adding a print statement at the end of otherconverter.py and run otherconverter.py. (You'll see that the print statement is executed twice)
I prefer the following. Use another package with name "_converter" and define everything there. And then your "converters.py" becomes the interface for accessing all public members
* _converters
- __init__.py
- baseconverter.py
- someconverter.py
- otherconverter.py
* converters.py
where converters.py is
from _converters.someconverter import SomeConverter
from _converters.otherconverter import OtherConverter
...
...
...
converters = [SomeConverter, OtherConverter, ...]
And as the previous solutions mentioned, it is a personal choice. A few practices involve defining a module "interace.py" within the package and importing all public members here. If you have many modules to load, you should choose efficiency over aesthetics.
I'm working on my first significant Python project and I'm having trouble with scope issues and executing code in included files. Previously my experience is with PHP.
What I would like to do is have one single file that sets up a number of configuration variables, which would then be used throughout the code. Also, I want to make certain functions and classes available globally. For example, the main file would include a single other file, and that file would load a bunch of commonly used functions (each in its own file) and a configuration file. Within those loaded files, I also want to be able to access the functions and configuration variables. What I don't want to do, is to have to put the entire routine at the beginning of each (included) file to include all of the rest. Also, these included files are in various sub-directories, which is making it much harder to import them (especially if I have to re-import in every single file).
Anyway I'm looking for general advice on the best way to structure the code to achieve what I want.
Thanks!
In python, it is a common practice to have a bunch of modules that implement various functions and then have one single module that is the point-of-access to all the functions. This is basically the facade pattern.
An example: say you're writing a package foo, which includes the bar, baz, and moo modules.
~/project/foo
~/project/foo/__init__.py
~/project/foo/bar.py
~/project/foo/baz.py
~/project/foo/moo.py
~/project/foo/config.py
What you would usually do is write __init__.py like this:
from foo.bar import func1, func2
from foo.baz import func3, constant1
from foo.moo import func1 as moofunc1
from foo.config import *
Now, when you want to use the functions you just do
import foo
foo.func1()
print foo.constant1
# assuming config defines a config1 variable
print foo.config1
If you wanted, you could arrange your code so that you only need to write
import foo
At the top of every module, and then access everything through foo (which you should probably name "globals" or something to that effect). If you don't like namespaces, you could even do
from foo import *
and have everything as global, but this is really not recommended. Remember: namespaces are one honking great idea!
This is a two-step process:
In your module globals.py import the items from wherever.
In all of your other modules, do "from globals import *"
This brings all of those names into the current module's namespace.
Now, having told you how to do this, let me suggest that you don't. First of all, you are loading up the local namespace with a bunch of "magically defined" entities. This violates precept 2 of the Zen of Python, "Explicit is better than implicit." Instead of "from foo import *", try using "import foo" and then saying "foo.some_value". If you want to use the shorter names, use "from foo import mumble, snort". Either of these methods directly exposes the actual use of the module foo.py. Using the globals.py method is just a little too magic. The primary exception to this is in an __init__.py where you are hiding some internal aspects of a package.
Globals are also semi-evil in that it can be very difficult to figure out who is modifying (or corrupting) them. If you have well-defined routines for getting/setting globals, then debugging them can be much simpler.
I know that PHP has this "everything is one, big, happy namespace" concept, but it's really just an artifact of poor language design.
As far as I know program-wide global variables/functions/classes/etc. does not exist in Python, everything is "confined" in some module (namespace). So if you want some functions or classes to be used in many parts of your code one solution is creating some modules like: "globFunCl" (defining/importing from elsewhere everything you want to be "global") and "config" (containing configuration variables) and importing those everywhere you need them. If you don't like idea of using nested namespaces you can use:
from globFunCl import *
This way you'll "hide" namespaces (making names look like "globals").
I'm not sure what you mean by not wanting to "put the entire routine at the beginning of each (included) file to include all of the rest", I'm afraid you can't really escape from this. Check out the Python Packages though, they should make it easier for you.
This depends a bit on how you want to package things up. You can either think in terms of files or modules. The latter is "more pythonic", and enables you to decide exactly which items (and they can be anything with a name: classes, functions, variables, etc.) you want to make visible.
The basic rule is that for any file or module you import, anything directly in its namespace can be accessed. So if myfile.py contains definitions def myfun(...): and class myclass(...) as well as myvar = ... then you can access them from another file by
import myfile
y = myfile.myfun(...)
x = myfile.myvar
or
from myfile import myfun, myvar, myclass
Crucially, anything at the top level of myfile is accessible, including imports. So if myfile contains from foo import bar, then myfile.bar is also available.