I'm working on my first significant Python project and I'm having trouble with scope issues and executing code in included files. Previously my experience is with PHP.
What I would like to do is have one single file that sets up a number of configuration variables, which would then be used throughout the code. Also, I want to make certain functions and classes available globally. For example, the main file would include a single other file, and that file would load a bunch of commonly used functions (each in its own file) and a configuration file. Within those loaded files, I also want to be able to access the functions and configuration variables. What I don't want to do, is to have to put the entire routine at the beginning of each (included) file to include all of the rest. Also, these included files are in various sub-directories, which is making it much harder to import them (especially if I have to re-import in every single file).
Anyway I'm looking for general advice on the best way to structure the code to achieve what I want.
Thanks!
In python, it is a common practice to have a bunch of modules that implement various functions and then have one single module that is the point-of-access to all the functions. This is basically the facade pattern.
An example: say you're writing a package foo, which includes the bar, baz, and moo modules.
~/project/foo
~/project/foo/__init__.py
~/project/foo/bar.py
~/project/foo/baz.py
~/project/foo/moo.py
~/project/foo/config.py
What you would usually do is write __init__.py like this:
from foo.bar import func1, func2
from foo.baz import func3, constant1
from foo.moo import func1 as moofunc1
from foo.config import *
Now, when you want to use the functions you just do
import foo
foo.func1()
print foo.constant1
# assuming config defines a config1 variable
print foo.config1
If you wanted, you could arrange your code so that you only need to write
import foo
At the top of every module, and then access everything through foo (which you should probably name "globals" or something to that effect). If you don't like namespaces, you could even do
from foo import *
and have everything as global, but this is really not recommended. Remember: namespaces are one honking great idea!
This is a two-step process:
In your module globals.py import the items from wherever.
In all of your other modules, do "from globals import *"
This brings all of those names into the current module's namespace.
Now, having told you how to do this, let me suggest that you don't. First of all, you are loading up the local namespace with a bunch of "magically defined" entities. This violates precept 2 of the Zen of Python, "Explicit is better than implicit." Instead of "from foo import *", try using "import foo" and then saying "foo.some_value". If you want to use the shorter names, use "from foo import mumble, snort". Either of these methods directly exposes the actual use of the module foo.py. Using the globals.py method is just a little too magic. The primary exception to this is in an __init__.py where you are hiding some internal aspects of a package.
Globals are also semi-evil in that it can be very difficult to figure out who is modifying (or corrupting) them. If you have well-defined routines for getting/setting globals, then debugging them can be much simpler.
I know that PHP has this "everything is one, big, happy namespace" concept, but it's really just an artifact of poor language design.
As far as I know program-wide global variables/functions/classes/etc. does not exist in Python, everything is "confined" in some module (namespace). So if you want some functions or classes to be used in many parts of your code one solution is creating some modules like: "globFunCl" (defining/importing from elsewhere everything you want to be "global") and "config" (containing configuration variables) and importing those everywhere you need them. If you don't like idea of using nested namespaces you can use:
from globFunCl import *
This way you'll "hide" namespaces (making names look like "globals").
I'm not sure what you mean by not wanting to "put the entire routine at the beginning of each (included) file to include all of the rest", I'm afraid you can't really escape from this. Check out the Python Packages though, they should make it easier for you.
This depends a bit on how you want to package things up. You can either think in terms of files or modules. The latter is "more pythonic", and enables you to decide exactly which items (and they can be anything with a name: classes, functions, variables, etc.) you want to make visible.
The basic rule is that for any file or module you import, anything directly in its namespace can be accessed. So if myfile.py contains definitions def myfun(...): and class myclass(...) as well as myvar = ... then you can access them from another file by
import myfile
y = myfile.myfun(...)
x = myfile.myvar
or
from myfile import myfun, myvar, myclass
Crucially, anything at the top level of myfile is accessible, including imports. So if myfile contains from foo import bar, then myfile.bar is also available.
Related
I am building a python library. The functions I want available for users are in stemmer.py. Stemmer.py uses stemmerutil.py
I was wondering whether there is a way to make stemmerutil.py not accessible to users.
If you want to hide implementation details from your users, there are two routes that you can go. The first uses conventions to signal what is and isn't part of the public API, and the other is a hack.
The convention for declaring an API within a python library is to add all classes/functions/names that should be exposed into an __all__-list of the topmost __init__.py. It doesn't do that many useful things, its main purpose nowadays is a symbolic "please use this and nothing else". Yours would probably look somewhat like this:
urdu/urdu/__init__.py
from urdu.stemmer import Foo, Bar, Baz
__all__ = [Foo, Bar, Baz]
To emphasize the point, you can also give all definitions within stemmerUtil.py an underscore before their name, e.g. def privateFunc(): ... becomes def _privateFunc(): ...
But you can also just hide the code from the interpreter by making it a resource instead of a module within the package and loading it dynamically. This is a hack, and probably a bad idea, but it is technically possible.
First, you rename stemmerUtil.py to just stemmerUtil - now it is no longer a python module and can't be imported with the import keyword. Next, update this line in stemmer.py
import stemmerUtil
with
import importlib.util
import importlib.resources
# in python3.7 and lower, this is importlib_resources and needs to be installed first
stemmer_util_spec = importlib.util.spec_from_loader("stemmerUtil", loader=None)
stemmerUtil = importlib.util.module_from_spec(stemmer_util_spec)
with importlib.resources.path("urdu", "stemmerUtil") as stemmer_util_path:
with open(stemmer_util_path) as stemmer_util_file:
stemmer_util_code = stemmer_util_file.read()
exec(stemmer_util_code, stemmerUtil.__dict__)
After running this code, you can use the stemmerUtil module as if you had imported it, but it is invisible to anyone who installed your package - unless they run this exact code as well.
But as I said, if you just want to communicate to your users which part of your package is the public API, the first solution is vastly preferable.
I come from a PHP (as well as a bunch of other stuff) background and I am playing around with Python. In PHP when I want to include another file I just do include or require and everything in that file is included.
But it seems the recommended way to do stuff in python is from file import but that seems to be more for including libraries and stuff? How do you separate your code amongst several files? Is the only way to do it, to have a single file with a whole bunch of function calls and then import 15 other files?
Things are totally different between PHP and Python, and there are many reasons why.
But it seems the recommended way to do stuff in python is from file import but that seems to be more for including libraries and stuff?
Indeed, import statements are for importing objects from another module to current module. You can either import all the objects of the imported module to current module:
import foo
print foo.bar
or you can select what you want from that module:
from foo import bar
print bar
and even better, if you import a module twice, it will be only imported once:
>> import foo as foo1
>> import foo as foo2
>> foo1 is foo2
True
How do you separate your code amongst several files?
You have to think about your code... That's called software design, and here are a few rules:
you never write an algorithm at the module's level; instead make it a function, and call that function
you never instantiate an object at the module's level; you shall embed it in the function, and call that function
if you need an object in several different functions, create a class and encapsulate that object in that class, then use it in your functions bound to that class (so they now are called methods)
The only exception is when you want to launch a program from command line, you append:
if __name__ == "__main__":
at the end of the module. And my best advice would be to just call your first function afterwards:
if __name__ == "__main__":
main()
Is the only way to do it, to have a single file with a whole bunch of function calls and then import 15 other files?
It's not the only way to do it, but it's the best way to do it. You make all your algorithms into libraries of functions and objects, and then import exactly what you need in other libraries etc.. That's how you create a whole universe of reusable code and never have to reinvent the wheel! So forget about files, and think about modules that contains objects.
Finally, my best advice to you learning python is to unlearn every habit and usage you had while coding PHP, and learn those things again, differently. In the end, that can only make you a better software engineer.
I guess I understand what you are tring to say and to do.
Here is the random include example from PHP:
File #1 - vars.php
<?php
$color = 'green';
$fruit = 'apple';
?>
File #2 - main.php
<?php
echo "A $color $fruit"; // A
include 'vars.php';
echo "A $color $fruit"; // A green apple
?>
The fist echo command will print just "A" string, for it does not have any values assigned to the vars. The next echo will print a full string thanks to your include before it.
Python's "import", however, imports a module or it's part, so you could work with it in your current module.
Here is a python example:
File 1 - echo.py
apple = 'apple'
color = 'green'
File 2 - main.py
import echo
def func():
print "A "+echo.color+" "+echo.fruit
if __name__ == '__main__':
func()
In other words - you import some functionality from one module and then use it in your other module.
The example above is not really good from programming standarts or best practises, but I think it gives you some understanding.
Interesting question. As you know, in PHP, you can separate your code by using include, which literally takes all the code in the included file and puts it wherever you called include. This is convenient for writing web applications because you can easily divide a page into parts (such as header, navigation, footer, etc).
Python, on the other hand, is used for way more than just web applications. To reuse code, you must rely on functions or good old object-oriented programming. PHP also has functions and object-oriented programming FYI.
You write functions and classes in a file and import it in another file. This lets you access the functions or use the classes you defined in the other file.
Lets say you have a function called foo in file file1.py. From file2.py, you can write
import file1. Then, call foo with file1.foo(). Alternatively, write from file1 import foo and then you can call foo with foo(). Note that the from lets you call foo directly. For more info, look at the python docs.
On a technical level, a Python import is very similar to a PHP require, as it will execute the imported file. But since Python isn't designed to ultimately generate an HTML file, the way you use it is very different.
Typically a Python file will on the module level not include much executable code at all, but definitions of functions and classes. You them import them and use them as a library.
Hence having things like header() and footer() makes no sense in Python. Those are just functions. Call them like that, and the result they generate will be ignored.
So how do you split up your Python code? Well, you split it up into functions and classes, which you put into different files, and then import.
There's an execfile() function which does something vaguely comparable with PHP's include, here, but it's almost certainly something you don't want to do. As others have said, it's just a different model and a different programming need in Python. Your code is going to go from function to function, and it doesn't really make a difference in which order you put them, as long as they're in an order where you define things before you use them. You're just not trying to end up with some kind of ordered document like you typically are with PHP, so the need isn't there.
One way is to use import x, without using "from" keyword. So then you refer to things with their namespace everywhere.
Is there any other way? like doing something like in C++ ifnotdef __b__ def __b__ type of thing?
Merge any pair of modules that depend on each other into a single module. Then introduce extra modules to get the old names back.
E.g.,
# a.py
from b import B
class A: whatever
# b.py
from a import A
class B: whatever
becomes
# common.py
class A: whatever
class B: whatever
# a.py
from common import A
# b.py
from common import B
Circular imports are a "code smell," and often (but not always) indicate that some refactoring would be appropriate. E.g., if A.x uses B.y and B.y uses A.z, then you might consider moving A.z into its own module.
If you do think you need circular imports, then I'd generally recommend importing the module and referring to objects with fully qualified names (i.e, import A and use A.x rather than from A import x).
If you're trying to do from A import *, the answer is very simple: Don't do that. You're usually supposed to do import A and refer to the qualified names.
For quick&dirty scripts, and interactive sessions, that's a perfectly reasonable thing to do—but in such cases, you won't run into circular imports.
There are some cases where it makes sense to do import * in real code. For example, if you want to hide a module structure that's complex, or that you generate dynamically, or that changes frequently between versions, or if you're wrapping up someone else's package that's too deeply nested, import * may make sense from a "wrapper module" or a top-level package module. But in that case, nothing you import will be importing you.
In fact, I'm having a hard time imagining any case where import * is warranted and circular dependencies are even a possibility.
If you're doing from A import foo, there are ways around that (e.g., import A then foo = A.foo). But you probably don't want to do that. Again, consider whether you really need to bring foo into your namespace—qualified names are a feature, not a problem to be worked around.
If you're doing the from A import foo just for convenience in implementing your functions, because A is actually long_package_name.really_long_module_name and your code is unreadable because of all those calls to long_package_name.really_long_module_name.long_class_name.class_method_that_puts_me_over_80_characters, remember that you can always import long_package_name.really_long_module_name as P and then use P for you qualified calls.
(Also, remember that with any from done for implementation convenience, you probably want to make sure to specify a __all__ to make sure the imported names don't appear to be part of your namespace if someone does an import * on you from an interactive session.)
Also, as others have pointed out, most, but not all, cases of circular dependencies, are a symptom of bad design, and refactoring your modules in a sensible way will fix it. And in the rare cases where you really do need to bring the names into your namespace, and a circular set of modules is actually the best design, some artificial refactoring may still be a better choice than foo = A.foo.
a.py:
import b
import c
...
import z
class Foo(object):
...
Each of thoses module B-Z needs to use class foo.
Is some way, like importing, which allows indirect access (e.g. via an object) to all values of all modules A-Z, while still allowing each module B-Z access to A's namespace (e.g. foo).
No. They must each in turn import A themselves.
I still cannot tell what you are trying to do or even asking, but this is my best guess:
Normally, just use classic imports.
IF a module is growing too large, or if you have an extremely good reason to split things up but desire to share the same namespace, you can "hoist" values into a dummy namespace. For example if I had widget.Foo and widget.Bar and wanted them in different files, but I wanted to be able to type Foo and Bar in each file, I would normally have to from widget import Foo and from widget import Bar. If you have MANY of these files (foo.py,bar.py,baz.py,...,zeta.py) it can get a bit unwieldy. Thus you can improve your situation by importing them only once, in widget/__init__.py, and then going from foo import *, from bar import *, ... in each folder just once, and going from widget import * only once in each module. And you're done!... well... almost...
This gets you into a circular import scenario, which you have to be extremely careful of: Circular (or cyclic) imports in Python It will be fine for example if you reference Bar in a function in foo.py, everything is fine because you don't immediately use the value. However if you do x = Bar in foo.py then the value may not have been defined yet!
sidenote: You can programatically import using the __import__ function. If you couple this with os.walk then you can avoid having to type from ... import * for each file in your widget folder. This is a critical and necessary step to avoid bugs down the line.
As I understand it python has the following outermost namespaces:
Builtin - This namespace is global across the entire interpreter and all scripts running within an interpreter instance.
Globals - This namespace is global across a module, ie across a single file.
I am looking for a namespace in between these two, where I can share a few variables declared within the main script to modules called by it.
For example, script.py:
import Log from Log
import foo from foo
log = Log()
foo()
foo.py:
def foo():
log.Log('test') # I want this to refer to the callers log object
I want to be able to call script.py multiple times and in each case, expose the module level log object to the foo method.
Any ideas if this is possible?
It won't be too painful to pass down the log object, but I am working with a large chunk of code that has been ported from Javascript. I also understand that this places constraints on the caller of foo to expose its log object.
Thanks,
Paul
There is no namespace "between" builtins and globals -- but you can easily create your own namespaces and insert them with a name in sys.modules, so any other module can "import" them (ideally not using the from ... import syntax, which carries a load of problems, and definitely not using tghe import ... from syntax you've invented, which just gives a syntax error). For example, in script.py:
import sys
import types
sys.modules['yay'] = types.ModuleType('yay')
import Log
import foo
yay.log = Log.Log()
foo.foo()
and in foo.py
import yay
def foo():
yay.log.Log('test')
Do not fear qualified names -- they're goodness! Or as the last line of the Zen of Python (AKA import this) puts it:
Namespaces are one honking great idea -- let's do more of those!
You can make and use "more of those" most simply -- just qualify your names (situating them in the proper namespace they belong in!) rather than insisting on barenames where they're just not a good fit. There's a bazillion things that are quite easy with qualified names and anywhere between seriously problematic and well-nigh unfeasible for those who're stuck on barenames!-)
There is no such scope. You will need to either add to the builtins scope, or pass the relevant object.
Actually, I did figure out what I was looking for.
This hack is actually used PLY and that is where is stumbled across.
The library code can raise a runtime exception, which then gives access to the callers stack.