treatment of `common` memoized python functions - python

For a python application including the following modules:
commons.py
#lru_cache(maxsize=2)
def memoized_f(x):
...
pipeline_a.py
from commons import memoized_f
x = memoized_f(10)
y = memoized_f(11)
pipeline_b.py
from commons import memoized_f
x = memoized_f(20)
y = memoized_f(21)
does python store one memoized_f cache per pipeline_* module? so in the example above, there will be two caches, total for memoized_f? or
because the caching is defined for memoized_f, there is only cache stored here for memoized_f in the application containing all the modules above?

#functools.lru_cache doesn't do any magic. It is a function decorator, meaning it takes the annotated function (here: memoized_f) as input and defines a new function. Essentially it does memoized_f = lru_cache(memoized_f, maxsize=2, ...) (see here)
So your question boils down to whether a (function in a) module imported by two other modules shares state, to which the answer is yes, there is a common cache. This is because each module is only imported once.
See e.g. the official documentation

Related

Is it bad practice to modify attributes of one module from another module?

I want to define a bunch of config variables that can be imported in all the modules in my project. The values of those variables will be constant during runtime but are not known before runtime; they depend on the input. Usually I'd define a dict in my top module which would be passed to all functions and classes from other modules; however, I was thinking it may be cleaner to simply create a blank config.py module which would be dynamically filled with config variables by the top module:
# top.py
import config
config.x = x
# config.py
x = None
# other.py
import config
print(config.x)
I like this approach because I don't have to save the parameters as attributes of classes in my other modules; which makes sense to me because parameters do not describe classes themselves.
This works but is it considered bad practice?
The question as such may be disputed. But I would generally say yes, it's "bad practice" because scope and impact of change is really getting blurred. Note the use case you're describing really is not about sharing configuration, but about different parts of the program functions, objects, modules exchanging data and as such it's a bit of a variation on (meta)global variable).
Reading common configuration values could be fine, but changing them along the way... you may lose track of what happened where and also in which order as modules get imported / values get modified. For instance assume the config.py and two modules m1.py:
import config
print(config.x)
config.x=1
and m2.py:
import config
print(config.x)
config.x=2
and a main.py that just does:
import m1
import m2
import config
print(config.x)
or:
import m2
import m1
import config
print(config.x)
The state in which you find config in each module and really any other (incl. main.py here) depends on order in which imports have occurred and who assigned what value when. Even for a program entirely under your control, this may get confusing (and source of mistakes) rather quickly.
For runtime data and passing information between objects and modules (and your example is really that and not configuration that is predefined and shared between modules) I would suggest you look into describing the information perhaps in a custom state (config) object and pass it around through appropriate interface. But really just a function / method argument may be all that is needed. The exact form depends on what exactly you're trying to achieve and what your overall design is.
In your example, other.py behaves differently when called or imported before top.py which may still seem obvious and manageable in a minimal example, but really is not a very sound design. Anyone reading the code (incl. future you) should be able to follow its logic and this IMO breaks its flow.
The most trivial (and procedural) example of what for what you've described and now I hopefully have a better grasp of would be other.py recreating your current behavior:
def do_stuff(value):
print(value) # We did something useful here
if __name__ == "__main__":
do_stuff(None) # Could also use config with defaults
And your top.py presumably being the entry point and orchestrating importing and execution doing:
import other
x = get_the_value()
other.do_stuff(x)
You can of course introduce an interface to configure do_stuff perhaps a dict or a custom class even with default implementation in config.py:
class Params:
def __init__(self, x=None):
self.x = x
and your other.py:
def do_stuff(params=config.Params()):
print(params.x) # We did something useful here
And on your top.py you can use:
params = config.Params(get_the_value())
other.do_stuff(params)
But you could also have any use case specific source of value(s):
class TopParams:
def __init__(self, url):
self.x = get_value_from_url(url)
params = TopParams("https://example.com/value-source")
other.do_stuff(params)
x could even be a property which you retrieve every time you access it... or lazily when needed and then cached... Again, it really then is a matter of what you need to do.
"Is it bad practice to modify attributes of one module from another module?"
that it is considered as bad practice - violation of the law of demeter, which means in fact "talk to friends, not to strangers".
Objects should expose behaviour and functions, but should HIDE the data.
DataStructures should EXPOSE data, but should not have any methods (which are exposed). The law of demeter does not apply to such DataStructures. OOP Purists might cover such DataStructures with setters and getters, but it really adds no value in Python.
there is a lot of literature about that like : https://en.wikipedia.org/wiki/Law_of_Demeter
and of course, a must to read: "Clean Code", by Robert C. Martin (Uncle Bob), check it out on Youtube also.
For procedural programming it is perfectly normal to keep data in a DataStructure which does not have any (exposed) methods.
The procedures in the program work with that data. Consider to use the module attrs, see : https://www.attrs.org/en/stable/ for easy creation of such classes.
my prefered method for keeping config is (here without using attrs):
# conf_xy.py
"""
config is code - so why use damned parsers, textfiles, xml, yaml, toml and all that
if You just can use testable code as config that can deliver the correct types, etc.
as well as hinting in Your favorite IDE ?
Here, for demonstration without using attrs package - usually I use attrs (read the docs)
"""
class ConfXY(object):
def __init__(self) -> None:
self.x: int = 1
self.z: float = get_z_from_input()
...
conf_xy=ConfXY()
# other.py
from conf_xy import conf_xy
...
y = conf_xy.x * 2
...

How to call a procedure inside of another procedure

I'm working on creating a large .py file that can be imported and used to solve mathematical formulas. I'd like to store the formulas in a procedure that is called input1_input2_input3(): for example the formual distance=speed*time is called dis_spe_tim().
The code so far is:
def dis_spe_tim():
def distance(speed, time):
result = speed*time
unit = input("What unit are you measuring the distance in?")
print(resule,unit)
def speed():
print("speed")
and I would ideally like the user to use this like so:
import equations #name of the .py file
from equations import *
dis_spe_tim.distance(1,2)
Unfortunately, this is my first time ever doing something like this so I have absolutely no idea how to go about calling the procedure inside of the procedure and providing its arguments.
Short answer: you can't. Nested functions are local to the function they're defined in and only exists during the outer function's execution (def is an executable statement that, at runtime, creates a function object and bind it to it's name in the enclosing namespace).
The canonical python solution is to use modules as namespaces (well, Python modules ARE, mainly, namespaces), ie have a distinct module for each "formula", and define the functions at the module's top-level:
# dis_spe_tim.py
def distance(speed, time):
# code here
def speed():
# code here
Then put all those modules in an equations package (mostly: a folder containing modules and an __init__.py file). Then you can do:
from equations import dis_spe_tim
dis_spe_tim.distance(1,2)
You can check the doc for more on modules and packages here: https://docs.python.org/3/tutorial/modules.html#packages
Also note that
1/ "star imports" (also named "wildcard imports"), ie from somemodule import *, are highly discouraged as they tend to make the code harder to read and maintain and can cause unexpected (and sometimes subtles enough to be hard to spot) breakages.
2/ you shouldn't mix "domain" code (code that do effective computations) with UI code (code that communicates with the user), so any call to input(), print() etc should be outside the "domain" code. This is key to make your domain code usable with different UIs (command-line, text-based (curse etc), GUI, web, whatever), but also, quite simply, to make sure your domain code is easily testable in isolation (unit testing...).

What determines the methods listed by python's help()?

I have the following python file, test.py:
from math import floor
from logging import getLogger
When I do the following:
$ python3
>>> import test
>>> help(test)
I see this:
Help on module test:
NAME
test
FUNCTIONS
floor(...)
floor(x)
Return the floor of x as an Integral.
This is the largest integer <= x.
FILE
...
Why is the floor method documented in the help text, but getLogger is not? More broadly, what determines which methods are listed in a python module's help text?
For modules, you can take a look at the docmodule method that generates this help text.
In a nutshell, built-in functions (like floor) are listed (see the isbuiltin call in the condition) while functions not belonging to the module you've called help on don't get listed (that's what the inspect.getmodule(value) is object takes care of). I'm not certain why this decision was made.
This can be overridden by you if you define an __all__ variable that contains the names of the functions/variables/classes to be visible.
Apart from these, you also have some special names that don't get picked up (e.g names starting with _). You can see how this is taken care of by looking at the visiblename function that is called for most of the names in your module.

Access objects from another module

I'm a very inexperienced programmer creating a game (using Python 3.3) as a learning exercise. I currently have a main module and a combat module.
The people in the game are represented by instances of class "Person", and are created in the main module. However, the combat module obviously needs access to those objects. Furthermore, I'm probably going to create more modules later that will also need access to those objects.
How do I allow other modules to access the Persons from main.py?
As things stand, main.py has
import combat
at the top; adding
import main
to combat.py doesn't seem to help.
Should I instantiate my objects in a separate module (common.py?) and import them to every module that needs to access them?
Yes, you should factor this out. What you tried is circular imports between your modules, and that typically causes more problems than it solves. If combat imports main and main imports combat, then you may get an error because some object definitions will be missing from main when you try to import them. This is because main will not have finished executing when combat starts executing for the import. Assuming main is your start up script, it should do nothing more than start the program by calling a method from another module; it may instantiate an object if the desired method is an instance method on a class. Avoid global variables, too. Even if it doesn't seem like they'll be a problem now, that can bite you later on.
That said, you can reference members of a module like so:
import common
x = common.some_method_in_common()
y = common.SomeClass()
or
from common import SomeClass
y = SomeClass()
Personally, I generally avoid referencing a method from another module without qualifying it with the module name, but this is also legal:
from common import some_method_in_common
x = some_method_in_common()
I typically use from ... import ... for classes, and I typically use the first form for methods. (Yes, this sometimes means I have specific class imports from a module in addition to importing the module itself.) But this is only my personal convention.
An alternate syntax of which is strongly discouraged is
from common import *
y = SomeClass()
This will import every member of common into the current scope that does not start with an underscore (_). The reason it's discouraged is because it makes identifying the source of the name harder and it makes breaking things too easy. Consider this pair of imports:
from common import *
from some_other_module import *
y = SomeClass()
Which module does SomeClass come from? There's no way to tell other than to go look at the two modules. Worse, what if both modules define SomeClass or SomeClass is later added to some_other_module?
if you have imported main module in combat module by using import main, then you should use main.*(stuff that are implemented in main module) to access classes and methods in there.
example:
import main
person = main.Person()
also you can use from main import * or import Person to avoid main.* in the previous.
There are some rules for importing modules as described in http://effbot.org/zone/import-confusion.htm :
import X imports the module X, and creates a reference to that
module in the current namespace. Or in other words, after you’ve run
this statement, you can use X.name to refer to things defined in
module X.
from X import * imports the module X, and creates references in
the current namespace to all public objects defined by that module
(that is, everything that doesn’t have a name starting with “_”). Or
in other words, after you’ve run this statement, you can simply use
a plain name to refer to things defined in module X. But X itself is
not defined, so X.name doesn’t work. And if name was already
defined, it is replaced by the new version. And if name in X is
changed to point to some other object, your module won’t notice.
from X import a, b, c imports the module X, and creates references
in the current namespace to the given objects. Or in other words,
you can now use a and b and c in your program.
Finally, X = __import__(‘X’) works like import X, with the
difference that you
1) pass the module name as a string, and
2) explicitly assign it to a variable in your current namespace.

How does/should global data in modules across packages be managed in Python/other languages?

I am trying to design the package and module system for a programming language (Heron) which can be both compiled and interpreted, and from what I have seen I really like the Python approach. Python has a rich choice of modules, which seems to contribute largely to its success.
What I don`t know is what happens in Python if a module is included in two different compiled packages: are there separate copies made of the data or is it shared?
Related to this are a bunch of side-questions:
Am I right in assuming that packages can be compiled in Python?
What are there pros and cons to the two approaches (copying or sharing of module data)?
Are there widely known problems with the Python module system, from the point of view of the Python community? For example is there a PEP under consideration for enhancing modules/packages?
Are there certain aspects of the Python module/package system which wouldn`t work well for a compiled language?
Well, you asked a lot of questions. Here are some hints to get a bit further:
a. Python code is lexed and compiled into Python specific instructions, but not compiled to machine executable code. The ".pyc" file is automatically created whenever you run python code that does not match the existing .pyc timestamp. This feature can be turned off. You might play with the dis module to see these instructions.
b. When a module is imported, it is executed (top to bottom) in its own namespace and that namespace cached globally. When you import from another module, the module is not executed again. Remember that def is just a statement. You may want to put a print('compiling this module') statement in your code to trace it.
It depends.
There were recent enhancements, mostly around specifying which module needed to be loaded. Modules can have relative paths so that a huge project might have multiple modules with the a same name.
Python itself won't work for a compiled language. Google for "unladen swallow blog" to see the tribulations of trying to speed up a language where "a = sum(b)" can change meanings between executions. Outside of corner cases, the module system forms a nice bridge between source code and a compiled library system. The approach works well, and Python's easy wrapping of C code (swig, etc.) helps.
Modules are the only truly global objects in Python, with all other global data based around the module system (which uses sys.modules as a registry). Packages are simply modules with special semantics for importing submodules. "Compiling" a .py file into a .pyc or .pyo isn't compilation as understood for most languages: it only checks the syntax and creates a code object which, when executed in the interpreter, creates the module object.
example.py:
print "Creating %s module." % __name__
def show_def(f):
print "Creating function %s.%s." % (__name__, f.__name__)
return f
#show_def
def a():
print "called: %s.a" % __name__
Interactive session:
>>> import example
# first sys.modules['example'] is checked
# since it doesn't exist, example.py is found and "compiled" to example.pyc
# (since example.pyc doesn't exist, same would happen if it was outdated, etc.)
Creating example module. # module code is executed
Creating function example.a. # def statement executed
>>> example.a()
called: example.a
>>> import example
# sys.modules['example'] found, local variable example assigned to that object
# no 'Creating ..' output
>>> d = {"__name__": "fake"}
>>> exec open("example.py") in d
# the first import in this session is very similar to this
# in that it creates a module object (which has a __dict__), initializes a few
# variables in it (__builtins__, __name__, and others---packages' __init__
# modules have their own as well---look at some_module.__dict__.keys() or
# dir(some_module))
# and executes the code from example.py in this dict (or the code object stored
# in example.pyc, etc.)
Creating fake module. # module code is executed
Creating function fake.a. # def statement executed
>>> d.keys()
['__builtins__', '__name__', 'a', 'show_def']
>>> d['a']()
called: fake.a
Your questions:
They are compiled, in a sense, but not as you would expect if you're familiar with how C compilers work.
If the data is immutable, copying is feasible, and should be indistinguishable from sharing except for object identity (is operator and id() in Python).
Imports may or may not execute code (they always assign a local variable to an object, but that poses no problems) and may or may not modify sys.modules. You must be careful to not import in threads, and generally it is best to do all imports at the top of every module: this leads to a cascading graph so all the imports are done at once and then __main__ continues and does the Real Work™.
I don't know of any current PEP, but there's already a lot of complex machinery in place, too. For example packages can have a __path__ attribute (really a list of paths) so submodules don't have to be in the same directory, and these paths can even be computed at runtime! (Example mungepath package below.) You can have your own import hooks, use import statements inside functions, directly call __import__, and I wouldn't be surprised to find 2-3 other unique ways to work with packages and modules.
A subset of the import system would work in a traditionally-compiled language, as long as it was similar to something like C's #include. You could run the "first level" of execution (creating the module objects) in the compiler, and compile those results. There are significant drawbacks to this, however, and amounts to separate execution contexts for module-level code and functions executed at runtime (and some functions would have to run in both contexts!). (Remember in Python that every statement is executed at runtime, even def and class statements.)
I believe this is the main reason traditionally-compiled languages restrict "top-level" code to class, function, and object declarations, eliminating this second context. Even then, you have initialization problems for global objects in C/C++ (and others), unless managed carefully.
mungepath/__init__.py:
print __path__
__path__.append(".") # CWD, would be different in non-example code
print __path__
from . import example # this is example.py from above, and is NOT in mungepath/
# note that this is a degenerate case, in that we now have two names for the
# 'same' module: example and mungepath.example, but they're really different
# modules with different functions (use 'is' or 'id()' to verify)
Interactive session:
>>> import example
Creating example module.
Creating function example.a.
>>> example.__dict__.keys()
['a', '__builtins__', '__file__', 'show_def', '__package__',
'__name__', '__doc__']
>>> import mungepath
['mungepath']
['mungepath', '.']
Creating mungepath.example module.
Creating function mungepath.example.a.
>>> mungepath.example.a()
called: mungepath.example.a
>>> example is mungepath.example
False
>>> example.a is mungepath.example.a
False
Global data is scoped at the interpreter level.
"packages" can be compiled as a package is just a collection of modules which themselves can be compiled.
I am not sure I understand given the established scoping of data.

Categories