I want to modify some classes in the standard library to use a different set of globals the ones that other classes in that module use.
Example
This example is an example only:
# module_a.py
my_global = []
class A:
def __init__(self):
my_global.append(self)
class B:
def __init__(self):
my_global.append(self)
In this example, If I create an instance of A, via A(), it will call append on the object named by my_global. But now I wish to create a new module, import B to it, and have B use my_global from the module it's been imported into, instead of the my_global from the module B was original defined.
# module_b.py
from module_a import B
my_global = []
Related
I'm struggling to explain my problem, here is my previous attempt which did in fact ask something completely different:
Clone a module and make changes to the copy
Update0
The example above is only for illustration of what I'm trying to achieve.
Since there is no variable scope for classes (unlike say, C++), I think a reference to a globals mapping is not stored in a class, but instead is attached to every function when defined.
Update1
An example was requested from the standard library:
Many (maybe all?) of the classes in the threading module make use of globals such as _allocate_lock, get_ident, and _active, defined here and here. One cannot change these globals without changing it for all the classes in that module.
You can't change the globals without affecting all other users of the module, but what you sort of can do is create a private copy of the whole module.
I trust you are familiar with sys.modules, and that if you remove a module from there, Python forgets it was imported, but old objects referencing it will continue to do so. When imported again, a new copy of the module will be made.
A hacky solution to your problem could would be something like this:
import sys
import threading
# Remove the original module, but keep it around
main_threading = sys.modules.pop('threading')
# Get a private copy of the module
import threading as private_threading
# Cover up evidence by restoring the original
sys.modules['threading'] = main_threading
# Modify the private copy
private_threading._allocate_lock = my_allocate_lock()
And now, private_threading.Lock has globals entirely separate from threading.Lock!
Needless to say, the module wasn't written with this in mind, and especially with a system module such as threading you might run into problems. For example, threading._active is supposed to contain all running threads, but with this solution, neither _active will have them all. The code may also eat your socks and set your house on fire, etc. Test rigorously.
Okay, here's a proof-of-concept that shows how to do it. Note that it only goes one level deep -- properties and nested functions are not adjusted. To implement that, as well as make this more robust, each function's globals() should be compared to the globals() that should be replaced, and only make the substitution if they are the same.
def migrate_class(cls, globals):
"""Recreates a class substituting the passed-in globals for the
globals already in the existing class. This proof-of-concept
version only goes one-level deep (i.e. properties and other nested
functions are not changed)."""
name = cls.__name__
bases = cls.__bases__
new_dict = dict()
if hasattr(cls, '__slots__'):
new_dict['__slots__'] = cls.__slots__
for name in cls.__slots__:
if hasattr(cls, name):
attr = getattr(cls, name)
if callable(attr):
closure = attr.__closure__
defaults = attr.__defaults__
func_code = attr.__code__
attr = FunctionType(func_code, globals)
new_dict[name] = attr
if hasattr(cls, '__dict__'):
od = getattr(cls, '__dict__')
for name, attr in od.items():
if callable(attr):
closure = attr.__closure__
defaults = attr.__defaults__
kwdefaults = attr.__kwdefaults__
func_code = attr.__code__
attr = FunctionType(func_code, globals, name, defaults, closure)
if kwdefaults:
attr.__kwdefaults__ = kwdefaults
new_dict[name] = attr
return type(name, bases, new_dict)
After having gone through this excercise, I am really curious as to why you need to do this?
"One cannot change these globals without changing it for all the classes in that module." That's the root of the problem isn't it, and a good explanation of the problem with global variables in general. The use of globals in threading tethers its classes to those global objects.
By the time you jerry-rig something to find and monkey patch each use of a global variable within an individual class from the module, are you any further ahead of just reimplementing the code for your own use?
The only work around that "might" be of use in your situation is something like mock. Mock's patch decorators/context managers (or something similar) could be used to swap out a global variable for the life-time of a given object. It works well within the very controlled context of unit testing, but in any other circumstances I wouldn't recommend it and would think about just reimplementing the code to suit my needs.
Globals are bad for exactly this reason, as I am sure you know well enough.
I'd try to reimplement A and B (maybe by subclassing them) in my own module and with all references to
my_global replaced by an injected dependency on A and B, which I'll call registry here.
class A(orig.A):
def __init__(self, registry):
self.registry = registry
self.registry.append(self)
# more updated methods
If you are creating all instances of A yourself you are pretty much done. You might want to create a factory which hides away the new init parameter.
my_registry = []
def A_in_my_registry():
return A(my_registry)
If foreign code creates orig.A instances for you, and you would rather have new A instances, you have to hope the foreign code is customizeable
with factories. If not, derive from the foreign classes and update them to use (newly injected) A factories instead. .... And rinse repeat for for the creation of those updated classes. I realize this can be tedious to almost impossible depending on the complexity of the foreign code, but most std libs are quite flat.
--
Edit: Monkey patch std lib code.
If you don't mind monkey patching std libs, you could also try to modifiy the original classes to work
with a redirection level which defaults to the original globals, but is customizable per instance:
import orig
class A(orig.A):
def __init__(self, registry=orig.my_globals):
self.registry = registry
self.registry.append(self)
# more updated methods
orig.A = A
As before you will need to control creations of A which should use non "standard globals",
but you won't have different A classes around as long as you monkey patch early enough.
If you use Python 3, you can subclass B and redefine the __globals__ attribute of the __init__ method like this:
from module_a import B
function = type(lambda: 0) # similar to 'from types import FunctionType as function', but faster
my_global = []
class My_B (B):
__init__ = function(B.__init__.__code__, globals(), '__init__', B.__init__.__defaults__, B.__init__.__closure__)
IMHO it is not possible to override global variables...
Globals are rarely a good idea.
Implicit variables are rarely a good idea.
An implicitly-used global is easy to indict as also "rarely good".
Additionally, you don't want A.__init__() doing anything "class-level" like updating some mysterious collection that exists for the class as a whole. That's often a bad idea.
Rather than mess with implicit class-level collection, you want a Factory in module_a that (1) creates A or B instances and (b) updates an explicit collection.
You can then use this factory in module_b, except with a different collection.
This can promote testability by exposing an implicit dependency.
module_a.py
class Factory( object ):
def __init__( self, collection ):
self.collection= collection
def make( self, name, *args, **kw ):
obj= eval( name )( *args, **kw )
self.collection.append( obj )
return obj
module_collection = []
factory= Factory( module_collection )
module_b.py
module_collection = []
factory = module_a.Factory( module_collection )
Now a client can do this
import module_b
a = module_b.factory.make( "A" )
b = module_b.factory.make( "B" )
print( module_b.module_collection )
You can make the API a bit more fluent by making the factory "callable" (implementing __call__ instead of make.
The point is to make the collection explicit via a factory class.
Related
I am coding a small Python module composed of two parts:
some functions defining a public interface,
an implementation class used by the above functions, but which is not meaningful outside the module.
At first, I decided to "hide" this implementation class by defining it inside the function using it, but this hampers readability and cannot be used if multiple functions reuse the same class.
So, in addition to comments and docstrings, is there a mechanism to mark a class as "private" or "internal"? I am aware of the underscore mechanism, but as I understand it it only applies to variables, function and methods name.
Use a single underscore prefix:
class _Internal:
...
This is the official Python convention for 'internal' symbols; "from module import *" does not import underscore-prefixed objects.
Reference to the single underscore convention.
In short:
You cannot enforce privacy. There are no private classes/methods/functions in Python. At least, not strict privacy as in other languages, such as Java.
You can only indicate/suggest privacy. This follows a convention. The Python convention for marking a class/function/method as private is to preface it with an _ (underscore). For example, def _myfunc() or class _MyClass:. You can also create pseudo-privacy by prefacing the method with two underscores (for example, __foo). You cannot access the method directly, but you can still call it through a special prefix using the classname (for example, _classname__foo). So the best you can do is indicate/suggest privacy, not enforce it.
Python is like Perl in this respect. To paraphrase a famous line about privacy from the Perl book, the philosophy is that you should stay out of the living room because you weren't invited, not because it is defended with a shotgun.
For more information:
Private variables Python Documentation
Why are Python’s ‘private’ methods not actually private? Stack Overflow question 70528
Define __all__, a list of names that you want to be exported (see documentation).
__all__ = ['public_class'] # don't add here the 'implementation_class'
A pattern that I sometimes use is this:
Define a class:
class x(object):
def doThis(self):
...
def doThat(self):
...
Create an instance of the class, overwriting the class name:
x = x()
Define symbols that expose the functionality:
doThis = x.doThis
doThat = x.doThat
Delete the instance itself:
del x
Now you have a module that only exposes your public functions.
The convention is prepend "_" to internal classes, functions, and variables.
To address the issue of design conventions, and as chroder said, there's really no such thing as "private" in Python. This may sound twisted for someone coming from C/C++ background (like me a while back), but eventually, you'll probably realize following conventions is plenty enough.
Seeing something having an underscore in front should be a good enough hint not to use it directly. If you're concerned with cluttering help(MyClass) output (which is what everyone looks at when searching on how to use a class), the underscored attributes/classes are not included there, so you'll end up just having your "public" interface described.
Plus, having everything public has its own awesome perks, like for instance, you can unit test pretty much anything from outside (which you can't really do with C/C++ private constructs).
Use two underscores to prefix names of "private" identifiers. For classes in a module, use a single leading underscore and they will not be imported using "from module import *".
class _MyInternalClass:
def __my_private_method:
pass
(There is no such thing as true "private" in Python. For example, Python just automatically mangles the names of class members with double underscores to be __clssname_mymember. So really, if you know the mangled name you can use the "private" entity anyway. See here. And of course you can choose to manually import "internal" classes if you wanted to).
In fact you can achieve something similar to private members by taking advantage of scoping. We can create a module-level class that creates new locally-scoped variables during creation of the class, then use those variables elsewhere in that class.
class Foo:
def __new__(cls: "type[Foo]", i: int, o: object) -> "Foo":
_some_private_int: int = i
_some_private_obj: object = o
foo = super().__new__(cls)
def show_vars() -> None:
print(_some_private_int)
print(_some_private_obj)
foo.show_vars = show_vars
return foo
def show_vars(self: "Foo") -> None:
pass
We can then do, e.g.
foo = Foo(10, {"a":1})
foo.show_vars()
# 10
# {'a': 1}
Alternatively, here's a poor example that creates a class in a module that has access to variables scoped to the function in which the class is created. Do note that this state is shared between all instances (so be wary of this specific example). I'm sure there's a way to avoid this, but I'll leave that as an exercise for someone else.
def _foo_create():
_some_private_int: int
_some_private_obj: object
class Foo:
def __init__(self, i: int, o: object) -> None:
nonlocal _some_private_int
nonlocal _some_private_obj
_some_private_int = i
_some_private_obj = o
def show_vars(self):
print(_some_private_int)
print(_some_private_obj)
import sys
sys.modules[__name__].Foo = Foo
_foo_create()
As far as I am aware, there is not a way to gain access to these locally-scoped variables, though I'd be interested to know otherwise, if it is possible.
I'm new to Python but as I understand it, Python isn't like Java.
Here's how it happens in Python:
class Student:
__schoolName = 'XYZ School' # private attribute
def __nameprivamethod(self): # private function
print('two underscore')
class Student:
_schoolName = 'XYZ School' # protected attribute
Don't to check how to access the private and protected parts.
I would like to convert a singleton-object programmatically into a Python module so that I can use the methods of this singleton-object directly by importing them via the module instead of accessing them as object attributes. By "programmatically" I mean that I do not want to have to copy-paste the class methods explicitly into a module file. I need some sort of a workaround that allows me to import the object methods into to global scope of another module.
I would really appreciate if someone could help me on this one.
Here is a basic example that should illustrate my problem:
mymodule.py
class MyClass:
"""This is my custom class"""
def my_method(self):
return "myValue"
singleton = MyClass()
main_as_is.py
from mymodule import MyClass
myobject = MyClass()
print(myobject.my_method())
main_to_be.py
from mymodule import my_method # or from mymodule.singleton import my_method
print(my_method())
You can use the same strategy that the standard random module uses. All the functions in that module are actually methods of a "private" instance of the Random class. That's convenient for most common uses of the module, although sometimes it's useful to create your own instances of Random so that you can have multiple independent random streams.
I've adapted your code to illustrate that technique. I named the class and its instance with a single leading underscore, since that's the usual convention in Python to signify a private name, but bear in mind it's simply a convention, Python doesn't do anything to enforce this privacy.
mymodule.py
class _MyClass:
""" This is my custom class """
def my_method(self):
return "myValue"
_myclass = _MyClass()
my_method = _myclass.my_method
main_to_be.py
from mymodule import my_method
print(my_method())
output
myValue
BTW, the from mymodule import method1, method2 syntax is ok if you only import a small number of names, or it's clear from the name which module it's from (like math module functions and constants), and you don't import from many modules. Otherwise it's better to use this sort of syntax
import mymodule as mm
# Call a method from the module
mm.method1()
That way it's obvious which names are local, and which ones are imported and where they're imported from. Sure, it's a little more typing, but it makes the code a whole lot more readable. And it eliminates the possibility of name collisions.
FWIW, here's a way to automate adding all of the _myclass methods without explicitly listing them (but remember "explicit is better than implicit"). At the end of "mymodule.py", in place of my_method = _myclass.my_method, add this:
globals().update({k: getattr(_myclass, k) for k in _MyClass.__dict__
if not k.startswith('__')})
I'm not comfortable with recommending this, since it directly injects items into the globals() dict. Note that that code will add all class attributes, not just methods.
In your question you talk about singleton objects. We don't normally use singletons in Python, and many programmers in various OOP languages consider them to be an anti-pattern. See https://stackoverflow.com/questions/12755539/why-is-singleton-considered-an-anti-pattern for details. For this application there is absolutely no need at all to use a singleton. If you only want a single instance of _MyClass then simply don't create another instance of it, just use the instance that mymodule creates for you. But if your boss insists that you must use a singleton, please see the example code here.
I am coding a small Python module composed of two parts:
some functions defining a public interface,
an implementation class used by the above functions, but which is not meaningful outside the module.
At first, I decided to "hide" this implementation class by defining it inside the function using it, but this hampers readability and cannot be used if multiple functions reuse the same class.
So, in addition to comments and docstrings, is there a mechanism to mark a class as "private" or "internal"? I am aware of the underscore mechanism, but as I understand it it only applies to variables, function and methods name.
Use a single underscore prefix:
class _Internal:
...
This is the official Python convention for 'internal' symbols; "from module import *" does not import underscore-prefixed objects.
Reference to the single underscore convention.
In short:
You cannot enforce privacy. There are no private classes/methods/functions in Python. At least, not strict privacy as in other languages, such as Java.
You can only indicate/suggest privacy. This follows a convention. The Python convention for marking a class/function/method as private is to preface it with an _ (underscore). For example, def _myfunc() or class _MyClass:. You can also create pseudo-privacy by prefacing the method with two underscores (for example, __foo). You cannot access the method directly, but you can still call it through a special prefix using the classname (for example, _classname__foo). So the best you can do is indicate/suggest privacy, not enforce it.
Python is like Perl in this respect. To paraphrase a famous line about privacy from the Perl book, the philosophy is that you should stay out of the living room because you weren't invited, not because it is defended with a shotgun.
For more information:
Private variables Python Documentation
Why are Python’s ‘private’ methods not actually private? Stack Overflow question 70528
Define __all__, a list of names that you want to be exported (see documentation).
__all__ = ['public_class'] # don't add here the 'implementation_class'
A pattern that I sometimes use is this:
Define a class:
class x(object):
def doThis(self):
...
def doThat(self):
...
Create an instance of the class, overwriting the class name:
x = x()
Define symbols that expose the functionality:
doThis = x.doThis
doThat = x.doThat
Delete the instance itself:
del x
Now you have a module that only exposes your public functions.
The convention is prepend "_" to internal classes, functions, and variables.
To address the issue of design conventions, and as chroder said, there's really no such thing as "private" in Python. This may sound twisted for someone coming from C/C++ background (like me a while back), but eventually, you'll probably realize following conventions is plenty enough.
Seeing something having an underscore in front should be a good enough hint not to use it directly. If you're concerned with cluttering help(MyClass) output (which is what everyone looks at when searching on how to use a class), the underscored attributes/classes are not included there, so you'll end up just having your "public" interface described.
Plus, having everything public has its own awesome perks, like for instance, you can unit test pretty much anything from outside (which you can't really do with C/C++ private constructs).
Use two underscores to prefix names of "private" identifiers. For classes in a module, use a single leading underscore and they will not be imported using "from module import *".
class _MyInternalClass:
def __my_private_method:
pass
(There is no such thing as true "private" in Python. For example, Python just automatically mangles the names of class members with double underscores to be __clssname_mymember. So really, if you know the mangled name you can use the "private" entity anyway. See here. And of course you can choose to manually import "internal" classes if you wanted to).
In fact you can achieve something similar to private members by taking advantage of scoping. We can create a module-level class that creates new locally-scoped variables during creation of the class, then use those variables elsewhere in that class.
class Foo:
def __new__(cls: "type[Foo]", i: int, o: object) -> "Foo":
_some_private_int: int = i
_some_private_obj: object = o
foo = super().__new__(cls)
def show_vars() -> None:
print(_some_private_int)
print(_some_private_obj)
foo.show_vars = show_vars
return foo
def show_vars(self: "Foo") -> None:
pass
We can then do, e.g.
foo = Foo(10, {"a":1})
foo.show_vars()
# 10
# {'a': 1}
Alternatively, here's a poor example that creates a class in a module that has access to variables scoped to the function in which the class is created. Do note that this state is shared between all instances (so be wary of this specific example). I'm sure there's a way to avoid this, but I'll leave that as an exercise for someone else.
def _foo_create():
_some_private_int: int
_some_private_obj: object
class Foo:
def __init__(self, i: int, o: object) -> None:
nonlocal _some_private_int
nonlocal _some_private_obj
_some_private_int = i
_some_private_obj = o
def show_vars(self):
print(_some_private_int)
print(_some_private_obj)
import sys
sys.modules[__name__].Foo = Foo
_foo_create()
As far as I am aware, there is not a way to gain access to these locally-scoped variables, though I'd be interested to know otherwise, if it is possible.
I'm new to Python but as I understand it, Python isn't like Java.
Here's how it happens in Python:
class Student:
__schoolName = 'XYZ School' # private attribute
def __nameprivamethod(self): # private function
print('two underscore')
class Student:
_schoolName = 'XYZ School' # protected attribute
Don't to check how to access the private and protected parts.
tl;dr: How come property decorators work with class-level function definitions, but not with module-level definitions?
I was applying property decorators to some module-level functions, thinking they would allow me to invoke the methods by mere attribute lookup.
This was particularly tempting because I was defining a set of configuration functions, like get_port, get_hostname, etc., all of which could have been replaced with their simpler, more terse property counterparts: port, hostname, etc.
Thus, config.get_port() would just be the much nicer config.port
I was surprised when I found the following traceback, proving that this was not a viable option:
TypeError: int() argument must be a string or a number, not 'property'
I knew I had seen some precedant for property-like functionality at module-level, as I had used it for scripting shell commands using the elegant but hacky pbs library.
The interesting hack below can be found in the pbs library source code. It enables the ability to do property-like attribute lookups at module-level, but it's horribly, horribly hackish.
# this is a thin wrapper around THIS module (we patch sys.modules[__name__]).
# this is in the case that the user does a "from pbs import whatever"
# in other words, they only want to import certain programs, not the whole
# system PATH worth of commands. in this case, we just proxy the
# import lookup to our Environment class
class SelfWrapper(ModuleType):
def __init__(self, self_module):
# this is super ugly to have to copy attributes like this,
# but it seems to be the only way to make reload() behave
# nicely. if i make these attributes dynamic lookups in
# __getattr__, reload sometimes chokes in weird ways...
for attr in ["__builtins__", "__doc__", "__name__", "__package__"]:
setattr(self, attr, getattr(self_module, attr))
self.self_module = self_module
self.env = Environment(globals())
def __getattr__(self, name):
return self.env[name]
Below is the code for inserting this class into the import namespace. It actually patches sys.modules directly!
# we're being run as a stand-alone script, fire up a REPL
if __name__ == "__main__":
globs = globals()
f_globals = {}
for k in ["__builtins__", "__doc__", "__name__", "__package__"]:
f_globals[k] = globs[k]
env = Environment(f_globals)
run_repl(env)
# we're being imported from somewhere
else:
self = sys.modules[__name__]
sys.modules[__name__] = SelfWrapper(self)
Now that I've seen what lengths pbs has to go through, I'm left wondering why this facility of Python isn't built into the language directly. The property decorator in particular seems like a natural place to add such functionality.
Is there any partiuclar reason or motivation for why this isn't built directly in?
This is related to a combination of two factors: first, that properties are implemented using the descriptor protocol, and second that modules are always instances of a particular class rather than being instantiable classes.
This part of the descriptor protocol is implemented in object.__getattribute__ (the relevant code is PyObject_GenericGetAttr starting at line 1319). The lookup rules go like this:
Search through the class mro for a type dictionary that has name
If the first matching item is a data descriptor, call its __get__ and return its result
If name is in the instance dictionary, return its associated value
If there was a matching item from the class dictionaries and it was a non-data descriptor, call its __get__ and return the result
If there was a matching item from the class dictionaries, return it
raise AttributeError
The key to this is at number 3 - if name is found in the instance dictionary (as it will be with modules), then its value will just be returned - it won't be tested for descriptorness, and its __get__ won't be called. This leads to this situation (using Python 3):
>>> class F:
... def __getattribute__(self, attr):
... print('hi')
... return object.__getattribute__(self, attr)
...
>>> f = F()
>>> f.blah = property(lambda: 5)
>>> f.blah
hi
<property object at 0xbfa1b0>
You can see that .__getattribute__ is being invoked, but isn't treating f.blah as a descriptor.
It is likely that the reason for the rules being structured this way is an explicit tradeoff between the usefulness of allowing descriptors on instances (and, therefore, in modules) and the extra code complexity that this would lead to.
Properties are a feature specific to classes (new-style classes specifically) so by extension the property decorator can only be applied to class methods.
A new-style class is one that derives from object, i.e. class Foo(object):
Further info: Can modules have properties the same way that objects can?
I have a large python code with many modules and classes. I have a special class, whose single instance is needed everywhere throughout the code (it's a threaded application, and that instance of a class also holds Thread Local Storage, locks, etc). It's a bit uncomfortable to always "populate" that instance in every imported module. I know, using globals is not the best practice, but anyway: is there any "import hook" in python, so I can do with hooking on it to have my instance available in every modules without extra work? It should work for normal imports, "from mod import ..." too, and for import constructs too. If this is not possible, can you suggest a better solution? Certenly it's not fun to pass that instance to the constructors of every classes, etc ... Inheritance also does not help, since I have modules without classes, and also I need a single instance, not the class itself ...
class master():
def import_module(self, name):
mod = __import__(name)
mod.m = self
return mod
[...]
m = master()
Currently I am thinking something like that: but then I have to use m.import_module() to import modules, then other modules will have instance of master class with name of "m" available, so I can use m.import_module() too, etc. But then I have to give up to use "normal" import statements, and I should write this:
example_mod = m.module_import("example_mod")
instead of just this:
import example_mod
(but for sure I can do with this too, to assign "m" to example_mod.m then)
Certainly it's not fun to pass that instance to the constructors of
every classes
You don't have to do this. Set up your global class in a module like config and import it
# /myapp/enviroment/__init__.py
class ThatSingleInstanceClass: pass
# create the singleton object directly or have a function init the module
singleton = ThatSingleInstanceClass()
# /myapp/somewhere.py
# all you need to use the object is importing it
from myapp.enviroment import singleton
class SomeClass:
def __init__(self): # no need to pass that object
print "Always the same object:", singleton
What's wrong with having each module import the needed object? Explicit is better than implicit.