Is it possible to delete an object form inside its class?
class A():
def __init__(self):
print("init")
self.b="c"
def __enter__(self):
print("enter")
return self
def __exit__(self, type, value, traceback):
print("exit")
with A() as a:
print(a.b)
print(a.b)
returns:
init
enter
c
exit
c
How comes I still have access to the a object after exiting the with ? Is there a way to auto-delete the object in __exit__?
Yes and no. Use del a after the with clause. This will remove the variable a who is the last reference holder on the object.
The object itself (i. e. in __exit__()) cannot make the ones who know about it and hold a reference (i. e. the code at the with clause) forget this. As long as the reference exists, the object will exist.
Of course, your object can empty itself in the __exit__() and remain as a hollow thing (e. g. by del self.b in this case).
Short answer: it is (to some extent) possible, but not advisable at all.
The with part in Python has no dedicated scope, so that means that variables defined in the with statement are not removed. This is frequently wanted behavior. For example if you load a file, you can write it like:
with open('foo.txt') as f:
data = list(f)
print(data)
You do not want to remove the data variable: the with is here used to ensure that the file handler is properly closed (and the handler is also closed if an exception occurs in the body of the with).
Strictly speaking you can delete local variables that refer to the A() object, by a "hackish" solution: we inspect the call stack, and remove references to self (or another object), like:
import inspect
class A(object):
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
locs = inspect.stack()[1][0].f_locals
ks = [k for k, v in locs.items() if v is self]
for k in ks:
del locs[k]
Then it will delete it like:
>>> with A() as a:
... pass
...
>>> a
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'a' is not defined
But I would strongly advice against that. First of all, if the variabe is global, or located outside the local scope, it will not get removed here (we can fix this, but it will introduce a lot of extra logic).
Furthermore it is not said that the variable even exists, if the variable is iterable, one can define it like:
# If A.__enter__ returns an iterable with two elements
with A() as (foo, bar):
pass
So then these elements will not get recycled. Finally if the __enter__ returns self, it is possible that it "removes too much", since one could write with foo as bar, and then both foo and bar will be removed.
Most IDEs will probably not be able to understand the logic in the __exit__, anyway, and hence still will include a in the autocompletion.
In general, it is better to simply mark the object as closed, like:
import inspect
class A(object):
def __init__(self):
self.closed = False
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
self.closed = True
def some_method(self):
if self.closed:
raise Exception('A object is closed')
# process request
The above is also the way it is handled for a file handler.
class A():
def __init__(self):
print("init")
self.b="c"
def __enter__(self):
print("enter")
return self
def __exit__(self, type, value, traceback):
print("exit")
del self.b
with A() as a:
print(a.b)
print(a.b)
You can't delete instance of class A itself within __exit__. The best you can do is delete the property b.
init
enter
c
exit
Traceback (most recent call last):
File "main.py", line 14, in <module>
print(a.b)
AttributeError: A instance has no attribute 'b'
Related
I saw a code snippet in Python 3.6.5 that can be replicated with this simplified example below and I do not understand if this is something concerning or not. I am surprised it works honestly...
class Foo:
def bar(numb):
return numb
A1 = bar(1)
print(Foo)
print(Foo.A1)
print(Foo.bar(17))
In all python guides that I have seen, self appears as the first argument for all the purposes we know and love. When it is not, the methods are decorated with a static decorator and all is well. This case works as it is, however. If I were to use the static decorator on bar, I get a TypeError when setting A1:
Traceback (most recent call last):
File "/home/user/dir/understanding_classes.py", line 1, in <module>
class Foo:
File "/home/user/dir/understanding_classes.py", line 7, in Foo
A1 = bar(1)
TypeError: 'staticmethod' object is not callable
Is this something that is OK keeping in the code or is this a potential problem? I hope the question is not too broad, but how and why does this work?
The first parameter of the method will be set to the receiver. We call it self by convention, but self isn't a keyword; any valid parameter name would work just as well.
There's two different ways to invoke a method that are relevant here. Let's say we have a simple Person class with a name and a say_hi method
class Person:
def __init__(self, name):
self.name = name
def say_hi(self):
print(f'Hi my name is {self.name}')
p = Person('J.K.')
If we call the method on p, we'll get a call to say_hi with self=p
p.say_hi() # self=p, prints 'Hi my name is J.K.'
What you're doing in your example is calling the method via the class, and passing that first argument explicitly. The equivalent call here would be
Person.say_hi(p) # explicit self=p, also prints 'Hi my name is J.K.'
In your example you're using a non-static method then calling it through the class, then explicitly passing the first parameter. It happens to work but it doesn't make a lot of sense because you should be able to invoke a non-static method by saying
f = Foo()
f.bar() # numb = f, works, but numb isn't a number it's a Foo
If you want to put a function inside of a class that doesn't have a receiver, that's when you want to use #staticmethod (or, #classmethod more often)
class Person:
def __init__(self, name):
self.name = name
def say_hi(self):
print(f'Hi my name is {self.name}')
#staticmethod
def say_hello():
print('hello')
p = Person('J.K.')
Person.say_hello()
p.say_hello()
The following code will recurrent the bug:
from multiprocessing import Process, set_start_method
class TestObject:
def __init__(self) -> None:
self.a = lambda *args: {}
def __getattr__(self, item):
return self.a
class TestProcess(Process):
def __init__(self, textobject, **kwargs):
super(TestProcess, self).__init__(**kwargs)
self.testobject = textobject
def run(self) -> None:
print("heihei")
print(self.testobject)
if __name__ == "__main__":
set_start_method("spawn")
testobject = TestObject()
testprocess = TestProcess(testobject)
testprocess.start()
Using 'spawn' will cause infinite loop in the method if 'TestObject.__getattr__'.
When delete the line 'set_start_method('spawn')', all things go right.
It would be very thankful of us to know why the infinite loop happen.
If you head over to pickle's documentation, you will find a note that says
At unpickling time, some methods like getattr(), getattribute(), or setattr() may be called upon the instance. In case those methods rely on some internal invariant being true, the type should implement new() to establish such an invariant, as init() is not called when unpickling an instance.
I am unsure of what exact conditions leads to a __getattribute__ call, but you can bypass the default behaviour by providing a __setstate__ method:
class TestObject:
def __init__(self) -> None:
self.a = lambda *args: {}
def __getattr__(self, item):
return self.a
def __setstate__(self, state):
self.__dict__ = state
If it's present, pickle calls this method with the unpickled state and you are free to restore it however you wish.
Now we figure out what is really happening of the bug:
Before we look into the code, we should know two things:
When we define a __getattr__ method for our class, we should never try to get an attribute that does not belong to the class or the instance itself in __getattr__, otherwise it will cause infinite loop, for example:
class TestObject:
def __getattr__(self, item):
return self.a
if __name__ == "__main__":
testobject = TestObject()
print(f"print a: {testobject.a}")
The result should be like this:
Traceback (most recent call last):
File "tmp_test.py", line 10, in <module>
print(f"print a: {testobject.a}")
File "tmp_test.py", line 6, in __getattr__
return self.a
File "tmp_test.py", line 6, in __getattr__
return self.a
File "tmp_test.py", line 6, in __getattr__
return self.a
[Previous line repeated 996 more times]
RecursionError: maximum recursion depth exceeded
Cause a is not in the instance's __dict__, so every time it can not find a, it will go into the __getattr__ method and then cause the infinite loop.
The next thing we should remember is how the pickle module in python
works. When pickling and unpickling one class's instance, its dumps and loads(same for dump and load) function will call the instance's __getstate__(for dumps) and __setstate__(for loads) methods. And guess when our class does not define these two methods, where python will look for? Yes, the __getattr__ method! Normally, it is ok when pickling the instance, cause for this time, the attributes used in __getattr__ still exist in the instance. But when unpickling, things go wrong.
This is how the pickle module documentation says when pickling the class's instance: https://docs.python.org/3/library/pickle.html#pickling-class-instances.
And here is what we should notice:
It means when unpickling one class's instance, it will not call the __init__ function to create the instance! So when unpickling, pickle's loads function would check whether the re-instantiate instance has the __setstate__ method, and as we said above, it will go into the __getattr__ method, but for now, the attributes that the instance once owned has not been given (at the code obj.__dict__.update(attributes)), so bingo, the infinite loop bug appears!
To reproduce the whole exact bug, you can run this code:
import pickle
class TestClass:
def __init__(self):
self.w = 1
class Test:
def __init__(self):
self.a = TestClass()
def __getattr__(self, item):
print(f"{item} begin.")
print(self.a)
print(f"{item} end.")
try:
return self.a.__getattribute__(item)
except AttributeError as e:
raise e
# def __getstate__(self):
# return self.__dict__
#
# def __setstate__(self, state):
# self.__dict__ = state
if __name__ == "__main__":
test = Test()
print(test.w)
test_data = pickle.dumps(test)
new_test = pickle.loads(test_data)
print(new_test.w)
You should get the infinite bug when not add the __getstate__ and __setstate__ method, and add them will fix it. You can also try to see the print info to see whether the bug exists at __getattr__('__setstate__').
And the connection between this pickle bug and our multiprocessing bug at beginning is that it seems when using `spawn``, the son process's context would try to pickle the father process's context and then unpickle it and inherit it. So now all things make sense.
In both Python2 and Python3, in the stack trace the __name__ of a function is not used, the original name (the one that is specified after def) is used instead.
Consider the example:
import traceback
def a():
return b()
def b():
return c()
def c():
print("\n".join(line.strip() for line in traceback.format_stack()))
a.__name__ = 'A'
b.__name__ = 'B'
c.__name__ = 'C'
a();
The output is:
File "test.py", line 16, in <module>
a();
File "test.py", line 4, in a
return b()
File "test.py", line 7, in b
return c()
File "test.py", line 10, in c
print("\n".join(line.strip() for line in traceback.format_stack()))
Why so? How do I change the name that is used in the stack trace? Where is the __name__ attribute used then?
So, basically every function has three things that can be considered being name of the function:
The original name of the code block
It's stored in the f.__code__.co_name (where f is the function object). If you use def orig_name to create function, orig_name is that name. For lambas it's <lambda>.
This attribute is readonly and can't be changed. So the only way to create function with the custom name in runtime I'm aware of is exec:
exec("""def {name}():
print '{name}'
""".format(name='any')) in globals()
any() # prints 'any'
(There is also more low-level way to do this that was mentioned in a comment to the question.)
The immutability of co_name actually makes sense: with that you can be sure that the name you see in the debugger (or just stack trace) is exactly the same you see in the source code (along with the filename and line number).
The __name__ attribute of the function object
It's also aliased to func_name.
You can modify it (orig_name.__name__ = 'updated name') and you surely do on a daily basis: #functools.wraps copies the __name__ of the decorated function to the new one.
__name__ is used by tools like pydoc, that's why you need #functools.wraps: so you don't see the technical details of every decorator in your documentation. Look at the example:
from functools import wraps
def decorator1(f):
def decorated(*args, **kwargs):
print 'start1'
f(*args, **kwargs)
return decorated
def decorator2(f):
#wraps(f)
def decorated(*args, **kwargs):
print 'start2'
f(*args, **kwargs)
return decorated
#decorator1
def test1():
print 'test1'
#decorator2
def test2():
print 'test2'
Here is the pydoc output:
FUNCTIONS
decorator1(f)
decorator2(f)
test1 = decorated(*args, **kwargs)
test2(*args, **kwargs)
With wraps there is no sign of decorated in the documentation.
Name of the reference
One more thing that can be called function name (though it hardly is) is the name of a variable or an attribute where reference to that function is stored.
If you create function with def name, the name attribute will be added to the current scope. In case of lambda you should assign the result to some variable: name = lambda: None.
Obviously you can create more than one reference to the same function and all that references can have different names.
The only way all that three things are connected to each other is the def foo statement that creates function object with both __name__ and __code__.co_name equal to foo and assign it to the foo attribute of the current scope. But they are not bound in any way and can be different from each other:
import traceback
def make_function():
def orig_name():
"""Docstring here
"""
traceback.print_stack()
return orig_name
globals()['name_in_module'] = make_function()
name_in_module.__name__ = 'updated name'
name_in_module()
Output:
File "my.py", line 13, in <module>
name_in_module()
File "my.py", line 7, in orig_name
traceback.print_stack()
Pydoc:
FUNCTIONS
make_function()
name_in_module = updated name()
Docstring here
I thank other people for comments and answers, they helped me to organize my thoughts and knowledge.
Tried to explore the CPython implementation, definitely not an expert. As pointed out in the comments, when the stack entry of f is printed, the attribute f.__code__.co_name is used. Also, f.__name__ is initially set to f.__code__.co_name, but when you modify the former, the latter is not modified accordingly.
Therefore, I tried to modify that directly, but it is not possible:
>>> f.__code__.co_name = 'g'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: readonly attribute
>>>
Why are there two ways to say a function's name? Well, according to the documentation, __name__ is defined for "class, function, method, descriptor, or generator instance", so in the case of functions it maps to that attribute, for other objects it will map to something else.
How do you "disable" the __call__ method on a subclass so the following would be true:
class Parent(object):
def __call__(self):
return
class Child(Parent):
def __init__(self):
super(Child, self).__init__()
object.__setattr__(self, '__call__', None)
>>> c = Child()
>>> callable(c)
False
This and other ways of trying to set __call__ to some non-callable value still result in the child appearing as callable.
You can't. As jonrsharpe points out, there's no way to make Child appear to not have the attribute, and that's what callable(Child()) relies on to produce its answer. Even making it a descriptor that raises AttributeError won't work, per this bug report: https://bugs.python.org/issue23990 . A python 2 example:
>>> class Parent(object):
... def __call__(self): pass
...
>>> class Child(Parent):
... __call__ = property()
...
>>> c = Child()
>>> c()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: unreadable attribute
>>> c.__call__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: unreadable attribute
>>> callable(c)
True
This is because callable(...) doesn't act out the descriptor protocol. Actually calling the object, or accessing a __call__ attribute, involves retrieving the method even if it's behind a property, through the normal descriptor protocol. But callable(...) doesn't bother going that far, if it finds anything at all it is satisfied, and every subclass of Parent will have something for __call__ -- either an attribute in a subclass, or the definition from Parent.
So while you can make actually calling the instance fail with any exception you want, you can't ever make callable(some_instance_of_parent) return False.
It's a bad idea to change the public interface of the class so radically from the parent to the base.
As pointed out elsewhere, you cant uninherit __call__. If you really need to mix in callable and non callable classes you should use another test (adding a class attribute) or simply making it safe to call the variants with no functionality.
To do the latter, You could override the __call__ to raise NotImplemented (or better, a custom exception of your own) if for some reason you wanted to mix a non-callable class in with the callable variants:
class Parent(object):
def __call__(self):
print "called"
class Child (Parent):
def __call__(self):
raise NotACallableInstanceException()
for child_or_parent in list_of_children_and_parents():
try:
child_or_parent()
except NotACallableInstanceException:
pass
Or, just override call with pass:
class Parent(object):
def __call__(self):
print "called"
class Child (Parent):
def __call__(self):
pass
Which will still be callable but just be a nullop.
I have an interface class called iResource, and a number of subclasses, each of which implement the "request" method. The request functions use socket I/O to other machines, so it makes sense to run them asynchronously, so those other machines can work in parallel.
The problem is that when I start a thread with iResource.request and give it a subclass as the first argument, it'll call the superclass method. If I try to start it with "type(a).request" and "a" as the first argument, I get "" for the value of type(a). Any ideas what that means and how to get the true type of the method? Can I formally declare an abstract method in Python somehow?
EDIT: Including code.
def getSocialResults(self, query=''):
#for a in self.types["social"]: print type(a)
tasks = [type(a).request for a in self.types["social"]]
argss = [(a, query, 0) for a in self.types["social"]]
grabbers = executeChainResults(tasks, argss)
return igrabber.cycleGrabber(grabbers)
"executeChainResults" takes a list "tasks" of callables and a list "argss" of args-tuples, and assumes each returns a list. It then executes each in a separate thread, and concatenates the lists of results. I can post that code if necessary, but I haven't had any problems with it so I'll leave it out for now.
The objects "a" are DEFINITELY not of type iResource, since it has a single constructor that just throws an exception. However, replacing "type(a).request" with "iResource.request" invokes the base class method. Furthermore, calling "self.types["social"][0].request" directly works fine, but the above code gives me: "type object 'instance' has no attribute 'request'".
Uncommenting the commented line prints <type 'instance'> several times.
You can just use the bound method object itself:
tasks = [a.request for a in self.types["social"]]
# ^^^^^^^^^
grabbers = executeChainResults(tasks, [(query, 0)] * len(tasks))
# ^^^^^^^^^^^^^^^^^^^^^^^^^
If you insist on calling your methods through the base class you could also do it like this:
from abc import ABCMeta
from functools import wraps
def virtualmethod(method):
method.__isabstractmethod__ = True
#wraps(method)
def wrapper(self, *args, **kwargs):
return getattr(self, method.__name__)(*args, **kwargs)
return wrapper
class IBase(object):
__metaclass__ = ABCMeta
#virtualmethod
def my_method(self, x, y):
pass
class AddImpl(IBase):
def my_method(self, x, y):
return x + y
class MulImpl(IBase):
def my_method(self, x, y):
return x * y
items = [AddImpl(), MulImpl()]
for each in items:
print IBase.my_method(each, 3, 4)
b = IBase() # <-- crash
Result:
7
12
Traceback (most recent call last):
File "testvirtual.py", line 30, in <module>
b = IBase()
TypeError: Can't instantiate abstract class IBase with abstract methods my_method
Python doesn't support interfaces as e.g. Java does. But with the abc module you can ensure that certain methods must be implemented in subclasses. Normally you would do this with the abc.abstractmethod() decorator, but you still could not call the subclasses method through the base class, like you intend. I had a similar question once and I had the idea of the virtualmethod() decorator. It's quite simple. It essentially does the same thing as abc.abstratmethod(), but also redirects the call to the subclasses method. The specifics of the abc module can be found in the docs and in PEP3119.
BTW: I assume you're using Python >= 2.6.
The reference to "<type "instance" >" you get when you are using an "old style class" in Python - i.e.: classes not derived from the "object" type hierarchy. Old style classes are not supposed to work with several of the newer features of the language, including descriptors and others. AND, among other things, - you can't retrieve an attribute (or method) from the class of an old style class using what you are doing:
>>> class C(object):
... def c(self): pass
...
>>> type (c)
<class '__main__.C'>
>>> c = C()
>>> type(c).c
<unbound method C.c>
>>> class D: #not inheriting from object: old style class
... def d(self): pass
...
>>> d = D()
>>> type(d).d
>>> type(d)
<type 'instance'>
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'instance' has no attribute 'd'
>>>
Therefore, just make your base class inherit from "object" instead of "nothing" and check if you still get the error message when requesting the "request" method from type(a) :
As for your other observation:
"The problem is that when I start a thread with iResource.request and give it a subclass as the first argument, it'll call the superclass method."
It seems that the "right" thing for it to do is exactly that:
>>> class A(object):
... def b(self):
... print "super"
...
>>> class B(A):
... def b(self):
... print "child"
...
>>> b = B()
>>> A.b(b)
super
>>>
Here, I call a method in the class "A" giving it an specialized instance of "A" - the method is still the one in class "A".