Python deleting class objects - python

I'm looking for a way to delete instances of a class, the methods I see to safely do this are like the one proposed by Clint Miller here. For what I'm trying to do, this does not seem like a viable option.
My question is, is the method below safe for getting rid of class instances, and if not, how (if at all) can it be modified to make it so?
class myClass:
_reg = []
def __init__(self, *args):
self.args = args
self._reg.append(self)
...
def close(self):
self._reg.pop(self._reg.index(self))
Instances of myClass are only ever created like this: myClass(arg1, arg2...) as opposed to this: a = myClass(arg1, arg2...) or through a function like myFunc. When the instance is no longer needed, its close function is called.
def myFunc(*args):
...
myClass(newArg1, newArg2...)
P.S You can assume that no new references will be made to items in myClass._reg and thanks in advance.
Edit: The registry exists as a way of keeping track of instances of myClass while they exist. It's not just an overcomplicated way of trying to immediately delete them (I probably wouldn't have created them were that the case).
The amount of time each instance needs to exist and perform functions that have been omitted for brevity (and relevance) varies a lot.
As for why I need to make sure unused instances are deleted new instances need to be created at unspecified times, and some of the intended functions of each instance will begin to clash with one another in ways that will be hard to deal with if they're not deleted or otherwise disabled. Besides that, the list of instances will almost certainly become extremely long as this happens which could cause other problems.

Related

Cleanest way to reset a class attribute

I have a class that looks like the following
class A:
communicate = set()
def __init__(self):
pass
...
def _some_func(self):
...some logic...
self.communicate.add(some_var)
The communicate variable is shared among the instances of the class. I use it to provide a convenient way for the instances of this class to communicate with one another (they have some mild orchestration needed and I don't want to force the calling object to serve as an intermediary of this communication). However, I realized this causes problems when I run my tests. If I try to test multiple aspects of my code, since the python interpreter is the same throughout all the tests, I won't get a "fresh" A class for the tests, and as such the communicate set will be the aggregate of all objects I add to that set (in normal usage this is exactly what I want, but for testing I don't want interactions between my tests). Furthermore, down the line this will also cause problems in my code execution if I want to loop over my whole process multiple times (because I won't have a way of resetting this class variable).
I know I can fix this issue where it occurs by having the creator of the A objects do something like
A.communicate = set()
before it creates and uses any instances of A. However, I don't really love this because it forces my caller to know some details about the communication pathways of the A objects, and I don't want that coupling. Is there a better way for me to to reset the communicate A class variable? Perhaps some method I could call on the class instead of an instance itself like (A.new_batch()) that would perform this resetting? Or is there a better way I'm not familiar with?
Edit:
I added a class method like
class A:
communicate = set()
def __init__(self):
pass
...
#classmethod
def new_batch(cls):
cls.communicate = set()
def _some_func(self):
...some logic...
self.communicate.add(some_var)
and this works with the caller running A.new_batch(). Is this the way it should be constructed and called, or is there a better practice here?

Should a Weakref object be called every time it's accessed?

I have a class structure where the instance of one class needs to hold a reference to an instance of the other. Reading through some other posts, the best (safest) way to do this, is using weakref. It would look like this:
class ClassA:
def __init__(self):
self.my_b = ClassB(self)
self.some_prop = 1
class ClassB:
def __init__(self, some_a):
self.some_a = weakref.ref(some_a)
The question that I have, is that to access some_prop through an instance of ClassB, you'd have call the reference which will make the object available, as per documentation:
self.some_a().some_prop
However, my question is whether calling the reference should be done every time. Can't we just call the weakref in init? I.e.
self.some_a = weakref.ref(some_a)()
and then access it (more naturally) like
self.some_a.some_prop
I have a feeling the first option is preferred, but I am trying to understand why. In my case there is no way that the referenced object gets deleted before the other.
For completeness' sake I will write this answer, which is a copy of #sanyash 's comment.
Python garbage collector is clever enough to detect cyclic references and delete both objects that reference each other if there is no reference to them in the outer scope. You really don't need to use weakref.

Python when to use instance vs static methods

I am struggling to understand when it makes sense to use an instance method versus a static method. Also, I don't know if my functions are static since there is not a #staticmethod decorator. Would I be able to access the class functions when I make a call to one of the methods?
I am working on a webscraper that sends information to a database. It’s setup to run once a week. The structure of my code looks like this
import libraries...
class Get:
def build_url(url_paramater1, url_parameter2, request_date):
return url_with_parameters
def web_data(request_date, url_parameter1, url_parameter2): #no use of self
# using parameters pull the variables to look up in the database
for a in db_info:
url = build_url(a, url_parameter2, request_date)
x = requests.Session().get(url, proxies).json()
#save data to the database
return None
#same type of function for pulling the web data from the database and parsing it
if __name__ == ‘__main__’:
Get.web_data(request_date, url_parameter1, url_parameter2)
Parse.web_data(get_date, parameter) #to illustrate the second part of the scrapper
That is the basic structure. The code is functional but I don’t know if I am using the methods (functions?) correctly and potentially missing out on ways to use my code in the future. I may even be writing bad code that will cause errors down the line that are impossibly hard to debug only because I didn’t follow best practices.
After reading about when class and instance methods are used. I cannot see why I would use them. If I want the url built or the data pulled from the website I call the build_url or get_web_data function. I don’t need an instance of the function to keep track of anything separate. I cannot imagine when I would need to keep something separate either which I think is part of the problem.
The reason I think my question is different than the previous questions is: the conceptual examples to explain the differences don't seem to help me when I am sitting down and writing code. I have not run into real world problems that are solved with the different methods that show when I should even use an instance method, yet instance methods seem to be mandatory when looking at conceptual examples of code.
Thank you!
Classes can be used to represent objects, and also to group functions under a common namespace.
When a class represents an object, like a cat, anything that this object 'can do', logically, should be an instance method, such as meowing.
But when you have a group of static functions that are all related to each other or are usually used together to achieve a common goal, like build_url and web_data, you can make your code clearer and more organized by putting them under a static class, which provides a common namespace, like you did.
Therefore in my opinion the structure you chose is legitimate. It is worth considering though, that you'd find static classes more in more definitively OOP languages, like Java, while in python it is more common to use modules for namespace separation.
This code doesn't need to be a class at all. It should just be a pair of functions. You can't see why you would need an instance method because you don't have a reason to instantiate the object in the first place.
The functions you have wrote in your code are instance methods but they were written incorrectly.
An instance method must have self as first parameter
i.e def build_url(self, url_paramater1, url_parameter2, request_date):
Then you call it like that
get_inst = Get()
get_inst.build_url(url_paramater1, url_parameter2, request_date)
This self parameter is provided by python and it allow you to access all properties and functions - static or not - of your Get class.
If you don't need to access other functions or properties in your class then you add #staticmethod decorator and remove self parameter
#staticmethod
def build_url(url_paramater1, url_parameter2, request_date):
And then you can call it directly
Get.build_url(url_paramater1, url_parameter2, request_date)
or call from from class instance
get_inst = Get()
get_inst.build_url(url_paramater1, url_parameter2, request_date)
But what is the problem with your current code you might ask?
Try calling it from an instance like this and u will see the problem
get_inst = Get()
get_inst.build_url(url_paramater1, url_parameter2, request_date)
Example where creating an instance is useful:
Let's say you want to make a chat client.
You could write code like this
class Chat:
def send(server_url, message):
connection = connect(server_url)
connection.write(message)
connection.close()
def read(server_url):
connection = connect(server_url)
message = connection.read()
connection.close()
return message
But a much cleaner and better way to do it:
class Chat:
def __init__(server_url):
# Initialize connection only once when instance is created
self.connection = connect(server_url)
def __del__()
# Close connection only once when instance is deleted
self.connection.close()
def send(self, message):
self.connection.write(message)
def read(self):
return self.connection.read()
To use that last class you do
# Create new instance and pass server_url as argument
chat = Chat("http://example.com/chat")
chat.send("Hello")
chat.read()
# deleting chat causes __del__ function to be called and connection be closed
delete chat
From given example, there is no need to have Get class after all, since you are using it just like a additional namespace. You do not have any 'state' that you want to preserve, in either class or class instance.
What seems like a good thing is to have separate module and define these functions in it. This way, when importing this module, you get to have this namespace that you want.

Pythonic approach to defining __init__ helper methods?

Please excuse me, I'm new to Python and trying to learn the Pythonic approach. I'm designing a class that essentially initializes its state from a number of different sources (files). I've isolated this behavior to a single instance method, _load_from_file(file). It's called a number of times in __init__, but I typically like to keep my constructors at the beginning of a class definition, and my internal helper methods towards the end.
However, if I were to take this approach, _load_from_file isn't defined at the point in __init__ where I'd like to use it. How do you pythonistas lay this situation out?
To elaborate:
class Thing(object):
def __init__(self, file_path):
f = open('file_path')
_load_from_file(self,"someData",f) # ISSUES!
_load_from_file(self,"moreData",f) # WRONG!
f.close()
# Interface
# ...
# Internal - Where do you guys put this stuff?
def _load_from_file(self,thingToLoad,file):
# logic
Are you sure it won't work in the order you're already using? Remember, you're not using C. The called method doesn't have to appear in the class definition before calling code, so long as it has been defined by the time it gets called.
I would, however, change this:
_load_from_file(self)
to this:
self._load_from_file()
Any name-not-defined error you were getting was not because your method call was at a file position earlier than the method's definition, but because you tried to call it like a global function instead of via an object on which the method is defined.

How dangerous is setting self.__class__ to something else?

Say I have a class, which has a number of subclasses.
I can instantiate the class. I can then set its __class__ attribute to one of the subclasses. I have effectively changed the class type to the type of its subclass, on a live object. I can call methods on it which invoke the subclass's version of those methods.
So, how dangerous is doing this? It seems weird, but is it wrong to do such a thing? Despite the ability to change type at run-time, is this a feature of the language that should completely be avoided? Why or why not?
(Depending on responses, I'll post a more-specific question about what I would like to do, and if there are better alternatives).
Here's a list of things I can think of that make this dangerous, in rough order from worst to least bad:
It's likely to be confusing to someone reading or debugging your code.
You won't have gotten the right __init__ method, so you probably won't have all of the instance variables initialized properly (or even at all).
The differences between 2.x and 3.x are significant enough that it may be painful to port.
There are some edge cases with classmethods, hand-coded descriptors, hooks to the method resolution order, etc., and they're different between classic and new-style classes (and, again, between 2.x and 3.x).
If you use __slots__, all of the classes must have identical slots. (And if you have the compatible but different slots, it may appear to work at first but do horrible things…)
Special method definitions in new-style classes may not change. (In fact, this will work in practice with all current Python implementations, but it's not documented to work, so…)
If you use __new__, things will not work the way you naively expected.
If the classes have different metaclasses, things will get even more confusing.
Meanwhile, in many cases where you'd think this is necessary, there are better options:
Use a factory to create an instance of the appropriate class dynamically, instead of creating a base instance and then munging it into a derived one.
Use __new__ or other mechanisms to hook the construction.
Redesign things so you have a single class with some data-driven behavior, instead of abusing inheritance.
As a very most common specific case of the last one, just put all of the "variable methods" into classes whose instances are kept as a data member of the "parent", rather than into subclasses. Instead of changing self.__class__ = OtherSubclass, just do self.member = OtherSubclass(self). If you really need methods to magically change, automatic forwarding (e.g., via __getattr__) is a much more common and pythonic idiom than changing classes on the fly.
Assigning the __class__ attribute is useful if you have a long time running application and you need to replace an old version of some object by a newer version of the same class without loss of data, e.g. after some reload(mymodule) and without reload of unchanged modules. Other example is if you implement persistency - something similar to pickle.load.
All other usage is discouraged, especially if you can write the complete code before starting the application.
On arbitrary classes, this is extremely unlikely to work, and is very fragile even if it does. It's basically the same thing as pulling the underlying function objects out of the methods of one class, and calling them on objects which are not instances of the original class. Whether or not that will work depends on internal implementation details, and is a form of very tight coupling.
That said, changing the __class__ of objects amongst a set of classes that were particularly designed to be used this way could be perfectly fine. I've been aware that you can do this for a long time, but I've never yet found a use for this technique where a better solution didn't spring to mind at the same time. So if you think you have a use case, go for it. Just be clear in your comments/documentation what is going on. In particular it means that the implementation of all the classes involved have to respect all of their invariants/assumptions/etc, rather than being able to consider each class in isolation, so you'd want to make sure that anyone who works on any of the code involved is aware of this!
Well, not discounting the problems cautioned about at the start. But it can be useful in certain cases.
First of all, the reason I am looking this post up is because I did just this and __slots__ doesn't like it. (yes, my code is a valid use case for slots, this is pure memory optimization) and I was trying to get around a slots issue.
I first saw this in Alex Martelli's Python Cookbook (1st ed). In the 3rd ed, it's recipe 8.19 "Implementing Stateful Objects or State Machine Problems". A fairly knowledgeable source, Python-wise.
Suppose you have an ActiveEnemy object that has different behavior from an InactiveEnemy and you need to switch back and forth quickly between them. Maybe even a DeadEnemy.
If InactiveEnemy was a subclass or a sibling, you could switch class attributes. More exactly, the exact ancestry matters less than the methods and attributes being consistent to code calling it. Think Java interface or, as several people have mentioned, your classes need to be designed with this use in mind.
Now, you still have to manage state transition rules and all sorts of other things. And, yes, if your client code is not expecting this behavior and your instances switch behavior, things will hit the fan.
But I've used this quite successfully on Python 2.x and never had any unusual problems with it. Best done with a common parent and small behavioral differences on subclasses with the same method signatures.
No problems, until my __slots__ issue that's blocking it just now. But slots are a pain in the neck in general.
I would not do this to patch live code. I would also privilege using a factory method to create instances.
But to manage very specific conditions known in advance? Like a state machine that the clients are expected to understand thoroughly? Then it is pretty darn close to magic, with all the risk that comes with it. It's quite elegant.
Python 3 concerns? Test it to see if it works but the Cookbook uses Python 3 print(x) syntax in its example, FWIW.
The other answers have done a good job of discussing the question of why just changing __class__ is likely not an optimal decision.
Below is one example of a way to avoid changing __class__ after instance creation, using __new__. I'm not recommending it, just showing how it could be done, for the sake of completeness. However it is probably best to do this using a boring old factory rather than shoe-horning inheritance into a job for which it was not intended.
class ChildDispatcher:
_subclasses = dict()
def __new__(cls, *args, dispatch_arg, **kwargs):
# dispatch to a registered child class
subcls = cls.getsubcls(dispatch_arg)
return super(ChildDispatcher, subcls).__new__(subcls)
def __init_subclass__(subcls, **kwargs):
super(ChildDispatcher, subcls).__init_subclass__(**kwargs)
# add __new__ contructor to child class based on default first dispatch argument
def __new__(cls, *args, dispatch_arg = subcls.__qualname__, **kwargs):
return super(ChildDispatcher,cls).__new__(cls, *args, **kwargs)
subcls.__new__ = __new__
ChildDispatcher.register_subclass(subcls)
#classmethod
def getsubcls(cls, key):
name = cls.__qualname__
if cls is not ChildDispatcher:
raise AttributeError(f"type object {name!r} has no attribute 'getsubcls'")
try:
return ChildDispatcher._subclasses[key]
except KeyError:
raise KeyError(f"No child class key {key!r} in the "
f"{cls.__qualname__} subclasses registry")
#classmethod
def register_subclass(cls, subcls):
name = subcls.__qualname__
if cls is not ChildDispatcher:
raise AttributeError(f"type object {name!r} has no attribute "
f"'register_subclass'")
if name not in ChildDispatcher._subclasses:
ChildDispatcher._subclasses[name] = subcls
else:
raise KeyError(f"{name} subclass already exists")
class Child(ChildDispatcher): pass
c1 = ChildDispatcher(dispatch_arg = "Child")
assert isinstance(c1, Child)
c2 = Child()
assert isinstance(c2, Child)
How "dangerous" it is depends primarily on what the subclass would have done when initializing the object. It's entirely possible that it would not be properly initialized, having only run the base class's __init__(), and something would fail later because of, say, an uninitialized instance attribute.
Even without that, it seems like bad practice for most use cases. Easier to just instantiate the desired class in the first place.
Here's an example of one way you could do the same thing without changing __class__. Quoting #unutbu in the comments to the question:
Suppose you were modeling cellular automata. Suppose each cell could be in one of say 5 Stages. You could define 5 classes Stage1, Stage2, etc. Suppose each Stage class has multiple methods.
class Stage1(object):
…
class Stage2(object):
…
…
class Cell(object):
def __init__(self):
self.current_stage = Stage1()
def goToStage2(self):
self.current_stage = Stage2()
def __getattr__(self, attr):
return getattr(self.current_stage, attr)
If you allow changing __class__ you could instantly give a cell all the methods of a new stage (same names, but different behavior).
Same for changing current_stage, but this is a perfectly normal and pythonic thing to do, that won't confuse anyone.
Plus, it allows you to not change certain special methods you don't want changed, just by overriding them in Cell.
Plus, it works for data members, class methods, static methods, etc., in ways every intermediate Python programmer already understands.
If you refuse to change __class__, then you might have to include a stage attribute, and use a lot of if statements, or reassign a lot of attributes pointing to different stage's functions
Yes, I've used a stage attribute, but that's not a downside—it's the obvious visible way to keep track of what the current stage is, better for debugging and for readability.
And there's not a single if statement or any attribute reassignment except for the stage attribute.
And this is just one of multiple different ways of doing this without changing __class__.
In the comments I proposed modeling cellular automata as a possible use case for dynamic __class__s. Let's try to flesh out the idea a bit:
Using dynamic __class__:
class Stage(object):
def __init__(self, x, y):
self.x = x
self.y = y
class Stage1(Stage):
def step(self):
if ...:
self.__class__ = Stage2
class Stage2(Stage):
def step(self):
if ...:
self.__class__ = Stage3
cells = [Stage1(x,y) for x in range(rows) for y in range(cols)]
def step(cells):
for cell in cells:
cell.step()
yield cells
For lack of a better term, I'm going to call this
The traditional way: (mainly abarnert's code)
class Stage1(object):
def step(self, cell):
...
if ...:
cell.goToStage2()
class Stage2(object):
def step(self, cell):
...
if ...:
cell.goToStage3()
class Cell(object):
def __init__(self, x, y):
self.x = x
self.y = y
self.current_stage = Stage1()
def goToStage2(self):
self.current_stage = Stage2()
def __getattr__(self, attr):
return getattr(self.current_stage, attr)
cells = [Cell(x,y) for x in range(rows) for y in range(cols)]
def step(cells):
for cell in cells:
cell.step(cell)
yield cells
Comparison:
The traditional way creates a list of Cell instances each with a
current stage attribute.
The dynamic __class__ way creates a list of instances which are
subclasses of Stage. There is no need for a current stage
attribute since __class__ already serves this purpose.
The traditional way uses goToStage2, goToStage3, ... methods to
switch stages.
The dynamic __class__ way requires no such methods. You just
reassign __class__.
The traditional way uses the special method __getattr__ to delegate
some method calls to the appropriate stage instance held in the
self.current_stage attribute.
The dynamic __class__ way does not require any such delegation. The
instances in cells are already the objects you want.
The traditional way needs to pass the cell as an argument to
Stage.step. This is so cell.goToStageN can be called.
The dynamic __class__ way does not need to pass anything. The
object we are dealing with has everything we need.
Conclusion:
Both ways can be made to work. To the extent that I can envision how these two implementations would pan-out, it seems to me the dynamic __class__ implementation will be
simpler (no Cell class),
more elegant (no ugly goToStage2 methods, no brain-teasers like why
you need to write cell.step(cell) instead of cell.step()),
and easier to understand (no __getattr__, no additional level of
indirection)

Categories