Say I want to create a class for car, tractor and boat. All these classes have an instance of engine and I want to keep track of all the engines in a single list. If I understand correctly if the motor object is mutable i can store it as an attribute of car and also the same instance in a list.
I cant track down any solid info on whether user defined classes are mutable and if there is a choice to choose when you define them, can anybody shed some light?
User classes are considered mutable. Python doesn't have (absolutely) private attributes, so you can always change a class by reaching into the internals.
For using your class as a key in a dict or storing them in a set, you can define a .__hash__() method and a .__eq__() method, making a promise that your class is immutable. You generally design your class API to not mutate the internal state after creation in such cases.
For example, if your engines are uniquely defined by their id, you can use that as the basis of your hash:
class Engine(object):
def __init__(self, id):
self.id = id
def __hash__(self):
return hash(self.id)
def __eq__(self, other):
if isinstance(other, self.__class__):
return self.id == other.id
return NotImplemented
Now you can use instances of class Engine in sets:
>>> eng1 = Engine(1)
>>> eng2 = Engine(2)
>>> eng1 == eng2
False
>>> eng1 == eng1
True
>>> eng1 == Engine(1)
True
>>> engines = set([eng1, eng2])
>>> engines
set([<__main__.Engine object at 0x105ebef10>, <__main__.Engine object at 0x105ebef90>])
>>> engines.add(Engine(1))
>>> engines
set([<__main__.Engine object at 0x105ebef10>, <__main__.Engine object at 0x105ebef90>])
In the above sample I add another Engine(1) instance to the set, but it is recognized as already present and the set didn't change.
Note that as far as lists are concerned, the .__eq__() implementation is the important one; lists don't care if an object is mutable or not, but with the .__eq__() method in place you can test if a given engine is already in a list:
>>> Engine(1) in [eng1, eng2]
True
All objects (with the exception of a few in the standard library, some that implement special access mechanisms using things like descriptors and decorators, or some implemented in C) are mutable. This includes instances of user defined classes, classes themselves, and even the type objects that define the classes. You can even mutate a class object at runtime and have the modifications manifest in instances of the class created before the modification. By and large, things are only immutable by convention in Python if you dig deep enough.
I think you're confusing mutability with how python keeps references -- Consider:
class Foo(object):
pass
t = (1,2,Foo()) # t is a tuple, :. t is immutable
b = a[2] # b is an instance of Foo
b.foo = "Hello" # b is mutable. (I just changed it)
print (hash(b)) # b is hashable -- although the default hash isn't very useful
d = {b : 3} # since b is hashable, it can be used as a key in a dictionary (or set).
c = t # even though t is immutable, we can create multiple references to it.
a = [t] # here we add another reference to t in a list.
Now to your question about getting/storing a list of engines globally -- There are a few different ways to do this, here's one:
class Engine(object):
def __init__(self, make, model):
self.make = make
self.model = model
class EngineFactory(object):
def __init__(self,**kwargs):
self._engines = kwargs
def all_engines(self):
return self._engines.values()
def __call__(self,make, model):
""" Return the same object every for each make,model combination requested """
if (make,model) in _engines:
return self._engines[(make,model)]
else:
a = self._engines[(make,model)] = Engine(make,model)
return a
engine_factory = EngineFactory()
engine1 = engine_factory('cool_engine',1.0)
engine2 = engine_factory('cool_engine',1.0)
engine1 is engine2 #True !!! They're the same engine. Changing engine1 changes engine2
The example above could be improved a little bit by having the EngineFactory._engines dict store weakref.ref objects instead of actually storing real references to the objects. In that case, you'd check to make sure the reference is still alive (hasn't been garbage collected) before you return a new reference to the object.
EDIT: This is conceptually wrong, The immutable object in python can shed some light as to why.
class Engine():
def __init__(self, sn):
self.sn = sn
a = Engine(42)
b = a
print (a is b)
prints True.
Related
I have three variables that are closely tied together and I do not want to pass separately every time I call a function. What is the proper way to bundle them.
Context: The purpose of the variables is to keep track of some properties of a document while I am reading it word by word.
My current approach is to bundle them in a class:
class MarkdownIsOpen(object):
def __init__(self):
self.ChapterOpen = False
self.SectionOpen = False
self.ArticleOpen = False
But this seems a bit wrong to me, as I do not intend to add any methods or other functionalities.
A namedtuple would be perfect if it were mutable.
What would be the proper (most pythonic) way to bundle the three variables?
Use a dataclass:
#dataclass
class MarkdownIsOpen:
ChapterOpen: bool = False
SectionOpen: bool = False
ArticleOpen: bool = False
Or:
MarkdownIsOpen = make_dataclass('MarkdownIsOpen', ['ChapterOpen', 'SectionOpen', 'ArticleOpen'])
Note that this requires Python 3.7.
If you're using Python <= 3.6, then an ordinary class will do as well. Classes are not expensive, and they provide a hint to the user that your function does not expect any old dict-like, but a special container with the following attributes.
Compare this to, for example, C's struct or Scala's case class, which serve largely the same purpose.
Also, you can even override __slots__ and/or __getitem__ to allow dict-like access, and prevent the addition of new attributes:
class MarkdownIsOpen:
__slots__ = ('ChapterOpen', 'SectionOpen', 'ArticleOpen')
def __init__(self):
self.ChapterOpen = False
self.SectionOpen = False
self.ArticleOpen = False
def __getattr__(self, key):
return getattr(self, key)
def __setattr__(self, key, value):
setattr(self, key, value)
Example:
m = MarkdownIsOpen()
m['ChapterOpen'] = True
print(m['SectionOpen'])
m['Nonexistent'] = False
Output:
False
AttributeError: 'MarkdownIsOpen' object has no attribute 'Nonexistent'
You can use dataclasses.
#dataclass
class MarkdownIsOpen:
ChapterOpen: bool = False
SectionOpen: bool = False
ArticleOpen: bool = False
May take a look at this question: Existence of mutable named tuple in Python?
With two nice answers: recordclass
and namedlist of mutable alternatives to named tuples
You could use a simple named tuple or a simple dictionary for that purpose, if you really never need to define any methods on the class.
In python3, I have a class. Like below:
class Foo:
def __init__(self):
self.x = 3
def fcn(self, val):
self.x += val
Then I instantiate objects of that class, like so:
new_obj = Foo()
new_obj2 = Foo()
Now when I hash these objects, I get different hash values. I need them to return the same hash, as they are the same objects (in theory).
Any idea how I can do this?
Thank you to all who answered. You're right that instantiating a new instance of the same class object is not actually the same, as it occupies a different place in memory. What I ended up doing is similar to what #nosklo suggested.
I created a 'get_hashables' function that returned a dictionary with all the properties of the class that would constitute a unique class object, like so:
def get_hashables(self):
return {'data': self.data, 'result': self.result}
Then my main method would take these 'hashable' variables, and hash them to produce the hash itself.
class Foo:
def __init__(self):
self.x = 3
def fcn(self, val):
self.x += val
def __hash__(self):
return hash(self.x)
This will calculate the hash using self.x; That means the hash will be the same when self.x is the same. You can return anything from __hash__, but to prevent consistency bugs you should return the same hash if the objects compare equal. More about that in the docs.
They are not the same object. The expression Foo() invokes the class constructor, Foo.__init__, which returns a new, unique instance of the object on each call. Your two calls return two independent objects, residing in different memory locations, each containing its own, private instance of the x attribute.
You might want to read up on Python class and instance theory.
Sorry if this is worded badly, I hope you can understand/edit my question to make it easier to understand.
Ive been using python pickle to pickle/unpickle the state of the objects in a game (i do understand this is probably very storage/just generally inefficient and lazy but its only whilst im learning more python). However I encounter errors when doing this with the classes for presenting information.
The issue at root I believe is that when I unpickle the save data to load, it overwrites the existing dictionaries but the object storage points change, so the information class is trying to detect a room that the player can no longer enter since the data was overwritten.
I've made a snippet to reproduce the issue I have:
import pickle
class A(object):
def __init__(self):
pass
obj_dict = {
'a' : A(),
'b' : A()
## etc.
}
d = obj_dict['a']
f = open('save', 'wb')
pickle.Pickler(f,2).dump(obj_dict)
f.close()
f = open('save', 'rb')
obj_dict = pickle.load(f)
f.close()
if d == obj_dict['a']:
print('success')
else:
print(str(d) + '\n' + str(obj_dict['a']))
I understand this is probably to be expected when rewriting variables like this, but is there a way around it? Many thanks
Is your issue that you want d == obj_dict['a'] to evaluate to true?
By default, the above == equality check will compare the references of the two objects. I.e. does d and obj_dict['a'] point to the same chunk of memory?
When you un-pickle your object, it will be created as a new object, in a new chunk of memory and thus your equality check will fail.
You need to override how your equality check behaves to get the behavior you want. The methods you need to override are: __eq__ and __hash__.
In order to track your object through repeated pickling and un-pickling, you'll need to assign a unique id to the object on creation:
class A:
def __init__(self):
self.id = uuid.uuid4() # assign a unique, random id
Now you must override the methods mentioned above:
def __eq__( self, other ):
# is the other object also a class A and does it have the same id
return isinstance( other, A ) and self.id == other.id
def __hash__( self ):
return hash(self.id)
I want two objects to share a single string object. How do I pass the string object from the first to the second such that any changes applied by one will be visible to the other? I am guessing that I would have to wrap the string in a sort of buffer object and do all sorts of complexity to get it to work.
However, I have a tendency to overthink problems, so undoubtedly there is an easier way. Or maybe sharing the string is the wrong way to go? Keep in mind that I want both objects to be able to edit the string. Any ideas?
Here is an example of a solution I could use:
class Buffer(object):
def __init__(self):
self.data = ""
def assign(self, value):
self.data = str(value)
def __getattr__(self, name):
return getattr(self.data, name)
class Descriptor(object):
def __get__(self, instance, owner):
return instance._buffer.data
def __set__(self, instance, value):
if not hasattr(instance, "_buffer"):
if isinstance(value, Buffer):
instance._buffer = value
return
instance._buffer = Buffer()
instance._buffer.assign(value)
class First(object):
data = Descriptor()
def __init__(self, data):
self.data = data
def read(self, size=-1):
if size < 0:
size = len(self.data)
data = self.data[:size]
self.data = self.data[size:]
return data
class Second(object):
data = Descriptor()
def __init__(self, data):
self.data = data
def add(self, newdata):
self.data += newdata
def reset(self):
self.data = ""
def spawn(self):
return First(self._buffer)
s = Second("stuff")
f = s.spawn()
f.data == s.data
#True
f.read(2)
#"st"
f.data
# "uff"
f.data == s.data
#True
s.data
#"uff"
s._buffer == f._buffer
#True
Again, this seems like absolute overkill for what seems like a simple problem. As well, it requires the use of the Buffer class, a descriptor, and the descriptor's impositional _buffer variable.
An alternative is to put one of the objects in charge of the string and then have it expose an interface for making changes to the string. Simpler, but not quite the same effect.
I want two objects to share a single
string object.
They will, if you simply pass the string -- Python doesn't copy unless you tell it to copy.
How do I pass the string object from
the first to the second such that any
changes applied by one will be visible
to the other?
There can never be any change made to a string object (it's immutable!), so your requirement is trivially met (since a false precondition implies anything).
I am guessing that I would have to
wrap the string in a sort of buffer
object and do all sorts of complexity
to get it to work.
You could use (assuming this is Python 2 and you want a string of bytes) an array.array with a typecode of c. Arrays are mutable, so you can indeed alter them (with mutating methods -- and some operators, which are a special case of methods since they invoke special methods on the object). They don't have the myriad non-mutating methods of strings, so, if you need those, you'll indeed need a simple wrapper (delegating said methods to the str(...) of the array that the wrapper also holds).
It doesn't seem there should be any special complexity, unless of course you want to do something truly weird as you seem to given your example code (have an assignment, i.e., a *rebinding of a name, magically affect a different name -- that has absolutely nothing to do with whatever object was previously bound to the name you're rebinding, nor does it change that object in any way -- the only object it "changes" is the one holding the attribute, so it's obvious that you need descriptors or other magic on said object).
You appear to come from some language where variables (and particularly strings) are "containers of data" (like C, Fortran, or C++). In Python (like, say, in Java), names (the preferred way to call what others call "variables") always just refer to objects, they don't contain anything except exactly such a reference. Some objects can be changed, some can't, but that has absolutely nothing to do with the assignment statement (see note 1) (which doesn't change objects: it rebinds names).
(note 1): except of course that rebinding an attribute or item does alter the object that "contains" that item or attribute -- objects can and do contain, it's names that don't.
Just put your value to be shared in a list, and assign the list to both objects.
class A(object):
def __init__(self, strcontainer):
self.strcontainer = strcontainer
def upcase(self):
self.strcontainer[0] = self.strcontainer[0].upper()
def __str__(self):
return self.strcontainer[0]
# create a string, inside a shareable list
shared = ['Hello, World!']
x = A(shared)
y = A(shared)
# both objects have the same list
print id(x.strcontainer)
print id(y.strcontainer)
# change value in x
x.upcase()
# show how value is changed in both x and y
print str(x)
print str(y)
Prints:
10534024
10534024
HELLO, WORLD!
HELLO, WORLD!
i am not a great expert in python, but i think that if you declare a variable in a module and add a getter/setter to the module for this variable you will be able to share it this way.
What is the difference between doing
class a:
def __init__(self):
self.val=1
to doing
class a:
val=1
def __init__(self):
pass
class a:
def __init__(self):
self.val=1
this creates a class (in Py2, a cruddy, legacy, old-style, don't do that! class; in Py3, the nasty old legacy classes have finally gone away so this would be a class of the one and only kind -- the **good* kind, which requires class a(object): in Py2) such that each instance starts out with its own reference to the integer object 1.
class a:
val=1
def __init__(self):
pass
this creates a class (of the same kind) which itself has a reference to the integer object 1 (its instances start out with no per-instance reference).
For immutables like int values, it's hard to see a practical difference. For example, in either case, if you later do self.val = 2 on one instance of a, this will make an instance reference (the existing answer is badly wrong in this respect).
The distinction is important for mutable objects, because they have mutator methods, so it's pretty crucial to know if a certain list is unique per-instance or shared among all instances. But for immutable objects, since you can never change the object itself but only assign (e.g. to self.val, which will always make a per-instance reference), it's pretty minor.
Just about the only relevant difference for immutables: if you later assign a.val = 3, in the first case this will affect what's seen as self.val by each instance (except for instances that had their own self.val assigned to, or equivalent actions); in the second case, it will not affect what's seen as self.val by any instance (except for instances for which you had performed del self.val or equivalent actions).
Others have explained the technical differences. I'll try to explain why you might want to use class variables.
If you're only instantiating the class once, then class variables effectively are instance variables. However, if you're making many copies, or want to share state among a few instances, then class variables are very handy. For example:
class Foo(object):
def __init__(self):
self.bar = expensivefunction()
myobjs = [Foo() for _ in range(1000000)]
will cause expensivefunction() to be called a million times. If it's going to return the same value each time, say fetching a configuration parameter from a database, then you should consider moving it into the class definition so that it's only called once and then shared across all instances.
I also use class variables a lot when memoizing results. Example:
class Foo(object):
bazcache = {}
#classmethod
def baz(cls, key):
try:
result = cls.bazcache[key]
except KeyError:
result = expensivefunction(key)
cls.bazcache[key] = result
return result
In this case, baz is a class method; its result doesn't depend on any instance variables. That means we can keep one copy of the results cache in the class variable, so that 1) you don't store the same results multiple times, and 2) each instance can benefit from results that were cached from other instances.
To illustrate, suppose that you have a million instances, each operating on the results of a Google search. You'd probably much prefer that all those objects share those results than to have each one execute the search and wait for the answer.
So I'd disagree with Lennart here. Class variables are very convenient in certain cases. When they're the right tool for the job, don't hesitate to use them.
As mentioned by others, in one case it's an attribute on the class on the other an attribute on the instance. Does this matter? Yes, in one case it does. As Alex said, if the value is mutable. The best explanation is code, so I'll add some code to show it (that's all this answer does, really):
First a class defining two instance attributes.
>>> class A(object):
... def __init__(self):
... self.number = 45
... self.letters = ['a', 'b', 'c']
...
And then a class defining two class attributes.
>>> class B(object):
... number = 45
... letters = ['a', 'b', 'c']
...
Now we use them:
>>> a1 = A()
>>> a2 = A()
>>> a2.number = 15
>>> a2.letters.append('z')
And all is well:
>>> a1.number
45
>>> a1.letters
['a', 'b', 'c']
Now use the class attribute variation:
>>> b1 = B()
>>> b2 = B()
>>> b2.number = 15
>>> b2.letters.append('z')
And all is...well...
>>> b1.number
45
>>> b1.letters
['a', 'b', 'c', 'z']
Yeah, notice that when you changed, the mutable class attribute it changed for all classes. That's usually not what you want.
If you are using the ZODB, you use a lot of class attributes because it's a handy way of upgrading existing objects with new attributes, or adding information on a class level that doesn't get persisted. Otherwise you can pretty much ignore them.