In python, it is illegal to create new attribute for an object instance like this
>>> a = object()
>>> a.hhh = 1
throws
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'object' object has no attribute 'hhh'
However, for a function object, it is OK.
>>> def f():
... return 1
...
>>> f.hhh = 1
What is the rationale behind this difference?
The reason function objects support arbitrary attributes is that, before we added that feature, several frameworks (e.g. parser generator ones) were abusing function docstrings (and other attribute of function objects) to stash away per-function information that was crucial to them -- the need for such association of arbitrary named attributes to function objects being proven by example, supporting them directly in the language rather than punting and letting (e.g.) docstrings be abused, was pretty obvious.
To support arbitrary instance attributes a type must supply every one of its instances with a __dict__ -- that's no big deal for functions (which are never tiny objects anyway), but it might well be for other objects intended to be tiny. By making the object type as light as we could, and also supplying __slots__ to allow avoiding per-instance __dict__ in subtypes of object, we supported small, specialized "value" types to the best of our ability.
Alex Martelli posted an awesome answer to your question. For anyone who is looking for a good way to accomplish arbitrary attributes on an empty object, do this:
class myobject(object):
pass
o = myobject()
o.anything = 123
Or more efficient (and better documented) if you know the attributes:
class myobject(object):
__slots__ = ('anything', 'anythingelse')
o = myobject()
o.anything = 123
o.anythingelse = 456
The rationale is that an instance of object() is a degenerate special case. It "is" an object but it isn't designed to be useful by itself.
Think of object as a temporary hack, bridging old-style types and classes. In Python 3.0 it will fade into obscurity because it will no longer be used as part of
class Foo( object ):
pass
f = Foo()
f.randomAttribute = 3.1415926
Here's another alternative, as short as I could make it:
>>> dummy = type('', (), {})()
>>> dummy.foo = 5
>>> dummy.foo
5
Related
I've been reading articles about OOP with python, specifically this one.
The autor of that article has a description and then a code example:
The Python syntax to instantiate a class is the same of a function
call
>>> b = int()
>>> type(b)
<type 'int'>
By this I infer "instance" exist at the moment of the execution and not before. When you execute type(b) that's the instance of the class int().
But then I read this stack overflow answer:
Instance is a variable that holds the memory address of the Object.
Which makes me a little be confused about the term. So when I assign a variable at the moment of the execution the "instance" is created?
Finally this explanation in ComputerHope points to the fact that instances are the same as variable assigments:
function Animal(numlegs, mysound) {
this.legs = numlegs;
this.sound = mysound;
}
var lion = new Animal(4, "roar");
var cat = new Animal(4, "meow");
var dog = new Animal(4, "bark");
The Animal object allows for the number of legs and the sound the
animal makes to be set by each instance of the object. In this case,
all three instances (lion, cat, and dog) have the same number of legs,
but make different sounds.
Could anyone actually provide a clear definition of when an instance exits?
I've been reading articles about OOP with python, specifically this
one.
The autor of that article has a description and then a code example:
The Python syntax to instantiate a class is the same of a function
call
>>> b = int()
>>> type(b)
<type 'int'>
Also read the sentence before that:
Once you have a class you can instantiate it to get a concrete object (an instance) of that type, i.e. an object built according to the structure of that class.
So an instance of a class is an object that has that class as its type.
By this I infer "instance" exist at the moment of the execution and
not before.
Yes, correct. "Instance" and "instance of" are runtime concepts in Python.
When you execute type(b) that's the instance of the
class int().
Not quite.
The int instance here starts existing when int() is called.1 This process is what's called "instantiation" and the result (which is returned by this call, and in this example then assigned to b) is the "instance" of int.
But then I read this stack overflow answer:
Instance is a variable that holds the memory address of the Object.
Oh well, that's not quite correct. It's the object itself (the value at that memory address, if you will) that's the instance. Several variables may be bound to the same object (and thus the same instance). There's even an operator for testing that: is
>>> a = 5
>>> b = a
>>> a is b
True
Which makes me a little be confused about the term. So when I assign a
variable at the moment of the execution the "instance" is created?
No, then the instance is bound to that variable. In Python, think of variables just as "names for values". So binding an object to a variable means giving that object that name. An object can have several names, as we saw above.
You can use an instance without assigning it to any variable, i.e., without naming it, e.g. by passing it to a function:
>>> print(int())
0
Finally this explanation in ComputerHope points to the fact that
instances are the same as variable assigments:
function Animal(numlegs, mysound) {
this.legs = numlegs;
this.sound = mysound;
}
var lion = new Animal(4, "roar");
var cat = new Animal(4, "meow");
var dog = new Animal(4, "bark");
The Animal object allows for the number of legs and the sound the
animal makes to be set by each instance of the object. In this case,
all three instances (lion, cat, and dog) have the same number of legs,
but make different sounds.
Unfortunately, that explanation on ComputerHope will probably confuse most readers more than it helps them. First, it conflates the terms "class" and "object". They don't mean the same. A class is a template for one type of objects. Objects and templates for a type of objects aren't the same concept, just as cookie cutters aren't the same things as cookies.
Of course, [for the understanding] it doesn't particularly help that in Python, classes are (special, but not too special) objects (of type type) and that in JavaScript until the class concept was introduced, it was customary to use plain objects as templates for other objects. (The latter approach is known as "prototype based object orientation" or "prototype based inheritance". In contrast, most other object oriented languages, including Python, use class-based object orientation / class-based inheritance. I'm not quite sure in what category modern ECMAScript with the class keyword falls.)
Could anyone actually provide a clear definition of instance?
Like I wrote further up:
An instance of a class is an object that has that class as its type.
So an "instance" is always an "instance of" something. That also answers the linguistic take on the question in the title
When should I call it “instance”?
You should call it "instance" when you want to call it "instance of" something (usually of a class).
1 I haven't told the whole truth. Try this:
>>> a = int()
>>> b = int()
>>> a is b
True
Wait what? Shouldn't the two invocations of int have returned new instances each, and thus two distinct ones?
That what would have happened with most types, but some built-in types are different, int being one of them. The makers of the CPython implementation are aware that small integers are used a lot. Thus they let CPython create new ones all the time, they just have it re-use the same integer (the same object / instance) each time the same value is required. Because Python integers are immutable, that doesn't usually cause any problems, and saves a lot of memory and object-creation-time in computation-intensive programs.
The Python standard allows implementations to make this optimization, but AFAIK doesn't require them to. So this should be considered an implementation detail and your program logic should never rely on this. (Your performance optimizations may rely on it, though.)
Generally in OOP
Classes and objects are the two main aspects of object oriented
programming. A class creates a new type where objects are instances of
the class.
As explained here.
Thus everytime an object is created, it is called an instance.
Python makes no difference in this concept, however things are a little different from other languages like Java for instance.
In fact in Python everything is an object, even classes themselves.
Here is a brief explanation of how it works:
Considering this snippet:
>>> class Foo:
... pass
...
>>> type(Foo)
<type 'type'>
>>>
Class Foo is type type which it is a metaclass for all classes in Python
(There is however a distinction between 'old' and 'new' classes, more here, here and here).
Class type being a class, is an instance of itself:
>>> isinstance(type, type)
True
So Foo despite being a class definition, is treated like an object by the interpreter.
Objects as instances are created whit statements like foo = Foo(), foo being an object inherits from object class.
>>> isinstance(foo, object)
True
This class provides all the methods an object needs, such as __new__() and __int__() (new, init). In short the former is used to create a new instance of a class, the latter is called after the instance has been created and is used to initialize values like you did with Animal.
The fact that everything is an object also mean that we can do funny pieces of code like this one:
>>> class Foo:
... var = 'hello'
...
>>> foo = Foo()
>>> foo.var
'hello'
>>> foo.other_var = 'world'
>>> foo.other_var
'world'
>>> Foo.other_var
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'Foo' has no attribute 'other_var'
>>> Foo.var
'hello'
>>>
Here I added an attribute on an object at runtime. That attribute will be unically in foo, the class itself or any other instances won't have it.
This is called Instance variable and class variable.
Hope it all makes sense to you.
TL;DR
In Python everything (class definitions, functions, modules, etc..) are all treated like objects by the interpreter. Therefore 'everything' is an instance.
So, I was playing around with Python while answering this question, and I discovered that this is not valid:
o = object()
o.attr = 'hello'
due to an AttributeError: 'object' object has no attribute 'attr'. However, with any class inherited from object, it is valid:
class Sub(object):
pass
s = Sub()
s.attr = 'hello'
Printing s.attr displays 'hello' as expected. Why is this the case? What in the Python language specification specifies that you can't assign attributes to vanilla objects?
For other workarounds, see How can I create an object and add attributes to it?.
To support arbitrary attribute assignment, an object needs a __dict__: a dict associated with the object, where arbitrary attributes can be stored. Otherwise, there's nowhere to put new attributes.
An instance of object does not carry around a __dict__ -- if it did, before the horrible circular dependence problem (since dict, like most everything else, inherits from object;-), this would saddle every object in Python with a dict, which would mean an overhead of many bytes per object that currently doesn't have or need a dict (essentially, all objects that don't have arbitrarily assignable attributes don't have or need a dict).
For example, using the excellent pympler project (you can get it via svn from here), we can do some measurements...:
>>> from pympler import asizeof
>>> asizeof.asizeof({})
144
>>> asizeof.asizeof(23)
16
You wouldn't want every int to take up 144 bytes instead of just 16, right?-)
Now, when you make a class (inheriting from whatever), things change...:
>>> class dint(int): pass
...
>>> asizeof.asizeof(dint(23))
184
...the __dict__ is now added (plus, a little more overhead) -- so a dint instance can have arbitrary attributes, but you pay quite a space cost for that flexibility.
So what if you wanted ints with just one extra attribute foobar...? It's a rare need, but Python does offer a special mechanism for the purpose...
>>> class fint(int):
... __slots__ = 'foobar',
... def __init__(self, x): self.foobar=x+100
...
>>> asizeof.asizeof(fint(23))
80
...not quite as tiny as an int, mind you! (or even the two ints, one the self and one the self.foobar -- the second one can be reassigned), but surely much better than a dint.
When the class has the __slots__ special attribute (a sequence of strings), then the class statement (more precisely, the default metaclass, type) does not equip every instance of that class with a __dict__ (and therefore the ability to have arbitrary attributes), just a finite, rigid set of "slots" (basically places which can each hold one reference to some object) with the given names.
In exchange for the lost flexibility, you gain a lot of bytes per instance (probably meaningful only if you have zillions of instances gallivanting around, but, there are use cases for that).
As other answerers have said, an object does not have a __dict__. object is the base class of all types, including int or str. Thus whatever is provided by object will be a burden to them as well. Even something as simple as an optional __dict__ would need an extra pointer for each value; this would waste additional 4-8 bytes of memory for each object in the system, for a very limited utility.
Instead of doing an instance of a dummy class, in Python 3.3+, you can (and should) use types.SimpleNamespace for this.
It is simply due to optimization.
Dicts are relatively large.
>>> import sys
>>> sys.getsizeof((lambda:1).__dict__)
140
Most (maybe all) classes that are defined in C do not have a dict for optimization.
If you look at the source code you will see that there are many checks to see if the object has a dict or not.
So, investigating my own question, I discovered this about the Python language: you can inherit from things like int, and you see the same behaviour:
>>> class MyInt(int):
pass
>>> x = MyInt()
>>> print x
0
>>> x.hello = 4
>>> print x.hello
4
>>> x = x + 1
>>> print x
1
>>> print x.hello
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
AttributeError: 'int' object has no attribute 'hello'
I assume the error at the end is because the add function returns an int, so I'd have to override functions like __add__ and such in order to retain my custom attributes. But this all now makes sense to me (I think), when I think of "object" like "int".
https://docs.python.org/3/library/functions.html#object :
Note: object does not have a __dict__, so you can’t assign arbitrary attributes to an instance of the object class.
It's because object is a "type", not a class. In general, all classes that are defined in C extensions (like all the built in datatypes, and stuff like numpy arrays) do not allow addition of arbitrary attributes.
This is (IMO) one of the fundamental limitations with Python - you can't re-open classes. I believe the actual problem, though, is caused by the fact that classes implemented in C can't be modified at runtime... subclasses can, but not the base classes.
Python 2.7 docs for weakref module say this:
Not all objects can be weakly referenced; those objects which can
include class instances, functions written in Python (but not in C),
methods (both bound and unbound), ...
And Python 3.3 docs for weakref module say this:
Not all objects can be weakly referenced; those objects which can
include class instances, functions written in Python (but not in C),
instance methods, ...
To me, these indicate that weakrefs to bound methods (in all versions Python 2.7 - 3.3) should be good, and that weakrefs to unbound methods should be good in Python 2.7.
Yet in Python 2.7, creating a weakref to a method (bound or unbound) results in a dead weakref:
>>> def isDead(wr): print 'dead!'
...
>>> class Foo:
... def bar(self): pass
...
>>> wr=weakref.ref(Foo.bar, isDead)
dead!
>>> wr() is None
True
>>> foo=Foo()
>>> wr=weakref.ref(foo.bar, isDead)
dead!
>>> wr() is None
True
Not what I would have expected based on the docs.
Similarly, in Python 3.3, a weakref to a bound method dies on creation:
>>> wr=weakref.ref(Foo.bar, isDead)
>>> wr() is None
False
>>> foo=Foo()
>>> wr=weakref.ref(foo.bar, isDead)
dead!
>>> wr() is None
True
Again not what I would have expected based on the docs.
Since this wording has been around since 2.7, it's surely not an oversight. Can anyone explain how the statements and the observed behavior are in fact not in contradiction?
Edit/Clarification: In other words, the statement for 3.3 says "instance methods can be weak referenced"; doesn't this mean that it is reasonable to expect that weakref.ref(an instance method)() is not None? and if it None, then "instance methods" should not be listed among the types of objects that can be weak referenced?
Foo.bar produces a new unbound method object every time you access it, due to some gory details about descriptors and how methods happen to be implemented in Python.
The class doesn't own unbound methods; it owns functions. (Check out Foo.__dict__['bar'].) Those functions just happen to have a __get__ which returns an unbound-method object. Since nothing else holds a reference, it vanishes as soon as you're done creating the weakref. (In Python 3, the rather unnecessary extra layer goes away, and an "unbound method" is just the underlying function.)
Bound methods work pretty much the same way: the function's __get__ returns a bound-method object, which is really just partial(function, self). You get a new one every time, so you see the same phenomenon.
You can store a method object and keep a reference to that, of course:
>>> def is_dead(wr): print "blech"
...
>>> class Foo(object):
... def bar(self): pass
...
>>> method = Foo.bar
>>> wr = weakref.ref(method, is_dead)
>>> 1 + 1
2
>>> method = None
blech
This all seems of dubious use, though :)
Note that if Python didn't spit out a new method instance on every attribute access, that'd mean that classes refer to their methods and methods refer to their classes. Having such cycles for every single method on every single instance in the entire program would make garbage collection way more expensive—and before 2.1, Python didn't even have cycle collection, so they would've stuck around forever.
#Eevee's answer is correct but there is a subtlety that is important.
The Python docs state that instance methods (py3k) and un/bound methods (py2.4+) can be weak referenced. You'd expect (naively, as I did) that weakref.ref(foo.bar)() would therefore be non-None, yet it is None, making the weak ref "dead on arrival" (DOA). This lead to my question, if the weakref to an instance method is DOA, why do the docs say you can weak ref a method?
So as #Eevee showed, you can create a non-dead weak reference to an instance method, by creating a strong reference to the method object which you give to weakref:
m = foo.bar # creates a *new* instance method "Foo.bar" and strong refs it
wr = weakref.ref(m)
assert wr() is not None # success
The subtlety (to me, anyways) is that a new instance method object is created every time you use Foo.bar, so even after the above code is run, the following will fail:
wr = weakref.ref(foo.bar)
assert wr() is not None # fails
because foo.bar is new instance of the "Foo instance" foo's "bar" method, different from m, and there is no strong ref to this new instance, so it is immediately gc'd, even if you have created a strong reference to it earlier (it is not the same strong ref). To be clear,
>>> d1 = foo.bla # assume bla is a data member
>>> d2 = foo.bla # assume bla is a data member
>>> d1 is d2
True # which is what you expect
>>> m1 = foo.bar # assume bar is an instance method
>>> m2 = foo.bar
>>> m1 is m2
False # !!! counter-intuitive
This takes many people by surprise since no one expects access to an instance member to be creating a new instance of anything. For example, if foo.bla is a data member of foo, then using foo.bla in your code does not create a new instance of the object referenced by foo.bla. Now if bla is a "function", foo.bla does create a new instance of type "instance method" representing the bound function.
Why the weakref docs (since python 2.4!) don't point that out is very strange, but that's a separate issue.
While I see that there's an accepted answer as to why this should be so, from a simple use-case situation wherein one would like an object that acts as a weakref to a bound method, I believe that one might be able to sneak by with an object as such. It's kind of a runt compared to some of the 'codier' things out there, but it works.
from weakref import proxy
class WeakMethod(object):
"""A callable object. Takes one argument to init: 'object.method'.
Once created, call this object -- MyWeakMethod() --
and pass args/kwargs as you normally would.
"""
def __init__(self, object_dot_method):
self.target = proxy(object_dot_method.__self__)
self.method = proxy(object_dot_method.__func__)
###Older versions of Python can use 'im_self' and 'im_func' in place of '__self__' and '__func__' respectively
def __call__(self, *args, **kwargs):
"""Call the method with args and kwargs as needed."""
return self.method(self.target, *args, **kwargs)
As an example of its ease of use:
class A(object):
def __init__(self, name):
self.name = name
def foo(self):
return "My name is {}".format(self.name)
>>> Stick = A("Stick")
>>> WeakFoo = WeakMethod(Stick.foo)
>>> WeakFoo()
'My name is Stick'
>>> Stick.name = "Dave"
>>> WeakFoo()
'My name is Dave'
Note that evil trickery will cause this to blow up, so depending on how you'd prefer it to work this may not be the best solution.
>>> A.foo = lambda self: "My eyes, aww my eyes! {}".format(self.name)
>>> Stick.foo()
'My eyes, aww my eyes! Dave'
>>> WeakFoo()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 6, in __call__
ReferenceError: weakly-referenced object no longer exists
>>>
If you were going to be replacing methods on-the-fly you might need to use a getattr(weakref.proxy(object), 'name_of_attribute_as_string') approach instead. getattr is a fairly fast look-up so that isn't the literal worst thing in the world, but depending on what you're doing, YMMV.
I really hope this is not a question posed by millions of newbies, but my search didn t really give me a satisfying answer.
So my question is fairly simple. Are classes basically a container for functions with its own namespace? What other functions do they have beside providing a separate namespace and holding functions while making them callable as class atributes? Im asking in a python context.
Oh and thanks for the great help most of you have been!
More importantly than functions, class instances hold data attributes, allowing you to define new data types beyond what is built into the language; and
they support inheritance and duck typing.
For example, here's a moderately useful class. Since Python files (created with open) don't remember their own name, let's make a file class that does.
class NamedFile(object):
def __init__(self, name):
self._f = f
self.name = name
def readline(self):
return self._f.readline()
Had Python not had classes, you'd probably be working with dicts instead:
def open_file(name):
return {"name": name, "f": open(name)}
Needless to say, calling myfile["f"].readline() all the time will cause your fingers to hurt at some point. You could of course introduce a function readline in a NamedFile module (namespace), but then you'd always have to use that exact function. By contrast, NamedFile instances can be used anywhere you need an object with a readline method, so it would be a plug-in replacement for file in many situation. That's called polymorphism, one of the biggest benefits of OO/class-based programming.
(Also, dict is a class, so using it violates the assumption that there are no classes :)
In most languages, classes are just pieces of code that describe how to produce an object. That's kinda true in Python too:
>>> class ObjectCreator(object):
... pass
...
>>> my_object = ObjectCreator()
>>> print my_object
<__main__.ObjectCreator object at 0x8974f2c>
But classes are more than that in Python. Classes are objects too.
Yes, objects.
As soon as you use the keyword class, Python executes it and creates an OBJECT. The instruction:
>>> class ObjectCreator(object):
... pass
...
creates in memory an object with the name ObjectCreator.
This object (the class) is itself capable of creating objects (the instances), and this is why it's a class.
But still, it's an object, and therefore:
you can assign it to a variable
you can copy it
you can add attributes to it
you can pass it as a function parameter
e.g.:
>>> print ObjectCreator # you can print a class because it's an object
<class '__main__.ObjectCreator'>
>>> def echo(o):
... print o
...
>>> echo(ObjectCreator) # you can pass a class as a parameter
<class '__main__.ObjectCreator'>
>>> print hasattr(ObjectCreator, 'new_attribute')
False
>>> ObjectCreator.new_attribute = 'foo' # you can add attributes to a class
>>> print hasattr(ObjectCreator, 'new_attribute')
True
>>> print ObjectCreator.new_attribute
foo
>>> ObjectCreatorMirror = ObjectCreator # you can assign a class to a variable
>>> print ObjectCreatorMirror.new_attribute
foo
>>> print ObjectCreatorMirror()
<__main__.ObjectCreator object at 0x8997b4c>
Classes (or objects) are used to provide encapsulation of data and operations that can be performed on that data.
They don't provide namespacing in Python per se; module imports provide the same type of stuff and a module can be entirely functional rather than object oriented.
You might gain some benefit from looking at OOP With Python, Dive into Python, Chapter 5. Objects and Object Oriented Programming or even just the Wikipedia article on object oriented programming
A class is the definition of an object. In this sense, the class provides a namespace of sorts, but that is not the true purpose of a class. The true purpose is to define what the object will 'look like' - what the object is capable of doing (methods) and what it will know (properties).
Note that my answer is intended to provide a sense of understanding on a relatively non-technical level, which is what my initial trouble was with understanding classes. I'm sure there will be many other great answers to this question; I hope this one adds to your overall understanding.
It's a thing that bugged me for a while. Why can't I do:
>>> a = ""
>>> a.foo = 2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'str' object has no attribute 'foo'
...while I can do the following?
>>> class Bar():
... pass
...
>>> a = Bar()
>>> a.foo = 10 #ok!
What's the rule here? Could you please point me to some description?
You can add attributes to any object that has a __dict__.
x = object() doesn't have it, for example.
Strings and other simple builtin objects also don't have it.
Classes using __slots__ also do not have it.
Classes defined with class have it unless the previous statement applies.
If an object is using __slots__ / doesn't have a __dict__, it's usually to save space. For example, in a str it would be overkill to have a dict - imagine the amount of bloat for a very short string.
If you want to test if a given object has a __dict__, you can use hasattr(obj, '__dict__').
This might also be interesting to read:
Some objects, such as built-in types and their instances (lists, tuples, etc.) do not have a __dict__. Consequently user-defined attributes cannot be set on them.
Another interesting article about Python's data model including __dict__, __slots__, etc. is this from the python reference.