Not able to get exact meaning of self and __init__ [duplicate] - python

What is the difference between class and instance variables in Python?
class Complex:
a = 1
and
class Complex:
def __init__(self):
self.a = 1
Using the call: x = Complex().a in both cases assigns x to 1.
A more in-depth answer about __init__() and self will be appreciated.

When you write a class block, you create class attributes (or class variables). All the names you assign in the class block, including methods you define with def become class attributes.
After a class instance is created, anything with a reference to the instance can create instance attributes on it. Inside methods, the "current" instance is almost always bound to the name self, which is why you are thinking of these as "self variables". Usually in object-oriented design, the code attached to a class is supposed to have control over the attributes of instances of that class, so almost all instance attribute assignment is done inside methods, using the reference to the instance received in the self parameter of the method.
Class attributes are often compared to static variables (or methods) as found in languages like Java, C#, or C++. However, if you want to aim for deeper understanding I would avoid thinking of class attributes as "the same" as static variables. While they are often used for the same purposes, the underlying concept is quite different. More on this in the "advanced" section below the line.
An example!
class SomeClass:
def __init__(self):
self.foo = 'I am an instance attribute called foo'
self.foo_list = []
bar = 'I am a class attribute called bar'
bar_list = []
After executing this block, there is a class SomeClass, with 3 class attributes: __init__, bar, and bar_list.
Then we'll create an instance:
instance = SomeClass()
When this happens, SomeClass's __init__ method is executed, receiving the new instance in its self parameter. This method creates two instance attributes: foo and foo_list. Then this instance is assigned into the instance variable, so it's bound to a thing with those two instance attributes: foo and foo_list.
But:
print instance.bar
gives:
I am a class attribute called bar
How did this happen? When we try to retrieve an attribute through the dot syntax, and the attribute doesn't exist, Python goes through a bunch of steps to try and fulfill your request anyway. The next thing it will try is to look at the class attributes of the class of your instance. In this case, it found an attribute bar in SomeClass, so it returned that.
That's also how method calls work by the way. When you call mylist.append(5), for example, mylist doesn't have an attribute named append. But the class of mylist does, and it's bound to a method object. That method object is returned by the mylist.append bit, and then the (5) bit calls the method with the argument 5.
The way this is useful is that all instances of SomeClass will have access to the same bar attribute. We could create a million instances, but we only need to store that one string in memory, because they can all find it.
But you have to be a bit careful. Have a look at the following operations:
sc1 = SomeClass()
sc1.foo_list.append(1)
sc1.bar_list.append(2)
sc2 = SomeClass()
sc2.foo_list.append(10)
sc2.bar_list.append(20)
print sc1.foo_list
print sc1.bar_list
print sc2.foo_list
print sc2.bar_list
What do you think this prints?
[1]
[2, 20]
[10]
[2, 20]
This is because each instance has its own copy of foo_list, so they were appended to separately. But all instances share access to the same bar_list. So when we did sc1.bar_list.append(2) it affected sc2, even though sc2 didn't exist yet! And likewise sc2.bar_list.append(20) affected the bar_list retrieved through sc1. This is often not what you want.
Advanced study follows. :)
To really grok Python, coming from traditional statically typed OO-languages like Java and C#, you have to learn to rethink classes a little bit.
In Java, a class isn't really a thing in its own right. When you write a class you're more declaring a bunch of things that all instances of that class have in common. At runtime, there's only instances (and static methods/variables, but those are really just global variables and functions in a namespace associated with a class, nothing to do with OO really). Classes are the way you write down in your source code what the instances will be like at runtime; they only "exist" in your source code, not in the running program.
In Python, a class is nothing special. It's an object just like anything else. So "class attributes" are in fact exactly the same thing as "instance attributes"; in reality there's just "attributes". The only reason for drawing a distinction is that we tend to use objects which are classes differently from objects which are not classes. The underlying machinery is all the same. This is why I say it would be a mistake to think of class attributes as static variables from other languages.
But the thing that really makes Python classes different from Java-style classes is that just like any other object each class is an instance of some class!
In Python, most classes are instances of a builtin class called type. It is this class that controls the common behaviour of classes, and makes all the OO stuff the way it does. The default OO way of having instances of classes that have their own attributes, and have common methods/attributes defined by their class, is just a protocol in Python. You can change most aspects of it if you want. If you've ever heard of using a metaclass, all that is is defining a class that is an instance of a different class than type.
The only really "special" thing about classes (aside from all the builtin machinery to make them work they way they do by default), is the class block syntax, to make it easier for you to create instances of type. This:
class Foo(BaseFoo):
def __init__(self, foo):
self.foo = foo
z = 28
is roughly equivalent to the following:
def __init__(self, foo):
self.foo = foo
classdict = {'__init__': __init__, 'z': 28 }
Foo = type('Foo', (BaseFoo,) classdict)
And it will arrange for all the contents of classdict to become attributes of the object that gets created.
So then it becomes almost trivial to see that you can access a class attribute by Class.attribute just as easily as i = Class(); i.attribute. Both i and Class are objects, and objects have attributes. This also makes it easy to understand how you can modify a class after it's been created; just assign its attributes the same way you would with any other object!
In fact, instances have no particular special relationship with the class used to create them. The way Python knows which class to search for attributes that aren't found in the instance is by the hidden __class__ attribute. Which you can read to find out what class this is an instance of, just as with any other attribute: c = some_instance.__class__. Now you have a variable c bound to a class, even though it probably doesn't have the same name as the class. You can use this to access class attributes, or even call it to create more instances of it (even though you don't know what class it is!).
And you can even assign to i.__class__ to change what class it is an instance of! If you do this, nothing in particular happens immediately. It's not earth-shattering. All that it means is that when you look up attributes that don't exist in the instance, Python will go look at the new contents of __class__. Since that includes most methods, and methods usually expect the instance they're operating on to be in certain states, this usually results in errors if you do it at random, and it's very confusing, but it can be done. If you're very careful, the thing you store in __class__ doesn't even have to be a class object; all Python's going to do with it is look up attributes under certain circumstances, so all you need is an object that has the right kind of attributes (some caveats aside where Python does get picky about things being classes or instances of a particular class).
That's probably enough for now. Hopefully (if you've even read this far) I haven't confused you too much. Python is neat when you learn how it works. :)

What you're calling an "instance" variable isn't actually an instance variable; it's a class variable. See the language reference about classes.
In your example, the a appears to be an instance variable because it is immutable. It's nature as a class variable can be seen in the case when you assign a mutable object:
>>> class Complex:
>>> a = []
>>>
>>> b = Complex()
>>> c = Complex()
>>>
>>> # What do they look like?
>>> b.a
[]
>>> c.a
[]
>>>
>>> # Change b...
>>> b.a.append('Hello')
>>> b.a
['Hello']
>>> # What does c look like?
>>> c.a
['Hello']
If you used self, then it would be a true instance variable, and thus each instance would have it's own unique a. An object's __init__ function is called when a new instance is created, and self is a reference to that instance.

Related

In python object and class attributes can also be defined outside the class definition, is this a bad coding practise? [duplicate]

I am studying python, and although I think I get the whole concept and notion of Python, today I stumbled upon a piece of code that I did not fully understand:
Say I have a class that is supposed to define Circles but lacks a body:
class Circle():
pass
Since I have not defined any attributes, how can I do this:
my_circle = Circle()
my_circle.radius = 12
The weird part is that Python accepts the above statement. I don't understand why Python doesn't raise an undefined name error. I do understand that via dynamic typing I just bind variables to objects whenever I want, but shouldn't an attribute radius exist in the Circle class to allow me to do this?
EDIT: Lots of wonderful information in your answers! Thank you everyone for all those fantastic answers! It's a pity I only get to mark one as an answer.
A leading principle is that there is no such thing as a declaration. That is, you never declare "this class has a method foo" or "instances of this class have an attribute bar", let alone making a statement about the types of objects to be stored there. You simply define a method, attribute, class, etc. and it's added. As JBernardo points out, any __init__ method does the very same thing. It wouldn't make a lot of sense to arbitrarily restrict creation of new attributes to methods with the name __init__. And it's sometimes useful to store a function as __init__ which don't actually have that name (e.g. decorators), and such a restriction would break that.
Now, this isn't universally true. Builtin types omit this capability as an optimization. Via __slots__, you can also prevent this on user-defined classes. But this is merely a space optimization (no need for a dictionary for every object), not a correctness thing.
If you want a safety net, well, too bad. Python does not offer one, and you cannot reasonably add one, and most importantly, it would be shunned by Python programmers who embrace the language (read: almost all of those you want to work with). Testing and discipline, still go a long way to ensuring correctness. Don't use the liberty to make up attributes outside of __init__ if it can be avoided, and do automated testing. I very rarely have an AttributeError or a logical error due to trickery like this, and of those that happen, almost all are caught by tests.
Just to clarify some misunderstandings in the discussions here. This code:
class Foo(object):
def __init__(self, bar):
self.bar = bar
foo = Foo(5)
And this code:
class Foo(object):
pass
foo = Foo()
foo.bar = 5
is exactly equivalent. There really is no difference. It does exactly the same thing. This difference is that in the first case it's encapsulated and it's clear that the bar attribute is a normal part of Foo-type objects. In the second case it is not clear that this is so.
In the first case you can not create a Foo object that doesn't have the bar attribute (well, you probably can, but not easily), in the second case the Foo objects will not have a bar attribute unless you set it.
So although the code is programatically equivalent, it's used in different cases.
Python lets you store attributes of any name on virtually any instance (or class, for that matter). It's possible to block this either by writing the class in C, like the built-in types, or by using __slots__ which allows only certain names.
The reason it works is that most instances store their attributes in a dictionary. Yes, a regular Python dictionary like you'd define with {}. The dictionary is stored in an instance attribute called __dict__. In fact, some people say "classes are just syntactic sugar for dictionaries." That is, you can do everything you can do with a class with a dictionary; classes just make it easier.
You're used to static languages where you must define all attributes at compile time. In Python, class definitions are executed, not compiled; classes are objects just like any other; and adding attributes is as easy as adding an item to a dictionary. This is why Python is considered a dynamic language.
No, python is flexible like that, it does not enforce what attributes you can store on user-defined classes.
There is a trick however, using the __slots__ attribute on a class definition will prevent you from creating additional attributes not defined in the __slots__ sequence:
>>> class Foo(object):
... __slots__ = ()
...
>>> f = Foo()
>>> f.bar = 'spam'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Foo' object has no attribute 'bar'
>>> class Foo(object):
... __slots__ = ('bar',)
...
>>> f = Foo()
>>> f.bar
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: bar
>>> f.bar = 'spam'
It creates a radius data member of my_circle.
If you had asked it for my_circle.radius it would have thrown an exception:
>>> print my_circle.radius # AttributeError
Interestingly, this does not change the class; just that one instance. So:
>>> my_circle = Circle()
>>> my_circle.radius = 5
>>> my_other_circle = Circle()
>>> print my_other_circle.radius # AttributeError
There are two types of attributes in Python - Class Data Attributes and Instance Data Attributes.
Python gives you flexibility of creating Data Attributes on the fly.
Since an instance data attribute is related to an instance, you can also do that in __init__ method or you can do it after you have created your instance..
class Demo(object):
classAttr = 30
def __init__(self):
self.inInit = 10
demo = Demo()
demo.outInit = 20
Demo.new_class_attr = 45; # You can also create class attribute here.
print demo.classAttr # Can access it
del demo.classAttr # Cannot do this.. Should delete only through class
demo.classAttr = 67 # creates an instance attribute for this instance.
del demo.classAttr # Now OK.
print Demo.classAttr
So, you see that we have created two instance attributes, one inside __init__ and one outside, after instance is created..
But a difference is that, the instance attribute created inside __init__ will be set for all the instances, while if created outside, you can have different instance attributes for different isntances..
This is unlike Java, where each Instance of a Class have same set of Instance Variables..
NOTE: - While you can access a class attribute through an instance, you cannot delete it..
Also, if you try to modify a class attribute through an instance, you actually create an instance attribute which shadows the class attribute..
How to prevent new attributes creation ?
Using class
To control the creation of new attributes, you can overwrite the __setattr__ method. It will be called every time my_obj.x = 123 is called.
See the documentation:
class A:
def __init__(self):
# Call object.__setattr__ to bypass the attribute checking
super().__setattr__('x', 123)
def __setattr__(self, name, value):
# Cannot create new attributes
if not hasattr(self, name):
raise AttributeError('Cannot set new attributes')
# Can update existing attributes
super().__setattr__(name, value)
a = A()
a.x = 123 # Allowed
a.y = 456 # raise AttributeError
Note that users can still bypass the checking if they call directly object.__setattr__(a, 'attr_name', attr_value).
Using dataclass
With dataclasses, you can forbid the creation of new attributes with frozen=True. It will also prevent existing attributes to be updated.
#dataclasses.dataclass(frozen=True)
class A:
x: int
a = A(x=123)
a.y = 123 # Raise FrozenInstanceError
a.x = 123 # Raise FrozenInstanceError
Note: dataclasses.FrozenInstanceError is a subclass of AttributeError
To add to Conchylicultor's answer, Python 3.10 added a new parameter to dataclass.
The slots parameter will create the __slots__ attribute in the class, preventing creation of new attributes outside of __init__, but allowing assignments to existing attributes.
If slots=True, assigning to an attribute that was not defined will throw an AttributeError.
Here is an example with slots and with frozen:
from dataclasses import dataclass
#dataclass
class Data:
x:float=0
y:float=0
#dataclass(frozen=True)
class DataFrozen:
x:float=0
y:float=0
#dataclass(slots=True)
class DataSlots:
x:float=0
y:float=0
p = Data(1,2)
p.x = 5 # ok
p.z = 8 # ok
p = DataFrozen(1,2)
p.x = 5 # FrozenInstanceError
p.z = 8 # FrozenInstanceError
p = DataSlots(1,2)
p.x = 5 # ok
p.z = 8 # AttributeError
As delnan said, you can obtain this behavior with the __slots__ attribute. But the fact that it is a way to save memory space and access type does not discard the fact that it is (also) a/the mean to disable dynamic attributes.
Disabling dynamic attributes is a reasonable thing to do, if only to prevent subtle bugs due to spelling mistakes. "Testing and discipline" is fine but relying on automated validation is certainly not wrong either – and not necessarily unpythonic either.
Also, since the attrs library reached version 16 in 2016 (obviously way after the original question and answers), creating a closed class with slots has never been easier.
>>> import attr
...
... #attr.s(slots=True)
... class Circle:
... radius = attr.ib()
...
... f = Circle(radius=2)
... f.color = 'red'
AttributeError: 'Circle' object has no attribute 'color'

Instance variables in methods outside the constructor (Python) -- why and how?

My questions concern instance variables that are initialized in methods outside the class constructor. This is for Python.
I'll first state what I understand:
Classes may define a constructor, and it may also define other methods.
Instance variables are generally defined/initialized within the constructor.
But instance variables can also be defined/initialized outside the constructor, e.g. in the other methods of the same class.
An example of (2) and (3) -- see self.meow and self.roar in the Cat class below:
class Cat():
def __init__(self):
self.meow = "Meow!"
def meow_bigger(self):
self.roar = "Roar!"
My questions:
Why is it best practice to initialize the instance variable within the constructor?
What general/specific mess could arise if instance variables are regularly initialized in methods other than the constructor? (E.g. Having read Mark Lutz's Tkinter guide in his Programming Python, which I thought was excellent, I noticed that the instance variable used to hold the PhotoImage objects/references were initialized in the further methods, not in the constructor. It seemed to work without issue there, but could that practice cause issues in the long run?)
In what scenarios would it be better to initialize instance variables in the other methods, rather than in the constructor?
To my knowledge, instance variables exist not when the class object is created, but after the class object is instantiated. Proceeding upon my code above, I demonstrate this:
>> c = Cat()
>> c.meow
'Meow!'
>> c.roar
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Cat' object has no attribute 'roar'
>>> c.meow_bigger()
>>> c.roar
'Roar!'
As it were:
I cannot access the instance variable (c.roar) at first.
However, after I have called the instance method c.meow_bigger() once, I am suddenly able to access the instance variable c.roar.
Why is the above behaviour so?
Thank you for helping out with my understanding.
Why is it best practice to initialize the instance variable within the
constructor?
Clarity.
Because it makes it easy to see at a glance all of the attributes of the class. If you initialize the variables in multiple methods, it becomes difficult to understand the complete data structure without reading every line of code.
Initializing within the __init__ also makes documentation easier. With your example, you can't write "an instance of Cat has a roar attribute". Instead, you have to add a paragraph explaining that an instance of Cat might have a "roar" attribute, but only after calling the "meow_louder" method.
Clarity is king. One of the smartest programmers I ever met once told me "show me your data structures, and I can tell you how your code works without seeing any of your code". While that's a tiny bit hyperbolic, there's definitely a ring of truth to it. One of the biggest hurdles to learning a code base is understanding the data that it manipulates.
What general/specific mess could arise if instance variables are
regularly initialized in methods other than the constructor?
The most obvious one is that an object may not have an attribute available during all parts of the program, leading to having to add a lot of extra code to handle the case where the attribute is undefined.
In what scenarios would it be better to initialize instance variables
in the other methods, rather than in the constructor?
I don't think there are any.
Note: you don't necessarily have to initialize an attribute with it's final value. In your case it's acceptable to initialize roar to None. The mere fact that it has been initialized to something shows that it's a piece of data that the class maintains. It's fine if the value changes later.
Remember that class members in "pure" Python are just a dictionary. Members aren't added to an instance's dictionary until you run the function in which they are defined. Ideally this is the constructor, because that then guarantees that your members will all exist regardless of the order that your functions are called.
I believe your example above could be translated to:
class Cat():
def __init__(self):
self.__dict__['meow'] = "Meow!"
def meow_bigger(self):
self.__dict__['roar'] = "Roar!"
>>> c = Cat() # c.__dict__ = { 'meow': "Meow!" }
>>> c.meow_bigger() # c.__dict__ = { 'meow': "Meow!", 'roar': "Roar!" }
To initialize instance variables within the constructor, is - as you already pointed out - only recommended in python.
First of all, defining all instance variables within the constructor is a good way to document a class. Everybody, seeing the code, knows what kind of internal state an instance has.
Secondly, order matters. if one defines an instance variable V in a function A and there is another function B also accessing V, it is important to call A before B. Otherwise B will fail since V was never defined. Maybe, A has to be invoked before B, but then it should be ensured by an internal state, which would be an instance variable.
There are many more examples. Generally it is just a good idea to define everything in the __init__ method, and set it to None if it can not / should not be initialized at initialization.
Of course, one could use hasattr method to derive some information of the state. But, also one could check if some instance variable V is for example None, which can imply the same then.
So in my opinion, it is never a good idea to define an instance variable anywhere else as in the constructor.
Your examples state some basic properties of python. An object in Python is basically just a dictionary.
Lets use a dictionary: One can add functions and values to that dictionary and construct some kind of OOP. Using the class statement just brings everything into a clean syntax and provides extra stuff like magic methods.
In other languages all information about instance variables and functions are present before the object was initialized. Python does that at runtime. You can also add new methods to any object outside the class definition: Adding a Method to an Existing Object Instance
3.) But instance variables can also be defined/initialized outside the constructor, e.g. in the other methods of the same class.
I'd recommend providing a default state in initialization, just so its clear what the class should expect. In statically typed languages, you'd have to do this, and it's good practice in python.
Let's convey this by replacing the variable roar with a more meaningful variable like has_roared.
In this case, your meow_bigger() method now has a reason to set has_roar. You'd initialize it to false in __init__, as the cat has not roared yet upon instantiation.
class Cat():
def __init__(self):
self.meow = "Meow!"
self.has_roared = False
def meow_bigger(self):
print self.meow + "!!!"
self.has_roared = True
Now do you see why it often makes sense to initialize attributes with default values?
All that being said, why does python not enforce that we HAVE to define our variables in the __init__ method? Well, being a dynamic language, we can now do things like this.
>>> cat1 = Cat()
>>> cat2 = Cat()
>>> cat1.name = "steve"
>>> cat2.name = "sarah"
>>> print cat1.name
... "steve"
The name attribute was not defined in the __init__ method, but we're able to add it anyway. This is a more realistic use case of setting variables that aren't defaulted in __init__.
I try to provide a case where you would do so for:
3.) But instance variables can also be defined/initialized outside the constructor, e.g. in the other methods of the same class.
I agree it would be clear and organized to include instance field in the constructor, but sometimes you are inherit other class, which is created by some other people and has many instance fields and api.
But if you inherit it only for certain apis and you want to have your own instance field for your own apis, in this case, it is easier for you to just declare extra instance field in the method instead override the other's constructor without bothering to deep into the source code. This also support Adam Hughes's answer, because in this case, you will always have your defined instance because you will guarantee to call you own api first.
For instance, suppose you inherit a package's handler class for web development, you want to include a new instance field called user for handler, you would probability just declare it directly in the method--initialize without override the constructor, I saw it is more common to do so.
class BlogHandler(webapp2.RequestHandler):
def initialize(self, *a, **kw):
webapp2.RequestHandler.initialize(self, *a, **kw)
uid = self.read_cookie('user_id') #get user_id by read cookie in the browser
self.user = User.by_id(int(uid)) #run query in data base find the user and return user
These are very open questions.
Python is a very "free" language in the sense that it tries to never restrict you from doing anything, even if it looks silly. This is why you can do completely useless things such as replacing a class with a boolean (Yes you can).
The behaviour that you mention follows that same logic: if you wish to add an attribute to an object (or to a function - yes you can, too) dynamically, anywhere, not necessarily in the constructor, well... you can.
But it is not because you can that you should. The main reason for initializing attributes in the constructor is readability, which is a prerequisite for maintenance. As Bryan Oakley explains in his answer, class fields are key to understand the code as their names and types often reveal the intent better than the methods.
That being said, there is now a way to separate attribute definition from constructor initialization: pyfields. I wrote this library to be able to define the "contract" of a class in terms of attributes, while not requiring initialization in the constructor. This allows you in particular to create "mix-in classes" where attributes and methods relying on these attributes are defined, but no constructor is provided.
See this other answer for an example and details.
i think to keep it simple and understandable, better to initialize the class variables in the class constructor, so they can be directly called without the necessity of compiling of a specific class method.
class Cat():
def __init__(self,Meow,Roar):
self.meow = Meow
self.roar = Roar
def meow_bigger(self):
return self.roar
def mix(self):
return self.meow+self.roar
c=Cat("Meow!","Roar!")
print(c.meow_bigger())
print(c.mix())
Output
Roar!
Roar!
Meow!Roar!

What is the functionality difference between the Reference of a class and its object/instance in python while calling its objects?

I was searching for the meaning of default parameters object,self that are present as default class and function parameters, so moving away from it, if we are calling an attribute of a class should we use Foo (class reference) or should we use Foo() (instance of the class).
If you are reading a normal attribute then it doesn't matter. If you are binding a normal attribute then you must use the correct one in order for the code to work. If you are accessing a descriptor then you must use an instance.
The details of python's class semantics are quite well documented in the data model. Especially the __get__ semantics are at work here. Instances basically stack their namespace on top of their class' namespace and add some boilerplate for calling methods.
There are some large "it depends on what you are doing" gotchas at work here. The most important question: do you want to access class or instance attributes? Second, do you want attribute or methods?
Let's take this example:
class Foo(object):
bar = 1
baz = 2
def __init__(self, foobar="barfoo", baz=3):
self.foobar = foobar
self.baz = baz
def meth(self, param):
print self, param
#classmethod
def clsmeth(cls, param):
print cls, param
#staticmethod
def stcmeth(param):
print param
Here, bar is a class attribute, so you can get it via Foo.bar. Since instances have implicit access to their class namespace, you can also get it as Foo().bar. foobar is an instance attribute, since it is never bound to the class (only instances, i.e. selfs) - you can only get it as Foo().foobar. Last, baz is both a class and an instance attribute. By default, Foo.baz == 2 and Foo().baz == 3, since the class attribute is hidden by the instance attribute set in __init__.
Similarly, in an assignment there are slight differences whether you work on the class or an instance. Foo.bar=2 will set the class attribute (also for all instances) while Foo().bar=2 will create an instance attribute that shadows the class attribute for this specific instance.
For methods, it is somewhat similar. However, here you get the implicit self parameter for instance method (what a function is if defined for a class). Basically, the call Foo().meth(param=x) is silently translated to Foo.meth(self=Foo(), param=x). This is why it is usually not valid to call Foo.meth(param=x) - meth is not "bound" to an instance and thus lacks the self parameter.
Now, sometimes you do not need any instance data in a method - for example, you have strict string transformation that is an implementation detail of a larger parser class. This is where #classmethod and #staticmethod come into play. A classmethod's first parameter is always the class, as opposed to the instance for regular methods. Foo().clsmeth(param=x) and Foo.clsmeth(param=x) result in a call of clsmethod(cls=Foo, param=x). Here, the two are equivalent. Going one step further, a staticmethod doesn't get any class or instance information - it is like a raw function bound to the classes namespace.

Why I can Assign a attributes to a class instance, even this attributes is not defined in class?

What kind of mechanism is behind this?
For example, I can define a class
class test:
string_a="aaa"
then I can setup a instance for class test.
test_instance=test()
later I can assign a test_attr to test_instance.
test_instance.test_attr="bbb"
print test_instance.test_attr
this will print, "bbb"
so what is behavior for? anything conflict to __init__?
Thanks.
There are two things going on here:
Class attributes
Instance attributes
It is important to distinguish the difference.
Class Attributes as in:
class A:
a = 1
will persist with every Instance created. That is:
a = A()
b = A()
a.a == b.a # True
Instance Attributes are typically accessed by magic methods called __setattr__ (to set them) and __getattr__ (to retrieve them).
All instances of classes (unless you use __slots__) have an internal dict called __dict__. You don't have to "declare" attributes on a class in order to set them on an instance.
Have a read of Python Data Model

How to access calls to self from python class functions

Concretely, I have a user-defined class of type
class Foo(object):
def __init__(self, bar):
self.bar = bar
def bind(self):
val = self.bar
do_something(val)
I need to:
1) be able to call on the class (not an instance of the class) to recover all the self.xxx attributes defined within the class.
For an instance of a class, this can be done by doing a f = Foo('') and then f.__dict__. Is there a way of doing it for a class, and not an instance? If yes, how? I would expect Foo.__dict__ to return {'bar': None} but it doesn't work this way.
2) be able to access all the self.xxx parameters called from a particular function of a class. For instance I would like to do Foo.bind.__selfparams__ and recieve in return ['bar']. Is there a way of doing this?
This is something that is quite hard to do in a dynamic language, assuming I understand correctly what you're trying to do. Essentially this means going over all the instances in existence for the class and then collecting all the set attributes on those instances. While not infeasible, I would question the practicality of such approach both from a design as well as performance points of view.
More specifically, you're talking of "all the self.xxx attributes defined within the class"—but these things are not defined at all, not at least in a single place—they more like "evolve" as more and more instances of the class are brought to life. Now, I'm not saying all your instances are setting different attributes, but they might, and in order to have a reliable generic solution, you'd literally have to keep track of anything the instances might have done to themselves. So unless you have a static analysis approach in mind, I don't see a clean and efficient way of achieving it (and actually even static analysis is of no help generally speaking in a dynamic language).
A trivial example to prove my point:
class Foo(object):
def __init__(self):
# statically analysable
self.bla = 3
# still, but more difficult
if SOME_CONSTANT > 123:
self.x = 123
else:
self.y = 321
def do_something(self):
import random
setattr(self, "attr%s" % random.randint(1, 100), "hello, world of dynamic languages!")
foo = Foo()
foo2 = Foo()
# only `bla`, `x`, and `y` attrs in existence so far
foo2.do_something()
# now there's an attribute with a random name out there
# in order to detect it, we'd have to get all instances of Foo existence at the moment, and individually inspect every attribute on them.
And, even if you were to iterate all instances in existence, you'd only be getting a snapshot of what you're interested, not all possible attributes.
This is not possible. The class doesn't have those attributes, just functions that set them. Ergo, there is nothing to retrieve and this is impossible.
This is only possible with deep AST inspection. Foo.bar.func_code would normally have the attributes you want under co_freevars but you're looking up the attributes on self, so they are not free variables. You would have to decompile the bytecode from func_code.co_code to AST and then walk said AST.
This is a bad idea. Whatever you're doing, find a different way of doing it.
To do this, you need some way to find all the instances of your class. One way to do this is just to have the class itself keep track of its instances. Unfortunately, keeping a reference to every instance in the class means that those instances can never be garbage-collected. Fortunately, Python has weakref, which will keep a reference to an object but does not count as a reference to Python's memory management, so the instances can be garbage-collected as per usual.
A good place to update the list of instances is in your __init__() method. You could also do it in __new__() if you find the separation of concerns a little cleaner.
import weakref
class Foo(object):
_instances = []
def __init__(self, value):
self.value = value
cls = type(self)
type(self)._instances.append(weakref.ref(self,
type(self)._instances.remove))
#classmethod
def iterinstances(cls):
"Returns an iterator over all instances of the class."
return (ref() for ref in cls._instances)
#classmethod
def iterattrs(cls, attr, default=None):
"Returns an iterator over a named attribute of all instances of the class."
return (getattr(ref(), attr, default) for ref in cls._instances)
Now you can do this:
f1, f2, f3 = Foo(1), Foo(2), Foo(3)
for v in Foo.iterattrs("value"):
print v, # prints 1 2 3
I am, for the record, with those who think this is generally a bad idea and/or not really what you want. In particular, instances may live longer than you expect depending on where you pass them and what that code does with them, so you may not always have the instances you think you have. (Some of this may even happen implicitly.) It is generally better to be explicit about this: rather than having the various instances of your class be stored in random variables all over your code (and libraries), have their primary repository be a list or other container, and access them from there. Then you can easily iterate over them and get whatever attributes you want. However, there may be use cases for something like this and it's possible to code it up, so I did.

Categories