Python objects - avoiding creation of attribute with unknown name

Python objects - avoiding creation of attribute with unknown name - python

Wishing to avoid a situation like this:
>>> class Point:
x = 0
y = 0
>>> a = Point()
>>> a.X = 4 #whoops, typo creates new attribute capital x
I created the following object to be used as a superclass:
class StrictObject(object):
def __setattr__(self, item, value):
if item in dir(self):
object.__setattr__(self, item, value)
else:
raise AttributeError("Attribute " + item + " does not exist.")
While this seems to work, the python documentation says of dir():
Note: Because dir() is supplied primarily as a convenience for use at an interactive prompt, it tries to supply an interesting set of names more than it tries to supply a rigorously or consistently defined set of names, and its detailed behavior may change across releases. For example, metaclass attributes are not in the result list when the argument is a class.
Is there a better way to check if an object has an attribute?

Much better ways.
The most common way is "we're all consenting adults". That means, you don't do any checking, and you leave it up to the user. Any checking you do makes the code less flexible in it's use.
But if you really want to do this, there is __slots__ by default in Python 3.x, and for new-style classes in Python 2.x:
By default, instances of both old and new-style classes have a dictionary for attribute storage. This wastes space for objects having very few instance variables. The space consumption can become acute when creating large numbers of instances.
The default can be overridden by defining __slots__ in a new-style class definition. The __slots__ declaration takes a sequence of instance variables and reserves just enough space in each instance to hold a value for each variable. Space is saved because __dict__ is not created for each instance.
Without a __dict__ variable, instances cannot be assigned new variables not listed in the __slots__ definition. Attempts to assign to an unlisted variable name raises AttributeError. If dynamic assignment of new variables is desired, then add '__dict__' to the sequence of strings in the __slots__ declaration.
For example:
class Point(object):
__slots__ = ("x", "y")
point = Point()
point.x = 5 # OK
point.y = 1 # OK
point.X = 4 # AttributeError is raised
And finally, the proper way to check if an object has a certain attribute is not to use dir, but to use the built-in function hasattr(object, name).

I don't think it's a good idea to write code to prevent such errors. These "static" checks should be the job of your IDE. Pylint will warn you about assigning attributes outside of __init__ thus preventing typo errors. It also shows many other problems and potential problems and it can easily be used from PyDev.

In such situation you should look what the python standard library may offer you. Did you consider the namedtuple?
from collections import namedtuple
Point = namedtuple("Point", "x, y")
a = Point(1,3)
print a.x, a.y
Because Point is now immutable your problem just can't happen, but the draw-back is naturally you can't e.g. just add +1 to a, but have to create a complete new Instance.
x,y = a
b = Point(x+1,y)

Related

In python, why we can create a new attribute from an instance and not a method?

In the following code,
# An example class with some variable and a method
class ExampleClass(object):
def __init__(self):
self.var = 10
def dummyPrint(self):
print ('Hello World!')
# Creating instance and printing the init variable
inst_a = ExampleClass()
# This prints --> __init__ variable = 10
print ('__init__ variable = %d' %(inst_a.var))
# This prints --> Hello World!
inst_a.dummyPrint()
# Creating a new attribute and printing it.
# This prints --> New variable = 20
inst_a.new_var = 20
print ('New variable = %d' %(inst_a.new_var))
# Trying to create new method, which will give error
inst_a.newDummyPrint()
I am able to create a new attribute (new_var) outside the class, using instance. And it works. Ideally, I was expecting it will not work.
Similarly I tried creating new method (newDummyPrint()); which will print AttributeError: 'ExampleClass' object has no attribute 'newDummyPrint' as I expected.
My question is,
Why did creating a new attribute worked?
Why creating a new method didn't work?

As already mentionned in comments, you are creating the new attribute here:
inst_a.new_var = 20
before reading it on the next line. You're NOT assigning newDummyPrint anywhere, so obviously the attribute resolution mechanism cannot find it and ends up raising an AtributeError. You'd get the very same result if you tried to access any other non-existing attribute, ie inst_a.whatever.
Note that since in Python everything is an object (including classes, functions etc), there are no real distinction between accessing a "data" attribute or a method - they are all attributes (whether class or instance ones), and the attribute resolution rules are the same. In the case of methods (or any other callable attribute), the call operation happens after the attribute has been resolved.
To dynamically create a new "method", you mainly have two solutions: creating as a class attribute (which will make it available to all other instances of the class), or as an instance attribute (which will - obviously - make it available only on this exact instance.
The first solution is as simple as it can be: define your function and bind it to the class:
# nb: inheriting from `object` for py2 compat
class Foo(object):
def __init__(self, var):
self.var = var
def bar(self, x):
return self.var * x
# testing before:
f = Foo(42)
try:
print(f.bar(2))
except AttribteError as e:
print(e)
# now binds the function to the class:
Foo.bar = bar
# and test it:
print(f.bar(2))
# and it's also available on other instances:
f2 = Foo(6)
print(f2.bar(7))
Creating per-instance method is a (very tiny) bit more involved - you have to manually get the method from the function and bind this method to the instance:
def baaz(self):
return "{}.var = {}".format(self, self.var)
# test before:
try:
print(f.baaz())
except AttributeError as e:
print(e)
# now binds the method to the instance
f.baaz = baaz.__get__(f, Foo)
# now `f` has a `baaz` method
print(f.baaz())
# but other Foo instances dont
try:
print(f2.baaz())
except AttributeError as e:
print(e)
You'll noticed I talked about functions in the first case and methods in the second case. A python "method" is actually just a thin callable wrapper around a function, an instance and a class, and is provided by the function type through the descriptor protocol - which is automagically invoked when the attribute is resolved on the class itself (=> is a class attribute implementin the descriptor protocol) but not when resolved on the instance. This why, in the second case, we have to manually invoke the descriptor protocol.
Also note that there are limitations on what's possible here: first, __magic__ methods (all methods named with two leading and two trailing underscores) are only looked up on the class itself so you cannot define them on a per-instance basis. Then, slots-based types and some builtin or C-coded types do not support dynamic attributes whatsoever. Those restrictions are mainly there for performance optimization reasons.

You can create new attributes on the fly when you are using an empty class definition emulating Pascal "record" or C "struct". Otherwise, what you are trying to do is not a good manner, or a good pattern for object-oriented programming. There are lots of books you can read about it. Generally speaking, you have to clearly tell in the class definition what an object of that class is, how it behaves: modifying its behavior on the fly (e.g. adding new methods) could lead to unknown results, which make your life impossible when reading that code a month later and even worse when you are debugging.
There is even an anti-pattern problem called Ambiguous Viewpoint:
Lack of clarification of the modeling viewpoint leads to problematic
ambiguities in object models.
Anyway, if you are playing with Python and you swear you'll never use this code in production, you can write new attributes which store lambda functions, e.g.
c = ExampleClass()
c.newMethod = lambda s1, s2: str(s1) + ' and ' + str(s2)
print(c.newMethod('string1', 'string2'))
# output is: string1 and string2
but this is very ugly, I would never do it.

Python caching attributes in object with slots

I am trying to cache a computationally expensive property in a class defined with the __slots__ attribute.
Any idea, how to store the cache for later use? Of course the usual way to store a dictionary in instance._cache would not work without __dict__ being defined. For several reasons i do not want to add a '_cache' string to __slots__.
I was thinking whether this is one of the rare use cases for global. Any thoughts or examples on this matter?

There is no magic possible there - ou want to store a value, so you need a place to store your value.
You can't just decide "I won't have an extra entry on my __slots__ because it is not elegant" - you don't need to call it _cached:
give it whatever name you want, but these cached values are something you want to exist in each of the object's instances, and therefore you need an attribute.
You can cache in a global (module level) dictionary, in which the keys are id(self) - but that would be a major headache to keep synchronized when instances are deleted. (The same thing is true for a class-level dictionary, with the further downside of it still be visible on the instance).
TL;DR: the "one and obvious way to do it" is to have a shadow attribute, starting with "_" to keep the values you want cached, and declare these in __slots__. (If you use a _cached dictionary per instance, you loose the main advantage from __slots__, that is exactly not needing one dictionary per instance).

You don't quite need a global; you can store the cache as a class property and still define the expensive property as a property.
class Foo(object):
__slots__ = ('a', 'b', 'c')
expensive_cache = {}
#property
def expensive(self):
if self not in self.expensive_cache:
self.expensive_cache[self] = self._compute_expensive()
return self.expensive_cache[self]
def _compute_expensive(self):
print("Computing expensive property for {}".format(self))
return 3
f = Foo()
g = Foo()
print(f.expensive)
print("===")
print(f.expensive)
print("===")
print(g.expensive)
If you run this code, you can see that _compute_expensive is run only once, the first time you access expensive for each distinct object.
$ python3 tmp.py
Computing expensive property for <__main__.Foo object at 0x102861188>
3
===
3
===
Computing expensive property for <__main__.Foo object at 0x1028611c8>
3

Something like Borg pattern can help.
You can alterate the status of your instance in the __init__ or __new__ methods.

How does attribute resolution work in Python?

Consider the following code:
class A(object):
def do(self):
print self.z
class B(A):
def __init__(self, y):
self.z = y
b = B(3)
b.do()
Why does this work? When executing b = B(3), attribute z is set. When b.do() is called, Python's MRO finds the do function in class A. But why is it able to access an attribute defined in a subclass?
Is there a use case for this functionality? I would love an example.

It works in a pretty simple way: when a statement is executed that sets an attribute, it is set. When a statement is executed that reads an attribute, it is read. When you write code that reads an attribute, Python does not try to guess whether the attribute will exist when that code is executed; it just waits until the code actually is executed, and if at that time the attribute doesn't exist, then you'll get an exception.
By default, you can always set any attribute on an instance of a user-defined class; classes don't normally define lists of "allowed" attributes that could be set (although you can make that happen too), they just actually set attributes. Of course, you can only read attributes that exist, but again, what matters is whether they exist when you actually try to read them. So it doesn't matter if an attribute exists when you define a function that tries to read it; it only matters when (or if) you actually call that function.
In your example, it doesn't matter that there are two classes, because there is only one instance. Since you only create one instance and call methods on one instance, the self in both methods is the same object. First __init__ is run and it sets the attribute on self. Then do is run and it reads the attribute from the same self. That's all there is to it. It doesn't matter where the attribute is set; once it is set on the instance, it can be accessed from anywhere: code in a superclass, subclass, other class, or not in any class.

Since new attributes can be added to any object at any time, attribute resolution happens at execution time, not compile time. Consider this example which may be a bit more instructive, derived from yours:
class A(object):
def do(self):
print(self.z) # references an attribute which we have't "declared" in an __init__()
#make a new A
aa = A()
# this next line will error, as you would expect, because aa doesn't have a self.z
aa.do()
# but we can make it work now by simply doing
aa.z = -42
aa.do()
The first one will squack at you, but the second will print -42 as expected.
Python objects are just dictionaries. :)

When retrieving an attribute from an object (print self.attrname) Python follows these steps:
If attrname is a special (i.e. Python-provided) attribute for objectname, return it.
Check objectname.__class__.__dict__ for attrname. If it exists and is a data-descriptor, return the descriptor result. Search all bases of objectname.__class__ for the same case.
Check objectname.__dict__ for attrname, and return if found. If objectname is a class, search its bases too. If it is a class and a descriptor exists in it or its bases, return the descriptor result.
Check objectname.__class__.__dict__ for attrname. If it exists and is a non-data descriptor, return the descriptor result. If it exists, and is not a descriptor, just return it. If it exists and is a data descriptor, we shouldn't be here because we would have returned at point 2. Search all bases of objectname.__class__ for same case.
Raise AttributeError
Source
Understanding get and set and Python descriptors

Since you instanciated a B object, B.__init__ was invoked and added an attribute z. This attribute is now present in the object. It's not some weird overloaded magical shared local variable of B methods that somehow becomes inaccessible to code written elsewhere. There's no such thing. Neither does self become a different object when it's passed to a superclass' method (how's polymorphism supposed to work if that happens?).
There's also no such thing as a declaration that A objects have no such object (try o = A(); a.z = whatever), and neither is self in do required to be an instance of A1. In fact, there are no declarations at all. It's all "go ahead and try it"; that's kind of the definition of a dynamic language (not just dynamic typing).
That object's z attribute present "everywhere", all the time2, regardless of the "context" from which it is accessed. It never matters where code is defined for the resolution process, or for several other behaviors3. For the same reason, you can access a list's methods despite not writing C code in listobject.c ;-) And no, methods aren't special. They are just objects too (instances of the type function, as it happens) and are involved in exactly the same lookup sequence.
1 This is a slight lie; in Python 2, A.do would be "bound method" object which in fact throws an error if the first argument doesn't satisfy isinstance(A, <first arg>).
2 Until it's removed with del or one of its function equivalents (delattr and friends).
3 Well, there's name mangling, and in theory, code could inspect the stack, and thereby the caller code object, and thereby the location of its source code.

Using globals() to create class object

I'm new in programming so please don't kill me for asking stupid questions.
I've been trying to understand all that class business in Python and I got to the point where could not find answer for my question just by google it.
In my program I need to call a class from within other class based on string returned by function. I found two solutions: one by using getattr() and second one by using globals() / locals().
Decided to go for second solution and got it working but I'm really don't understand how it's working.
So there is the code example:
class Test(object):
def __init__(self):
print "WORKS!"
room = globals()['Test']
room()
type(room()) gives:
<class '__main__.Test'>
type(room) gives:
<type 'type'> # What????
It looks like room() is a class object, but shouldn't that be room instead of room()?
Please help me because it is a little bit silly if I write a code which I don't understand myself.

What happens here is the following:
class Test(object):
def __init__(self):
print "WORKS!"
room = globals()['Test']
Here you got Test as room the way you wanted. Verify this:
room is Test
should give True.
type(room()) gives:
<class '__main__.Test'>
You do one step an go it backwards: room() returns the same as Test() would - an instance of that class. type() "undoes" this step resp. gets the type of the object - this is, of course, Test.
type(room) gives:
<type 'type'> # What????
Of course - it is the type of a (new style) class. The same as type(Test).
Be aware, however, that for
In my program I need to call a class from within other class based on string returned by function. I found two solutions: one by using getattr() and second one by using globals() / locals().
it could be better to create an explicitly separate dict. Here you have full control over which objects/classes/... are allowed in that context and which are not.

First of all, I'd go with getattr instead.
In your example, room equals Test and is a class. Its type is type.
When you call room(), you instantiate Test, so room() evaluates to an instance of Test, whose type is Test.

Classes are objects too, in Python. All this does:
class Test(object):
def __init__(self):
print "WORKS!"
is create a class object and bind it to the name Test. Much as this:
x = []
creates a list object and binds it to the name x.
Test() isn't magic syntax for creating an instance. The Test is perfectly ordinary variable lookup, and the () is perfectly ordinary "call with empty arguments". It just so happens that calling a class will create an instance of that class.
If follows then that your problem of instantiating a class chosen based on having the name of the class as a string boils down to the much simpler problem of finding an object stored in a variable. It's exactly the same problem as getting that list bound to the name x, given the string "x". Once you've got a reference to the class in any old variable, you can simply call it to create your instance.
globals() returns a dictionary mapping the names of globals to their values. So globals()['Test'] will get you the class Test just as easily as globals()['x'] will get you the list. However it's usually not considered great style to use globals() like this; your module probably contains a large number of callables (including a bunch imported from other modules) that you don't want to be accidentally invoked if the function can be made to return their name. Given that classes are just ordinary objects, you can put them in a dictionary of your own making:
classes = {
'Test': Test,
'SomethingElse': Something,
...
}
This involves a bit more typing, but it's also easier to see what the intended usage is, and it gives you a bit more flexibility, since you can also easily pass this dictionary to other modules and have the instantiation take place elsewhere (you could do that with globals(), but then you're getting very weird).
Now, for the type(room) being type. Again, this is just a simple consequence of the fact that classes themselves are also objects. If a class is an object, then it should also be an instance of some class. What class is that? type, the "type of types". Much as any class defines the common behaviour of all its instances, the class type defines the common behaviour of all classes.
And just to make your brain hurt, type is an instance of itself (since type is also a class, and type is the class of classes). And it's a subclass of object (since all type instances are object instances, but not all object instances are type instances), and also an instance of object (since object is the root class of which everything is an instance).
You can generally ignore type as an advanced topic, however. :)

Python Variable Declaration

I want to clarify how variables are declared in Python.
I have seen variable declaration as
class writer:
path = ""
sometimes, there is no explicit declaration but just initialization using __init__:
def __init__(self, name):
self.name = name
I understand the purpose of __init__, but is it advisable to declare variable in any other functions?
How can I create a variable to hold a custom type?
class writer:
path = "" # string value
customObj = ??

Okay, first things first.
There is no such thing as "variable declaration" or "variable initialization" in Python.
There is simply what we call "assignment", but should probably just call "naming".
Assignment means "this name on the left-hand side now refers to the result of evaluating the right-hand side, regardless of what it referred to before (if anything)".
foo = 'bar' # the name 'foo' is now a name for the string 'bar'
foo = 2 * 3 # the name 'foo' stops being a name for the string 'bar',
# and starts being a name for the integer 6, resulting from the multiplication
As such, Python's names (a better term than "variables", arguably) don't have associated types; the values do. You can re-apply the same name to anything regardless of its type, but the thing still has behaviour that's dependent upon its type. The name is simply a way to refer to the value (object). This answers your second question: You don't create variables to hold a custom type. You don't create variables to hold any particular type. You don't "create" variables at all. You give names to objects.
Second point: Python follows a very simple rule when it comes to classes, that is actually much more consistent than what languages like Java, C++ and C# do: everything declared inside the class block is part of the class. So, functions (def) written here are methods, i.e. part of the class object (not stored on a per-instance basis), just like in Java, C++ and C#; but other names here are also part of the class. Again, the names are just names, and they don't have associated types, and functions are objects too in Python. Thus:
class Example:
data = 42
def method(self): pass
Classes are objects too, in Python.
So now we have created an object named Example, which represents the class of all things that are Examples. This object has two user-supplied attributes (In C++, "members"; in C#, "fields or properties or methods"; in Java, "fields or methods"). One of them is named data, and it stores the integer value 42. The other is named method, and it stores a function object. (There are several more attributes that Python adds automatically.)
These attributes still aren't really part of the object, though. Fundamentally, an object is just a bundle of more names (the attribute names), until you get down to things that can't be divided up any more. Thus, values can be shared between different instances of a class, or even between objects of different classes, if you deliberately set that up.
Let's create an instance:
x = Example()
Now we have a separate object named x, which is an instance of Example. The data and method are not actually part of the object, but we can still look them up via x because of some magic that Python does behind the scenes. When we look up method, in particular, we will instead get a "bound method" (when we call it, x gets passed automatically as the self parameter, which cannot happen if we look up Example.method directly).
What happens when we try to use x.data?
When we examine it, it's looked up in the object first. If it's not found in the object, Python looks in the class.
However, when we assign to x.data, Python will create an attribute on the object. It will not replace the class' attribute.
This allows us to do object initialization. Python will automatically call the class' __init__ method on new instances when they are created, if present. In this method, we can simply assign to attributes to set initial values for that attribute on each object:
class Example:
name = "Ignored"
def __init__(self, name):
self.name = name
# rest as before
Now we must specify a name when we create an Example, and each instance has its own name. Python will ignore the class attribute Example.name whenever we look up the .name of an instance, because the instance's attribute will be found first.
One last caveat: modification (mutation) and assignment are different things!
In Python, strings are immutable. They cannot be modified. When you do:
a = 'hi '
b = a
a += 'mom'
You do not change the original 'hi ' string. That is impossible in Python. Instead, you create a new string 'hi mom', and cause a to stop being a name for 'hi ', and start being a name for 'hi mom' instead. We made b a name for 'hi ' as well, and after re-applying the a name, b is still a name for 'hi ', because 'hi ' still exists and has not been changed.
But lists can be changed:
a = [1, 2, 3]
b = a
a += [4]
Now b is [1, 2, 3, 4] as well, because we made b a name for the same thing that a named, and then we changed that thing. We did not create a new list for a to name, because Python simply treats += differently for lists.
This matters for objects because if you had a list as a class attribute, and used an instance to modify the list, then the change would be "seen" in all other instances. This is because (a) the data is actually part of the class object, and not any instance object; (b) because you were modifying the list and not doing a simple assignment, you did not create a new instance attribute hiding the class attribute.

This might be 6 years late, but in Python 3.5 and above, you can give a hint about a variable type like this:
variable_name: type_name
or this:
variable_name # type: shinyType
This hint has no effect in the core Python interpreter, but many tools will use it to aid the programmer in writing correct code.
So in your case(if you have a CustomObject class defined), you can do:
customObj: CustomObject
See this or that for more info.

There's no need to declare new variables in Python. If we're talking about variables in functions or modules, no declaration is needed. Just assign a value to a name where you need it: mymagic = "Magic". Variables in Python can hold values of any type, and you can't restrict that.
Your question specifically asks about classes, objects and instance variables though. The idiomatic way to create instance variables is in the __init__ method and nowhere else — while you could create new instance variables in other methods, or even in unrelated code, it's just a bad idea. It'll make your code hard to reason about or to maintain.
So for example:
class Thing(object):
def __init__(self, magic):
self.magic = magic
Easy. Now instances of this class have a magic attribute:
thingo = Thing("More magic")
# thingo.magic is now "More magic"
Creating variables in the namespace of the class itself leads to different behaviour altogether. It is functionally different, and you should only do it if you have a specific reason to. For example:
class Thing(object):
magic = "Magic"
def __init__(self):
pass
Now try:
thingo = Thing()
Thing.magic = 1
# thingo.magic is now 1
Or:
class Thing(object):
magic = ["More", "magic"]
def __init__(self):
pass
thing1 = Thing()
thing2 = Thing()
thing1.magic.append("here")
# thing1.magic AND thing2.magic is now ["More", "magic", "here"]
This is because the namespace of the class itself is different to the namespace of the objects created from it. I'll leave it to you to research that a bit more.
The take-home message is that idiomatic Python is to (a) initialise object attributes in your __init__ method, and (b) document the behaviour of your class as needed. You don't need to go to the trouble of full-blown Sphinx-level documentation for everything you ever write, but at least some comments about whatever details you or someone else might need to pick it up.

For scoping purpose, I use:
custom_object = None

Variables have scope, so yes it is appropriate to have variables that are specific to your function. You don't always have to be explicit about their definition; usually you can just use them. Only if you want to do something specific to the type of the variable, like append for a list, do you need to define them before you start using them. Typical example of this.
list = []
for i in stuff:
list.append(i)
By the way, this is not really a good way to setup the list. It would be better to say:
list = [i for i in stuff] # list comprehension
...but I digress.
Your other question.
The custom object should be a class itself.
class CustomObject(): # always capitalize the class name...this is not syntax, just style.
pass
customObj = CustomObject()

As of Python 3, you can explicitly declare variables by type.
For instance, to declare an integer one can do it as follows:
x: int = 3
or:
def f(x: int):
return x
see this question for more detailed info about it:
Explicitly declaring a variable type in Python

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.