Python - Static Class Variables - python

I'm from a C++ background and have often been using static vars to reduce the number of time variables have to be initialized (especially if the initialization takes very long). So from other posts on StackOverflow, people suggested using static class variables as follows:
class MyClass(object):
StaticList1 = [...] # Very large list
StaticList2 = [...] # Very large list
Now this is fine if there exists at least 1 instance of MyClass throughout the execution of the program and the lists are only created once. However, if at some stage of the execution there is no instance of MyClass, Python seems to remove the static lists (I assume because the reference counter drops to 0).
So my question, is there any easy way without using external modules to initialize StaticList1 and StaticList2 only once (the first time they are used) and never to remove them even if there is no instance of MyClass until the program exists (or you delete the lists manually)?
EDIT:
Maybe I oversimplified this issue. What I'm doing:
class MyClass(object):
StaticList = None
def __init__(self, info):
if self.StaticList == None:
print "Initializing ..."
self.StaticList = []
# Computationally expensive task to add elements to self.StaticList, depending on the value of parameter info
def data(self):
return self.StaticList
I import the module from another script and have a loop like this:
import myclass
for i in range(10000):
m = myclass.MyClass(i)
d = m.data()
# Do something with d.
The initializing of the static list takes about 200 - 300 ms and is executed on every iteration of the loop, so the loop takes extremely long to finish.

While your class does have a static field called StaticList, you are actually initializing and working with an instance field of the same name because of the self qualifier you are using. I think your code will work fine if you use MyClass.StaticList to initialize and access it instead.
In general, by Python's name lookup, you can access a class field via an instance as if it was an instance field (e.g., self.StaticList) on that instance as long as you haven't actually set an instance field of the same name on that instance. From that moment on, the instance field shadows the class field (i.e., self.StaticList will find your new value, while MyClass.StaticList will still refer to your class value).
As an example fresh from the interpreter:
>>> class A(object):
... v=2 # static initialization
...
>>> A.v
2
>>> a=A() # get an instance, and
>>> a.v # get the static value via the instance:
2
>>> a.v = 7 # but now set 'v' on the instance, and ...
>>> a.v # we will get the instance field's value:
7
>>> A.v # the static value is still the old:
2
>>> b=A() # and other instances of the class ...
>>> b.v # will use the same old static value:
2
The instance variable a.v is initially equal to A.v, but by explicitly setting a.v=7, you are "dissociating" them in that instance.
While this means that, in principle, you could make use of a static class field MyClass.Values as well as an instance field xyz.Values of the same name, this is often discouraged for exactly this kind of confusion.
As a separate remark, you could consider annotating the data method as a #staticmethod (and remove the self argument in the move) and call it as MyClass.data() to make the fact clearer that you would be getting back the same list instance on every call.

Related

In python, why we can create a new attribute from an instance and not a method?

In the following code,
# An example class with some variable and a method
class ExampleClass(object):
def __init__(self):
self.var = 10
def dummyPrint(self):
print ('Hello World!')
# Creating instance and printing the init variable
inst_a = ExampleClass()
# This prints --> __init__ variable = 10
print ('__init__ variable = %d' %(inst_a.var))
# This prints --> Hello World!
inst_a.dummyPrint()
# Creating a new attribute and printing it.
# This prints --> New variable = 20
inst_a.new_var = 20
print ('New variable = %d' %(inst_a.new_var))
# Trying to create new method, which will give error
inst_a.newDummyPrint()
I am able to create a new attribute (new_var) outside the class, using instance. And it works. Ideally, I was expecting it will not work.
Similarly I tried creating new method (newDummyPrint()); which will print AttributeError: 'ExampleClass' object has no attribute 'newDummyPrint' as I expected.
My question is,
Why did creating a new attribute worked?
Why creating a new method didn't work?
As already mentionned in comments, you are creating the new attribute here:
inst_a.new_var = 20
before reading it on the next line. You're NOT assigning newDummyPrint anywhere, so obviously the attribute resolution mechanism cannot find it and ends up raising an AtributeError. You'd get the very same result if you tried to access any other non-existing attribute, ie inst_a.whatever.
Note that since in Python everything is an object (including classes, functions etc), there are no real distinction between accessing a "data" attribute or a method - they are all attributes (whether class or instance ones), and the attribute resolution rules are the same. In the case of methods (or any other callable attribute), the call operation happens after the attribute has been resolved.
To dynamically create a new "method", you mainly have two solutions: creating as a class attribute (which will make it available to all other instances of the class), or as an instance attribute (which will - obviously - make it available only on this exact instance.
The first solution is as simple as it can be: define your function and bind it to the class:
# nb: inheriting from `object` for py2 compat
class Foo(object):
def __init__(self, var):
self.var = var
def bar(self, x):
return self.var * x
# testing before:
f = Foo(42)
try:
print(f.bar(2))
except AttribteError as e:
print(e)
# now binds the function to the class:
Foo.bar = bar
# and test it:
print(f.bar(2))
# and it's also available on other instances:
f2 = Foo(6)
print(f2.bar(7))
Creating per-instance method is a (very tiny) bit more involved - you have to manually get the method from the function and bind this method to the instance:
def baaz(self):
return "{}.var = {}".format(self, self.var)
# test before:
try:
print(f.baaz())
except AttributeError as e:
print(e)
# now binds the method to the instance
f.baaz = baaz.__get__(f, Foo)
# now `f` has a `baaz` method
print(f.baaz())
# but other Foo instances dont
try:
print(f2.baaz())
except AttributeError as e:
print(e)
You'll noticed I talked about functions in the first case and methods in the second case. A python "method" is actually just a thin callable wrapper around a function, an instance and a class, and is provided by the function type through the descriptor protocol - which is automagically invoked when the attribute is resolved on the class itself (=> is a class attribute implementin the descriptor protocol) but not when resolved on the instance. This why, in the second case, we have to manually invoke the descriptor protocol.
Also note that there are limitations on what's possible here: first, __magic__ methods (all methods named with two leading and two trailing underscores) are only looked up on the class itself so you cannot define them on a per-instance basis. Then, slots-based types and some builtin or C-coded types do not support dynamic attributes whatsoever. Those restrictions are mainly there for performance optimization reasons.
You can create new attributes on the fly when you are using an empty class definition emulating Pascal "record" or C "struct". Otherwise, what you are trying to do is not a good manner, or a good pattern for object-oriented programming. There are lots of books you can read about it. Generally speaking, you have to clearly tell in the class definition what an object of that class is, how it behaves: modifying its behavior on the fly (e.g. adding new methods) could lead to unknown results, which make your life impossible when reading that code a month later and even worse when you are debugging.
There is even an anti-pattern problem called Ambiguous Viewpoint:
Lack of clarification of the modeling viewpoint leads to problematic
ambiguities in object models.
Anyway, if you are playing with Python and you swear you'll never use this code in production, you can write new attributes which store lambda functions, e.g.
c = ExampleClass()
c.newMethod = lambda s1, s2: str(s1) + ' and ' + str(s2)
print(c.newMethod('string1', 'string2'))
# output is: string1 and string2
but this is very ugly, I would never do it.

Why doesn't Python allow referencing a class inside its definition?

Python (3 and 2) doesn't allow you to reference a class inside its body (except in methods):
class A:
static_attribute = A()
This raises a NameError in the second line because 'A' is not defined, while this
class A:
def method(self):
return A('argument')
works fine.
In other languages, for example Java, the former is no problem and it is advantageous in many situations, like implementing singletons.
Why isn't this possible in Python? What are the reasons for this decision?
EDIT:
I edited my other question so it asks only for ways to "circumvent" this restriction, while this questions asks for its motivation / technical details.
Python is a dynamically typed language, and executes statements as you import the module. There is no compiled definition of a class object, the object is created by executing the class statement.
Python essentially executes the class body like a function, taking the resulting local namespace to form the body. Thus the following code:
class Foo(object):
bar = baz
translates roughly to:
def _Foo_body():
bar = baz
return locals()
Foo = type('Foo', (object,), _Foo_body())
As a result, the name for the class is not assigned to until the class statement has completed executing. You can't use the name inside the class statement until that statement has completed, in the same way that you can't use a function until the def statement has completed defining it.
This does mean you can dynamically create classes on the fly:
def class_with_base(base_class):
class Foo(base_class):
pass
return Foo
You can store those classes in a list:
classes = [class_with_base(base) for base in list_of_bases]
Now you have a list of classes with no global names referring to them anywhere. Without a global name, I can't rely on such a name existing in a method either; return Foo won't work as there is no Foo global for that to refer to.
Next, Python supports a concept called a metaclass, which produces classes just like a class produces instances. The type() function above is the default metaclass, but you are free to supply your own for a class. A metaclass is free to produce whatever it likes really, even things that are bit classes! As such Python cannot, up front, know what kind of object a class statement will produce and can't make assumptions about what it'll end up binding the name used to. See What is a metaclass in Python?
All this is not something you can do in a statically typed language like Java.
A class statement is executed just like any other statement. Your first example is (roughly) equivalent to
a = A()
A = type('A', (), {'static_attribute': a})
The first line obviously raises a NameError, because A isn't yet bound to anything.
In your second example, A isn't referenced until method is actually called, by which time A does refer to the class.
Essentially, a class does not exist until its entire definition is compiled in its entirety. This is similar to end blocks that are explicitly written in other languages, and Python utilizes implicit end blocks which are determined by indentation.
The other answers are great at explaining why you can't reference the class by name within the class, but you can use class methods to access the class.
The #classmethod decorator annotes a method that will be passed the class type, instead of the usual class instance (self). This is similar to Java's static method (there's also a #staticmethod decorator, which is a little different).
For a singleton, you can access a class instance to store an object instance (Attributes defined at the class level are the fields defined as static in a Java class):
class A(object):
instance = None
#classmethod
def get_singleton(cls):
if cls.instance is None:
print "Creating new instance"
cls.instance = cls()
return cls.instance
>>> a1 = A.get_singleton()
Creating new instance
>>> a2 = A.get_singleton()
>>> print a1 is a2
True
You can also use class methods to make java-style "static" methods:
class Name(object):
def __init__(self, name):
self.name = name
#classmethod
def make_as_victoria(cls):
return cls("Victoria")
#classmethod
def make_as_stephen(cls):
return cls("Stephen")
>>> victoria = Name.make_as_victoria()
>>> stephen = Name.make_as_stephen()
>>> print victoria.name
Victoria
>>> print stephen.name
Stephen
The answer is "just because".
It has nothing to do with the type system of Python, or it being dynamic. It has to do with the order in which a newly introduced type is initialized.
Some months ago I developed an object system for the language TXR, in which this works:
1> (defstruct foo nil (:static bar (new foo)))
#
2> (new foo)
#S(foo)
3> *2.bar
#S(foo)
Here, bar is a static slot ("class variable") in foo. It is initialized by an expression which constructs a foo.
Why that works can be understood from the function-based API for the instantiation of a new type, where the static class initialization is performed by a function which is passed in. The defstruct macro compiles a call to make-struct-type in which the (new foo) expression ends up in the body of the anonymous function that is passed for the static-initfun argument. This function is called after the type is registered under the foo symbol already.
We could easily patch the C implementation of make_struct_type so that this breaks. The last few lines of that function are:
sethash(struct_type_hash, name, stype);
if (super) {
mpush(stype, mkloc(su->dvtypes, super));
memcpy(st->stslot, su->stslot, sizeof (val) * su->nstslots);
}
call_stinitfun_chain(st, stype);
return stype;
}
The call_stinifun_chain does the initialization which ends up evaluating (new foo) and storing it in the bar static slot, and the sethash call is what registers the type under its name.
If we simply reverse the order in which these functions are called, the language and type system will still be the same, and almost everything will work as before. Yet, the (:static bar (new foo)) slot specifier will fail.
I put the calls in that order because I wanted the language-controlled aspects of the type to be as complete as possible before exposing it to the user-definable initializations.
I can't think of any reason for foo not to be known at the time when that struct type is being initialized, let alone a good reason. It is legitimate for static construction to create an instance. For example, we could use it to create a "singleton".
This looks like a bug in Python.

Creating a global variable (from a string) from within a class

Context: I'm making a Ren'py game. The value is Character(). Yes, I know this is a dumb idea outside of this context.
I need to create a variable from an input string inside of a class that exists outside of the class' scope:
class Test:
def __init__(self):
self.dict = {} # used elsewhere to give the inputs for the function below.
def create_global_var(self, variable, value):
# the equivalent of exec("global {0}; {0} = {1}".format(str(variable), str(value)))
# other functions in the class that require this.
Test().create_global_var("abc", "123") # hence abc = 123
I have tried vars()[], globals()[variable] = value, etc, and they simply do not work (they don't even define anything) Edit: this was my problem.
I know that the following would work equally as well, but I want the variables in the correct scope:
setattr(self.__class__, variable, value) # d.abc = 123, now. but incorrect scope.
How can I create a variable in the global scope from within a class, using a string as the variable name, without using attributes or exec in python?
And yes, i'll be sanity checking.
First things first: what we call the "global" scope in Python is actually the "module" scope
(on the good side, it diminishes the "evils" of using global vars).
Then, for creating a global var dynamically, although I still can't see why that would
be better than using a module-level dictionary, just do:
globals()[variable] = value
This creates a variable in the current module. If you need to create a module variable on the module from which the method was called, you can peek at the globals dictionary from the caller frame using:
from inspect import currentframe
currentframe(1).f_globals[variable] = name
Now, the this seems especially useless since you may create a variable with a dynamic name, but you can't access it dynamically (unless using the globals dictionary again)
Even in your test example, you create the "abc" variable passing the method a string, but then you have to access it by using a hardcoded "abc" - the language itself is designed to discourage this (hence the difference to Javascript, where array indexes and object attributes are interchangeable, while in Python you have distinct Mapping objects)
My suggestion is that you use a module-level explicit dictionary and create all your
dynamic variables as key/value pairs there:
names = {}
class Test(object):
def __init__(self):
self.dict = {} # used elsewhere to give the inputs for the function below.
def create_global_var(self, variable, value):
names[variable] = value
(on a side note, in Python 2 always inherit your classes from "object")
You can use setattr(__builtins__, 'abc', '123') for this.
Do mind you that this is most likely a design problem and you should rethink the design.

Python Variable Declaration

I want to clarify how variables are declared in Python.
I have seen variable declaration as
class writer:
path = ""
sometimes, there is no explicit declaration but just initialization using __init__:
def __init__(self, name):
self.name = name
I understand the purpose of __init__, but is it advisable to declare variable in any other functions?
How can I create a variable to hold a custom type?
class writer:
path = "" # string value
customObj = ??
Okay, first things first.
There is no such thing as "variable declaration" or "variable initialization" in Python.
There is simply what we call "assignment", but should probably just call "naming".
Assignment means "this name on the left-hand side now refers to the result of evaluating the right-hand side, regardless of what it referred to before (if anything)".
foo = 'bar' # the name 'foo' is now a name for the string 'bar'
foo = 2 * 3 # the name 'foo' stops being a name for the string 'bar',
# and starts being a name for the integer 6, resulting from the multiplication
As such, Python's names (a better term than "variables", arguably) don't have associated types; the values do. You can re-apply the same name to anything regardless of its type, but the thing still has behaviour that's dependent upon its type. The name is simply a way to refer to the value (object). This answers your second question: You don't create variables to hold a custom type. You don't create variables to hold any particular type. You don't "create" variables at all. You give names to objects.
Second point: Python follows a very simple rule when it comes to classes, that is actually much more consistent than what languages like Java, C++ and C# do: everything declared inside the class block is part of the class. So, functions (def) written here are methods, i.e. part of the class object (not stored on a per-instance basis), just like in Java, C++ and C#; but other names here are also part of the class. Again, the names are just names, and they don't have associated types, and functions are objects too in Python. Thus:
class Example:
data = 42
def method(self): pass
Classes are objects too, in Python.
So now we have created an object named Example, which represents the class of all things that are Examples. This object has two user-supplied attributes (In C++, "members"; in C#, "fields or properties or methods"; in Java, "fields or methods"). One of them is named data, and it stores the integer value 42. The other is named method, and it stores a function object. (There are several more attributes that Python adds automatically.)
These attributes still aren't really part of the object, though. Fundamentally, an object is just a bundle of more names (the attribute names), until you get down to things that can't be divided up any more. Thus, values can be shared between different instances of a class, or even between objects of different classes, if you deliberately set that up.
Let's create an instance:
x = Example()
Now we have a separate object named x, which is an instance of Example. The data and method are not actually part of the object, but we can still look them up via x because of some magic that Python does behind the scenes. When we look up method, in particular, we will instead get a "bound method" (when we call it, x gets passed automatically as the self parameter, which cannot happen if we look up Example.method directly).
What happens when we try to use x.data?
When we examine it, it's looked up in the object first. If it's not found in the object, Python looks in the class.
However, when we assign to x.data, Python will create an attribute on the object. It will not replace the class' attribute.
This allows us to do object initialization. Python will automatically call the class' __init__ method on new instances when they are created, if present. In this method, we can simply assign to attributes to set initial values for that attribute on each object:
class Example:
name = "Ignored"
def __init__(self, name):
self.name = name
# rest as before
Now we must specify a name when we create an Example, and each instance has its own name. Python will ignore the class attribute Example.name whenever we look up the .name of an instance, because the instance's attribute will be found first.
One last caveat: modification (mutation) and assignment are different things!
In Python, strings are immutable. They cannot be modified. When you do:
a = 'hi '
b = a
a += 'mom'
You do not change the original 'hi ' string. That is impossible in Python. Instead, you create a new string 'hi mom', and cause a to stop being a name for 'hi ', and start being a name for 'hi mom' instead. We made b a name for 'hi ' as well, and after re-applying the a name, b is still a name for 'hi ', because 'hi ' still exists and has not been changed.
But lists can be changed:
a = [1, 2, 3]
b = a
a += [4]
Now b is [1, 2, 3, 4] as well, because we made b a name for the same thing that a named, and then we changed that thing. We did not create a new list for a to name, because Python simply treats += differently for lists.
This matters for objects because if you had a list as a class attribute, and used an instance to modify the list, then the change would be "seen" in all other instances. This is because (a) the data is actually part of the class object, and not any instance object; (b) because you were modifying the list and not doing a simple assignment, you did not create a new instance attribute hiding the class attribute.
This might be 6 years late, but in Python 3.5 and above, you can give a hint about a variable type like this:
variable_name: type_name
or this:
variable_name # type: shinyType
This hint has no effect in the core Python interpreter, but many tools will use it to aid the programmer in writing correct code.
So in your case(if you have a CustomObject class defined), you can do:
customObj: CustomObject
See this or that for more info.
There's no need to declare new variables in Python. If we're talking about variables in functions or modules, no declaration is needed. Just assign a value to a name where you need it: mymagic = "Magic". Variables in Python can hold values of any type, and you can't restrict that.
Your question specifically asks about classes, objects and instance variables though. The idiomatic way to create instance variables is in the __init__ method and nowhere else — while you could create new instance variables in other methods, or even in unrelated code, it's just a bad idea. It'll make your code hard to reason about or to maintain.
So for example:
class Thing(object):
def __init__(self, magic):
self.magic = magic
Easy. Now instances of this class have a magic attribute:
thingo = Thing("More magic")
# thingo.magic is now "More magic"
Creating variables in the namespace of the class itself leads to different behaviour altogether. It is functionally different, and you should only do it if you have a specific reason to. For example:
class Thing(object):
magic = "Magic"
def __init__(self):
pass
Now try:
thingo = Thing()
Thing.magic = 1
# thingo.magic is now 1
Or:
class Thing(object):
magic = ["More", "magic"]
def __init__(self):
pass
thing1 = Thing()
thing2 = Thing()
thing1.magic.append("here")
# thing1.magic AND thing2.magic is now ["More", "magic", "here"]
This is because the namespace of the class itself is different to the namespace of the objects created from it. I'll leave it to you to research that a bit more.
The take-home message is that idiomatic Python is to (a) initialise object attributes in your __init__ method, and (b) document the behaviour of your class as needed. You don't need to go to the trouble of full-blown Sphinx-level documentation for everything you ever write, but at least some comments about whatever details you or someone else might need to pick it up.
For scoping purpose, I use:
custom_object = None
Variables have scope, so yes it is appropriate to have variables that are specific to your function. You don't always have to be explicit about their definition; usually you can just use them. Only if you want to do something specific to the type of the variable, like append for a list, do you need to define them before you start using them. Typical example of this.
list = []
for i in stuff:
list.append(i)
By the way, this is not really a good way to setup the list. It would be better to say:
list = [i for i in stuff] # list comprehension
...but I digress.
Your other question.
The custom object should be a class itself.
class CustomObject(): # always capitalize the class name...this is not syntax, just style.
pass
customObj = CustomObject()
As of Python 3, you can explicitly declare variables by type.
For instance, to declare an integer one can do it as follows:
x: int = 3
or:
def f(x: int):
return x
see this question for more detailed info about it:
Explicitly declaring a variable type in Python

Python Class scope & lists

I'm still fairly new to Python, and my OO experience comes from Java. So I have some code I've written in Python that's acting very unusual to me, given the following code:
class MyClass():
mylist = []
mynum = 0
def __init__(self):
# populate list with some value.
self.mylist.append("Hey!")
# increment mynum.
self.mynum += 1
a = MyClass()
print a.mylist
print a.mynum
b = MyClass()
print b.mylist
print b.mynum
Running this results in the following output:
['Hey!']
1
['Hey!', 'Hey!']
1
Clearly, I would expect the class variables to result in the same exact data, and the same exact output... What I can't seem to find anywhere is what makes a list different than say a string or number, why is the list referencing the same list from the first instantiation in subsequent ones? Clearly I'm probably misunderstanding some kind of scope mechanics or list creation mechanics..
tlayton's answer is part of the story, but it doesn't explain everything.
Add a
print MyClass.mynum
to become even more confused :). It will print '0'. Why? Because the line
self.mynum += 1
creates an instance variable and subsequently increases it. It doesn't increase the class variable.
The story of the mylist is different.
self.mylist.append("Hey!")
will not create a list. It expects a variable with an 'append' function to exist. Since the instance doesn't have such a variable, it ends up referring the one from the class, which does exist, since you initialized it. Just like in Java, an instance can 'implicitly' reference a class variable. A warning like 'Class fields should be referenced by the class, not by an instance' (or something like that; it's been a while since I saw it in Java) would be in order. Add a line
print MyClass.mylist
to verify this answer :).
In short: you are initializing class variables and updating instance variables. Instances can reference class variables, but some 'update' statements will automagically create the instance variables for you.
I believe the difference is that += is an assignment (just the same as = and +), while append changes an object in-place.
mylist = []
mynum = 0
This assigns some class variables, once, at class definition time.
self.mylist.append("Hey!")
This changes the value MyClass.mylist by appending a string.
self.mynum += 1
This is the same as self.mynum = self.mynum + 1, i.e., it assigns self.mynum (instance member). Reading from self.mynum falls through to the class member since at that time there is no instance member by that name.
What you are doing here is not just creating a class variable. In Python, variables defined in the class body result in both a class variable ("MyClass.mylist") and in an instance variable ("a.mylist"). These are separate variables, not just different names for a single variable.
However, when a variable is initialized in this way, the initial value is only evaluated once and passed around to each instance's variables. This means that, in your code, the mylist variable of each instance of MyClass are referring to a single list object.
The difference between a list and a number in this case is that, like in Java, primitive values such as numbers are copied when passed from one variable to another. This results in the behavior you see; even though the variable initialization is only evaluated once, the 0 is copied when it is passed to each instance's variable. As an object, though, the list does no such thing, so your append() calls are all coming from the same list. Try this instead:
class MyClass():
def __init__(self):
self.mylist = ["Hey"]
self.mynum = 1
This will cause the value to be evaluated separately each time an instance is created. Very much unlike Java, you don't need the class-body declarations to accompany this snippet; the assignments in the __init__() serve as all the declaration that is needed.

Categories