python: manipulating the global name space vs explicit assignment of object reference - python

I've been reading plenty of posts here stating that one should never manipulate the global namespace. I understand the concerns WRT good coding practice (I.e., keep things localized where you can, avoid side effects etc., etc.). But many comments also point to issues WRT performance (i.e. that doing so bloats the globals dict). However, I am curious whether the following two cases are indeed treated differently by the python interpreter
# let's assume the following two classes
class someclass1():
def __init__(self,name):
self.n = name
class someclass2():
def __init__(self,name):
import builtins
self.n = name
setattr(builtins,name,self)
# now we create the instances as
SC1 = someclass1("SC1")
someclass2("SC2")
Both versions will create new object handles (SC1 and SC2) in the global namespace. Does it truly matter how I create them? Note, I am fully aware that this can be handled by a dict but let's keep the discussion of why I want to do this for another thread.

Related

Scattered declaration of class/instance variable names in Python due to combining declaration with useage

I'm very new to python so please don't be displeased if I missed some thing.
It seems that class and instance variables in python simply sprint into existence when they are used, this combination of use and declaration of instance variables results in the declaration of a class no longer being the only place that a class's data structures can be defined. consequently given a class Cls one cannot tell what data structures been created for it for one to access simply by tracing up stream to the class's declaration, as one can do with languages such as C and Java. Instead, one is compel to search all modules that might have used/define variables for that class.
As a result when a class is used by multiple modules, although you know of some data structure belong to that class must have been defined, to know it's name and type, you have to search through all the imported files that might have used that class (in a way that create new variables) to see just where was it used/defined.
example
declaration:
class Cls:
pass
module1:
Cls.m = 2.3
module_n:
print Cls.m
And again the issue being for those who operate on module_n, before they write that print statement, they don't know whether the variable name will be defined in module1 or some other module in the import list that could have used Cls , but if they can't find where the variable's defined they can't use it, since they don't know the name of that variable, or even its existence.
As someone less than a day into Python who is reading a rather lengthy project, I want to know if there's any trick to solve this, both as a reader and as the writer, how can such headaches be avoided

Referencing an attribute within a method

Is it bad practice to reference an attribute with a new variable name in a method within a class? For example:
class Stuff:
def __init__(self, a):
self.a = a
def some_method(self):
a = self.a
# Do some stuff with a
I've seen this in other peoples' code and I've gotten into a habit of it myself, especially with long variable names. It seems like a copy of a is created when I do this which could be a problem if a is very large. Should I just stick to calling self.a inside of some_method? Does python garbage collect the a created in some_method after it is called?
This isn't necessarily a bad practice, you could make this assignment with two reasons (see the comments by #ShadowRanger for a quite obscure third reason) backing it:
Making code more readable (as you mentioned, long names can be too long.
Eliminating the dot; if you have a tedious loop that uses self.a, it might shed some time if you don't need to perform the look-up every time (not too much time, though). Additionally, if this wasn't a plain attribute but instead was a function, assigning it to a local variable would eliminate the transformation from function to method which also sheds some execution time.
Also, copy isn't the best term, you just make a different name refer to the same object. After the method some_method is completed, a will just not exist because it is only created in the local scope.
No, garbage collection doesn't happen because a (which is assigned to the value of self.a) isn't the only reference; you still have self.a which keeps the value assigned to it alive.

Is it better to use self variable than pass variable in a class? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I used to be a c programmer, so we have to pass every variable as argument or pointer and not encouraged to define global variable.
I am going to use some variable in several functions in python.
Generally, which is better, pass the variable as an argument, or define a self variable when we get the value of the variables? Does python has any general rules about this?
Like this:
class A:
def func2(self, var):
print var
def func1(self):
var = 1
self.func2(var)
class B:
def func2(self):
print self.var
def func1(self):
self.var = 1
self.func2()
Which is better? A or B?
In Python, you have a lot of freedom to do what "makes sense". In this case, I would say that it depends on how you plan on using func2 and who will be accessing it. If func2 is only ever supposed to act upon self.var, then you should code it as such. If other objects are going to need to pass in different arguments to func2, then you should allow for it to be an argument. Of course, this all depends on the larger scope of what you're trying to do, but given your simple example, this makes sense.
Also, I'm confused about how your question relates to global variables. Member variables are not the same thing as global variables.
Edited to reflect updated post:
The difference between A and B in your example is that B persists the information about self.var, while A does not. If var needs to be persisted as part of the object's state, then you need to store it as part of self. I get the sense that your question might relate more to objects as a general concept than anything Python-specific.
Of course it's better to design your program to use scope intelligently. The most obvious problem is that a mutation of a global variable can affect distant parts of code in ways that are difficult to trace, but in addition, garbage collection (reference counting, whatever) becomes effectively moot when your references live in long-lived scopes.
That said, Python has a global keyword, but it doesn't have globals in the same way c does. Python globals are module level, so they're namespaced with the module by default. The downstream programmer can bypass or alias this namespacing, but that's his/her problem. There are certainly cases where defining a module-level configuration value, pseudo-enum or -const makes sense.
Next, consider whether you need to maintain state: if the behavior of an object depends on it being aware of a certain value, make it a property. You can do that by attaching the value to self. Otherwise, just pass the value as an argument. (But then, if you have a lot of methods and no state, ask yourself if you really need a class, or should they just be module functions?)
This questions has implications towards object-oriented design. Python is an object oriented language; c is not. You would be dramatically undermining (and in some cases thwarting) object oriented advantages to use in-out programming or entirely global variables in Python except where there's particular reason to do so.
Consider the following reasons, which are not exhaustive:
Garbage collection won't know when to collect if the variables are all global
You no longer have fields (which is what "self" helps you reference). Say your object is a Cat; there isn't some global name for a cat which you reassign whenever a new Cat appears in your neighborhood. Rather, each cat has its own name, age, size, etc. Someone who wants to find out how big the cat is shouldn't have to go to some global repository of cat sizes and look it up, they should just look at the cat
You can run into problems with primitives because Python, unlike C, does not let you track (easily) the reference of an object. If I pass in an integer variable, I can't change the value of the variable in its original location, only within the scope of the function. This can be solved with global variables, but only by being very messy. Consider the following code:
def foo(x):
x = 3
myVar = 5
foo(myVar)
print(myVar)
This will, of course, output 5, not three. There is no "x*" like there is in C, so solving this would be rather tricky in Python if we wanted foo to reassign 3 to the input variable. Rather, we could write
class Foo:
x = 5
def foo( fooObj ):
fooObj.x = 3
myFoo = Foo()
foo(myFoo)
print(myFoo.x)
Problem solved - it now outputs 3, not 5!
As a general rule, it is better to use self whenever possible, to encapsulate internal information and bind it to object (or class). I may be helpful to explain how self and classes work.
In Python class or object variables are passed to methods explicitly, just as you would do in C if you want to do OOP. This is different from other object oriented languages, like Java or C++, where this argument is passed implicitly (but it always is!).
Thus if you define class like:
Class B(object):
def __init__(self, var=None): # this is constructor
self.var = var
def func2(self):
print self.var
when you call object method with . operator, this object will be passed as the first argument, that maps to self in method signature:
b = B(1) # object b is created and B.__init__(b, 1) is called
b.func2() # B.func2(b) is called, outputs 1
I hope this clears up things for you a bit
I recommend focusing on proximity. If the variable only relates to the current method or is created to be passed to another method, then it probably isn't expressing the persistent state of the class instance. Create the variable and throw it away when you're done.
If the variable describes an important facet of the instance, use self. This is not encroaching on your aversion to global variables as the variable is encapsulated within the instance. Class and module variables are also fine for the same reason.
In short, both A and B are proper implementations depending on context. I'm sorry that I haven't given you a clear answer but it has more to do with how important an object is to the objects around it than maintaining any sort of community standard. That you asked the question makes me think you'll make a reasonable judgement.

Should I be using "global" or "self." for class scope variables in Python?

Both of these blocks of code work. Is there a "right" way to do this?
class Stuff:
def __init__(self, x = 0):
global globx
globx = x
def inc(self):
return globx + 1
myStuff = Stuff(3)
print myStuff.inc()
Prints "4"
class Stuff:
def __init__(self, x = 0):
self.x = x
def inc(self):
return self.x + 1
myStuff = Stuff(3)
print myStuff.inc()
Also prints "4"
I'm a noob, and I'm working with a lot of variables in a class. Started wondering why I was putting "self." in front of everything in sight.
Thanks for your help!
You should use the second way, then every instance has a separate x
If you use a global variable then you may find you get surprising results when you have more than one instance of Stuff as changing the value of one will affect all the others.
It's normal to have explicit self's all over your Python code. If you try tricks to avoid that you will be making your code difficult to read for other Python programmers (and potentially introducing extra bugs)
There are 2 ways for "class scope variables". One is to use self, this is called instance variable, each instance of a class has its own copy of instance variables; another one is to define variables in the class definition, this could be achieved by:
class Stuff:
globx = 0
def __init__(self, x = 0):
Stuff.globx = x
...
This is called class attribute, which could be accessed directly by Stuff.globx, and owned by the class, not the instances of the class, just like the static variables in Java.
you should never use global statement for a "class scope variable", because it is not. A variable declared as global is in the global scope, e.g. the namespace of the module in which the class is defined.
namespace and related concept is introduced in the Python tutorial here.
Those are very different semantically. self. means it's an instance variable, i.e. each instance has its own. This is propably the most common kind, but not the only one. And then there are class variables, defined at class level (and therefore by the time the class definition is executed) and accessable in class methods. The equivalent to most uses of static methods, and most propably what you want when you need to share stuff between instances (this is perfectly valid, although not automatically teh one and only way for a given problem). You propably want one of those, depending on what you're doing. Really, we can't read your mind and tell you which one fits your problem.
Globals variables are a different story. They're, well, global - everyone has the same one. This is almost never a good idea (for reasons explained on many occasions), but if you're just writing a quick and dirty script and need share something between several places, they can be acceptable.

Alternative to Passing Global Variables Around to Classes and Functions

I'm new to python, and I've been reading that using global to pass variables to other functions is considered noobie, as well as a bad practice. I would like to move away from using global variables, but I'm not sure what to do instead.
Right now I have a UI I've created in wxPython as its own separate class, and I have another class that loads settings from a .ini file. Since the settings in the UI should match those in the .ini, how do I pass around those values? I could using something like: Settings = Settings() and then define the variables as something like self.settings1, but then I would have to make Settings a global variable to pass it to my UI class (which it wouldn't be if I assign in it main()).
So what is the correct and pythonic way to pass around these variables?
Edit: Here is the code that I'm working with, and I'm trying to get it to work like Alex Martelli's example. The following code is saved in Settings.py:
import ConfigParser
class _Settings():
#property
def enableautodownload(self): return self._enableautodownload
def __init__(self):
self.config = ConfigParser.ConfigParser()
self.config.readfp(open('settings.ini'))
self._enableautodownload=self.config.getboolean('DLSettings', 'enableautodownload')
settings = _Settings()
Whenever I try to refer to Settings.settings.enableautodownload from another file I get: AttributeError: 'module' object has no attribute 'settings'. What am I doing wrong?
Edit 2: Never mind about the issue, I retyped the code and it works now, so it must have been a simple spelling or syntax error.
The alternatives to global variables are many -- mostly:
explicit arguments to functions, classes called to create one of their instance, etc (this is usually the clearest, since it makes the dependency most explicit, when feasible and not too repetitious);
instance variables of an object, when the functions that need access to those values are methods on that same object (that's OK too, and a reasonable way to use OOP);
"accessor functions" that provide the values (or an object which has attributes or properties for the values).
Each of these (esp. the first and third ones) is particularly useful for values whose names must not be re-bound by all and sundry, but only accessed. The really big problem with global is that it provides a "covert communication channel" (not in the cryptographic sense, but in the literal one: apparently separate functions can actually be depending on each other, influencing each other, via global values that are not "obvious" from the functions' signatures -- this makes the code hard to test, debug, maintain, and understand).
For your specific problem, if you never use the global statement, but rather access the settings in a "read-only" way from everywhere (and you can ensure that more fully by making said object's attributes be read-only properties!), then having the "read-only" accesses be performed on a single, made-once-then-not-changed, module-level instance, is not too bad. I.e., in some module foo.py:
class _Settings(object):
#property
def one(self): return self._one
#property
def two(self): return self._two
def __init__(self, one, two):
self._one, self._two = one, two
settings = _Settings(23, 45)
and from everywhere else, import foo then just access foo.settings.one and foo.settings.two as needed. Note that I've named the class with a single leading underscore (just like the two instance attributes that underlie the read-only properties) to suggest that it's not meant to be used from "outside" the module -- only the settings object is supposed to be (there's no enforcement -- but any user violating such requested privacy is most obviously the only party responsible for whatever mayhem may ensue;-).

Categories