Related
In a Python method, I would like to have a local variable whose value persists between calls to the method.
This question shows how to declare such "static variables" (c++ terminology) inside functions. I tried to do the same in an instance method, and failed.
Here's a working minimal example that reproduces the problem. You can copy-paste it into an interpreter.
class SomeClass(object):
def some_method(self):
if not hasattr(SomeClass.some_method, 'some_static_var'):
SomeClass.some_method.some_static_var = 1 # breaks here
for i in range(3):
print SomeClass.some_method.some_static_var
SomeClass.some_method.some_static_var += 1
if __name__ == '__main__':
some_instance = SomeClass()
some_instance.some_method()
On the line labeled "# breaks here", I get:
AttributeError: 'instancemethod' object has no attribute 'some_static_var'
I realize there's an easy workaround, where I make some_static_var a member variable of SomeClass. However, the variable really has no use outside of the method, so I'd much prefer to keep it from cluttering up SomeClass' namespace if I could.
In python 2, you have to deal with bound and unbound methods. These do not have a __dict__ attribute, like functions do:
#python 2
'__dict__' in dir(SomeClass.some_method)
Out[9]: False
def stuff():
pass
'__dict__' in dir(stuff)
Out[11]: True
In python 3, your code works fine! The concept of bound/unbound methods is gone, everything is a function.
#python 3
'__dict__' in dir(SomeClass.some_method)
Out[2]: True
Back to making your code work, you need to put the attribute on the thing which has a __dict__: the actual function:
if not hasattr(SomeClass.some_method.__func__, 'some_static_var'):
#etc
Read more on im_func and __func__ here
It is up to you to decide whether this makes your code more or less readable - for me, making these types of things class attributes is almost always the way to go; it doesn't matter that only one method is accessing said attribute, it's where I look for "static" type vars. I value readable code over clean namespaces.
This last paragraph was of course an editorial, everyone is entitled to their opinion :-)
You can't set attribute on method objects.
Creating class attributes instead (that is, SomeClass.some_var = 1) is the standard Python way. However, we might be able to suggest more appropriate fixes if you give us a high-level overview of your actual problem (what are you writing this code for?).
Use the global keyword to access file-level variables
my_static = None
class MyClass(object):
def some_method(self):
global my_static
if my_static is None:
my_static = 0
else:
my_static = my_static + 1
print my_static
if __name__ == '__main__':
instance = MyClass()
instance.some_method()
instance.some_method()
Outputs:
0
1
Although, as mentioned elsewhere, a class variable would be preferable
I've gotten myself in trouble a few times now with accidentially (unintentionally) referencing global variables in a function or method definition.
My question is: is there any way to disallow python from letting me reference a global variable? Or at least warn me that I am referencing a global variable?
x = 123
def myfunc() :
print x # throw a warning or something!!!
Let me add that the typical situation where this arrises for my is using IPython as an interactive shell. I use 'execfile' to execute a script that defines a class. In the interpreter, I access the class variable directly to do something useful, then decide I want to add that as a method in my class. When I was in the interpreter, I was referencing the class variable. However, when it becomes a method, it needs to reference 'self'. Here's an example.
class MyClass :
a = 1
b = 2
def add(self) :
return a+b
m = MyClass()
Now in my interpreter I run the script 'execfile('script.py')', I'm inspecting my class and type: 'm.a * m.b' and decide, that would be a useful method to have. So I modify my code to be, with the non-intentional copy/paste error:
class MyClass :
a = 1
b = 2
def add(self) :
return a+b
def mult(self) :
return m.a * m.b # I really meant this to be self.a * self.b
This of course still executes in IPython, but it can really confuse me since it is now referencing the previously defined global variable!
Maybe someone has a suggestion given my typical IPython workflow.
First, you probably don't want to do this. As Martijn Pieters points out, many things, like top-level functions and classes, are globals.
You could filter this for only non-callable globals. Functions, classes, builtin-function-or-methods that you import from a C extension module, etc. are callable. You might also want to filter out modules (anything you import is a global). That still won't catch cases where you, say, assign a function to another name after the def. You could add some kind of whitelisting for that (which would also allow you to create global "constants" that you can use without warnings). Really, anything you come up with will be a very rough guide at best, not something you want to treat as an absolute warning.
Also, no matter how you do it, trying to detect implicit global access, but not explicit access (with a global statement) is going to be very hard, so hopefully that isn't important.
There is no obvious way to detect all implicit uses of global variables at the source level.
However, it's pretty easy to do with reflection from inside the interpreter.
The documentation for the inspect module has a nice chart that shows you the standard members of various types. Note that some of them have different names in Python 2.x and Python 3.x.
This function will get you a list of all the global names accessed by a bound method, unbound method, function, or code object in both versions:
def get_globals(thing):
thing = getattr(thing, 'im_func', thing)
thing = getattr(thing, '__func__', thing)
thing = getattr(thing, 'func_code', thing)
thing = getattr(thing, '__code__', thing)
return thing.co_names
If you want to only handle non-callables, you can filter it:
def get_callable_globals(thing):
thing = getattr(thing, 'im_func', thing)
func_globals = getattr(thing, 'func_globals', {})
thing = getattr(thing, 'func_code', thing)
return [name for name in thing.co_names
if callable(func_globals.get(name))]
This isn't perfect (e.g., if a function's globals have a custom builtins replacement, we won't look it up properly), but it's probably good enough.
A simple example of using it:
>>> def foo(myparam):
... myglobal
... mylocal = 1
>>> print get_globals(foo)
('myglobal',)
And you can pretty easily import a module and recursively walk its callables and call get_globals() on each one, which will work for the major cases (top-level functions, and methods of top-level and nested classes), although it won't work for anything defined dynamically (e.g., functions or classes defined inside functions).
If you only care about CPython, another option is to use the dis module to scan all the bytecode in a module, or .pyc file (or class, or whatever), and log each LOAD_GLOBAL op.
One major advantage of this over the inspect method is that it will find functions that have been compiled, even if they haven't been created yet.
The disadvantage is that there is no way to look up the names (how could there be, if some of them haven't even been created yet?), so you can't easily filter out callables. You can try to do something fancy, like connecting up LOAD_GLOBAL ops to corresponding CALL_FUNCTION (and related) ops, but… that's starting to get pretty complicated.
Finally, if you want to hook things dynamically, you can always replace globals with a wrapper that warns every time you access it. For example:
class GlobalsWrapper(collections.MutableMapping):
def __init__(self, globaldict):
self.globaldict = globaldict
# ... implement at least __setitem__, __delitem__, __iter__, __len__
# in the obvious way, by delegating to self.globaldict
def __getitem__(self, key):
print >>sys.stderr, 'Warning: accessing global "{}"'.format(key)
return self.globaldict[key]
globals_wrapper = GlobalsWrapper(globals())
Again, you can filter on non-callables pretty easily:
def __getitem__(self, key):
value = self.globaldict[key]
if not callable(value):
print >>sys.stderr, 'Warning: accessing global "{}"'.format(key)
return value
Obviously for Python 3 you'd need to change the print statement to a print function call.
You can also raise an exception instead of warning pretty easily. Or you might want to consider using the warnings module.
You can hook this into your code in various different ways. The most obvious one is an import hook that gives each new module a GlobalsWrapper around its normally-built globals. Although I'm not sure how that will interact with C extension modules, but my guess is that it will either work, or be harmlessly ignored, either of which is probably fine. The only problem is that this won't affect your top-level script. If that's important, you can write a wrapper script that execfiles the main script with a GlobalsWrapper, or something like that.
I've been struggling with a similar challenge (especially in Jupyter notebooks) and created a small package to limit the scope of functions.
>>> from localscope import localscope
>>> a = 'hello world'
>>> #localscope
... def print_a():
... print(a)
Traceback (most recent call last):
...
ValueError: `a` is not a permitted global
The #localscope decorator uses python's disassembler to find all instances of the decorated function using a LOAD_GLOBAL (global variable access) or LOAD_DEREF (closure access) statement. If the variable to be loaded is a builtin function, is explicitly listed as an exception, or satisfies a predicate, the variable is permitted. Otherwise, an exception is raised.
Note that the decorator analyses the code statically. Consequently, it does not have access to the values of variables accessed by closure.
I really hope this is not a question posed by millions of newbies, but my search didn t really give me a satisfying answer.
So my question is fairly simple. Are classes basically a container for functions with its own namespace? What other functions do they have beside providing a separate namespace and holding functions while making them callable as class atributes? Im asking in a python context.
Oh and thanks for the great help most of you have been!
More importantly than functions, class instances hold data attributes, allowing you to define new data types beyond what is built into the language; and
they support inheritance and duck typing.
For example, here's a moderately useful class. Since Python files (created with open) don't remember their own name, let's make a file class that does.
class NamedFile(object):
def __init__(self, name):
self._f = f
self.name = name
def readline(self):
return self._f.readline()
Had Python not had classes, you'd probably be working with dicts instead:
def open_file(name):
return {"name": name, "f": open(name)}
Needless to say, calling myfile["f"].readline() all the time will cause your fingers to hurt at some point. You could of course introduce a function readline in a NamedFile module (namespace), but then you'd always have to use that exact function. By contrast, NamedFile instances can be used anywhere you need an object with a readline method, so it would be a plug-in replacement for file in many situation. That's called polymorphism, one of the biggest benefits of OO/class-based programming.
(Also, dict is a class, so using it violates the assumption that there are no classes :)
In most languages, classes are just pieces of code that describe how to produce an object. That's kinda true in Python too:
>>> class ObjectCreator(object):
... pass
...
>>> my_object = ObjectCreator()
>>> print my_object
<__main__.ObjectCreator object at 0x8974f2c>
But classes are more than that in Python. Classes are objects too.
Yes, objects.
As soon as you use the keyword class, Python executes it and creates an OBJECT. The instruction:
>>> class ObjectCreator(object):
... pass
...
creates in memory an object with the name ObjectCreator.
This object (the class) is itself capable of creating objects (the instances), and this is why it's a class.
But still, it's an object, and therefore:
you can assign it to a variable
you can copy it
you can add attributes to it
you can pass it as a function parameter
e.g.:
>>> print ObjectCreator # you can print a class because it's an object
<class '__main__.ObjectCreator'>
>>> def echo(o):
... print o
...
>>> echo(ObjectCreator) # you can pass a class as a parameter
<class '__main__.ObjectCreator'>
>>> print hasattr(ObjectCreator, 'new_attribute')
False
>>> ObjectCreator.new_attribute = 'foo' # you can add attributes to a class
>>> print hasattr(ObjectCreator, 'new_attribute')
True
>>> print ObjectCreator.new_attribute
foo
>>> ObjectCreatorMirror = ObjectCreator # you can assign a class to a variable
>>> print ObjectCreatorMirror.new_attribute
foo
>>> print ObjectCreatorMirror()
<__main__.ObjectCreator object at 0x8997b4c>
Classes (or objects) are used to provide encapsulation of data and operations that can be performed on that data.
They don't provide namespacing in Python per se; module imports provide the same type of stuff and a module can be entirely functional rather than object oriented.
You might gain some benefit from looking at OOP With Python, Dive into Python, Chapter 5. Objects and Object Oriented Programming or even just the Wikipedia article on object oriented programming
A class is the definition of an object. In this sense, the class provides a namespace of sorts, but that is not the true purpose of a class. The true purpose is to define what the object will 'look like' - what the object is capable of doing (methods) and what it will know (properties).
Note that my answer is intended to provide a sense of understanding on a relatively non-technical level, which is what my initial trouble was with understanding classes. I'm sure there will be many other great answers to this question; I hope this one adds to your overall understanding.
Is there any way to get the original object from a weakproxy pointed to it? eg is there the inverse to weakref.proxy()?
A simplified example(python2.7):
import weakref
class C(object):
def __init__(self, other):
self.other = weakref.proxy(other)
class Other(object):
pass
others = [Other() for i in xrange(3)]
my_list = [C(others[i % len(others)]) for i in xrange(10)]
I need to get the list of unique other members from my_list. The way I prefer for such tasks
is to use set:
unique_others = {x.other for x in my_list}
Unfortunately this throws TypeError: unhashable type: 'weakproxy'
I have managed to solve the specific problem in an imperative way(slow and dirty):
unique_others = []
for x in my_list:
if x.other in unique_others:
continue
unique_others.append(x.other)
but the general problem noted in the caption is still active.
What if I have only my_list under control and others are burried in some lib and someone may delete them at any time, and I want to prevent the deletion by collecting nonweak refs in a list?
Or I may want to get the repr() of the object itself, not <weakproxy at xx to Other at xx>
I guess there should be something like weakref.unproxy I'm not aware about.
I know this is an old question but I was looking for an answer recently and came up with something. Like others said, there is no documented way to do it and looking at the implementation of weakproxy type confirms that there is no standard way to achieve this.
My solution uses the fact that all Python objects have a set of standard methods (like __repr__) and that bound method objects contain a reference to the instance (in __self__ attribute).
Therefore, by dereferencing the proxy to get the method object, we can get a strong reference to the proxied object from the method object.
Example:
>>> def func():
... pass
...
>>> weakfunc = weakref.proxy(func)
>>> f = weakfunc.__repr__.__self__
>>> f is func
True
Another nice thing is that it will work for strong references as well:
>>> func.__repr__.__self__ is func
True
So there's no need for type checks if either a proxy or a strong reference could be expected.
Edit:
I just noticed that this doesn't work for proxies of classes. This is not universal then.
Basically there is something like weakref.unproxy, but it's just named weakref.ref(x)().
The proxy object is only there for delegation and the implementation is rather shaky...
The == function doesn't work as you would expect it:
>>> weakref.proxy(object) == object
False
>>> weakref.proxy(object) == weakref.proxy(object)
True
>>> weakref.proxy(object).__eq__(object)
True
However, I see that you don't want to call weakref.ref objects all the time. A good working proxy with dereference support would be nice.
But at the moment, this is just not possible. If you look into python builtin source code you see, that you need something like PyWeakref_GetObject, but there is just no call to this method at all (And: it raises a PyErr_BadInternalCall if the argument is wrong, so it seems to be an internal function). PyWeakref_GET_OBJECT is used much more, but there is no method in weakref.py that could be able to do that.
So, sorry to disappoint you, but you weakref.proxy is just not what most people would want for their use cases. You can however make your own proxy implementation. It isn't to hard. Just use weakref.ref internally and override __getattr__, __repr__, etc.
On a little sidenote on how PyCharm is able to produce the normal repr output (Because you mentioned that in a comment):
>>> class A(): pass
>>> a = A()
>>> weakref.proxy(a)
<weakproxy at 0x7fcf7885d470 to A at 0x1410990>
>>> weakref.proxy(a).__repr__()
'<__main__.A object at 0x1410990>'
>>> type( weakref.proxy(a))
<type 'weakproxy'>
As you can see, calling the original __repr__ can really help!
weakref.ref is hashable whereas weakref.proxy is not. The API doesn't say anything about how you actually can get a handle on the object a proxy points to. with weakref, it's easy, you can just call it. As such, you can roll your own proxy-like class...Here's a very basic attemp:
import weakref
class C(object):
def __init__(self,obj):
self.object=weakref.ref(obj)
def __getattr__(self,key):
if(key == "object"): return object.__getattr__(self,"object")
elif(key == "__init__"): return object.__getattr__(self,"__init__")
else:
obj=object.__getattr__(self,"object")() #Dereference the weakref
return getattr(obj,key)
class Other(object):
pass
others = [Other() for i in range(3)]
my_list = [C(others[i % len(others)]) for i in range(10)]
unique_list = {x.object for x in my_list}
Of course, now unique_list contains refs, not proxys which is fundamentally different...
I know that this is an old question, but I've been bitten by it (so, there's no real 'unproxy' in the standard library) and wanted to share my solution...
The way I solved it to get the real instance was just creating a property which returned it (although I suggest using weakref.ref instead of a weakref.proxy as code should really check if it's still alive before accessing it instead of having to remember to catch an exception whenever any attribute is accessed).
Anyways, if you still must use a proxy, the code to get the real instance is:
import weakref
class MyClass(object):
#property
def real_self(self):
return self
instance = MyClass()
proxied = weakref.proxy(instance)
assert proxied.real_self is instance
I've got a bunch of functions (outside of any class) where I've set attributes on them, like funcname.fields = 'xxx'. I was hoping I could then access these variables from inside the function with self.fields, but of course it tells me:
global name 'self' is not defined
So... what can I do? Is there some magic variable I can access? Like __this__.fields?
A few people have asked "why?". You will probably disagree with my reasoning, but I have a set of functions that all must share the same signature (accept only one argument). For the most part, this one argument is enough to do the required computation. However, in a few limited cases, some additional information is needed. Rather than forcing every function to accept a long list of mostly unused variables, I've decided to just set them on the function so that they can easily be ignored.
Although, it occurs to me now that you could just use **kwargs as the last argument if you don't care about the additional args. Oh well...
Edit: Actually, some of the functions I didn't write, and would rather not modify to accept the extra args. By "passing in" the additional args as attributes, my code can work both with my custom functions that take advantage of the extra args, and with third party code that don't require the extra args.
Thanks for the speedy answers :)
self isn't a keyword in python, its just a normal variable name. When creating instance methods, you can name the first parameter whatever you want, self is just a convention.
You should almost always prefer passing arguments to functions over setting properties for input, but if you must, you can do so using the actual functions name to access variables within it:
def a:
if a.foo:
#blah
a.foo = false
a()
see python function attributes - uses and abuses for when this comes in handy. :D
def foo():
print(foo.fields)
foo.fields=[1,2,3]
foo()
# [1, 2, 3]
There is nothing wrong with adding attributes to functions. Many memoizers use this to cache results in the function itself.
For example, notice the use of func.cache:
from decorator import decorator
#decorator
def memoize(func, *args, **kw):
# Author: Michele Simoniato
# Source: http://pypi.python.org/pypi/decorator
if not hasattr(func, 'cache'):
func.cache = {}
if kw: # frozenset is used to ensure hashability
key = args, frozenset(kw.iteritems())
else:
key = args
cache = func.cache # attribute added by memoize
if key in cache:
return cache[key]
else:
cache[key] = result = func(*args, **kw)
return result
You can't do that "function accessing its own attributes" correctly for all situations - see for details here how can python function access its own attributes? - but here is a quick demonstration:
>>> def f(): return f.x
...
>>> f.x = 7
>>> f()
7
>>> g = f
>>> g()
7
>>> del f
>>> g()
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
File "<interactive input>", line 1, in f
NameError: global name 'f' is not defined
Basically most methods directly or indirectly rely on accessing the function object through lookup by name in globals; and if original function name is deleted, this stops working. There are other kludgey ways of accomplishing this, like defining class, or factory - but thanks to your explanation it is clear you don't really need that.
Just do the mentioned keyword catch-all argument, like so:
def fn1(oneArg):
// do the due
def fn2(oneArg, **kw):
if 'option1' in kw:
print 'called with option1=', kw['option1']
//do the rest
fn2(42)
fn2(42, option1='something')
Not sure what you mean in your comment of handling TypeError - that won't arise when using **kw. This approach works very well for some python system functions - check min(), max(), sort(). Recently sorted(dct,key=dct.get,reverse=True) came very handy to me in CodeGolf challenge :)
Example:
>>> def x(): pass
>>> x
<function x at 0x100451050>
>>> x.hello = "World"
>>> x.hello
"World"
You can set attributes on functions, as these are just plain objects, but I actually never saw something like this in real code.
Plus. self is not a keyword, just another variable name, which happens to be the particular instance of the class. self is passed implicitly, but received explicitly.
if you want globally set parameters for a callable 'thing' you could always create a class and implement the __call__ method?
There is no special way, within a function's body, to refer to the function object whose code is executing. Simplest is just to use funcname.field (with funcname being the function's name within the namespace it's in, which you indicate is the case -- it would be harder otherwise).
This isn't something you should do. I can't think of any way to do what you're asking except some walking around on the call stack and some weird introspection -- which isn't something that should happen in production code.
That said, I think this actually does what you asked:
import inspect
_code_to_func = dict()
def enable_function_self(f):
_code_to_func[f.func_code] = f
return f
def get_function_self():
f = inspect.currentframe()
code_obj = f.f_back.f_code
return _code_to_func[code_obj]
#enable_function_self
def foo():
me = get_function_self()
print me
foo()
While I agree with the the rest that this is probably not good design, the question did intrigue me. Here's my first solution, which I may update once I get decorators working. As it stands, it relies pretty heavily on being able to read the stack, which may not be possible in all implementations (something about sys._getframe() not necessarily being present...)
import sys, inspect
def cute():
this = sys.modules[__name__].__dict__.get(inspect.stack()[0][3])
print "My face is..." + this.face
cute.face = "very cute"
cute()
What do you think? :3
You could use the following (hideously ugly) code:
class Generic_Object(object):
pass
def foo(a1, a2, self=Generic_Object()):
self.args=(a1,a2)
print "len(self.args):", len(self.args)
return None
... as you can see it would allow you to use "self" as you described. You can't use an "object()" directly because you can't "monkey patch(*)" values into an object() instance. However, normal subclasses of object (such as the Generic_Object() I've shown here) can be "monkey patched"
If you wanted to always call your function with a reference to some object as the first argument that would be possible. You could put the defaulted argument first, followed by a *args and optional **kwargs parameters (through which any other arguments or dictionaries of options could be passed during calls to this function).
This is, as I said hideously ugly. Please don't ever publish any code like this or share it with anyone in the Python community. I'm only showing it here as a sort of strange educational exercise.
An instance method is like a function in Python. However, it exists within the namespace of a class (thus it must be accessed via an instance ... myobject.foo() for example) and it is called with a reference to "self" (analagous to the "this" pointer in C++) as the first argument. Also there's a method resolution process which causes the interpreter to search the namespace of the instance, then it's class, and then each of the parent classes and so on ... up through the inheritance tree.
An unbound function is called with whatever arguments you pass to it. There can't bee any sort of automatically pre-pended object/instance reference to the argument list. Thus, writing a function with an initial argument named "self" is meaningless. (It's legal because Python doesn't place any special meaning on the name "self." But meaningless because callers to your function would have to manually supply some sort of object reference to the argument list and it's not at all clear what that should be. Just some bizarre "Generic_Object" which then floats around in the global variable space?).
I hope that clarifies things a bit. It sounds like you're suffering from some very fundamental misconceptions about how Python and other object-oriented systems work.
("Monkey patching" is a term used to describe the direct manipulation of an objects attributes -- or "instance variables" by code that is not part of the class hierarchy of which the object is an instance).
As another alternative, you can make the functions into bound class methods like so:
class _FooImpl(object):
a = "Hello "
#classmethod
def foo(cls, param):
return cls.a + param
foo = _FooImpl.foo
# later...
print foo("World") # yes, Hello World
# and if you have to change an attribute:
foo.im_self.a = "Goodbye "
If you want functions to share attribute namespaecs, you just make them part of the same class. If not, give each its own class.
What exactly are you hoping "self" would point to, if the function is defined outside of any class? If your function needs some global information to execute properly, you need to send this information to the function in the form of an argument.
If you want your function to be context aware, you need to declare it within the scope of an object.