I recently spent way too long debugging a piece of code, only to realize that the issue was I did not include a () after a command. What is the logic behind which commands require a () and which do not?
For example:
import pandas as pd
col1=['a','b','c','d','e']
col2=[1,2,3,4,5]
df=pd.DataFrame(list(zip(col1,col2)),columns=['col1','col2'])
df.columns
Returns Index(['col1', 'col2'], dtype='object') as expected. If we use .columns() we get an error.
Other commands it is the opposite:
df.isna()
Returns:
col1 col2
0 False False
1 False False
2 False False
3 False False
4 False False
but df.isna returns:
<bound method DataFrame.isna of col1 col2
0 a 1
1 b 2
2 c 3
3 d 4
4 e 5>
Which, while not throwing an error, is clearly not what we're looking for.
What's the logic behind which commands use a () and which do not?
I use pandas as an example here, but I think this is relevant to python more generally.
Because functions need parenthesis for their arguments, while variables do not, that's why it's list.append(<item>) but it's list.items.
If you call a function without the parenthesis like list.append what returns is a description of the function, not a description of what the function does, but a description of what it is.
As for classes, a call to a class with parenthesis initiates an object of that class, while a call to a class without the parenthesis point to the class itself, which means that if you were to execute print(SomeClass) you'd get <class '__main__.SomeClass'> which is a description of what it is, the same kind of response you'd get if you were to call a function without parenthesis.
What's the logic behind which commands use a () and which do not?
An object needs to have a __call__ method associated with it for it to called as a function using ():
class Test:
def __call__(self, arg):
print("Called with", arg)
t = Test() # The Test class object uses __call__ to create instances
t(5) # Then this line prints "Called with 5"
So, the difference is that columns doesn't have a __call__ method defined, while Index and DataFrame do.
TL;DR you just kinda have to know
Nominally, the parens are needed to call a function instead of just returning an object.
foo.bar # get the bar object
foo.bar() # call the bar object
Callable objects have a __call__ method. When python sees the (), it knows to call __call__. This is done at the C level.
In addition, python has the concept of a property. Its a callable data object that looks like a regular data object.
class Foo:
def __init__(self):
self._foo = "foo"
#property
def foo(self):
return "I am " + self._foo
#foo.setter
def foo(self, val):
assert isinstance(val, str)
self._foo = val + " you bet"
f = Foo()
f.foo = "Hello" # calls setter
print(f.foo) # calls getter
Similarly, when python sees array notation foo[1] it will call an object's __getitem__ or __setitem__ methods and the object is free to overload that call in any way it sees fit.
Finally, the object itself can intercept attribute access with __getattr__, __getattribute__ and __setattr__ methods, leaving everything up in the air. In fact, python doesn't really know what getting and setting attributes means. It is calling these methods. Most objects just use the default versions inherited from object. If the class is implemented in C, there is no end to what could be going on in the background.
Python is a dynamic language and many packages add abstractions to make it easier (?) to use their services. The downside is that you may spend more time with help text and documentation than one may like.
Object method vs Object attribute.
Objects has methods and attributes.
Methods require a parenthesis to call them -- even if the method does not require arguments.
Where as attributes are like variables are pointed to objects as the program progresses. You just call these attributes by their name (without parenthesis). Of course you may have to qualify both the methods and attributes with the object names as required.
Related
I have following class with a function:
class A:
def myfn():
print("In myfn method.")
Here, the function does not have self as argument. It also does not have #classmethod or #staticmethod as decorator. However, it works if called with class:
A.myfn()
Output:
In myfn method.
But give an error if called from any instance:
a = A()
a.myfn()
Error output:
Traceback (most recent call last):
File "testing.py", line 16, in <module>
a.myfn()
TypeError: myfn() takes 0 positional arguments but 1 was given
probably because self was also sent as an argument.
What kind of function will this be called? Will it be a static function? Is it advisable to use function like this in classes? What is the drawback?
Edit: This function works only when called with class and not with object/instance. My main question is what is such a function called?
Edit2: It seems from the answers that this type of function, despite being the simplest form, is not accepted as legal. However, as no serious drawback is mentioned in any of many answers, I find this can be a useful construct, especially to group my own static functions in a class that I can call as needed. I would not need to create any instance of this class. In the least, it saves me from typing #staticmethod every time and makes code look less complex. It also gets derived neatly for someone to extend my class. Although all such functions can be kept at top/global level, keeping them in class is more modular. However, I feel there should be a specific name for such a simple construct which works in this specific way and it should be recognized as legal. It may also help beginners understand why self argument is needed for usual functions in a Python class. This will only add to the simplicity of this great language.
The function type implements the descriptor protocol, which means when you access myfn via the class or an instance of the class, you don't get the actual function back; you get instead the result of that function's __get__ method. That is,
A.myfn == A.myfn.__get__(None, A)
Here, myfn is an instance method, though one that hasn't been defined properly to be used as such. When accessed via the class, though, the return value of __get__ is simply the function object itself, and the function can be called the same as a static method.
Access via an instance results in a different call to __get__. If a is an instance of A, then
a.myfn() == A.myfn.__get__(a, A)
Here , __get__ tries to return, essentially, a partial application of myfn to a, but because myfn doesn't take any arguments, that fails.
You might ask, what is a static method? staticmethod is a type that wraps a function and defines its own __get__ method. That method returns the underlying function whether or not the attribute is accessed via the class or an instance. Otherwise, there is very little difference between a static method and an ordinary function.
This is not a true method. Correctly declarated instance methods should have a self argument (the name is only a convention and can be changed if you want hard to read code), and classmethods and staticmethods should be introduced by their respective decorator.
But at a lower level, def in a class declaration just creates a function and assigns it to a class member. That is exactly what happens here: A.my_fn is a function and can successfully be called as A.my_fn().
But as it was not declared with #staticmethod, it is not a true static method and it cannot be applied on a A instance. Python sees a member of that name that happens to be a function which is neither a static nor a class method, so it prepends the current instance to the list of arguments and tries to execute it.
To answer your exact question, this is not a method but just a function that happens to be assigned to a class member.
Such a function isn't the same as what #staticmethod provides, but is indeed a static method of sorts.
With #staticmethod you can also call the static method on an instance of the class. If A is a class and A.a is a static method, you'll be able to do both A.a() and A().a(). Without this decorator, only the first example will work, because for the second one, as you correctly noticed, "self [will] also [be] sent as an argument":
class A:
#staticmethod
def a():
return 1
Running this:
>>> A.a() # `A` is the class itself
1
>>> A().a() # `A()` is an instance of the class `A`
1
On the other hand:
class B:
def b():
return 2
Now, the second version doesn't work:
>>> B.b()
2
>>> B().b()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: b() takes 0 positional arguments but 1 was given
further to #chepnet's answer, if you define a class whose objects implement the descriptor protocol like:
class Descr:
def __get__(self, obj, type=None):
print('get', obj, type)
def __set__(self, obj, value):
print('set', obj, value)
def __delete__(self, obj):
print('delete', obj)
you can embed an instance of this in a class and invoke various operations on it:
class Foo:
foo = Descr()
Foo.foo
obj = Foo()
obj.foo
which outputs:
get None <class '__main__.Foo'>
get <__main__.Foo object at 0x106d4f9b0> <class '__main__.Foo'>
as functions also implement the descriptor protocol, we can replay this by doing:
def bar():
pass
print(bar)
print(bar.__get__(None, Foo))
print(bar.__get__(obj, Foo))
which outputs:
<function bar at 0x1062da730>
<function bar at 0x1062da730>
<bound method bar of <__main__.Foo object at 0x106d4f9b0>>
hopefully that complements chepnet's answer which I found a little terse/opaque
I created some test code, but I can't really understand why it works.
Shouldn't moo be defined before we can use it?
#!/usr/bin/python3
class Test():
def __init__(self):
self.printer = None
def foo(self):
self.printer = self.moo
self.printer()
def moo(self):
print("Y u printing?")
test = Test()
test.foo()
Output:
$ python test.py
Y u printing?
I know that the rule is define earlier, not higher, but in this case it's neither of those.
There's really nothing to be confused about here.
We have a function that says "when you call foo with a self parameter, look up moo in self's namespace, assign that value to printer in self's namespace, look up printer in self's namespace, and call that value".1
Unless/until you call that function, it doesn't matter whether or not anyone anywhere has an attribute named moo.
When you do call that method, whatever you pass as the self had better have a moo attribute or you're going to get an AttributeError. But this is no different from looking up an attribute on any object. If you write def spam(n): return n.bit_length() as a global function, when you call that function, whatever you pass as the n had better have a bit_length attribute or you're going to get an AttributeError.
So, we're calling it as test.foo(), so we're passing test as self. If you know how attribute lookup works (and there are already plenty of questions and answers on SO about that), you can trace this through. Slightly oversimplified:
Does test.__dict__ have a 'moo'? No.
Does type(test).__dict__ have a 'moo'? Yes. So we're done.
Again, this is the same way we check if 3 has a bit_length() method; there's no extra magic here.
That's really all there is to it.
In particular, notice that test.__dict__ does not have a 'moo'. Methods don't get created at construction time (__new__) any more than they get created at initialization time (__init__). The instance doesn't have any methods in it, because it doesn't have to; they can be looked up on the type.2
Sure, we could get into descriptors, and method resolution order, and object.__getattribute__, and how class and def statements are compiled and executed, and special method lookup to see if there's a custom __getattribute__, and so on, but you don't need any of that to understand this question.
1. If you're confused by this, it's probably because you're thinking in terms of semi-OO languages like C++ and its descendants, where a class has to specify all of its instances' attributes and methods, so the compiler can look at this->moo(), work out that this has a static type ofFoo, work out thatmoois the third method defined onFoo, and compile it into something likethis->vptr2`. If that's what you're expecting, forget all of it. In Python, methods are just attributes, and attributes are just looked up, by name, on demand.
2. If you're going to ask "then why is a bound method not the same thing as a function?", the answer is descriptors. Briefly: when an attribute is found on the type, Python calls the value's __get__ method, passing it the instance, and function objects' __get__ methods return method objects. So, if you want to refer specifically to bound method objects, then they get created every time a method is looked up. In particular, the bound method object does not exist yet when we call foo; it gets created by looking up self.moo inside foo.
While all that #scharette says is likely true (I don't know enough of Python internals to agree with confidence :) ), I'd like to propose an alternative explanation as to why one can instantiate Test and call foo():
The method's body is not executed until you actually call it. It does not matter if foo() contains references to undefined attributes, it will be parsed fine. As long as you create moo before you call foo, you're ok.
Try entering a truncated Test class in your interpreter:
class Test():
def __init__(self):
self.printer = None
def foo(self):
self.printer = self.moo
self.printer()
No moo, so we get this:
>>> test = Test()
>>> test.foo()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 5, in foo
Let's add moo to the class now:
>>> def moo(self):
... print("Y u printing?")
...
>>> Test.moo = moo
>>> test1 = Test()
>>> test1.foo()
Y u printing?
>>>
Alternatively, you can add moo directly to the instance:
>>> def moo():
... print("Y u printing?")
...
>>> test.moo = moo
>>> test.foo()
Y u printing?
The only difference is that the instance's moo does not take a self (see here for explanation).
This question already has answers here:
Overriding special methods on an instance
(5 answers)
Closed 5 years ago.
Suppose I want to run the following code using Python 3.6.3:
class Foo:
def bar(self):
return 1
def __len__(self):
return 2
class FooWrapper:
def __init__(self, foo):
self.bar = foo.bar
self.__len__ = foo.__len__
f = Foo()
print(f.bar())
print(f.__len__())
print(len(f))
w = FooWrapper(Foo())
print(w.bar())
print(w.__len__())
print(len(w))
Here's the output:
1
2
2
1
2
TypeError: object of type 'FooWrapper' has no len()
So __len__() works, but len() does not? What gives and how go I properly copy __len__ method from Foo to FooWrapper?
By the way, the following behavior is universal for all 'special' methods, not only __len__: for example, __iter__ or __getitem__ do not work either (unless called directly)
The reason this happens is because the special methods have to be on an object's class, not on the instance.
len will look for __len__ in the FooWrapper class. BTW, although this looks like it "works", you are actually adding foo.__len__, i.e. , the methoud already bound to the foo instance of Foo to your FooWrapper object. That might be the intent, but you have to be aware of this.
The easiest way for this to work is to make FooWrapper have itself a __len__ method that will call the wrapped instance's __len__:
class FooWrapper:
def __init__(self, foo):
self.foo = foo
self.bar = foo.bar
self.__len__ = foo.__len__
def __len__(self):
return len(self.foo)
Does that mean that for any and all special methods hace to explicitly exist in the wrapper class? Yes, it does - and it is one of the pains for creating proxies that behave just the same as the wrapped object.
That is because the special methods' checking for existence and calling is done directly in C, and not using Python's lenghty lookup mechanisms, as tat would be too inefficient.
It is possible to create a wrapper-class factory thing that would inspect the object and create a brand new wrapper class,with all meaningful special methods proxied, though - but I think that would be too advanced for what you have in mind right now.
You'd better just use explicit special methods, or explicit access to the wrapped object in the remainder of the code. (Like, when you will need to use __iter__ from a wrapped object, instead of doing just for x in wrapper, do for x in wrapper.wrapped )
This question already has answers here:
Overriding special methods on an instance
(5 answers)
Closed 5 years ago.
I'm working on a project right now that deals with functions in an abstract mathematical sense. Without boring the reader with the details, I'll say that I had the following structure in an earlier version:
class Foo(Bar):
def __init__(self, a, b):
self.a = a
self.b = b
self.sub_unit = Foo(a, not b)
Of course, that's not quite the change I'm making to the arguments to Foo, but suffice to say, it is necessary that this property, if accessed repeatedly, result in an indefinitely long chain of Foo objects. Obviously, this results in an infinite recursion when one instantiates Foo. I solved this in the earlier version by removing the last line of init and adding the following to the Foo class:
def __getattr__(self, attr: str):
if attr == 'sub_unit':
return Foo(self.a, not self.b)
else:
return super().__getattr__(attr)
This worked quite well, as I could calculate the next object in the chain as needed.
In going over the code, though, I realize that for other reasons, I need an instance of Bar, not a sub-class of it. To see if I could override the getattr for a single instance, I tried the following:
>>> foo = Bar(a=1, b=2) # sub_unit doesn't get set here.
>>> foo.__getattr__ = lambda attr: 'foo'
>>> foo.a
1
>>> foo.__getattr__('a')
'foo'
What is happening here that I don't understand? Why isn't foo.a calling foo.__getattr__('a')?
Is there a good way to overwrite __getattr__ for a single instance, or is my best bet to re-factor all the code I have that reads sub_unit and friends to call those as functions, to handle this special case?
When you lookup the attribute a with foo.a, python looks it up in the instance's property dictionary. When it is not found, the __getattr__ method will then be called.
On the contrary, if a exists in the instance, __getattr__ will not be called.
I know a ton has been written on this subject. I cannot, however, absorb much of it. Perhaps because I'm a complete novice teaching myself without the benefit of any training in computer science. Regardless, maybe if some of you big brains chime in on this specific example, you'll help other beginners like me.
So, I've written the following function which works just fine when I call it (as a module?) as it's own file called 'funky.py':
I type the following into my terminal:
python classy.py
and it runs fine.
def load_deck():
suite = ('Spades', 'Hearts')
rank = ('2', '3')
full_deck = {}
i = 0
for s in suite:
for r in rank:
full_deck[i] = "%s of %s" % (r, s)
i += 1
return full_deck
print load_deck()
When I put the same function in a class, however, I get an error.
Here's my code for 'classy.py':
class GAME():
def load_deck():
suite = ('Spades', 'Hearts')
rank = ('2', '3')
full_deck = {}
i = 0
for s in suite:
for r in rank:
full_deck[i] = "%s of %s" % (r, s)
i += 1
return full_deck
MyGame = GAME()
print MyGame.load_deck()
I get the following error:
Traceback (most recent call last):
File "classy.py", line 15, in <module>
print MyGame.load_deck()
TypeError: load_deck() takes no arguments (1 given)
So, I changed the definition line to the following and it works fine:
def load_deck(self):
What is it about putting a function in a class that demands the use of 'self'. I understand that 'self' is just a convention. So, why is any argument needed at all? Do functions behave differently when they are called from within a class?
Also, and this is almost more important, why does my class work without the benefit of using init ? What would using init do for my class?
Basically, if someone has the time to explain this to me like i'm a 6 year-old, it would help. Thanks in advance for any help.
Defining a function in a class definition invokes some magic that turns it into a method descriptor. When you access foo.method it will automatically create a bound method and pass the object instance as the first parameter. You can avoid this by using the #staticmethod decorator.
__init__ is simply a method called when your class is created to do optional setup. __new__ is what actually creates the object.
Here are some examples
>>> class Foo(object):
def bar(*args, **kwargs):
print args, kwargs
>>> foo = Foo()
>>> foo.bar
<bound method Foo.bar of <__main__.Foo object at 0x01C9FEB0>>
>>> Foo.bar
<unbound method Foo.bar>
>>> foo.bar()
(<__main__.Foo object at 0x01C9FEB0>,) {}
>>> Foo.bar()
Traceback (most recent call last):
File "<pyshell#29>", line 1, in <module>
Foo.bar()
TypeError: unbound method bar() must be called with Foo instance as first argument (got nothing instead)
>>> Foo.bar(foo)
(<__main__.Foo object at 0x01C9FEB0>,) {}
So, why is any argument needed at all?
To access attributes on the current instance of the class.
Say you have a class with two methods, load_deck and shuffle. At the end of load_deck you want to shuffle the deck (by calling the shuffle method)
In Python you'd do something like this:
class Game(object):
def shuffle(self, deck):
return random.shuffle(deck)
def load_deck(self):
# ...
return self.shuffle(full_deck)
Compare this to the roughly-equivalent C++ code:
class Game {
shuffle(deck) {
return random.shuffle(deck);
}
load_deck() {
// ...
return shuffle(full_deck)
}
}
On shuffle(full_deck) line, first it looks for a local variable called shuffle - this doesn't exist, to next it checks one level higher, and finds an instance-method called shuffle (if this doesn't exist, it would check for a global variable with the right name)
This is okay, but it's not clear if shuffle refers to some local variable, or the instance method. To address this ambiguity, instance-methods or instance-attributes can also be accessed via this:
...
load_deck() {
// ...
return this->shuffle(full_deck)
}
this is almost identical to Python's self, except it's not passed as an argument.
Why is it useful to have self as an argument useful? The FAQ lists several good reasons - these can be summarised by a line in "The Zen of Python":
Explicit is better than implicit.
This is backed up by a post in The History of Python blog,
I decided to give up on the idea of implicit references to instance variables. Languages like C++ let you write this->foo to explicitly reference the instance variable foo (in case there’s a separate local variable foo). Thus, I decided to make such explicit references the only way to reference instance variables. In addition, I decided that rather than making the current object ("this") a special keyword, I would simply make "this" (or its equivalent) the first named argument to a method. Instance variables would just always be referenced as attributes of that argument.
With explicit references, there is no need to have a special syntax for method definitions nor do you have to worry about complicated semantics concerning variable lookup. Instead, one simply defines a function whose first argument corresponds to the instance, which by convention is named "self."
If you don't intent to use self you should probably declare the method to be a staticmethod.
class Game:
#staticmethod
def load_deck():
....
This undoes the automatic default packing that ordinarily happens to turn a function in a class scope into a method taking the instance as an argument.
Passing arguments you don't use is disconcerting to others trying to read your code.
Most classes have members. Yours doesn't, so all of its methods should be static. As your project develops, you will probably find data that should be accessible to all of the functions in it, and you will put those in self, and pass it around to all of them.
In this context, where the application itself is your primary object, __init__ is just the function that would initialize all of those shared values.
This is the first step toward an object-oriented style, wherein smaller pieces of data get used as objects themselves. But this is a normal stage in moving from straight scripting to OO programming.