It seems a bit weird that Python requires you to explicitly pass self as the first argument to all class functions. Are there other languages that require something similar?
By explicit, do you mean "explicitly passed as an argument to each class function"?
If so, then Python is the only one I know off-hand.
Most OO languages support this or self in some form, but most of them let you define class functions without always defining self as the first argument.
Depending on your point of view, Lua. To quote the reference: "A call v:name(args) is syntactic sugar for v.name(v,args), except that v is evaluated only once." You can also define methods using either notation. So you could say that Lua has an optional explicit self.
The programming language Oberon 2 has an explicitly named but not explicitly passed 'this' or 'self' argument for member functions of classes (known as type bound procedures in Oberon terminology)
The following example is an Insert method on a type Text, where the identifier 't' is specified to bind to the explicit 'this' or 'self' reference.
PROCEDURE (t: Text) Insert (string: ARRAY OF CHAR; pos: LONGINT);
BEGIN ...
END Insert;
More details on Object Orientation in Oberon are here.
F# (presumably from its OCAML heritage) requires an explicit name for all self-references; though the name is any arbitrary identifier e.g.
override x.BeforeAnalysis() =
base.BeforeAnalysis()
DoWithLock x.AddReference
Here we're defining an overriding member function BeforeAnalysis which calls another member function AddReference. The identifier x here is arbitrary, but is required in both the declaration and any reference to members of the "this"/"self" instance.
Modula-3 does. Which is not too surprising since Python's class mechanism is a mixture of the ones found in Modula-3 and C++.
any Object-Oriented language has a notion of this or self within member functions.
Clojure isn't an OOP language but does use explicit self parameters in some circumstances: most notably when you implement a protocol, and the "self" argument (you can name it anything you like) is the first parameter to a protocol method. This argument is then used for polymorphic dispatch to determine the right function implementation, e.g.:
(defprotocol MyProtocol
(foo [this that]))
(extend-protocol MyProtocol String
(foo [this that]
(str this " and " that)))
(extend-protocol MyProtocol Long
(foo [this that]
(* this that)))
(foo "Cat" "Dog")
=> "Cat and Dog"
(foo 10 20)
=> 200
Also, the first parameter to a function is often used by convention to mean the object that is being acted upon, e.g. the following code to append to a vector:
(conj [1 2 3] 4)
=> [1 2 3 4]
many object oriented languages if not all of them
for example c++ support "this" instead of "self"
but you dont have to pass it, it is passed passively
hope that helps ;)
Related
Here's what I mean:
>>> class Foo:
pass
>>> foo = Foo()
>>> setattr(foo, "#%#$%", 10)
>>> foo.#%#$%
SyntaxError: invalid syntax
>>> getattr(foo, "#%#$%")
10
>>> foo.__dict__
{'#%#$%': 10}
I looked it up and it has been brought up twice on the issue tracker for python 2:
https://bugs.python.org/issue14029
https://bugs.python.org/issue25205
And once for python 3:
https://bugs.python.org/issue35105
They insist it isn't a bug. Yet this behavior is quite obviously not intended; it's not documented in any version. What is the explanation for this? It seems like something that can be ignored easily, but that feels like sweeping it under the rug. So, is there any reason behind setattr's behavior or is it just a benign idiosyncrasy of python?
A bug is something that happens when it's not supposed to happen, i.e., when there's some method of communication forbidding it. If there's no documentation stating this shouldn't happen then (at worst) it's an idiosyncrasy, not a bug.
There appears to be nothing in the Python documentation forbidding attribute names that are not usable with the dot notation (which is, after all, just syntactic sugar), like foo.#%#$%. The only mention is an example of where they are equivalent, specifically:
For example, setattr(x, 'foobar', 123) is equivalent to x.foobar = 123.
The only restriction appears to be whether the class itself allows it:
The function assigns the value to the attribute, provided the object allows it.
In a more formal sense, the dot notation is specified here:
6.3.1. Attribute references
An attribute reference is a primary followed by a period and a name: attributeref ::= primary "." identifier.
The primary must evaluate to an object of a type that supports attribute references, which most objects do. This object is then asked to produce the attribute whose name is the identifier. This production can be customized by overriding the __getattr__() method.
Note the identifier in that syntax, it has limits above and beyond those of actual attribute names, as per here, and PEP 3131 is a more detailed look at what is allowed (it was the PEP that moved identifiers into the non-ASCII world).
Since the limits of identifiers are more restrictive that what is allowed in strings, it makes sense that the getattr/setattr attribute names could be a superset of the ones allowed in dot notation.
I am studying Scott Meyers' More Effective C++. Item 7 advises to never overload && and ||, because their short-circuit behavior cannot be replicated when the operators are turned into function calls (or is this no longer the case?).
As operators can also be overloaded in Python, I am curious whether this situation exists there as well. Is there any operator in Python (2.x, 3.x) that, when overridden, cannot be given its original meaning?
Here is an example of 'original meaning'
class MyInt {
public:
MyInt operator+(MyInt &m) {
return MyInt(this.val + m.val);
};
int val;
MyInt(int v) : val(v){}
}
Exactly the same rationale applies to Python. You shouldn't (and can't) overload and and or, because their short-circuiting behavior cannot be expressed in terms of functions. not isn't permitted either - I guess this is because there's no guarantee that it will be invoked at all.
As pointed out in the comments, the proposal to allow the overloading of logical and and or was officially rejected.
The assignment operator can also not be overloaded.
class Thing: ...
thing = Thing()
thing = 'something else'
There is nothing you can override in Thing to change the behavior of the = operator.
(You can overload property assignment though.)
In Python, all object methods that represent operators are treated "equal": their precedences are described in the language model, and there is no conflict with overriding any.
But both C++ "&&" and "||" - in Python "and" and "or" - are not available in Python as object methods to start with - they check for the object truthfulness, though - which is defined by __bool__. If __bool__is not implemented, Python check for a __len__ method, and check if its output is zero, in which case the object's truth value is False. In all other cases its truth value is True. That makes it for any semantic problems that would arise from combining overriding with the short-circuiting behavior.
Note one can override & and | by implementing __and__ and __or__ with no problems.
As for the other operators, although not directly related, one should just take care with __getattribute__ - the method called when retrieving any attribute from an object (we normally don't mention it as an operator) - including calls from within itself. The __getattr__ is also in place, and is just invoked at the end of the attribute search chain, when an attribute is not found.
I have a framework with some C-like language. Now I'm re-writing that framework and the language is being replaced with Python.
I need to find appropriate Python replacement for the following code construction:
SomeFunction(&arg1)
What this does is a C-style pass-by-reference so the variable can be changed inside the function call.
My ideas:
just return the value like v = SomeFunction(arg1)
is not so good, because my generic function can have a lot of arguments like SomeFunction(1,2,'qqq','vvv',.... and many more)
and I want to give the user ability to get the value she wants.
Return the collection of all the arguments no matter have they changed or not, like: resulting_list = SomeFunction(1,2,'qqq','vvv',.... and many more) interesting_value = resulting_list[3]
this can be improved by giving names to the values and returning dictionary interesting_value = resulting_list['magic_value1']
It's not good because we have constructions like
DoALotOfStaff( [SomeFunction1(1,2,3,&arg1,'qq',val2),
SomeFunction2(1,&arg2,v1),
AnotherFunction(),
...
], flags1, my_var,... )
And I wouldn't like to load the user with list of list of variables, with names or indexes she(the user) should know. The kind-of-references would be very useful here ...
Final Response
I compiled all the answers with my own ideas and was able to produce the solution. It works.
Usage
SomeFunction(1,12, get.interesting_value)
AnotherFunction(1, get.the_val, 'qq')
Explanation
Anything prepended by get. is kind-of reference, and its value will be filled by the function. There is no need in previous defining of the value.
Limitation - currently I support only numbers and strings, but these are sufficient form my use-case.
Implementation
wrote a Getter class which overrides getattribute and produces any variable on demand
all newly created variables has pointer to their container Getter and support method set(self,value)
when set() is called it checks if the value is int or string and creates object inheriting from int or str accordingly but with addition of the same set() method. With this new object we replace our instance in the Getter container
Thank you everybody. I will mark as "answer" the response which led me on my way, but all of you helped me somehow.
I would say that your best, cleanest, bet would be to construct an object containing the values to be passed and/or modified - this single object can be passed, (and will automatically be passed by reference), in as a single parameter and the members can be modified to return the new values.
This will simplify the code enormously and you can cope with optional parameters, defaults, etc., cleanly.
>>> class C:
... def __init__(self):
... self.a = 1
... self.b = 2
...
>>> c=C
>>> def f(o):
... o.a = 23
...
>>> f(c)
>>> c
<class __main__.C at 0x7f6952c013f8>
>>> c.a
23
>>>
Note
I am sure that you could extend this idea to have a class of parameter that carried immutable and mutable data into your function with fixed member names plus storing the names of the parameters actually passed then on return map the mutable values back into the caller parameter name. This technique could then be wrapped into a decorator.
I have to say that it sounds like a lot of work compared to re-factoring your existing code to a more object oriented design.
This is how Python works already:
def func(arg):
arg += ['bar']
arg = ['foo']
func(arg)
print arg
Here, the change to arg automatically propagates back to the caller.
For this to work, you have to be careful to modify the arguments in place instead of re-binding them to new objects. Consider the following:
def func(arg):
arg = arg + ['bar']
arg = ['foo']
func(arg)
print arg
Here, func rebinds arg to refer to a brand new list and the caller's arg remains unchanged.
Python doesn't come with this sort of thing built in. You could make your own class which provides this behavior, but it will only support a slightly more awkward syntax where the caller would construct an instance of that class (equivalent to a pointer in C) before calling your functions. It's probably not worth it. I'd return a "named tuple" (look it up) instead--I'm not sure any of the other ways are really better, and some of them are more complex.
There is a major inconsistency here. The drawbacks you're describing against the proposed solutions are related to such subtle rules of good design, that your question becomes invalid. The whole problem lies in the fact that your function violates the Single Responsibility Principle and other guidelines related to it (function shouldn't have more than 2-3 arguments, etc.). There is really no smart compromise here:
either you accept one of the proposed solutions (i.e. Steve Barnes's answer concerning your own wrappers or John Zwinck's answer concerning usage of named tuples) and refrain from focusing on good design subtleties (as your whole design is bad anyway at the moment)
or you fix the design. Then your current problem will disappear as you won't have the God Objects/Functions (the name of the function in your example - DoALotOfStuff really speaks for itself) to deal with anymore.
Consider these different behaviour::
>> def minus(a, b):
>> return a - b
>> minus(**dict(b=2, a=1))
-1
>> int(**dict(base=2, x='100'))
4
>> import operator
>> operator.sub.__doc__
'sub(a, b) -- Same as a - b.'
>> operator.sub(**dict(b=2, a=1))
TypeError: sub() takes no keyword arguments
Why does operator.sub behave differently from int(x, [base]) ?
It is an implementation detail. The Python C API to retrieve arguments separates between positional and keyword arguments. Positional arguments do not even have a name internally.
The code used to retrieve the arguments of the operator.add functions (and similar ones like sub) is this:
PyArg_UnpackTuple(a,#OP,2,2,&a1,&a2)
As you can see, it does not contain any argument name. The whole code related to operator.add is:
#define spam2(OP,AOP) static PyObject *OP(PyObject *s, PyObject *a) { \
PyObject *a1, *a2; \
if(! PyArg_UnpackTuple(a,#OP,2,2,&a1,&a2)) return NULL; \
return AOP(a1,a2); }
spam2(op_add , PyNumber_Add)
#define spam2(OP,ALTOP,DOC) {#OP, op_##OP, METH_VARARGS, PyDoc_STR(DOC)}, \
{#ALTOP, op_##OP, METH_VARARGS, PyDoc_STR(DOC)},
spam2(add,__add__, "add(a, b) -- Same as a + b.")
As you can see, the only place where a and b are used is in the docstring. The method definition also does not use the METH_KEYWORDS flag which would be necessary for the method to accept keyword arguments.
Generally spoken, you can safely assume that a python-based function where you know an argument name will always accept keyword arguments (of course someone could do nasty stuff with *args unpacking but creating a function doc where the arguments look normal) while C functions may or may not accept keyword arguments. Chances are good that functions with more than a few arguments or optional arguments accept keyword arguments for the later/optional ones. But you pretty much have to test it.
You can find a discussion about supporting keyword arguments everywhere on the python-ideas mailinglist. There is also a statement from Guido van Rossum (the Benevolent Dictator For Life aka the creator of Python) on it:
Hm. I think for many (most?) 1-arg and selected 2-arg functions (and
rarely 3+-arg functions) this would reduce readability, as the example
of ord(char=x) showed.
I would actually like to see a syntactic feature to state that an
argument cannot be given as a keyword argument (just as we already
added syntax to state that it must be a keyword).
One area where I think adding keyword args is outright wrong: Methods
of built-in types or ABCs and that are overridable. E.g. consider the
pop() method on dict. Since the argument name is currently
undocumented, if someone subclasses dict and overrides this method, or
if they create another mutable mapping class that tries to emulate
dict using duck typing, it doesn't matter what the argument name is --
all the callers (expecting a dict, a dict subclass, or a dict-like
duck) will be using positional arguments in the call. But if we were
to document the argument names for pop(), and users started to use
these, then most dict sublcasses and ducks would suddenly be broken
(except if by luck they happened to pick the same name).
operator is a C module, which defines functions differently. Unless the function declaration in the module initialization includes METH_KEYWORDS, the function will not accept keyword arguments under any conditions and you get the error given in the question.
minus(**dict(b=2, a=1)) expands to minus(b=2, a=1). This works because your definition has argument names a and b.
operator.sub(**dict(b=2, a=1)) expands to operator.sub(b=2, a=1). This doesn't work because sub does not accept keyword arguments.
In Python, what is the difference between add and __add__ methods?
A method called add is just that - a method with that name. It has no special meaning whatsoever to the language or the interpreter. The only other thing that could be said about it is that sets have a method with the same name. That's it, nothing special about it.
The method __add__ is called internally by the + operator, so it gets special attention in the language spec and by the interpreter and you override it to define addition for object of a class. You don't call it directly (you can - they're still normal methods, they only get called implicitly in some circumstances and have some extra restrictions - but there's rarely if ever a reason - let alone a good reason). See the docs on "special" methods for details and a complete list of other "special" methods.
If you just went through this doc https://docs.python.org/3/library/operator.html and was curious about the differences between e.g.
operator.add(a, b)
operator.__add__(a, b)
Check the source code https://github.com/python/cpython/blob/3.10/Lib/operator.py :
def add(a, b):
"Same as a + b."
return a + b
...
# All of these "__func__ = func" assignments have to happen after importing
# from _operator to make sure they're set to the right function
...
__add__ = add
So
print(3+3) # call `operator.__add__` which is `operator.add`
import operator
print(operator.add(3, 3)) # call `operator.add` directory
To add to the earlier posts, __*__ are often discouraged as names for identifiers in own-classes unless one is doing some hacking on core-python functionality, like modifying / over-loading standard operators, etc. And also, often such names are linked with magical behavior, so it might be wise to avoid using them in own-namespaces unless the magical nature of a method is implied.
See this post for an elaborate argument