Python: emulate C-style pass-by-reference for variables - python

I have a framework with some C-like language. Now I'm re-writing that framework and the language is being replaced with Python.
I need to find appropriate Python replacement for the following code construction:
SomeFunction(&arg1)
What this does is a C-style pass-by-reference so the variable can be changed inside the function call.
My ideas:
just return the value like v = SomeFunction(arg1)
is not so good, because my generic function can have a lot of arguments like SomeFunction(1,2,'qqq','vvv',.... and many more)
and I want to give the user ability to get the value she wants.
Return the collection of all the arguments no matter have they changed or not, like: resulting_list = SomeFunction(1,2,'qqq','vvv',.... and many more) interesting_value = resulting_list[3]
this can be improved by giving names to the values and returning dictionary interesting_value = resulting_list['magic_value1']
It's not good because we have constructions like
DoALotOfStaff( [SomeFunction1(1,2,3,&arg1,'qq',val2),
SomeFunction2(1,&arg2,v1),
AnotherFunction(),
...
], flags1, my_var,... )
And I wouldn't like to load the user with list of list of variables, with names or indexes she(the user) should know. The kind-of-references would be very useful here ...
Final Response
I compiled all the answers with my own ideas and was able to produce the solution. It works.
Usage
SomeFunction(1,12, get.interesting_value)
AnotherFunction(1, get.the_val, 'qq')
Explanation
Anything prepended by get. is kind-of reference, and its value will be filled by the function. There is no need in previous defining of the value.
Limitation - currently I support only numbers and strings, but these are sufficient form my use-case.
Implementation
wrote a Getter class which overrides getattribute and produces any variable on demand
all newly created variables has pointer to their container Getter and support method set(self,value)
when set() is called it checks if the value is int or string and creates object inheriting from int or str accordingly but with addition of the same set() method. With this new object we replace our instance in the Getter container
Thank you everybody. I will mark as "answer" the response which led me on my way, but all of you helped me somehow.

I would say that your best, cleanest, bet would be to construct an object containing the values to be passed and/or modified - this single object can be passed, (and will automatically be passed by reference), in as a single parameter and the members can be modified to return the new values.
This will simplify the code enormously and you can cope with optional parameters, defaults, etc., cleanly.
>>> class C:
... def __init__(self):
... self.a = 1
... self.b = 2
...
>>> c=C
>>> def f(o):
... o.a = 23
...
>>> f(c)
>>> c
<class __main__.C at 0x7f6952c013f8>
>>> c.a
23
>>>
Note
I am sure that you could extend this idea to have a class of parameter that carried immutable and mutable data into your function with fixed member names plus storing the names of the parameters actually passed then on return map the mutable values back into the caller parameter name. This technique could then be wrapped into a decorator.
I have to say that it sounds like a lot of work compared to re-factoring your existing code to a more object oriented design.

This is how Python works already:
def func(arg):
arg += ['bar']
arg = ['foo']
func(arg)
print arg
Here, the change to arg automatically propagates back to the caller.
For this to work, you have to be careful to modify the arguments in place instead of re-binding them to new objects. Consider the following:
def func(arg):
arg = arg + ['bar']
arg = ['foo']
func(arg)
print arg
Here, func rebinds arg to refer to a brand new list and the caller's arg remains unchanged.

Python doesn't come with this sort of thing built in. You could make your own class which provides this behavior, but it will only support a slightly more awkward syntax where the caller would construct an instance of that class (equivalent to a pointer in C) before calling your functions. It's probably not worth it. I'd return a "named tuple" (look it up) instead--I'm not sure any of the other ways are really better, and some of them are more complex.

There is a major inconsistency here. The drawbacks you're describing against the proposed solutions are related to such subtle rules of good design, that your question becomes invalid. The whole problem lies in the fact that your function violates the Single Responsibility Principle and other guidelines related to it (function shouldn't have more than 2-3 arguments, etc.). There is really no smart compromise here:
either you accept one of the proposed solutions (i.e. Steve Barnes's answer concerning your own wrappers or John Zwinck's answer concerning usage of named tuples) and refrain from focusing on good design subtleties (as your whole design is bad anyway at the moment)
or you fix the design. Then your current problem will disappear as you won't have the God Objects/Functions (the name of the function in your example - DoALotOfStuff really speaks for itself) to deal with anymore.

Related

Python, Do I need to return an object that was passed in as parameter?

When writing a function which accepts a mutable object, which will could be changed, is it necessary to return this object to the caller?
By necessary I mean...
Is there a specific PEP guideline around this?
If not, what is most common in the world of Python programming?
A little bit of code:
def foo(args):
args['a'] = 'new-value'
args['b'] = args['b'] + 1
# is there a need for a 'return args' ?
args = {'a': 'old-value', 'b': 99}
foo(args) # is there a need for args = foo(args)
print(args['a'], args['b']) # outputs new-value 100
"Explicit is better than implicit." makes me think I should make the potential for args to change very explicit in the main body, so that one does not have to look into the function to see if args might be changed...
This is not covered by any PEP, and it's really up to style of the author. Generally in API design though, methods that mutate arguments won't return anything so you don't forget you're mutating things. Be very careful with this kind of design.
In terms of what is more commonplace in python, there are some examples where the object passed is altered. These tend to be methods of the object to be amended (e.g. list.append, where list is a type), but most functions tend to take a copy of the object passed and return a new one (e.g. string.strip, where string is the module string).
This of course also brings up str.strip which is also a method of a str type object which returns a new object.

Why was the mutable default argument's behavior never changed? [duplicate]

This question already has answers here:
"Least Astonishment" and the Mutable Default Argument
(33 answers)
Closed 6 months ago.
I had a very difficult time with understanding the root cause of a problem in an algorithm. Then, by simplifying the functions step by step I found out that evaluation of default arguments in Python doesn't behave as I expected.
The code is as follows:
class Node(object):
def __init__(self, children = []):
self.children = children
The problem is that every instance of Node class shares the same children attribute, if the attribute is not given explicitly, such as:
>>> n0 = Node()
>>> n1 = Node()
>>> id(n1.children)
Out[0]: 25000176
>>> id(n0.children)
Out[0]: 25000176
I don't understand the logic of this design decision? Why did Python designers decide that default arguments are to be evaluated at definition time? This seems very counter-intuitive to me.
The alternative would be quite heavyweight -- storing "default argument values" in the function object as "thunks" of code to be executed over and over again every time the function is called without a specified value for that argument -- and would make it much harder to get early binding (binding at def time), which is often what you want. For example, in Python as it exists:
def ack(m, n, _memo={}):
key = m, n
if key not in _memo:
if m==0: v = n + 1
elif n==0: v = ack(m-1, 1)
else: v = ack(m-1, ack(m, n-1))
_memo[key] = v
return _memo[key]
...writing a memoized function like the above is quite an elementary task. Similarly:
for i in range(len(buttons)):
buttons[i].onclick(lambda i=i: say('button %s', i))
...the simple i=i, relying on the early-binding (definition time) of default arg values, is a trivially simple way to get early binding. So, the current rule is simple, straightforward, and lets you do all you want in a way that's extremely easy to explain and understand: if you want late binding of an expression's value, evaluate that expression in the function body; if you want early binding, evaluate it as the default value of an arg.
The alternative, forcing late binding for both situation, would not offer this flexibility, and would force you to go through hoops (such as wrapping your function into a closure factory) every time you needed early binding, as in the above examples -- yet more heavy-weight boilerplate forced on the programmer by this hypothetical design decision (beyond the "invisible" ones of generating and repeatedly evaluating thunks all over the place).
In other words, "There should be one, and preferably only one, obvious way to do it [1]": when you want late binding, there's already a perfectly obvious way to achieve it (since all of the function's code is only executed at call time, obviously everything evaluated there is late-bound); having default-arg evaluation produce early binding gives you an obvious way to achieve early binding as well (a plus!-) rather than giving TWO obvious ways to get late binding and no obvious way to get early binding (a minus!-).
[1]: "Although that way may not be obvious at first unless you're Dutch."
The issue is this.
It's too expensive to evaluate a function as an initializer every time the function is called.
0 is a simple literal. Evaluate it once, use it forever.
int is a function (like list) that would have to be evaluated each time it's required as an initializer.
The construct [] is literal, like 0, that means "this exact object".
The problem is that some people hope that it to means list as in "evaluate this function for me, please, to get the object that is the initializer".
It would be a crushing burden to add the necessary if statement to do this evaluation all the time. It's better to take all arguments as literals and not do any additional function evaluation as part of trying to do a function evaluation.
Also, more fundamentally, it's technically impossible to implement argument defaults as function evaluations.
Consider, for a moment the recursive horror of this kind of circularity. Let's say that instead of default values being literals, we allow them to be functions which are evaluated each time a parameter's default values are required.
[This would parallel the way collections.defaultdict works.]
def aFunc( a=another_func ):
return a*2
def another_func( b=aFunc ):
return b*3
What is the value of another_func()? To get the default for b, it must evaluate aFunc, which requires an eval of another_func. Oops.
Of course in your situation it is difficult to understand. But you must see, that evaluating default args every time would lay a heavy runtime burden on the system.
Also you should know, that in case of container types this problem may occur -- but you could circumvent it by making the thing explicit:
def __init__(self, children = None):
if children is None:
children = []
self.children = children
The workaround for this, discussed here (and very solid), is:
class Node(object):
def __init__(self, children = None):
self.children = [] if children is None else children
As for why look for an answer from von Löwis, but it's likely because the function definition makes a code object due to the architecture of Python, and there might not be a facility for working with reference types like this in default arguments.
I thought this was counterintuitive too, until I learned how Python implements default arguments.
A function's an object. At load time, Python creates the function object, evaluates the defaults in the def statement, puts them into a tuple, and adds that tuple as an attribute of the function named func_defaults. Then, when a function is called, if the call doesn't provide a value, Python grabs the default value out of func_defaults.
For instance:
>>> class C():
pass
>>> def f(x=C()):
pass
>>> f.func_defaults
(<__main__.C instance at 0x0298D4B8>,)
So all calls to f that don't provide an argument will use the same instance of C, because that's the default value.
As far as why Python does it this way: well, that tuple could contain functions that would get called every time a default argument value was needed. Apart from the immediately obvious problem of performance, you start getting into a universe of special cases, like storing literal values instead of functions for non-mutable types to avoid unnecessary function calls. And of course there are performance implications galore.
The actual behavior is really simple. And there's a trivial workaround, in the case where you want a default value to be produced by a function call at runtime:
def f(x = None):
if x == None:
x = g()
This comes from python's emphasis on syntax and execution simplicity. a def statement occurs at a certain point during execution. When the python interpreter reaches that point, it evaluates the code in that line, and then creates a code object from the body of the function, which will be run later, when you call the function.
It's a simple split between function declaration and function body. The declaration is executed when it is reached in the code. The body is executed at call time. Note that the declaration is executed every time it is reached, so you can create multiple functions by looping.
funcs = []
for x in xrange(5):
def foo(x=x, lst=[]):
lst.append(x)
return lst
funcs.append(foo)
for func in funcs:
print "1: ", func()
print "2: ", func()
Five separate functions have been created, with a separate list created each time the function declaration was executed. On each loop through funcs, the same function is executed twice on each pass through, using the same list each time. This gives the results:
1: [0]
2: [0, 0]
1: [1]
2: [1, 1]
1: [2]
2: [2, 2]
1: [3]
2: [3, 3]
1: [4]
2: [4, 4]
Others have given you the workaround, of using param=None, and assigning a list in the body if the value is None, which is fully idiomatic python. It's a little ugly, but the simplicity is powerful, and the workaround is not too painful.
Edited to add: For more discussion on this, see effbot's article here: http://effbot.org/zone/default-values.htm, and the language reference, here: http://docs.python.org/reference/compound_stmts.html#function
I'll provide a dissenting opinion, by addessing the main arguments in the other posts.
Evaluating default arguments when the function is executed would be bad for performance.
I find this hard to believe. If default argument assignments like foo='some_string' really add an unacceptable amount of overhead, I'm sure it would be possible to identify assignments to immutable literals and precompute them.
If you want a default assignment with a mutable object like foo = [], just use foo = None, followed by foo = foo or [] in the function body.
While this may be unproblematic in individual instances, as a design pattern it's not very elegant. It adds boilerplate code and obscures default argument values. Patterns like foo = foo or ... don't work if foo can be an object like a numpy array with undefined truth value. And in situations where None is a meaningful argument value that may be passed intentionally, it can't be used as a sentinel and this workaround becomes really ugly.
The current behaviour is useful for mutable default objects that should be shared accross function calls.
I would be happy to see evidence to the contrary, but in my experience this use case is much less frequent than mutable objects that should be created anew every time the function is called. To me it also seems like a more advanced use case, whereas accidental default assignments with empty containers are a common gotcha for new Python programmers. Therefore, the principle of least astonishment suggests default argument values should be evaluated when the function is executed.
In addition, it seems to me that there exists an easy workaround for mutable objects that should be shared across function calls: initialise them outside the function.
So I would argue that this was a bad design decision. My guess is that it was chosen because its implementation is actually simpler and because it has a valid (albeit limited) use case. Unfortunately, I don't think this will ever change, since the core Python developers want to avoid a repeat of the amount of backwards incompatibility that Python 3 introduced.
Python function definitions are just code, like all the other code; they're not "magical" in the way that some languages are. For example, in Java you could refer "now" to something defined "later":
public static void foo() { bar(); }
public static void main(String[] args) { foo(); }
public static void bar() {}
but in Python
def foo(): bar()
foo() # boom! "bar" has no binding yet
def bar(): pass
foo() # ok
So, the default argument is evaluated at the moment that that line of code is evaluated!
Because if they had, then someone would post a question asking why it wasn't the other way around :-p
Suppose now that they had. How would you implement the current behaviour if needed? It's easy to create new objects inside a function, but you cannot "uncreate" them (you can delete them, but it's not the same).

is there any way to prevent side effects in python?

Is there any way to prevent side effects in python? For example, the following function has a side effect, is there any keyword or any other way to have the python complain about it?
def func_with_side_affect(a):
a.append('foo')
Python is really not set up to enforce prevention of side-effects. As some others have mentioned, you can try to deepcopy the data or use immutable types, but these still have corner cases that are tricky to catch, and it's just a ton more effort than it's worth.
Using a functional style in Python normally involves the programmer simply designing their functions to be functional. In other words, whenever you write a function, you write it in such a way that it doesn't mutate the arguments.
If you're calling someone else's function, then you have to make sure the data you are passing in either cannot be mutated, or you have to keep around a safe, untouched copy of the data yourself, that you keep away from that untrusted function.
No, but with you example, you could use immutable types, and pass tuple as an a argument. Side effects can not affect immutable types, for example you can not append to tuple, you could only create other tuple by extending given.
UPD: But still, your function could change objects which is referenced by your immutable object (as it was pointed out in comments), write to files and do some other IO.
Sorry really late to the party. You can use effect library to isolate side-effects in your python code. As others have said in Python you have to explicitly write functional style code but this library really encourages towards it.
About the only way to enforce that would be to overwrite the function specification to deepcopy any arguments before they are passed to the original function. You could to that with a function decorator.
That way, the function has no way to actually change the originally passed arguments. This however has the "sideeffect" of a considerable slowdown as the deepcopy operation is rather costly in terms of memory (and garbage-collection) usage as well as CPU consumption.
I'd rather recommend you properly test your code to ensure that no accidental changes happen or use a language that uses full copy-by-value semantics (or has only immutable variables).
As another workaround, you could make your passed objects basically immutable by adding this to your classes:
"""An immutable class with a single attribute 'value'."""
def __setattr__(self, *args):
raise TypeError("can't modify immutable instance")
__delattr__ = __setattr__
def __init__(self, value):
# we can no longer use self.value = value to store the instance data
# so we must explicitly call the superclass
super(Immutable, self).__setattr__('value', value)
(Code copied from the Wikipedia article about Immutable object)
Since any Python code can do IO, any Python code could launch intercontinental ballistic missiles (and I'd consider launching ICBMs to be a fairly catastrophic side effect for most purposes).
The only way to avoid side effects is to not use Python code in the first place but rather data - i.e. you end up creating a domain specific language which disallows side effects, and a Python interpreter which executes programs of that language.
You'll have to make a copy of the list first. Something like this:
def func_without_side_affect(a):
b = a[:]
b.append('foo')
return b
This shorter version might work for you too:
def func_without_side_affect(a):
return a[:] + ['foo']
If you have nested lists or other things like that, you'll probably want to look at copy.deepcopy to make the copy instead of the [:] slice operator.
It would be very difficult to do for the general case, but for some practical cases you could do something like this:
def call_function_checking_for_modification(f, *args, **kwargs):
myargs = [deepcopy(x) for x in args]
mykwargs = dict((x, deepcopy(kwargs[x])) for x in kwargs)
retval = f(*args, **kwargs)
for arg, myarg in izip(args, myargs):
if arg != myarg:
raise ValueError, 'Argument was modified during function call!'
for kwkey in kwargs:
if kwargs[kwkey] != mykwargs[kwkey]:
raise ValueError, 'Argument was modified during function call!'
return retval
But, obviously, there are a few issues with this. For trivial things (i.e. all the inputs are simple types), then this isn't very useful anyways - those will likely be immutable, and in any case they are easy (well, relatively) to detect than complex types.
For complex types though, the deepcopy will be expensive, and there's no guarantee that the == operator will actually work correctly. (and simple copy isn't good enough... imagine a list, where one element changes value... a simple copy will just store a reference, and so the original value with change too).
In general, though, this is not that useful, since if you are already worried about side effects with calling this functions, you can just guard against them more intelligently (by storing your own copy if needed, auditing the destination function, etc), and if it's your function you are worried about causing side effects, you will have audited it to make sure.
Something like the above could be wrapped in a decorator though; with the expensive parts gated by a global variable (if _debug == True:, something like that), it could maybe be useful in projects where lots of people are editing the same code, though, i guess...
Edit: This only works for environments where a more 'strict' form of 'side effects' is expected. In many programming languages, you can make the available of side effects much more explicit - in C++ for instance, everything is by value unless explicitly a pointer or reference, and even then you can declare incoming references as const so that it can't be modified. There, 'side effects' can throw errors at compile time. (of course there are way to get some anyways).
The above enforces that any modified values are in the return value/tuple. If you are in python 3 (i'm not yet) I think you could specify decoration in the function declaration itself to specify attributes of function arguments, including whether they would be allowed to be modified, and include that in the above function to allow some arguments explicitly to be mutable.
Note that I think you could probably also do something like this:
class ImmutableObject(object):
def __init__(self, inobj):
self._inited = False
self._inobj = inobj
self._inited = True
def __repr__(self):
return self._inobj.__repr__()
def __str__(self):
return self._inobj.__str__()
def __getitem__(self, key):
return ImmutableObject(self._inobj.__getitem__(key))
def __iter__(self):
return self.__iter__()
def __setitem__(self, key, value):
raise AttributeError, 'Object is read-only'
def __getattr__(self, key):
x = getattr(self._inobj, key)
if callable(x):
return x
else:
return ImmutableObject(x)
def __setattr__(self, attr, value):
if attr not in ['_inobj', '_inited'] and self._inited == True:
raise AttributeError, 'Object is read-only'
object.__setattr__(self, attr, value)
(Probably not a complete implementation, haven't tested much, but a start). Works like this:
a = [1,2,3]
b = [a,3,4,5]
print c
[[1, 2, 3], 3, 4, 5]
c[0][1:] = [7,8]
AttributeError: Object is read-only
It would let you protect a specific object from modification if you didn't trust the downstream function, while still being relatively lightweight. Still requires explicit wrapping of the object though. You could probably build a decorator to do this semi-automatically though for all arguments. Make sure to skip the ones that are callable.

Python: how to pass a reference to a function

IMO python is pass by value if the parameter is basic types, like number, boolean
func_a(bool_value):
bool_value = True
Will not change the outside bool_value, right?
So my question is how can I make the bool_value change takes effect in the outside one(pass by reference?
You can use a list to enclose the inout variable:
def func(container):
container[0] = True
container = [False]
func(container)
print container[0]
The call-by-value/call-by-reference misnomer is an old debate. Python's semantics are more accurately described by CLU's call-by-sharing. See Fredrik Lundh's write up of this for more detail:
Call By Object
Python (always), like Java (mostly) passes arguments (and, in simple assignment, binds names) by object reference. There is no concept of "pass by value", neither does any concept of "reference to a variables" -- only reference to a value (some express this by saying that Python doesn't have "variables"... it has names, which get bound to values -- and that is all that can ever happen).
Mutable objects can have mutating methods (some of which look like operators or even assignment, e.g a.b = c actually means type(a).__setattr__(a, 'b', c), which calls a method which may likely be a mutating ones).
But simple assignment to a barename (and argument passing, which is exactly the same as simple assignment to a barename) never has anything at all to do with any mutating methods.
Quite independently of the types involved, simple barename assignment (and, identically, argument passing) only ever binds or rebinds the specific name on the left of the =, never affecting any other name nor any object in any way whatsoever. You're very mistaken if you believe that types have anything to do with the semantics of argument passing (or, identically, simple assignment to barenames).
Unmutable types can't, but if you send a user-defined class instance, a list or a dictionary, you can change it and keep with only one object.
Like this:
def add1(my_list):
my_list.append(1)
a = []
add1(a)
print a
But, if you do my_list = [1], you obtain a new instance, losing the original reference inside the function, that's why you can't just do "my_bool = False" and hope that outside of the function your variable get that False

Why should functions always return the same type?

I read somewhere that functions should always return only one type
so the following code is considered as bad code:
def x(foo):
if 'bar' in foo:
return (foo, 'bar')
return None
I guess the better solution would be
def x(foo):
if 'bar' in foo:
return (foo, 'bar')
return ()
Wouldn't it be cheaper memory wise to return a None then to create a new empty tuple or is this time difference too small to notice even in larger projects?
Why should functions return values of a consistent type? To meet the following two rules.
Rule 1 -- a function has a "type" -- inputs mapped to outputs. It must return a consistent type of result, or it isn't a function. It's a mess.
Mathematically, we say some function, F, is a mapping from domain, D, to range, R. F: D -> R. The domain and range form the "type" of the function. The input types and the result type are as essential to the definition of the function as is the name or the body.
Rule 2 -- when you have a "problem" or can't return a proper result, raise an exception.
def x(foo):
if 'bar' in foo:
return (foo, 'bar')
raise Exception( "oh, dear me." )
You can break the above rules, but the cost of long-term maintainability and comprehensibility is astronomical.
"Wouldn't it be cheaper memory wise to return a None?" Wrong question.
The point is not to optimize memory at the cost of clear, readable, obvious code.
It's not so clear that a function must always return objects of a limited type, or that returning None is wrong. For instance, re.search can return a _sre.SRE_Match object or a NoneType object:
import re
match=re.search('a','a')
type(match)
# <type '_sre.SRE_Match'>
match=re.search('a','b')
type(match)
# <type 'NoneType'>
Designed this way, you can test for a match with the idiom
if match:
# do xyz
If the developers had required re.search to return a _sre.SRE_Match object, then
the idiom would have to change to
if match.group(1) is None:
# do xyz
There would not be any major gain by requiring re.search to always return a _sre.SRE_Match object.
So I think how you design the function must depend on the situation and in particular, how you plan to use the function.
Also note that both _sre.SRE_Match and NoneType are instances of object, so in a broad sense they are of the same type. So the rule that "functions should always return only one type" is rather meaningless.
Having said that, there is a beautiful simplicity to functions that return objects which all share the same properties. (Duck typing, not static typing, is the python way!) It can allow you to chain together functions: foo(bar(baz))) and know with certainty the type of object you'll receive at the other end.
This can help you check the correctness of your code. By requiring that a function returns only objects of a certain limited type, there are fewer cases to check. "foo always returns an integer, so as long as an integer is expected everywhere I use foo, I'm golden..."
Best practice in what a function should return varies greatly from language to language, and even between different Python projects.
For Python in general, I agree with the premise that returning None is bad if your function generally returns an iterable, because iterating without testing becomes impossible. Just return an empty iterable in this case, it will still test False if you use Python's standard truth testing:
ret_val = x()
if ret_val:
do_stuff(ret_val)
and still allow you to iterate over it without testing:
for child in x():
do_other_stuff(child)
For functions that are likely to return a single value, I think returning None is perfectly acceptable, just document that this might happen in your docstring.
Here are my thoughts on all that and I'll try to also explain why I think that the accepted answer is mostly incorrect.
First of all programming functions != mathematical functions. The closest you can get to mathematical functions is if you do functional programming but even then there are plenty of examples that say otherwise.
Functions do not have to have input
Functions do not have to have output
Functions do not have to map input to output (because of the previous two bullet points)
A function in terms of programming is to be viewed simply as a block of memory with a start (the function's entry point), a body (empty or otherwise) and exit point (one or multiple depending on the implementation) all of which are there for the purpose of reusing code that you've written. Even if you don't see it a function always "returns" something. This something is actually the address of next statement right after the function call. This is something you will see in all of its glory if you do some really low-level programming with an Assembly language (I dare you to go the extra mile and do some machine code by hand like Linus Torvalds who ever so often mentions this during his seminars and interviews :D). In addition you can also take some input and also spit out some output. That is why
def foo():
pass
is a perfectly correct piece of code.
So why would returning multiple types be bad? Well...It isn't at all unless you abuse it. This is of course a matter of poor programming skills and/or not knowing what the language you're using can do.
Wouldn't it be cheaper memory wise to return a None then to create a new empty tuple or is this time difference too small to notice even in larger projects?
As far as I know - yes, returning a NoneType object would be much cheaper memory-wise. Here is a small experiment (returned values are bytes):
>> sys.getsizeof(None)
16
>> sys.getsizeof(())
48
Based on the type of object you are using as your return value (numeric type, list, dictionary, tuple etc.) Python manages the memory in different ways including the initially reserved storage.
However you have to also consider the code that is around the function call and how it handles whatever your function returns. Do you check for NoneType? Or do you simply check if the returned tuple has length of 0? This propagation of the returned value and its type (NoneType vs. empty tuple in your case) might actually be more tedious to handle and blow up in your face. Don't forget - the code itself is loaded into memory so if handling the NoneType requires too much code (even small pieces of code but in a large quantity) better leave the empty tuple, which will also avoid confusion in the minds of people using your function and forgetting that it actually returns 2 types of values.
Speaking of returning multiple types of value this is the part where I agree with the accepted answer (but only partially) - returning a single type makes the code more maintainable without a doubt. It's much easier to check only for type A then A, B, C, ... etc.
However Python is an object-oriented language and as such inheritance, abstract classes etc. and all that is part of the whole OOP shenanigans comes into play. It can go as far as even generating classes on-the-fly, which I have discovered a few months ago and was stunned (never seen that stuff in C/C++).
Side note: You can read a little bit about metaclasses and dynamic classes in this nice overview article with plenty of examples.
There are in fact multiple design patterns and techniques that wouldn't even exists without the so called polymorphic functions. Below I give you two very popular topics (can't find a better way to summarize both in a single term):
Duck typing - often part of the dynamic typing languages which Python is a representative of
Factory method design pattern - basically it's a function that returns various objects based on the input it receives.
Finally whether your function returns one or multiple types is totally based on the problem you have to solve. Can this polymorphic behaviour be abused? Sure, like everything else.
I personally think it is perfectly fine for a function to return a tuple or None. However, a function should return at most 2 different types and the second one should be a None. A function should never return a string and list for example.
If x is called like this
foo, bar = x(foo)
returning None would result in a
TypeError: 'NoneType' object is not iterable
if 'bar' is not in foo.
Example
def x(foo):
if 'bar' in foo:
return (foo, 'bar')
return None
foo, bar = x(["foo", "bar", "baz"])
print foo, bar
foo, bar = x(["foo", "NOT THERE", "baz"])
print foo, bar
This results in:
['foo', 'bar', 'baz'] bar
Traceback (most recent call last):
File "f.py", line 9, in <module>
foo, bar = x(["foo", "NOT THERE", "baz"])
TypeError: 'NoneType' object is not iterable
Premature optimization is the root of all evil. The minuscule efficiency gains might be important, but not until you've proven that you need them.
Whatever your language: a function is defined once, but tends to be used at any number of places. Having a consistent return type (not to mention documented pre- and postconditions) means you have to spend more effort defining the function, but you simplify the usage of the function enormously. Guess whether the one-time costs tend to outweigh the repeated savings...?

Categories