How can I pass an integer by reference in Python?
I want to modify the value of a variable that I am passing to the function. I have read that everything in Python is pass by value, but there has to be an easy trick. For example, in Java you could pass the reference types of Integer, Long, etc.
How can I pass an integer into a function by reference?
What are the best practices?
It doesn't quite work that way in Python. Python passes references to objects. Inside your function you have an object -- You're free to mutate that object (if possible). However, integers are immutable. One workaround is to pass the integer in a container which can be mutated:
def change(x):
x[0] = 3
x = [1]
change(x)
print x
This is ugly/clumsy at best, but you're not going to do any better in Python. The reason is because in Python, assignment (=) takes whatever object is the result of the right hand side and binds it to whatever is on the left hand side *(or passes it to the appropriate function).
Understanding this, we can see why there is no way to change the value of an immutable object inside a function -- you can't change any of its attributes because it's immutable, and you can't just assign the "variable" a new value because then you're actually creating a new object (which is distinct from the old one) and giving it the name that the old object had in the local namespace.
Usually the workaround is to simply return the object that you want:
def multiply_by_2(x):
return 2*x
x = 1
x = multiply_by_2(x)
*In the first example case above, 3 actually gets passed to x.__setitem__.
Most cases where you would need to pass by reference are where you need to return more than one value back to the caller. A "best practice" is to use multiple return values, which is much easier to do in Python than in languages like Java.
Here's a simple example:
def RectToPolar(x, y):
r = (x ** 2 + y ** 2) ** 0.5
theta = math.atan2(y, x)
return r, theta # return 2 things at once
r, theta = RectToPolar(3, 4) # assign 2 things at once
Not exactly passing a value directly, but using it as if it was passed.
x = 7
def my_method():
nonlocal x
x += 1
my_method()
print(x) # 8
Caveats:
nonlocal was introduced in python 3
If the enclosing scope is the global one, use global instead of nonlocal.
Maybe it's not pythonic way, but you can do this
import ctypes
def incr(a):
a += 1
x = ctypes.c_int(1) # create c-var
incr(ctypes.ctypes.byref(x)) # passing by ref
Really, the best practice is to step back and ask whether you really need to do this. Why do you want to modify the value of a variable that you're passing in to the function?
If you need to do it for a quick hack, the quickest way is to pass a list holding the integer, and stick a [0] around every use of it, as mgilson's answer demonstrates.
If you need to do it for something more significant, write a class that has an int as an attribute, so you can just set it. Of course this forces you to come up with a good name for the class, and for the attribute—if you can't think of anything, go back and read the sentence again a few times, and then use the list.
More generally, if you're trying to port some Java idiom directly to Python, you're doing it wrong. Even when there is something directly corresponding (as with static/#staticmethod), you still don't want to use it in most Python programs just because you'd use it in Java.
Maybe slightly more self-documenting than the list-of-length-1 trick is the old empty type trick:
def inc_i(v):
v.i += 1
x = type('', (), {})()
x.i = 7
inc_i(x)
print(x.i)
A numpy single-element array is mutable and yet for most purposes, it can be evaluated as if it was a numerical python variable. Therefore, it's a more convenient by-reference number container than a single-element list.
import numpy as np
def triple_var_by_ref(x):
x[0]=x[0]*3
a=np.array([2])
triple_var_by_ref(a)
print(a+1)
output:
7
The correct answer, is to use a class and put the value inside the class, this lets you pass by reference exactly as you desire.
class Thing:
def __init__(self,a):
self.a = a
def dosomething(ref)
ref.a += 1
t = Thing(3)
dosomething(t)
print("T is now",t.a)
In Python, every value is a reference (a pointer to an object), just like non-primitives in Java. Also, like Java, Python only has pass by value. So, semantically, they are pretty much the same.
Since you mention Java in your question, I would like to see how you achieve what you want in Java. If you can show it in Java, I can show you how to do it exactly equivalently in Python.
class PassByReference:
def Change(self, var):
self.a = var
print(self.a)
s=PassByReference()
s.Change(5)
class Obj:
def __init__(self,a):
self.value = a
def sum(self, a):
self.value += a
a = Obj(1)
b = a
a.sum(1)
print(a.value, b.value)// 2 2
In Python, everything is passed by value, but if you want to modify some state, you can change the value of an integer inside a list or object that's passed to a method.
integers are immutable in python and once they are created we cannot change their value by using assignment operator to a variable we are making it to point to some other address not the previous address.
In python a function can return multiple values we can make use of it:
def swap(a,b):
return b,a
a,b=22,55
a,b=swap(a,b)
print(a,b)
To change the reference a variable is pointing to we can wrap immutable data types(int, long, float, complex, str, bytes, truple, frozenset) inside of mutable data types (bytearray, list, set, dict).
#var is an instance of dictionary type
def change(var,key,new_value):
var[key]=new_value
var =dict()
var['a']=33
change(var,'a',2625)
print(var['a'])
Related
I apologize if I'm butchering the terminology. I'm trying to understand the code in this example on how to chain a custom function onto a PySpark dataframe. I'd really want to understand exactly what it's doing, and if it is not awful practice before I implement anything.
From the way I'm understanding the code, it:
defines a function g with sub-functions inside of it, that returns a copy of itself
assigns the sub-functions to g as attributes
assigns g as a property of the DataFrame class
I don't think at any step in the process do any of them become a method (when I do getattr, it always says "function")
When I run a (as best as I can do) simplified version of the code (below), it seems like only when I assign the function as a property to a class, and then instantiate at least one copy of the class, do the attributes on the function become available (even outside of the class). I want to understand what and why that is happening.
An answer [here(https://stackoverflow.com/a/17007966/19871699) indicates that this is a behavior, but doesn't really explain what/why it is. I've read this too but I'm having trouble seeing the connection to the code above.
I read here about the setattr part of the code. He doesn't mention exactly the use case above. this post has some use cases where people do it, but I'm not understanding how it directly applies to the above, unless I've missed something.
The confusing part is when the inner attributes become available.
class SampleClass():
def __init__(self):
pass
def my_custom_attribute(self):
def inner_function_one():
pass
setattr(my_custom_attribute,"inner_function",inner_function_one)
return my_custom_attribute
[x for x in dir(my_custom_attribute) if x[0] != "_"]
returns []
then when I do:
SampleClass.custom_attribute = property(my_custom_attribute)
[x for x in dir(my_custom_attribute) if x[0] != "_"]
it returns []
but when I do:
class_instance = SampleClass()
class_instance.custom_attribute
[x for x in dir(my_custom_attribute) if x[0] != "_"]
it returns ['inner_function']
In the code above though, if I do SampleClass.custom_attribute = my_custom_attribute instead of =property(...) the [x for x... code still returns [].
edit: I'm not intending to access the function itself outside of the class. I just don't understand the behavior, and don't like implementing something I don't understand.
So, setattr is not relevant here. This would all work exactly the same without it, say, by just doing my_custom_attribute.inner_function = inner_function_one etc. What is relevant is that the approach in the link you showed (which your example doesn't exactly make clear what the purpose is) relies on using a property, which is a descriptor. But the function won't get called unless you access the attribute corresponding to the property on an instance. This comes down to how property works. For any property, given a class Foo:
Foo.attribute_name = property(some_function)
Then some_function won't get called until you do Foo().attribute_name. That is the whole point of property.
But this whole solution is very confusingly engineered. It relies on the above behavior, and it sets attributes on the function object.
Note, if all you want to do is add some method to your DataFrame class, you don't need any of this. Consider the following example (using pandas for simplicity):
>>> import pandas as pd
>>> def foobar(self):
... print("in foobar with instance", self)
...
>>> pd.DataFrame.baz = foobar
>>> df = pd.DataFrame(dict(x=[1,2,3], y=['a','b','c']))
>>> df
x y
0 1 a
1 2 b
2 3 c
>>> df.baz()
in foobar with instance x y
0 1 a
1 2 b
2 3 c
That's it. You don't need all that rigamarole. Of course, if you wanted to add a nested accessor, df.custom.whatever, you would need something a bit more complicated. You could use the approach in the OP, but I would prefer something more explicit:
import pandas as pd
class AccessorDelegator:
def __init__(self, accessor_type):
self.accessor_type = accessor_type
def __get__(self, instance, cls=None):
return self.accessor_type(instance)
class CustomMethods:
def __init__(self, instance):
self.instance = instance
def foo(self):
# do something with self.instance as if this were your `self` on the dataframe being augmented
print(self.instance.value_counts())
pd.DataFrame.custom = AccessorDelegator(CustomMethods)
df = pd.DataFrame(dict(a=[1,2,3], b=['a','b','c']))
df.foo()
The above will print:
a b
1 a 1
2 b 1
3 c 1
Because when you call a function the attributes within that function aren't returned only the returned value is passed back.
In other words the additional attributes are only available on the returned function and not with 'g' itself.
Try moving setattr() outside of the function.
How can I assign the results of a function call to multiple variables when the results are stored by name (not index-able), in python.
For example (tested in Python 3),
import random
# foo, as defined somewhere else where we can't or don't want to change it
def foo():
t = random.randint(1,100)
# put in a dummy class instead of just "return t,t+1"
# because otherwise we could subscript or just A,B = foo()
class Cat(object):
x = t
y = t + 1
return Cat()
# METHOD 1
# clearly wrong; A should be 1 more than B; they point to fields of different objects
A,B = foo().x, foo().y
print(A,B)
# METHOD 2
# correct, but requires two lines and an implicit variable
t = foo()
A,B = t.x, t.y
del t # don't really want t lying around
print(A,B)
# METHOD 3
# correct and one line, but an obfuscated mess
A,B = [ (t.x,t.y) for t in (foo(),) ][0]
print(A,B)
print(t) # this will raise an exception, but unless you know your python cold it might not be obvious before running
# METHOD 4
# Conforms to the suggestions in the links below without modifying the initial function foo or class Cat.
# But while all subsequent calls are pretty, but we have to use an otherwise meaningless shell function
def get_foo():
t = foo()
return t.x, t.y
A,B = get_foo()
What we don't want to do
If the results were indexable ( Cat extended tuple/list, we had used a namedtuple, etc.), we could simply write A,B = foo() as indicated in the comment above the Cat class. That's what's recommended here , for example.
Let's assume we have a good reason not to allow that. Maybe we like the clarity of assigning from the variable names (if they're more meaningful than x and y) or maybe the object is not primarily a container. Maybe the fields are properties, so access actually involves a method call. We don't have to assume any of those to answer this question though; the Cat class can be taken at face value.
This question already deals with how to design functions/classes the best way possible; if the function's expected return value are already well defined and does not involve tuple-like access, what is the best way to accept multiple values when returning?
I would strongly recommend either using multiple statements, or just keeping the result object without unpacking its attributes. That said, you can use operator.attrgetter for this:
from operator import attrgetter
a, b, c = attrgetter('a', 'b', 'c')(foo())
This question already has answers here:
Mutability of the **kwargs argument in Python
(3 answers)
Closed 8 years ago.
Python beginner's question. I'm trying to change the value of some variables inside a function, and I don't understand why sometimes it works and sometimes it doesn't. So I would like to know what's happening behind the scenes. If I write:
def assign(self, **kwargs):
kwargs['test'] = 3
kwargs['steps'] += [1]
t = 1
s = []
assign(test=t, steps=s)
print(t)
print(s)
This still prints
1
[]
Now, if I change the function assign to
def assign(self, **kwargs):
kwargs['test'] += 3
kwargs['steps'] += [1, 2, 3]
it changes the list but not the integer. So I guess this has to do with the fact that integer are immutable and a list is mutable. So then I thought to use a dictionary instead, to make sure that my variables are changed. So then:
dict = {'test':1, 'steps': []}
assign(**dict)
print(dict)
still prints
{'test': 1, 'steps': [1, 2, 3]}
with exactly the same behavior, so now I'm really puzzled. It seems that when unpacking the dictionary, I am not passing references to the dictionary variables anymore so that these unpacked variables are being copied by value? What's the best way to achieve what I try to do then?
UPDATE
Thanks to the discussion with #6502, since
In Python there is no way to change a parameter that has been passed to a function because you cannot have "pointers" or "references".
The most pythonic way is not doing it. A function receive parameters and provide results. If the parameter is a mutable it can mutate its state but changing the call parameter itself was considered not needed.
Then I decided to return a dictionary with the results instead:
def assign(self, **kwargs):
kwargs['test'] += 3
kwargs['steps'] += [1, 2, 3]
return kwargs
dict = {'test':1, 'steps': []}
dict = assign(**dict)
print(dict)
This works of course, but I wonder the implications on large data, as it seems to me that (coming from a C++ world), there's a lot of copying around.
The first example is wrong, you get s=[1]:
That's because the list s is a parameter, steps and you change the contents of this list.
Much simpler:
def assign(step):
step += [1] # change the contents of the list
s = []
assign(step=s)
print(s) # gives [1]
That as nothing to do with keyword arguments or dict expansion. If you use ** or give the key words directly as parameters is absolutely the same.
Don't try to pass variables «by reference». That's not possible with python. Use return values instead.
def assign(test, steps):
return 3, steps + [1]
t, s = assign(3, [])
A simple way to rationalize the semantic is to consider that all Python values are indeed pointers to objects and that those pointers are always passed by value.
If you change the object pointed to the caller can see the mutation, but if you assign the variable you're just changing what it's pointing to and the caller won't notice.
In Python there is no way to change a parameter that has been passed to a function because you cannot have "pointers" or "references". The only way to mutate a variable is using its name.
A work-around (in Python 3) is to pass a closure that can access the local, for example:
def foo(x, y, set_result):
set_result(x + y)
def bar():
res = None
def set_res(x):
nonlocal res
res = x
foo(10, 20, set_res)
print(res)
This kind of trickery is however rarely needed because in languages with reference parameter passing the most common use is to return multiple values, e.g. (C++)
bool read_xy(double& x, double& y);
where the function needs to return three values, x, y and a success flag.
In Python however this is not needed as you can just write
return x, y
and use it as
x, y = read_xy()
Is there a way to make function B to be able to access a non global variable that was declared in only in function A, without return statements from function A.
As asked, the question:
Define two functions:
p: prints the value of a variable
q: increments the variable
such that
Initial value of the variable is 0. You can't define the variable in the global
enviroment.
Variable is not located in the global environment and the only way to change it is by invoking q().
The global enviroment should know only p() and q().
Tip: 1) In python, a function can return more than 1 value. 2) A function can be
assigned to a variable.
# Example:
>>> p()
0
>>> q()
>>> q()
>>> p()
2
The question says the global enviroment should know only p and q.
So, taking that literally, it could be done inline using a single function scope:
>>> p, q = (lambda x=[0]: (lambda: print(x[0]), lambda: x.__setitem__(0, x[0] + 1)))()
>>> p()
0
>>> q()
>>> q()
>>> p()
2
Using the tips provided as clues, it could be done something like this:
def make_p_and_q():
context = {'local_var': 0}
def p():
print('{}'.format(context['local_var']))
def q():
context['local_var'] += 1
return p, q
p, q = make_p_and_q()
p() # --> 0
q()
q()
p() # --> 2
The collection of things that functions can access is generally called its scope. One interpretation of your question is whether B can access a "local variable" of A; that is, one that is defined normally as
def A():
x = 1
The answer here is "not easily": Python lets you do a lot, but local variables are one of the things that are not meant to be accessed inside a function.
I suspect what your teacher is getting at is that A can modify things outside of its scope, in order to send information out without sending it through the return value. (Whether this is good coding practise is another matter.) For example, functions are themselves Python objects, and you can assign arbitrary properties to Python objects, so you can actually store values on the function object and read them from outside it.
def a():
a.key = "value"
a()
print a.key
Introspection and hacking with function objects
In fact, you can sort of get at the constant values defined in A by looking at the compiled Python object generated when you define a function. For example, in the example above, "value" is a constant, and constants are stored on the code object:
In [9]: a.func_code.co_consts
Out[9]: (None, 'value')
This is probably not what you meant.
Firstly, it's bad practise to do so. Such variables make debugging difficult and are easy to lose track of, especially in complex code.
Having said that, you can accomplish what you want by declaring a variable as global:
def funcA():
global foo
foo = 3
def funcB():
print foo # output is 3
That's one weird homework assignment; especially the tips make me suspect that you've misunderstood or left out something.
Anyway, here's a simpler solution than the accepted answer: Since calls to q increment the value of the variable, it must be a persistent ("static") variable of some sort. Store it somewhere other than the global namespace, and tell p about it. The obvious place to store it is as an attribute of q:
def q():
q.x += 1
q.x = 0 # Initialize
def p():
print(q.x)
In python. when I write x = 5, x becomes an instance of int automatically. But suppose I have defined a new class say number and I want x to become an instance of number instead of int when I assign it the value 5. Is this possible?
ie, Instead of this -->
>>> x = 5
>>> type(x)
<type 'int'>
Is this possible:
>>> x = 5
>>> type(x)
<type 'number'>
No. You would have to write a monkey patch to achieve this, that is incredibly unpythonic, can you simply not write
x = number(5)
:)
Note that you really should never do something like this. Jakob has the right answer, i.e. use x = number(5).
However, that said, I wanted to try how it could be done in theory, and here's one solution in the form of a decorator:
import types
class number(object):
def __init__(self, value):
self.value = value
def replace_int(x):
if isinstance(x, int):
return number(x)
else:
return x
def custom_numbers(f):
code = f.func_code
consts = tuple(map(replace_int, code.co_consts))
new_code = types.CodeType(code.co_argcount, code.co_nlocals,
code.co_stacksize, code.co_flags,
code.co_code, consts, code.co_names,
code.co_varnames, code.co_filename,
code.co_name, code.co_firstlineno,
code.co_lnotab)
return types.FunctionType(new_code, f.func_globals, f.func_name)
Any function you decorate, will end up using your custom number class:
#custom_numbers
def test():
x = 5
print type(x)
>>> test()
<class '__main__.number'>
The decorator works by replacing integer constants from the function's code-object with instances of the custom class. However, since function.co_code and code.co_consts are both read-only attributes, we have to create new code and function objects with the altered values.
One caveat is, that the values are assumed to be constants, so new instances are not created for each invocation of the function. If you mutate the value, that new value will be reflected in each subsequent call of the function.
You would have to take advantage of Python's language services to compile the statement and then walk the AST replacing the objects as appropriate.
In fact, 5 is an instance of int, x is just pointing to it. All variables in Python are references to objects. Thus, when you write type(x) you get the type of the object which x holds a reference to, in this case it is int.
If you assign another value to x, say x = "string", x will hold a reference to that string object, and type(x) will return <type 'str'>.