I have a numpy array that is changed by a function.
After calling the function I want to proceed with the initial value of the array (value before calling the modifying function)
# Init of the array
array = np.array([1, 2, 3])
# Function that modifies array
func(array)
# Print the init value [1,2,3]
print(array)
Is there a way to pass the array by value or am I obligated to make a deep copy?
As I mentioned, np.ndarray objects are mutable data structures. This means that any variables that refer to a particular object will all reflect changes when a change is made to the object.
However, keep in mind that most numpy functions that transform arrays return new array objects, leaving the original unchanged.
What you need to do in this scenario depends on exactly what you're doing. If your function modifies the same array in place, then you'll need to pass a copy to the function. You can do this with np.ndarray.copy.
You could use the following library (https://pypi.python.org/pypi/pynverse) that inverts your function and call it, like so :
from pynverse import inversefunc
cube = (lambda x: x**3)
invcube = inversefunc(cube)
arr = func(arr)
In [0]: arr
Out[0]: array([1, 8, 27], dtype=int32)
In [1]: invcube(arr)
Out[1]: array([1, 2, 3])
Related
I'd like to be able to reference an element of a Numpy n-dimensional array when the shape of the array itself is variable.
For a 4-D array, I know I'd be able to access some specific element index (let's say the very first one for argument) by direct referencing, for example
array[1, 1, 1, 1]
But what if the next iteration I only had a 3-dimension array? I'd need;
array[1, 1, 1]
And then a 6-dimension array? Then I'd need
array[1, 1, 1, 1, 1, 1]
Assuming that I know the appropriate index for each case, how do I go about programatically referencing array indices when the shape can change?
Thanks
you can always reshape the array and expand the index:
import numpy as np
a = np.arange(10*100*77*10*10).reshape(10,100,77,10,10)
print(a[1,2,5,9,9])
#output: 785999
def new_index(index, array):
shape = array.shape
terms = [v*np.prod(shape[i+1:]) for i,v in enumerate(index)]
return int(sum(terms))
# make a 1-D array
oneD = a.reshape(np.prod(a.shape))
print(oneD[new_index([1,2,5,9,9], a)])
#output: 785999
I'm assuming there's a numpy function that does this but I don't know it off the top of my head
hpaulj had the answer I was after. I'd tried referencing via a list (created by appending values with increasing dimension), but not via a tuple.
With the variable dimension index to be accessed as a list, first convert to a tuple, and then reference;
list_index = [1,1,1,1]
tuple_index = tuple(list_index)
result = array[tuple_index]
I'am trying to clip the values in arrays. I found the function np.clip()
and it did what I need. However, the way it modifies the array values in the list of arrays makes me confused. Here is the code:
import numpy as np
a = np.arange(5)
b = np.arange(5)
for x in [a,b]:
np.clip(x, 1, 3, out=x)
result
>>> a
array([1, 1, 2, 3, 3])
>>> b
array([1, 1, 2, 3, 3])
The values of a and b have been changed without being assigned while the function np.clip() only works with x.
There are some questions related, but they use index of the list, e.g Modifying list elements in a for loop, Changing iteration variable inside for loop in Python.
Can somebody explain for me how the function np.clip() can directly modify the value of list values.
It's not because of the np.clip function. It's because you use loop on a list of mutable, so the value of the element can be modified. You can see here Immutable vs Mutable types for more information.
I have a function with one of the parameters as numpy.ndarray. It's mutable so it can not be cached by lru_cache.
Is there any existing solution for this?
Possibly the simplest way of doing so is to memoize a version taking only immutable objects.
Say your function takes an np.array, and let's assume it's a 1d array. Fortunately, it is easily translated to a tuple:
import numpy as np
a = np.array([1, 2, 3, 4])
>> tuple(a)
(1, 2, 3, 4)
and vice versa:
>> np.array(tuple(a))
array([1, 2, 3, 4])
So you get something like
# Function called by the rest of your program
array_foo(a) # `a` is an `np.array`
...
return tuple_foo(tuple(a))
then memoize instead this function:
# Internal implementation
#functools.lru_cache
tuple_foo(t) # `t` is a tuple
...
a = np.array(t)
Lets say I have a numpy array like
x = np.arange(10)
is it somehow possible to create a reference to a single element i.e.
y = create_a_reference_to(x[3])
y = 100
print x
[ 0 1 2 100 4 5 6 7 8 9]
You can't create a reference to a single element, but you can get a view over that single element:
>>> x = numpy.arange(10)
>>> y = x[3:4]
>>> y[0] = 100
>>> x
array([0, 1, 2, 100, 4, 5, 6, 7, 8, 9])
The reason you can't do the former is that everything in python is a reference. By doing y = 100, you're modifying what y points to - not it's value.
If you really want to, you can get that behaviour on instance attributes by using properties. Note this is only possible because the python data model specifies additional operations while accessing class attributes - it's not possible to get this behaviour for variables.
No you cannot do that, and that is by design.
Numpy arrays are of type numpy.ndarray. Individual items in it can be accessed with numpy.ndarray.item which does "copy an element of an array to a standard Python scalar and return it".
I'm guessing numpy returns a copy instead of direct reference to the element to prevent mutability of numpy items outside of numpy's own implementation.
Just as a thoughtgame, let's assume this wouldn't be the case and you would be allowed to get reference to individual items. Then what would happen if: numpy was in the midle of calculation and you altered an individual intime in another thread?
#goncalopp gives a correct answer, but there are a few variations that will achieve similar effects.
All of the notations shown below are able to reference a single element while still returning a view:
x = np.arange(10)
two_index_method = [None] * 10
scalar_element_method = [None] * 10
expansion_method = [None] * 10
for i in range(10):
two_index_method[i] = x[i:i+1]
scalar_element_method[i] = x[..., i] # x[i, ...] works, too
expansion_method[i] = x[:, np.newaxis][i] # np.newaxis == None
two_index_method[5] # Returns a length 1 numpy.ndarray, shape=(1,)
# >>> array([5])
scalar_element_method[5] # Returns a numpy scalar, shape = ()
# >>> array(5)
expansion_method[5] # Returns a length 1 numpy.ndarray, shape=(1,)
# >>> array([5])
x[5] = 42 # Change the value in the original `ndarray`
x
# >>> array([0, 1, 2, 3, 4, 42, 6, 7, 8, 9]) # The element has been updated
# All methods presented here are correspondingly updated:
two_index_method[5], scalar_element_method[5], expansion_method[5]
# >>> (array([42]), array(42), array([42]))
Since the object in scalar_element_method is a dimension zero scalar, attempting to reference the element contained within the ndarray via element[0] will return an IndexError. For a scalar ndarray, element[()] can be used to reference the element contained within the numpy scalar. This method can also be used for assignment to a length-1 ndarray, but has the unfortunate side effect that it does not dereference a length-1 ndarray to a python scalar. Fortunately, there is a single method, element.item(), that can be used (for dereferencing only) to obtain the value regardless of whether the element is a length one ndarray or a scalar ndarray:
scalar_element_method[5][0] # This fails
# >>> IndexError: too many indices for array
scalar_element_method[5][()] # This works for scalar `ndarray`s
# >>> 42
scalar_element_method[5][()] = 6
expansion_method[5][0] # This works for length-1 `ndarray`s
# >>> 6
expansion_method[5][()] # Doesn't return a python scalar (or even a numpy scalar)
# >>> array([6])
expansion_method[5][()] = 8 # But can still be used to change the value by reference
scalar_element_method[5].item() # item() works to dereference all methods
# >>> 8
expansion_method[5].item()
# >>> [i]8
TLDR; You can create a single-element view v with v = x[i:i+1], v = x[..., i], or v = x[:, None][i]. While different setters and getters work with each method, you can always assign values with v[()]=new_value, and you can always retrieve a python scalar with v.item().
I came across the fact that numpy arrays are passed by reference at multiple places, but then when I execute the following code, why is there a difference between the behavior of foo and bar
import numpy as np
def foo(arr):
arr = arr - 3
def bar(arr):
arr -= 3
a = np.array([3, 4, 5])
foo(a)
print a # prints [3, 4, 5]
bar(a)
print a # prints [0, 1, 2]
I'm using python 2.7 and numpy version 1.6.1
In Python, all variable names are references to values.
When Python evaluates an assignment, the right-hand side is evaluated before the left-hand side. arr - 3 creates a new array; it does not modify arr in-place.
arr = arr - 3 makes the local variable arr reference this new array. It does not modify the value originally referenced by arr which was passed to foo. The variable name arr simply gets bound to the new array, arr - 3. Moreover, arr is local variable name in the scope of the foo function. Once the foo function completes, there is no more reference to arr and Python is free to garbage collect the value it references. As Reti43 points out, in order for arr's value to affect a, foo must return arr and a must be assigned to that value:
def foo(arr):
arr = arr - 3
return arr
# or simply combine both lines into `return arr - 3`
a = foo(a)
In contrast, arr -= 3, which Python translates into a call to the __iadd__ special method, does modify the array referenced by arr in-place.
The first function calculates (arr - 3), then assigns the local name arr to it, which doesn't affect the array data passed in. My guess is that in the second function, np.array overrides the -= operator, and operates in place on the array data.
Python passes the array by reference:
$:python
...python startup message
>>> import numpy as np
>>> x = np.zeros((2,2))
>>> x
array([[0.,0.],[0.,0.]])
>>> def setx(x):
... x[0,0] = 1
...
>>> setx(x)
>>> x
array([[1.,0.],[0.,0.]])
The top answer is referring to a phenomenon that occurs even in compiled c-code, as any BLAS events will involve a "read-onto" step where either a new array is formed which the user (code writer in this case) is aware of, or a new array is formed "under the hood" in a temporary variable which the user is unaware of (you might see this as a .eval() call).
However, I can clearly access the memory of the array as if it is in a more global scope than the function called (i.e., setx(...)); which is exactly what "passing by reference" is, in terms of writing code.
And let's do a few more tests to check the validity of the accepted answer:
(continuing the session above)
>>> def minus2(x):
... x[:,:] -= 2
...
>>> minus2(x)
>>> x
array([[-1.,-2.],[-2.,-2.]])
Seems to be passed by reference. Let us do a calculation which will definitely compute an intermediate array under the hood, and see if x is modified as if it is passed by reference:
>>> def pow2(x):
... x = x * x
...
>>> pow2(x)
>>> x
array([[-1.,-2.],[-2.,-2.]])
Huh, I thought x was passed by reference, but maybe it is not? -- No, here, we have shadowed the x with a brand new declaration (which is hidden via interpretation in python), and python will not propagate this "shadowing" back to global scope (which would violate the python-use case: namely, to be a beginner level coding language which can still be used effectively by an expert).
However, I can very easily perform this operation in a "pass-by-reference" manner by forcing the memory (which is not copied when I submit x to the function) to be modified instead:
>>> def refpow2(x):
... x *= x
...
>>> refpow2(x)
>>> x
array([[1., 4.],[4., 4.]])
And so you see that python can be finessed a bit to do what you are trying to do.