In python, I am trying to change the values of np array inside the function
def function(array):
array = array + 1
array = np.zeros((10, 1))
function(array)
For array as function parameter, it is supposed to be a reference, and I should be able to modify its content inside function.
array = array + 1 performs element wise operation that adds one to every element in the array, so it changes inside values.
But the array actually does not change after the function call. I am guessing that the program thinks I am trying to change the reference itself, not the content of the array, because of the syntax of the element wise operation. Is there any way to make it do the intended behavior? I don't want to loop through individual elements or make the function return the new array.
This line:
array = array + 1
… does perform an elementwise operation, but the operation it performs is creating a new array with each element incremented. Assigning that array back to the local variable array doesn't do anything useful, because that local variable is about to go away, and you haven't done anything to change the global variable of the same name,
On the other hand, this line:
array += 1
… performs the elementwise operation of incrementing all of the elements in-place, which is probably what you want here.
In Python, mutable collections are only allowed, not required, to handle the += statement this way; they could handle it the same way as array = array + 1 (as immutable types like str do). But builtin types like list, and most popular third-party types like np.array, do what you want.
Another solution if you want to change the content of your array is to use this:
array[:] = array + 1
Related
new to Python (matlab background).
I have a function (np.unique) that can output either 1 or 2 arrays:
array of unique values.
counts for each value (enabled by setting an argument return_counts=true)
When the function is set to return a single array only, assigning the result into the undefined variable "uni" makes it an ndarray type:
uni=np.unique(iris_2d['species'],return_counts=False)
But when the function is set to return 2 arrays the variable "uni" is created as a tuple containing 2 ndarrays.
Is there a way to force the output directly into a 2d array (and multidimensional in general), without predefine the variable "uni" or using a a second function like numpy.stack/numpy.asarray?
import numpy as np
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'
names = ('sepallength', 'sepalwidth', 'petallength', 'petalwidth', 'species')
dtype=np.dtype({'names':names, 'formats':np.append(np.repeat('float',4),'<U16')})
iris_2d = np.genfromtxt(url, delimiter=',', dtype=dtype, usecols=[0,1,2,3,4])
uni_isTuple=np.unique(iris_2d['species'],return_counts=True)
uni_isNdArray=np.unique(iris_2d['species'],return_counts=False)
I'm unaware of a way to force np.unique() to return a ndarray instead of a tuple. I realize you asked for a solution that doesn't call another function, but if you'll tolerate passing the tuple to np.array() to build a ndarray from the tuple that might give you what you want.
uni_isTuple = np.array(np.unique(iris_2d['species'],return_counts=True))
What I want to do:
I want to create an array and add each Item from a List to the array. This is what I have so far:
count = 0
arr = []
with open(path,encoding='utf-8-sig') as f:
data = f.readlines() #Data is the List
for s in data:
arr[count] = s
count+=1
What am I doing wrong? The Error I get is IndexError: list assignment index out of range
When you try to access arr at index 0, there is not anything there. What you are trying to do is add to it. You should do arr.append(s)
Your arr is an empty array. So, arr[count] = s is giving that error.
Either you initialize your array with empty elements, or use the append method of array. Since you do not know how many elements you will be entering into the array, it is better to use the append method in this case.
for s in data:
arr.append(s)
count+=1
It's worth taking a step back and asking what you're trying to do here.
f is already an iterable of lines: something you can loop over with for line in f:. But it's "lazy"—once you loop over it once, it's gone. And it's not a sequence—you can loop over it, but you can't randomly access it with indexes or slices like f[20] or f[-10:].
f.readlines() copies that into a list of lines: something you can loop over, and index. While files have the readlines method for this, it isn't really necessary—you can convert any iterable to a list just like this by just calling list(f).
Your loop appears to be an attempt to create another list of the same lines. Which you could do with just list(data). Although it's not clear why you need another list in the first place.
Also, the term "array" betrays some possible confusion.
A Python list is a dynamic array, which can be indexed and modified, but can also be resized by appending, inserting, and deleting elements. So, technically, arr is an array.
But usually when people talk about "arrays" in Python, they mean fixed-size arrays, usually of fixed-size objects, like those provided by the stdlib array module, the third-party numpy library, or special-purpose types like the builtin bytearray.
In general, to convert a list or other iterable into any of these is the same as converting into a list: just call the constructor. For example, if you have a list of numbers between 0-255, you can do bytearray(lst) to get a bytearray of the same numbers. Or, if you have a list of lists of float values, np.array(lst) will give you a 2D numpy array of floats. And so on.
So, why doesn't your code work?
When you write arr = [], you're creating a list of 0 elements.
When you write arr[count] = s, you're trying to set the countth element in the list to s. But there is no countth element. You're writing past the end of the list.
One option is to call arr.append(s) instead. This makes the list 1 element longer than it used to be, and puts s in the new slot.
Another option is to create a list of the right size in the first place, like arr = [None for _ in data]. Then, arr[count] = s can replace the None in the countth slot with s.
But if you really just want a copy of data in another list, you're better off just using arr = list(data), or arr = data[:].
And if you don't have any need for another copy, just do arr = data, or just use data as-is—or even, if it works for your needs, just use f in the first place.
Seems like you are coming from matlab or R background. when you do arr=[], it creates an empty list, its not an array.
import numpy
count = 0
with open(path,encoding='utf-8-sig') as f:
data = f.readlines() #Data is the List
size = len(data)
array = numpy.zeros((size,1))
for s in data:
arr[count,0] = s
count+=1
Suppose I have a very large numpy array a, and I want to add the numerical value 1 to each element of the array. From what I have read so far:
a += 1
is a good way of doing it rather than:
a = a + 1
since in the second case a new array a is created in a different memory slot, while in the first case the old array is effectively replaced in the same memory slot.
Suppose I want to do the following instead:
a = 1-a
What would be the memory efficient way of doing the above?
numpy.subtract(1, a, out=a)
Using the subtract ufunc directly gives you more control than the - operator. Here, we use the out parameter to place the results of the subtraction back into a.
You could do it in place like so:
a *= -1
a += 1
If I want to get the dot product of two arrays, I can get a performance boost by specifying an array to store the output in instead of creating a new array (if I am performing this operation many times)
import numpy as np
a = np.array([[1.0,2.0],[3.0,4.0]])
b = np.array([[2.0,2.0],[2.0,2.0]])
out = np.empty([2,2])
np.dot(a,b, out = out)
Is there any way I can take advantage of this feature if I need to modify an array in place? For instance, if I want:
out = np.array([[3.0,3.0],[3.0,3.0]])
out *= np.dot(a,b)
Yes, you can use the out argument to modify an array (e.g. array=np.ones(10)) in-place, e.g. np.multiply(array, 3, out=array).
You can even use in-place operator syntax, e.g. array *= 2.
To confirm if the array was updated in-place, you can check the memory address array.ctypes.data before and after the modification.
I would like to run the contraction algorithm on an array of vertices n^2 times so as to calculate the minimum cut of a graph. After the first for-loop iteration, the array is altered and the remaining iterations use the altered array, which is not what I want. How can I simulate pointers so as to have the original input array during each for-loop iteration?
def n_squared_runs(array):
min_cut, length = 9999, len(array) ** 2
for i in range(0, length):
# perform operation on original input array
array = contraction(array)
if len(array) < min_cut:
min_cut = len(array)
return min_cut
The contraction() operation should create and return a new array then, and not modify in-place the array it receives as a parameter - also you should use a different variable name for the returned array, clearly if you use array to name both the parameter and the local variable, the parameter will get overwritten inside the function.
This has nothing to do with pointers, but with the contracts of the functions in use. If the original array must be preserved, then the helper functions need to make sure that this restriction is enforced. Notice that in Python if you do this:
array = [1, 2, 3]
f(array)
The array received by the f function is the same that was declared "outside" of it - in fact, all that f receives is a reference to the array, not a copy of it - so naturally any modifications to the array you do inside f will be reflected outside. Also, it's worth pointing out that all parameters in Python get passed by value, and there's no such thing as pointers or passing by reference in the language.
Don't overwrite the original array.
def n_squared_runs(array):
min_cut, length = 9999, len(array) ** 2
for i in range(0, length):
# perform operation on original input array
new_array = contraction(array)
if len(new_array) < min_cut:
min_cut = len(new_array)
return min_cut