Confusion about inability to assign numpy array element using multiple array indexing

Confusion about inability to assign numpy array element using multiple array indexing - python

I ran into a bug caused by using multiple sets of brackets to index an array, i.e. using a[i][j] (for various reasons, mostly laziness, which I've now fixed properly). While attempting to assign an element of the array, I found that I was unable to, but I didn't receive any kind of error to tell me why. I am confused as to why I can't do this:
>>> import numpy as np
>>> x = np.arange(0,50,10)
>>> idx = np.array([1,3,4])
>>> x[idx]
array([10, 30, 40])
>>> x[idx][1]
30
>>> x[idx][1] = 10 #this assignment doesn't work
>>> x[idx][1]
30
However, if I instead index the idx array inside the brackets, then it seems to work:
>>> x[idx[1]]
30
>>> x[idx[1]] = 100
>>> x[idx[1]]
100
Can someone explain to me why?

Another way to explain this is that each [] translates into __getitem__() call, and each []= is a __setitem__ call.
x[idx][1]= 2
is then
x.__getitem__(idx).__setitem__(1, 2)
If x.__getitem__(idx) produces a new array with its own databuffer, the set changes that, not x. If on the other hand the get produces a view, then the change will appear in x.
It's important when working with arrays, that you understand the difference between a view and copy.

Because:
x[idx]
Creates a new array object with an independent, underlying buffer.
So then you index into that:
[1] = 10
Which does work, but then you don't keep that new array around, and it is discarded immediately.
Whereas:
x[idx[1]] = 100
Assigns to some particular index in the existing array object.

Related

Why ndarray function as immutable object

I probably misunderstood the term immutable/mutable and in-place change.
Creating ndarray: x = np.arange(6) . Reshaping ndarray: x.reshape(3,2).
Now when I look on x he is unchanged, the ndarray is still 1-dim.
When I do the same with a built-in python list, the list gets changed.

Like #morhc mentiond, you're not changing your array x because x.reshape returns a new array instead of modifying the existing one. Some numpy methods allow an inplace parameter, but reshape is not one of them.
Mutability is a somewhat related, but more general concept.
A mutable object is one that can be changed after it has been created and stored in memory. Immutable objects, once created, cannot be changed; if you want to modify an immutable object, you have to create a new one. For example, lists in python are mutable: You can add or remove elements, and the list remains stored at the same place in memory. A string however is immutable: If you take the string "foobar" and call its replace method to replace some characters, then the modified string that you get as a result is a new object stored in a different place in memory.
You can use the id built-in function in python to check at what memory address an object is stored. So, to demonstrate:
test_list = []
id(test_list)
>>> 1696610990848 # memory address of the empty list object we just created
test_list.append(1)
id(test_list)
>>> 1696610990848 # the list with one item added to it is still the same object
test_str = "foobar"
id(test_str)
>>> 1696611256816 # memory address of the string object we just created
test_str = test_str.replace("b", "x")
id(test_str)
>>> 1696611251312 # the replace method returned a new object
In fact, numpy arrays are in principle mutable:
test_arr = np.zeros(4)
id(test_arr)
>>> 1696611361200
test_arr[0] = 1
id(test_arr)
>>> 1696611361200 # after changing an entry in the array, it's still the same object
I suppose that reshaping an object in-place would be too difficult, so instead a new one is created.
Note also that making an assignment like test_arr2 = test_arr does not make a copy; instead, test_arr2 points to the same object in memory. If you truly want to make a new copy, you should use test_arr.copy().

The issue is that you are not overwriting the array. You are currently writing:
x = np.arange(6)
x.reshape(3,2)
But you should rather be writing:
x = np.arange(6)
x = x.reshape(3,2)
Note that the array has to be of size six to rearrange to (3,2).

Fill a portion of a list in Python / Equivalent of std::fill

I was just wondering if there is a clean solution in python to filling a portion of a list with some value (apart from simply iterating over the sublist). E.g., in C++ I would use std::fill. So far, I found the following syntax:
x = [0]*10 # some array
x[2:5] = [7]*3
A solution using numpy would be fine as well.

You can use np.repeat:
import numpy as np
x = np.repeat(0, 10)
x[2:5] = np.repeat(7, 3)

Each class has its own methods. For numpy`
x = np.zeros(10, int)
makes an array of zeros.
x[2:7] = 3
assigns 3 to a portion of it.
That's similar to your list example, but critically different in some ways. List slice assignment is different from numpy's.

Numpy empty array changes value after creating normal array

I read on the official numpy page that there was an alternative to creating an initialised array without using just zeros:
empty, unlike zeros, does not set the array values to zero, and may therefore be marginally faster.
I created two functions below that demonstrate an unusual issue with using this function:
import numpy as np
def getInitialisedArray():
return np.empty((), dtype=np.float).tolist()
def createFloatArray(x):
return np.array(float(x))
If I just call getInitialisedArray() on its own, here is what it outputs:
>>> getInitialisedArray()
0.007812501848093234
And if I just call the createFloatArray() function:
>>> createFloatArray(3.1415)
3.1415
This all seems fine, but if I repeat the test and call the getInitialisedArray() after creating the float array, there is an issue:
print(getInitialisedArray())
print(createFloatArray(3.1415))
print(getInitialisedArray())
Output:
>>>
0.007812501848093234
3.1415
3.1415
It seems the second call to get an initialised array gets the same value as what was put in the normal np.array(). I don't get why this occurs. Shouldn't they be separate arrays that have no link between each other?
--- Update ---
I repeated this and changed the size of the empty array:
import numpy as np
def getInitialisedArray():
# Changed size to 2 x 2
return np.empty((2, 2), dtype=np.float).tolist()
def createFloatArray(x):
return np.array(float(x))
print(getInitialisedArray())
print(createFloatArray(3.1415))
print(getInitialisedArray())
Output:
[[1.6717403e-316, 6.9051865033801e-310], [9.97338022253e-313, 2.482735075993e-312]]
3.1415
[[1.6717403e-316, 6.9051865033801e-310], [9.97338022253e-313, 2.482735075993e-312]]
This is the sort of output I was expecting, but here it works because I changed the size. Does size now affect if an empty array takes on the same value of a normal np.array()?

From the documentation, np.empty() will "return a new array of given shape and type, without initializing entries." This should mean that it will just allocate a space in memory for the variable it is assigned to. Whatever the data in the memory space is will not be changed.
In the first example, you are printing the return from getInitialisedArray without actually storing it. Python must then know you didn't store the address of that value. Then python will keep that address for the next value that needs an address. Since createFloatArray does not store the address as well, the value in the address will be changed to 3.1415, and python will keep the address for the next assignment. When you call getInitialisedArray again, it will use that address again and print out 3.1415. If you change the datatype however (such as changing the dimensions of the array), depending on how python handles that datatype, it might need more blocks of memory and have to get a different address. In theory, if createFloatArray was the same shape as getInitialisedArray, it could have the same behavior.
WARNING! I would highly recommend not doing this. It is possible that python or your system in general will perform a task between those two operations which would change the memory address between calls even if it is the same datatype.

Check id() for each array after initializing it. np.empty() creates space that later can be used after initializing array of the same shape.
for more understanding:
print(np.array(float(1))
print(np.empty((),dtype=np.float).tolist())
same thing but assigning to variables:
x = np.array(float(1))
y = np.empty((),dtype=np.float).tolist()
print(x)
print(y)

numpy.ndarray sent as argument doesn't need loop for iteration?

In this code np.linspace() assigns to inputs 200 evenly spaced numbers from -20 to 20.
This function works. What I am not understanding is how could it work. How can inputs be sent as an argument to output_function() without needing a loop to iterate over the numpy.ndarray?
def output_function(x):
return 100 - x ** 2
inputs = np.linspace(-20, 20, 200)
plt.plot(inputs, output_function(inputs), 'b-')
plt.show()

numpy works by defining operations on vectors the way that you really want to work with them mathematically. So, I can do something like:
a = np.arange(10)
b = np.arange(10)
c = a + b
And it works as you might hope -- each element of a is added to the corresponding element of b and the result is stored in a new array c. If you want to know how numpy accomplishes this, it's all done via the magic methods in the python data model. Specifically in my example case, the __add__ method of numpy's ndarray would be overridden to provide the desired behavior.

What you want to use is numpy.vectorize which behaves similarly to the python builtin map.
Here is one way you can use numpy.vectorize:
outputs = (np.vectorize(output_function))(inputs)
You asked why it worked, it works because numpy arrays can perform operations on its array elements en masse, for example:
a = np.array([1,2,3,4]) # gives you a numpy array of 4 elements [1,2,3,4]
b = a - 1 # this operation on a numpy array will subtract 1 from every element resulting in the array [0,1,2,3]
Because of this property of numpy arrays you can perform certain operations on every element of a numpy array very quickly without using a loop (like what you would do if it were a regular python array).

How do I declare an array in Python?

How do I declare an array in Python?

variable = []
Now variable refers to an empty list*.
Of course this is an assignment, not a declaration. There's no way to say in Python "this variable should never refer to anything other than a list", since Python is dynamically typed.
*The default built-in Python type is called a list, not an array. It is an ordered container of arbitrary length that can hold a heterogenous collection of objects (their types do not matter and can be freely mixed). This should not be confused with the array module, which offers a type closer to the C array type; the contents must be homogenous (all of the same type), but the length is still dynamic.

This is surprisingly complex topic in Python.
Practical answer
Arrays are represented by class list (see reference and do not mix them with generators).
Check out usage examples:
# empty array
arr = []
# init with values (can contain mixed types)
arr = [1, "eels"]
# get item by index (can be negative to access end of array)
arr = [1, 2, 3, 4, 5, 6]
arr[0] # 1
arr[-1] # 6
# get length
length = len(arr)
# supports append and insert
arr.append(8)
arr.insert(6, 7)
Theoretical answer
Under the hood Python's list is a wrapper for a real array which contains references to items. Also, underlying array is created with some extra space.
Consequences of this are:
random access is really cheap (arr[6653] is same to arr[0])
append operation is 'for free' while some extra space
insert operation is expensive
Check this awesome table of operations complexity.
Also, please see this picture, where I've tried to show most important differences between array, array of references and linked list:

You don't actually declare things, but this is how you create an array in Python:
from array import array
intarray = array('i')
For more info see the array module: http://docs.python.org/library/array.html
Now possible you don't want an array, but a list, but others have answered that already. :)

I think you (meant)want an list with the first 30 cells already filled.
So
f = []
for i in range(30):
f.append(0)
An example to where this could be used is in Fibonacci sequence.
See problem 2 in Project Euler

This is how:
my_array = [1, 'rebecca', 'allard', 15]

For calculations, use numpy arrays like this:
import numpy as np
a = np.ones((3,2)) # a 2D array with 3 rows, 2 columns, filled with ones
b = np.array([1,2,3]) # a 1D array initialised using a list [1,2,3]
c = np.linspace(2,3,100) # an array with 100 points beteen (and including) 2 and 3
print(a*1.5) # all elements of a times 1.5
print(a.T+b) # b added to the transpose of a
these numpy arrays can be saved and loaded from disk (even compressed) and complex calculations with large amounts of elements are C-like fast.
Much used in scientific environments. See here for more.

JohnMachin's comment should be the real answer.
All the other answers are just workarounds in my opinion!
So:
array=[0]*element_count

A couple of contributions suggested that arrays in python are represented by lists. This is incorrect. Python has an independent implementation of array() in the standard library module array "array.array()" hence it is incorrect to confuse the two. Lists are lists in python so be careful with the nomenclature used.
list_01 = [4, 6.2, 7-2j, 'flo', 'cro']
list_01
Out[85]: [4, 6.2, (7-2j), 'flo', 'cro']
There is one very important difference between list and array.array(). While both of these objects are ordered sequences, array.array() is an ordered homogeneous sequences whereas a list is a non-homogeneous sequence.

You don't declare anything in Python. You just use it. I recommend you start out with something like http://diveintopython.net.

I would normally just do a = [1,2,3] which is actually a list but for arrays look at this formal definition

To add to Lennart's answer, an array may be created like this:
from array import array
float_array = array("f",values)
where values can take the form of a tuple, list, or np.array, but not array:
values = [1,2,3]
values = (1,2,3)
values = np.array([1,2,3],'f')
# 'i' will work here too, but if array is 'i' then values have to be int
wrong_values = array('f',[1,2,3])
# TypeError: 'array.array' object is not callable
and the output will still be the same:
print(float_array)
print(float_array[1])
print(isinstance(float_array[1],float))
# array('f', [1.0, 2.0, 3.0])
# 2.0
# True
Most methods for list work with array as well, common
ones being pop(), extend(), and append().
Judging from the answers and comments, it appears that the array
data structure isn't that popular. I like it though, the same
way as one might prefer a tuple over a list.
The array structure has stricter rules than a list or np.array, and this can
reduce errors and make debugging easier, especially when working with numerical
data.
Attempts to insert/append a float to an int array will throw a TypeError:
values = [1,2,3]
int_array = array("i",values)
int_array.append(float(1))
# or int_array.extend([float(1)])
# TypeError: integer argument expected, got float
Keeping values which are meant to be integers (e.g. list of indices) in the array
form may therefore prevent a "TypeError: list indices must be integers, not float", since arrays can be iterated over, similar to np.array and lists:
int_array = array('i',[1,2,3])
data = [11,22,33,44,55]
sample = []
for i in int_array:
sample.append(data[i])
Annoyingly, appending an int to a float array will cause the int to become a float, without throwing an exception.
np.array retain the same data type for its entries too, but instead of giving an error it will change its data type to fit new entries (usually to double or str):
import numpy as np
numpy_int_array = np.array([1,2,3],'i')
for i in numpy_int_array:
print(type(i))
# <class 'numpy.int32'>
numpy_int_array_2 = np.append(numpy_int_array,int(1))
# still <class 'numpy.int32'>
numpy_float_array = np.append(numpy_int_array,float(1))
# <class 'numpy.float64'> for all values
numpy_str_array = np.append(numpy_int_array,"1")
# <class 'numpy.str_'> for all values
data = [11,22,33,44,55]
sample = []
for i in numpy_int_array_2:
sample.append(data[i])
# no problem here, but TypeError for the other two
This is true during assignment as well. If the data type is specified, np.array will, wherever possible, transform the entries to that data type:
int_numpy_array = np.array([1,2,float(3)],'i')
# 3 becomes an int
int_numpy_array_2 = np.array([1,2,3.9],'i')
# 3.9 gets truncated to 3 (same as int(3.9))
invalid_array = np.array([1,2,"string"],'i')
# ValueError: invalid literal for int() with base 10: 'string'
# Same error as int('string')
str_numpy_array = np.array([1,2,3],'str')
print(str_numpy_array)
print([type(i) for i in str_numpy_array])
# ['1' '2' '3']
# <class 'numpy.str_'>
or, in essence:
data = [1.2,3.4,5.6]
list_1 = np.array(data,'i').tolist()
list_2 = [int(i) for i in data]
print(list_1 == list_2)
# True
while array will simply give:
invalid_array = array([1,2,3.9],'i')
# TypeError: integer argument expected, got float
Because of this, it is not a good idea to use np.array for type-specific commands. The array structure is useful here. list preserves the data type of the values.
And for something I find rather pesky: the data type is specified as the first argument in array(), but (usually) the second in np.array(). :|
The relation to C is referred to here:
Python List vs. Array - when to use?
Have fun exploring!
Note: The typed and rather strict nature of array leans more towards C rather than Python, and by design Python does not have many type-specific constraints in its functions. Its unpopularity also creates a positive feedback in collaborative work, and replacing it mostly involves an additional [int(x) for x in file]. It is therefore entirely viable and reasonable to ignore the existence of array. It shouldn't hinder most of us in any way. :D

How about this...
>>> a = range(12)
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
>>> a[7]
6

Following on from Lennart, there's also numpy which implements homogeneous multi-dimensional arrays.

Python calls them lists. You can write a list literal with square brackets and commas:
>>> [6,28,496,8128]
[6, 28, 496, 8128]

I had an array of strings and needed an array of the same length of booleans initiated to True. This is what I did
strs = ["Hi","Bye"]
bools = [ True for s in strs ]

You can create lists and convert them into arrays or you can create array using numpy module. Below are few examples to illustrate the same. Numpy also makes it easier to work with multi-dimensional arrays.
import numpy as np
a = np.array([1, 2, 3, 4])
#For custom inputs
a = np.array([int(x) for x in input().split()])
You can also reshape this array into a 2X2 matrix using reshape function which takes in input as the dimensions of the matrix.
mat = a.reshape(2, 2)

# This creates a list of 5000 zeros
a = [0] * 5000
You can read and write to any element in this list with a[n] notation in the same as you would with an array.
It does seem to have the same random access performance as an array. I cannot say how it allocates memory because it also supports a mix of different types including strings and objects if you need it to.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Confusion about inability to assign numpy array element using multiple array indexing - python

Related

Why ndarray function as immutable object

Fill a portion of a list in Python / Equivalent of std::fill

Numpy empty array changes value after creating normal array

numpy.ndarray sent as argument doesn't need loop for iteration?

How do I declare an array in Python?

Categories

Resources