How to append to a Numpy array using a for-loop - python

I'm trying to learn how to work with Numpy arrays in python and working on a task where the goal is to append certain values from a square function to an np array.
To be specific, trying to append to the array in such a way that the result looks like this.
[[0, 0], [1, 1], [2, 4], [3, 9], [4, 16], [5, 25](....)
In other words kind of like using a for loop to append to a nested list kind of like this:
N = 101
def f(x):
return x**2
list1 = []
for i in range(N+1):
list1.append([i])
list1[i].append(f(i))
print(list1)
When I try to do this similarly whit Numpy arrays like below:
import numpy as np
N = 101
x_min = 1
x_max = 10
y = np.zeros(N)
x = np.linspace(x_min,x_max, N)
def f(x):
return x**2
for i in y:
np.append(y,f(x))
print(y)
I get the following output:
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0.]
... which is obviously wrong
Arrays as a datatype are quite new to me, so I would massively appreciate it if anyone could help me out.
Best regards from a rookie who is motivated to learn and welcome all help.

It is kind of un-numpy-thonic (if that's a thing) to mix and match numpy arrays with vanilla python operations like for loops and appending. If I were to do this in pure numpy I would first start with your original array
>>> import numpy as np
>>> N = 101
>>> values = np.arange(N)
>>> values
array([ 0, 1, 2, ..., 99, 100])
then I would generate your squared array to create your 2D result
>>> values = np.array([values, values**2])
>>> values.T
array([[ 0, 0],
[ 1, 1],
[ 2, 4],
...
[ 98, 9604],
[ 99, 9801],
[ 100, 10000]])

Numpy gets its speed advantages in two primary ways:
Faster execution of large numbers of repeated operations (i.e. without Python for loops)
Avoiding moving data in memory (i.e. re-allocating memory space).
It's impossible to implement an indefinite append operation with Numpy arrays and still get both of these advantages. So don't do it!
I can't see in your example why an append is necessary because you know the size of the result array in advance (N).
Perhaps what you are looking for instead is vectorized function execution and assignment:
y[:] = f(x)
print(y)
(Instead of your for loop.)
This produces:
[ 1. 1.1881 1.3924 1.6129 1.8496 2.1025 2.3716 2.6569
2.9584 3.2761 3.61 3.9601 4.3264 4.7089 5.1076 5.5225
5.9536 6.4009 6.8644 7.3441 7.84 8.3521 8.8804 9.4249
9.9856 10.5625 11.1556 11.7649 12.3904 13.0321 13.69 14.3641
...etc.
Or, to get a similar output to your first bit of code:
y = np.zeros((N, 2))
y[:, 0] = x
y[:, 1] = f(x)

You could simply broadcast the operation and column_stack them.
col1 = np.arange(N)
col2 = col1 **2
list1 = np.column_stack((col1,col2))

Related

Reversing numpy where coordinate

The objective is to extract the coordinate where a cell equal to 1 in a 2D array
[[1. 0. 0. 1. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0.]]
Here the coordinate is flip than the conventional
8
7
6
5
4
3
2
1
0
0 1 2 3 4 5 6 7 8
Hence, for the 2D array above,
the output where cell equal to 1 is
[(0, 8),(2,4),(3,8)]
I curious how can I tweak the np.where by taking consideration this type of coordinate.
Simply
cor=np.array(np.where(arr==1)).T
as expected will give different result than I expect.
The above array can be reproduce
arr=np.zeros((9,9))
arr[0,0]=1
arr[4,2]=1
arr[0,3]=1
Remark, the order is not important, such that
[(0, 8),(2,4),(3,8)] is equivalent to [(8, 0),(4,2),(8,3)]
The function np.where for a 2D array returns Tuple(np.ndarray, np.ndarray)where the first entry of the tuple contains all row indices and the second one all column indices. So if you want to index the tranposed array you have to swap the tuple entries:
indices = np.where(condition)[::-1]
If you want to transform the coordinate format to the list of 2-element-tuple you could do:
indices = [(n_rr, c) for r, c in zip(np.where(condition))]
Edit:
After clarification, I now understand that rpb wants to change the origin of indexing, so that the zeroth row becomes the last and so on. Furthermore, the coordinates are desired in the format of 2element tuple per found entry.
import numpy as np
arr=np.zeros((9,9))
arr[0,0]=1
arr[4,2]=1
arr[0,3]=1
print(arr)
print(np.where(arr==1.)[0])
n_rows = arr.shape[0]
indices = [(n_rows - 1 -r, c) for r, c in zip(*np.where(arr==1.))]
print(indices)
>>> [(8, 0), (8, 3), (4, 2)]
you can use numpy flip on axis 0 to flip the array to get your coordinates.
arr = np.flip(arr, axis=0)
cor = np.array(np.where(arr==1))

Convert categorical data back to numbers using keras utils to_categorical

I am using to_categorical from keras.utils for one-hot encoding the numbers in a list. How can get back the numbers from categorical data? Is there any function available for that.
Y=to_categorical(y, num_classes=79)
You can do it simply by np.argmax():
import numpy as np
y = [0, 1, 2, 0, 4, 5]
Y = to_categorical(y, num_classes=len(y))
print(Y)
y = np.argmax(Y, axis=-1)
print(y)
# [0, 1, 2, 0, 4, 5]
Why use argmax(axis=-1)?
In the above example, to_categorical returns a matrix with shape (6,6). Set axis=-1 means, extract largest indices in each row.
[[1. 0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0.]
[1. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 1. 0.]
[0. 0. 0. 0. 0. 1.]]
See more at here about indexing.
What if my data have more than 1 dimension?
No difference. Each entry, in the preliminary list, converts to a one-hot encoding with the size of [1, nb_classes] which only one index is one and the rest are zero. Similar to the above example, when you find the maximum in each row, it converts to the original list.
y = [[0, 1], [2, 0], [4, 5]]
Y = keras.utils.to_categorical(y, num_classes=6)
#[[[1. 0. 0. 0. 0. 0.]
# [0. 1. 0. 0. 0. 0.]]
#
# [[0. 0. 1. 0. 0. 0.]
# [1. 0. 0. 0. 0. 0.]]
#
# [[0. 0. 0. 0. 1. 0.]
# [0. 0. 0. 0. 0. 1.]]]
y = np.argmax(Y, axis=-1)
#[[0 1]
# [2 0]
# [4 5]]

1D Numpy array does not get reshaped into a 2D array

columns = np.shape(lines)[0] # Gets x-axis dimension of array lines (to get numbers of columns)
lengths = np.zeros(shape=(2,1)) # Create a 2D array
# lengths = [[ 0.]
# [ 0.]]
lengths = np.arange(columns).reshape((columns)) # Makes array have the same number of columns as columns and fills it with elements going up from zero <--- This line seems to be turning it into a 1D array
Output after printing lengths array:
print(lengths)
[0 1 2]
Expected Output Example:
print(lengths)
[[0 1 2]] # Notice the double square bracket
This results in me not being able to enter data into a 2D parts of an array, because it now no longer exists:
np.append(lengths, 65, axis=1)
AxisError: axis 1 is out of bounds for array of dimension 1
I want the array to be 2D so I can store "IDs" on the first row and values on the second (at a later point in the program). I'm also aware that I could add another row to the array instead of doing it at initialization. But I'd rather not do that since I heard that's inefficient and this program's success is highly dependent on performance.
Thank you.
Since you eventually want a 2d array with ids in one row and values in the second, I'd suggest starting with the right size
In [535]: arr = np.zeros((2,10),int)
In [536]: arr
Out[536]:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
In [537]: arr[0,:]=np.arange(10)
In [538]: arr
Out[538]:
array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
Sure you could start with a 1 row array of ids, but adding that 2nd row at a later time requires making a new array anyways. np.append is just a variation on np.concatenate.
But to make a 2d array from arange I like:
In [539]: np.arange(10)[None,:]
Out[539]: array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
reshape also works, but has to be given the correct shape, e.g. (1,10).
In:
lengths = np.zeros(shape=(2,1)) # Create a 2D array
lengths = np.arange(columns).reshape((columns))
the 2nd lengths assignment replaces the first. You have to do an indexed assignment as I did with arr[0,:] to modify an existing array. lengths[0,:] = np.arange(10) wouldn't work because lengths only has 1 column, not 10. Assignments like this require correct pairing of dimensions.
Don't need 2D data to put into a column of a 2D array. You just need 1D data.
You can put the data into the 0th row instead of the 0th column if you change the organization of memory. This is copying data into contiguous memory (memory without gaps) and that is faster.
Program:
import numpy as np
data = np.arange(12)
#method 1
buf = np.zeros((12, 6))
buf[:,0] = data
print(buf)
#method 2
buf = np.zeros((6, 12))
buf[0] = data
print(buf)
Result:
[[ 0. 0. 0. 0. 0. 0.]
[ 1. 0. 0. 0. 0. 0.]
[ 2. 0. 0. 0. 0. 0.]
[ 3. 0. 0. 0. 0. 0.]
[ 4. 0. 0. 0. 0. 0.]
[ 5. 0. 0. 0. 0. 0.]
[ 6. 0. 0. 0. 0. 0.]
[ 7. 0. 0. 0. 0. 0.]
[ 8. 0. 0. 0. 0. 0.]
[ 9. 0. 0. 0. 0. 0.]
[ 10. 0. 0. 0. 0. 0.]
[ 11. 0. 0. 0. 0. 0.]]
[[ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]

Numpy broadcasting of fancy index

How does the np.newaxis work within the index of numpy array in program 1? Why it works like this?
Program 1:
import numpy as np
x_id = np.array([0, 3])[:, np.newaxis]
y_id = np.array([1, 3, 4, 7])
A = np.zeros((6,8))
A[x_id, y_id] += 1
print(A)
Result 1:
[[ 0. 1. 0. 1. 1. 0. 0. 1.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 1. 0. 1. 1. 0. 0. 1.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]]
The newaxis turns the x_id array into a column vector, same as np.array([[0],[3]]).
So you are indexing A at the cartesian product of [0,3] and [1,3,4,7]. Or to put it another way, you end up with 1s for rows 0 and 3, columns 1,3,4,and 7.
Also look at np.ix_([0,3], [1,3,4,7])
or
In [832]: np.stack(np.meshgrid([0,3],[1,3,4,7],indexing='ij'),axis=2)
Out[832]:
array([[[0, 1],
[0, 3],
[0, 4],
[0, 7]],
[[3, 1],
[3, 3],
[3, 4],
[3, 7]]])
Be a little careful with the +=1 setting; if indices are duplicated you might not get what you expect, or would get with a loop.

Update 3 and 4 dimension elements of numpy array

I have a numpy array of shape [12, 8, 5, 5]. I want to modify the values of 3rd and 4th dimension for each element.
For e.g.
import numpy as np
x = np.zeros((12, 80, 5, 5))
print(x[0,0,:,:])
Output:
[[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]]
Modify values:
y = np.ones((5,5))
x[0,0,:,:] = y
print(x[0,0,:,:])
Output:
[[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]]
I can modify for all x[i,j,:,:] using two for loops. But, I was wondering if there is any pythonic way to do it without running two loops. Just curious to know :)
UPDATE
Actual use case:
dict_weights = copy.deepcopy(combined_weights)
for i in range(0, len(combined_weights[each_layer][:, 0, 0, 0])):
for j in range(0, len(combined_weights[each_layer][0, :, 0, 0])):
# Extract 5x5
trans_weight = combined_weights[each_layer][i,j]
trans_weight = np.fliplr(np.flipud(trans_weight ))
# Update
dict_weights[each_layer][i, j] = trans_weight
NOTE: The dimensions i, j of combined_weights can vary. There are around 200 elements in this list with varied i and j dimensions, but 3rd and 4th dimensions are always same (i.e. 5x5).
I just want to know if I can updated the elements combined_weights[:,:,5, 5] with transposed values without running 2 for loops.
Thanks.
Simply do -
dict_weights[each_layer] = combined_weights[each_layer][...,::-1,::-1]

Categories